PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1197.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_009567 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1CGSHiGG_00085CGSHiGG_00120Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_00085320-2.083436hypothetical protein
CGSHiGG_00090321-3.200187hypothetical protein
CGSHiGG_00095220-3.993186hypothetical protein
CGSHiGG_00100119-3.982427lic-1 operon protein
CGSHiGG_00105123-1.479661lic-1 operon protein
CGSHiGG_00110323-2.962724lic-1 operon protein
CGSHiGG_00115223-2.637526hypothetical protein
CGSHiGG_00120322-2.442355hypothetical protein
2CGSHiGG_00500CGSHiGG_00670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_00500-113-4.203076putative zinc protease
CGSHiGG_00505110-0.503179zinc protease
CGSHiGG_00520010-1.491763putative molybdenum-pterin binding protein
CGSHiGG_00525010-1.595166putative dissimilatory sulfite reductase,
CGSHiGG_00540010-1.936924condesin subunit F
CGSHiGG_00545011-2.392359modification methylase HaeII
CGSHiGG_00570111-1.838922cell division protein MukB
CGSHiGG_00575418-6.041317hypothetical protein
CGSHiGG_00580219-4.890450cell division protein MukB
CGSHiGG_00585522-5.001733exonuclease I
CGSHiGG_005901027-5.874244hypothetical protein
CGSHiGG_005951025-5.838513hypothetical protein
CGSHiGG_00600925-5.620542hypothetical protein
CGSHiGG_00605425-4.368019RstR-like phage repressor protein
CGSHiGG_00610326-4.434565hypothetical protein
CGSHiGG_00615126-4.764314hypothetical protein
CGSHiGG_00620025-4.008741hypothetical protein
CGSHiGG_00625124-3.768141hypothetical protein
CGSHiGG_00630123-5.447513hypothetical protein
CGSHiGG_00635123-5.640925hypothetical protein
CGSHiGG_00640221-5.471482putative phage-like membrane protein
CGSHiGG_00645119-4.847950putative phage-like membrane protein
CGSHiGG_00650220-4.981577putative phage-like secreted protein
CGSHiGG_00655220-4.561327phosphate regulon sensor protein PhoR
CGSHiGG_00660318-3.139541phosphate regulon transcriptional regulatory
CGSHiGG_00665318-3.461599phosphate transporter ATP-binding protein
CGSHiGG_00670214-2.463578phosphate transport system permease protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00570GPOSANCHOR391e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.9 bits (90), Expect = 1e-04
Identities = 34/292 (11%), Positives = 83/292 (28%), Gaps = 33/292 (11%)

Query: 881 EINRERNEIDRELNQFNNGEQQLRIQLDNAKEKLQLLNKLIPQLNVLADEDLIDRIEECR 940
+++ + ++ + +L + L I ++L R +
Sbjct: 75 DLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKI--------QELEARKADLE 126

Query: 941 EQLDIAE----QDEYFIRQYGVTLSQLEPIANSLQSDPENYEGLKNELTQAIERQKQVQQ 996
+ L+ A D I+ + L L+ E + I+ + +
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 186

Query: 997 RVFALADVVQRKPHFGYEDAGQAET------------SELNEKLRQRLEQMQAQRDTQRE 1044
+ A +++ + + L + LE
Sbjct: 187 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSA 246

Query: 1045 QVRQKQSQFAEYNRVLIQLQSSYDSKYQLLNELIGEISDLGVRADDGAEERARIR----- 1099
+++ +++ A +L+ + + +I L E+A +
Sbjct: 247 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQV 306

Query: 1100 ----RDELHQQLSTSRQRRSYVEKQLTLIESEADNLNRLIRKTERDYKTQRE 1147
R L + L SR+ + +E + +E + + RD RE
Sbjct: 307 LNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE 358



Score = 33.5 bits (76), Expect = 0.007
Identities = 52/358 (14%), Positives = 114/358 (31%), Gaps = 29/358 (8%)

Query: 414 ANDALEESQAQFEQTEIEIDAVRAQLADYQKALDAQQTRALQYQQAIAALEKAKTLCGLA 473
+ E S ++ V+ + ++ + + + AL+
Sbjct: 34 VVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKD-------- 85

Query: 474 DLSVKNVEDYHAEFEAHAESLTETVLELEHKMSISEAAKSQFDKAYQLVCKIAGEMPRSA 533
+ + + + +++ E K+ EA K+ +KA + + +
Sbjct: 86 --HNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTAD-SAK 142

Query: 534 AWESAKELLREYPSQKLQAQQTPQLRTKLHELEQRYAQQQSAVKLLNDFNQRTNLSLQTA 593
E + + + ++ L +L+ A
Sbjct: 143 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 202

Query: 594 EELEDYHAEQEALIEDISAGLSEQVENRSTLRQKRENLTALYDENARKAPAWLTAQAALE 653
A I+ + A + ++ L + E ++ K T +A
Sbjct: 203 MNFST---ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIK---TLEAEKA 256

Query: 654 RLEQQSGETFEHSQDVMNFMQSQLVKERELTMQRDQLEQKRLQLDE--QISRLSQPDGSE 711
LE + E + + MNF + K + L ++ LE ++ L+ Q+ ++
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRR 316

Query: 712 DPRLNMLAERFGGVLLSELYDDVTIEDAPYFSALYGPSRHAIVVRDLNAVREQLAQLE 769
D + A++ +L + I +A SR + + RDL+A RE QLE
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQNKISEA---------SRQS-LRRDLDASREAKKQLE 364



Score = 30.8 bits (69), Expect = 0.042
Identities = 35/201 (17%), Positives = 63/201 (31%), Gaps = 31/201 (15%)

Query: 337 KAKAEQNLSQHRLVDLSREAAELAENERTLEVDHQSAVDHLNLVLNALRHQEKITRYQED 396
A ++ L E A L + LE + A++ + + +
Sbjct: 236 GAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST---ADSAKIKTLEAEKAA 292

Query: 397 IAELTERLEEQKMV----VEDANDALEESQAQFEQTEIEI-------DAVRAQLADYQKA 445
+ LE Q V + L+ S+ +Q E E A ++
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 446 LDAQQTRALQYQQAIAALEKAKTLCGLADLSV--------------KNVEDYHAEFEAHA 491
LDA + +Q A +K + +++ S K VE E +
Sbjct: 353 LDASRE---AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKL 409

Query: 492 ESLTETVLELEHKMSISEAAK 512
+L + ELE ++E K
Sbjct: 410 AALEKLNKELEESKKLTEKEK 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00590ALARACEMASE270.015 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 26.7 bits (59), Expect = 0.015
Identities = 8/30 (26%), Positives = 13/30 (43%)

Query: 42 DRLETAEQTQARFEQMEHLGEFLKELTNSS 71
+ + AR EQ E + L+NS+
Sbjct: 164 EHPDGISGAMARIEQAAEGLECRRSLSNSA 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00610PF05043300.018 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 29.9 bits (67), Expect = 0.018
Identities = 13/65 (20%), Positives = 27/65 (41%), Gaps = 7/65 (10%)

Query: 10 IEQDFGIEIPEELLLSIYGQYLMVVTEGGEI-------QKSRVTGKYHHKGSYCDEVSIK 62
E ++ I + EE++ ++ Y + E + S V YH + D++S+K
Sbjct: 245 FESEYNISLDEEVVCQLFVSYFQKMFFIDESLFMKCVKKDSYVEKSYHLLSDFIDQISVK 304

Query: 63 ISGSV 67
+
Sbjct: 305 YQIEI 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00635PF05616330.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 32.8 bits (74), Expect = 0.003
Identities = 36/140 (25%), Positives = 57/140 (40%), Gaps = 18/140 (12%)

Query: 337 LNFNYSPDMFDDKKTQKPINPSRPSNPGGSSNNTHSNKHD--------EDNYGDPNYPNL 388
LN + +PD D + +P +P+ P P G D D P
Sbjct: 359 LNPDANPDT-DGQPGTRPDSPAVPDRPNGRHRKERKEGEDGGLLCKFFPDILACDRLP-- 415

Query: 389 EPPTAHQILEPFKKFFPEFQNLTIQGKAAQCP---TWSFNALNRT----YTIDSHCPILE 441
EP A + P + EFQ I +AQCP T++ L+ + ++ ++ C I E
Sbjct: 416 EPNPAEDLNLPSETVNVEFQKSGIFQDSAQCPAPVTFTVTVLDSSRQFAFSFENACTIAE 475

Query: 442 QNRAILGALFTLIWSIIAIR 461
+ R +L AL + + IR
Sbjct: 476 RLRYMLLALAWAVAAFFCIR 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00650BCTERIALGSPD952e-23 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 95.4 bits (237), Expect = 2e-23
Identities = 50/284 (17%), Positives = 113/284 (39%), Gaps = 36/284 (12%)

Query: 125 KDEEGAVSASG--DKLVYYGKTEDIARVKSVLKGVDVPSREVVVTGYVFEVQTEEKEGSG 182
D+ + A G + L+ + + ++ V+ +D+ +V+V + EVQ + G
Sbjct: 306 LDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLG 365

Query: 183 INLLAKLLSGKLGINIGV-------------------------KQNYENFI-TVNTGNLD 216
I K N G+ ++ GN
Sbjct: 366 IQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWA 425

Query: 217 AMIELFRTDSRFHVVSSPTLRVKSGSKGNFSVGSDVPVLGAVTYDRDGRAVQSIEYRSSG 276
++ + ++ ++++P++ + F+VG +VPVL ++E ++ G
Sbjct: 426 MLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVG 485

Query: 277 VIFDIQPTI-KNNAIDLKINQQLSNFVKTDTGVNQS--PTLIKRDIVTDVTLKSGDIVVL 333
+ ++P I + +++ L+I Q++S+ + + T R + V + SG+ VV+
Sbjct: 486 IKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVV 545

Query: 334 GGLAENKITEGETGFSFLPKGILSGR-----SKSNSKTDIVILL 372
GGL + +++ L + G SK SK ++++ +
Sbjct: 546 GGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFI 589


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_00660HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 2e-22
Identities = 36/130 (27%), Positives = 62/130 (47%), Gaps = 4/130 (3%)

Query: 1 MTR-KILIVEDECAIREMIALFLSQKYYDVIEASDFKTAINKI-KENPKLILLDWMLPGR 58
MT IL+ +D+ AIR ++ LS+ YDV S+ T I + L++ D ++P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 59 SGIQFIQYIKKQESYAAIPIIMLTAKSTEEDCIACLNAGADDYITKPFSPQILLARIEAV 118
+ + IKK +P+++++A++T I GA DY+ KPF L+ I
Sbjct: 61 NAFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 119 WRRIYEQQSQ 128
+ S+
Sbjct: 119 LAEPKRRPSK 128


3CGSHiGG_00840CGSHiGG_00950Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_00840223-2.642323hypothetical protein
CGSHiGG_00845425-2.570519putative antirepressor protein
CGSHiGG_00850428-2.808448hypothetical protein
CGSHiGG_00855230-1.171886hypothetical protein
CGSHiGG_008602330.279362hypothetical protein
CGSHiGG_008652320.169566hypothetical protein
CGSHiGG_008702280.772020hypothetical protein
CGSHiGG_008751251.527565hypothetical protein
CGSHiGG_008801261.946062putative recombination protein NinB
CGSHiGG_008852251.441462hypothetical protein
CGSHiGG_008902251.606692hypothetical protein
CGSHiGG_008950251.854688hypothetical protein
CGSHiGG_00900222-0.408650hypothetical protein
CGSHiGG_00905424-1.892056hypothetical protein
CGSHiGG_00910627-2.529967hypothetical protein
CGSHiGG_00915527-2.296171hypothetical protein
CGSHiGG_00920428-2.682591transcriptional regulator
CGSHiGG_00925629-4.395935hypothetical protein
CGSHiGG_00930432-4.475146hypothetical protein
CGSHiGG_00935427-4.409357hypothetical protein
CGSHiGG_00940428-3.675054hypothetical protein
CGSHiGG_00945429-3.974993hypothetical protein
CGSHiGG_00950130-3.425671hypothetical protein
4CGSHiGG_01475CGSHiGG_01560Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_01475216-0.509036hypothetical protein
CGSHiGG_014802140.291452nuclease
CGSHiGG_014852121.116512putative selenocysteine lyase
CGSHiGG_014900203.078094hypothetical protein
CGSHiGG_01495-1172.309988hypothetical protein
CGSHiGG_01500-2171.4883917-cyano-7-deazaguanine reductase
CGSHiGG_01505-2171.289486bifunctional chorismate mutase/prephenate
CGSHiGG_01510-1220.926505tRNA pseudouridine synthase B
CGSHiGG_015250160.949262putative type I restriction enzyme HindVIIP M
CGSHiGG_01530212-1.253338putative type I restriction enzyme HindVIIP
CGSHiGG_01535314-0.852034hypothetical protein
CGSHiGG_015403150.195423hypothetical protein
CGSHiGG_015452110.667547hypothetical protein
CGSHiGG_015602101.475923translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_01560TCRTETOQM702e-14 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 70.3 bits (172), Expect = 2e-14
Identities = 59/278 (21%), Positives = 96/278 (34%), Gaps = 77/278 (27%)

Query: 350 IMGHVDHGKTSLLDYIRKAKVAAGEAG------------------GITQHIGAYHVEMDD 391
++ HVD GKT+L + + A E G GIT G + ++
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 392 GKMITFLDTPGHAAFTSMRARGAKATDIVVLVVAADDGVMPQTIEAIQHAKAAGAPLVVA 451
K + +DTPGH F + R D +L+++A DGV QT + G P +
Sbjct: 68 TK-VNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFF 126

Query: 452 VNKIDKPEANP-----------------------------------DRVEQELLQHDVIS 476
+NKID+ + ++ + + +D +
Sbjct: 127 INKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLL 186

Query: 477 EKFGGDVQ------------------FVPV---SAKKGTGVDDLLDAILLQSEVLELTAV 515
EK+ PV SAK G+D+L++ I ++ T
Sbjct: 187 EKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHR 244

Query: 516 KDGMASGVVIESYLDKGRGPVATILVQSGTLRKGDIVL 553
G V + + R +A I + SG L D V
Sbjct: 245 GQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


5CGSHiGG_01880CGSHiGG_01905Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_01880420-1.902131translation initiation factor Sui1
CGSHiGG_01885218-1.506705orotidine 5'-phosphate decarboxylase
CGSHiGG_01890318-0.895975tetratricopeptide repeat protein
CGSHiGG_01895520-0.301162hypothetical protein
CGSHiGG_019004170.305850integration host factor subunit beta
CGSHiGG_019054180.46244730S ribosomal protein S1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_01880SECA331e-04 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 1e-04
Identities = 21/55 (38%), Positives = 29/55 (52%)

Query: 52 DLSDEELKKLAAELKKRCGCGGAVKNGIIEIQGEKRDLLKQLLEQKGFKVKLSGG 106
LSDEELK AE + R G ++N I E R+ K++ + F V+L GG
Sbjct: 37 KLSDEELKGKTAEFRARLEKGEVLENLIPEAFAVVREASKRVFGMRHFDVQLLGG 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_01900DNABINDINGHU1073e-34 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 107 bits (268), Expect = 3e-34
Identities = 34/89 (38%), Positives = 51/89 (57%), Gaps = 1/89 (1%)

Query: 2 TKSELMEKLSAKQPTLPAKEIENMVKDILEFISQSLENGDRVEVRGFGSFSLHHRQPRLG 61
K +L+ K+ A+ L K+ V + +S L G++V++ GFG+F + R R G
Sbjct: 3 NKQDLIAKV-AEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 62 RNPKTGDSVNLSAKSVPYFKAGKELKARV 90
RNP+TG+ + + A VP FKAGK LK V
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_01905ACRIFLAVINRP300.029 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.029
Identities = 25/162 (15%), Positives = 51/162 (31%), Gaps = 22/162 (13%)

Query: 318 PSKVVSLGDTVEVMVLEIDEERRRISLG-LKQCKANPWTQFADTHNKGDKVTGKIKSITD 376
+ T ++ ++ + +I+ G L A P Q N + K+ +
Sbjct: 190 ADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL----NASIIAQTRFKNPEE 245

Query: 377 FGIFIGLEGGIDGLVHLSDISWSISGEEAVRQYKKGDEVSAVVLAV------------DA 424
FG +V L D++ G E + + A L + A
Sbjct: 246 FGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKA 305

Query: 425 VKERISLGIKQLEED-----PFNNFVAINKKGAVVSATVVEA 461
+K +++ + P++ + V T+ EA
Sbjct: 306 IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEA 347


6CGSHiGG_02000CGSHiGG_02025Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_02000218-1.5328012-oxoglutarate dehydrogenase E1 component
CGSHiGG_02005832-5.370631hypothetical protein
CGSHiGG_02010630-5.887105hypothetical protein
CGSHiGG_02015425-4.767172hypothetical protein
CGSHiGG_02020221-3.375793hypothetical protein
CGSHiGG_02025319-2.859101hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_02000SALSPVBPROT320.011 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 32.0 bits (72), Expect = 0.011
Identities = 32/127 (25%), Positives = 49/127 (38%), Gaps = 15/127 (11%)

Query: 600 ETMAYATLLDEGVNVRLSGEDAGRGTFFHRHAVVHNQNDGTGYVPLTHLHANQGRFEVWD 659
E + Y+ L + G NV L+G +AGR R+ + T P L+ +W
Sbjct: 202 EHIYYSYLAENGDNVDLNGNEAGRDRSAMRYLSKVQYGNAT---PAADLY-------LWT 251

Query: 660 SVLSEES---VLAFEYGYATTDPKTLTIWEAQFGDFANGAQIVIDQFISSGEQKWGRMCG 716
S L F+YG DP+ + AQ A Q + E + R+C
Sbjct: 252 SATPAVQWLFTLVFDYGERGVDPQVPPAFTAQNSWLAR--QDPFSLYNYGFEIRLHRLCR 309

Query: 717 LVMLLPH 723
V++ H
Sbjct: 310 QVLMFHH 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_02020RTXTOXIND379e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 9e-05
Identities = 19/107 (17%), Positives = 38/107 (35%), Gaps = 13/107 (12%)

Query: 124 DINECKAFKRDYNNAISELRKQIQE----QKKQAIAKQKAVEQQRKAQQIQAEKKRKARE 179
+ A Y N + ++ + KQAIAK +EQ+ K + E
Sbjct: 215 ERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNE------- 267

Query: 180 AYLKTPQGQAELARQQQAEYQRQMLAQQREYQQQMLEMQRQAAQQAA 226
L+ + Q E + + + + ++ ++L+ RQ
Sbjct: 268 --LRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIG 312


7CGSHiGG_02205CGSHiGG_02245Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_02205013-3.338085molybdate-binding periplasmic protein
CGSHiGG_02210114-5.298870molybdenum transport protein
CGSHiGG_02215418-7.467069putative UDP-galactose--lipooligosaccharide
CGSHiGG_02230718-7.584711putative UDP-GlcNAc--lipooligosaccharide
CGSHiGG_02235215-4.428042lipopolysaccharide biosynthesis protein
CGSHiGG_02240014-3.357426hypothetical protein
CGSHiGG_02245012-3.153182lipopolysaccharide biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_0224556KDTSANTIGN310.004 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 31.5 bits (71), Expect = 0.004
Identities = 19/74 (25%), Positives = 28/74 (37%), Gaps = 6/74 (8%)

Query: 153 KIKKHYTVYPNYKNIVSNIEPISLWDNKVDGDIDG--KVSFFIGQPLLNTKEENISLIKK 210
I+K + + P + PIS+ D DI + QP LN ++ + I
Sbjct: 127 PIRKPFKLTPPQPTMS----PISIADRDFGIDIPNIPQAQRQAAQPPLNDQKRAAARIAW 182

Query: 211 LKEQISFDYYFPHP 224
LK DY P
Sbjct: 183 LKNCAGIDYMVKDP 196


8CGSHiGG_02690CGSHiGG_02745Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_02690-311-3.376223rod shape-determining protein MreD
CGSHiGG_02695-314-3.183866hypothetical protein
CGSHiGG_02700-212-2.682093exonuclease III
CGSHiGG_02705-112-2.995607exonuclease III
CGSHiGG_02710014-3.582369hypothetical protein
CGSHiGG_02715115-3.041755FtsH-interacting integral membrane protein
CGSHiGG_02720216-3.151388*hypothetical protein
CGSHiGG_02725215-3.245730hypothetical protein
CGSHiGG_02740215-3.304958glucuronate isomerase
CGSHiGG_02745215-3.525820glucuronide permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_02745TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.5 bits (87), Expect = 1e-04
Identities = 36/161 (22%), Positives = 59/161 (36%), Gaps = 9/161 (5%)

Query: 246 SNKPLQIFILIAIILLLANLLIGAMNPYLYIDYFNSKFALSIGGTLP---VIASFCVAPF 302
N+PL + + + + LI + P L D +S + G L + F AP
Sbjct: 3 PNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPV 62

Query: 303 AQTLVKKFGKKESASIALLLTGIGYFILFGVKTTDVWLYMAIAFVSLLGLNYFMVIIWAF 362
L +FG++ ++L + Y + LY+ + G + A+
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYA-IMATAPFLWVLYIGRIVAGITGAT--GAVAGAY 119

Query: 363 ITDIIDYQFLKTHRREDGTIYAVYSFARKIGQALAGGLGGV 403
I DI D R G + A + F G L G +GG
Sbjct: 120 IADITD---GDERARHFGFMSACFGFGMVAGPVLGGLMGGF 157



Score = 30.6 bits (69), Expect = 0.014
Identities = 25/171 (14%), Positives = 54/171 (31%), Gaps = 20/171 (11%)

Query: 187 IIPEMFTIIMGVLMLIAFYLYQFCWRNSVERIQLPEKVGRRHNRNECFGDVKAIFVSLFS 246
P L + F F S + + P + + S
Sbjct: 157 FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRR-----------EALNPLASFRW 205

Query: 247 NKPLQIFILIAIILLLANLLIGAMNPYLYIDYFNSKFALS---IGGTLPVIA---SFCVA 300
+ + + + + + L +G + L++ + +F IG +L S A
Sbjct: 206 ARGMTVVAALMAVFFIMQL-VGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQA 264

Query: 301 PFAQTLVKKFGKKESASIALLLTGIGYFILFGVKTTDVWLYMAIAFVSLLG 351
+ + G++ + + ++ G GY +L T W+ I + G
Sbjct: 265 MITGPVAARLGERRALMLGMIADGTGYILL--AFATRGWMAFPIMVLLASG 313


9CGSHiGG_03470CGSHiGG_03590Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_034703180.268176hypothetical protein
CGSHiGG_034753210.12793623S rRNA pseudouridine synthase D
CGSHiGG_03480423-0.435287hypothetical protein
CGSHiGG_034855200.140945hypothetical protein
CGSHiGG_034904170.753049pyruvate formate lyase-activating enzyme 1
CGSHiGG_034953161.644266formate acetyltransferase
CGSHiGG_035002133.233102formate acetyltransferase
CGSHiGG_035051133.012074putative formate transporter
CGSHiGG_035100163.223047N-acetyl-D-glucosamine kinase
CGSHiGG_03515-2163.028970amino acid carrier protein
CGSHiGG_03520-1162.411231esterase
CGSHiGG_03525-1160.494509alcohol dehydrogenase class III
CGSHiGG_03530113-2.498040putative HTH-type transcriptional regulator
CGSHiGG_03535-114-3.370495Sec-independent protein translocase protein
CGSHiGG_03540-213-3.371617sec-independent translocase
CGSHiGG_03545-215-3.179280Sec-independent protein translocase protein
CGSHiGG_03560-212-3.789543ferric uptake regulation protein
CGSHiGG_03565-312-2.425306flavodoxin FldA
CGSHiGG_03570-212-1.795366esterase/lipase
CGSHiGG_03575-313-1.172002replication initiation regulator SeqA
CGSHiGG_03580-112-0.628926O-succinylbenzoic acid--CoA ligase
CGSHiGG_03585-111-0.256407potassium efflux protein KefA
CGSHiGG_035900113.377482chorismate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03515PF04335290.028 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 29.4 bits (66), Expect = 0.028
Identities = 10/38 (26%), Positives = 17/38 (44%), Gaps = 1/38 (2%)

Query: 403 FVYFGAVRSGNVVWNFADTVMAVMAIINLIAILMLSPI 440
A RS + W A V +A ++A+ L+P+
Sbjct: 23 DKLAAAERSKKLAWVVA-GVAGALATAGVVAVAALTPL 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03540TATBPROTEIN1693e-56 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 169 bits (429), Expect = 3e-56
Identities = 72/171 (42%), Positives = 100/171 (58%), Gaps = 4/171 (2%)

Query: 1 MFDIGFSELILLMVLGLVVLGPKRLPIAIRTVMDWVKTIRGLAANVQNELKQELKLQELQ 60
MFDIGFSEL+L+ ++GLVVLGP+RLP+A++TV W++ +R LA VQNEL QELKLQE Q
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60

Query: 61 DSIKKVESLNLQALSPELSKTVEELKAQADKMK----AELEDKAAQAGTTVEEQIKEIKS 116
DS+KKVE +L L+PEL +++EL+ A+ MK A +KA+ T+ + +
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDELRQAAESMKRSYVANDPEKASDEAHTIHNPVVKDNE 120

Query: 117 AAENAEKPQNAISVEEAAETLSEAEKTPTDLTALETHEKVELNTHLSSYYP 167
AA P A + + E E P A + K + SS P
Sbjct: 121 AAHEGVTPAAAQTQASSPEQKPETTPEPVVKPAADAEPKTAAPSPSSSDKP 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03585RTXTOXIND421e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 1e-05
Identities = 36/255 (14%), Positives = 77/255 (30%), Gaps = 45/255 (17%)

Query: 51 EGE--AKNRLLAEL---QTSIDLLQQIQAQQKINDALQTTLSHSESEIRKNNAEIQALKK 105
EGE K +L +L D + Q+ QT I N L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADT-LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172

Query: 106 QQETATSTDDNAQSQDYLQNSLTKLNDQLQDTQNALSTANAQLAGQSSISERAQAALTEN 165
+ +++ L Q QN L + + A +
Sbjct: 173 EPYFQNVSEEEVLRLTSLIKE------QFSTWQNQKYQKELNLDKKRAERLTVLARINRY 226

Query: 166 VVRTQQINQQLANNDIGSILRKQYQIELQLIDLKNSYNQNLLKNNDQLSLLYQSRYDLLN 225
++ +L +D S+L KQ + +++ +N Y +
Sbjct: 227 ENLSRVEKSRL--DDFSSLLHKQAIAKHAVLEQENKYVE-------------------AV 265

Query: 226 LRLQVQQQNIIAIQEVINQKNLQQSQNQVEQAQQQQKTVQNDYIQKELDRNAQLSQYLLQ 285
L+V + + I+ +++ A+++ + V + + LD+ Q + +
Sbjct: 266 NELRVYKSQLEQIE------------SEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 286 QTEKANSLTQDELRM 300
T + + +
Sbjct: 314 LTLELAKNEERQQAS 328



Score = 32.5 bits (74), Expect = 0.011
Identities = 24/217 (11%), Positives = 67/217 (30%), Gaps = 33/217 (15%)

Query: 255 EQAQQQQKTVQNDYIQKELDRNAQLSQYLLQQTE----KANSLTQDELRMRNILDSLTQT 310
E ++ ++ + E D S L + E + S + + ++ +
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 311 QRTIDEQISALQGTLVLSRIIQQQKQKLPTNLNIQGLSKQIADLRVHIFDITQK------ 364
+ + E+ +L+ + Q QK LN+ + + I
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 365 ---------------RNELYDLDNYINKVESEDGKQFTEAERTQVKTLLTERR--KMTSD 407
++ + + +N + +E ++ E+ + + L + +T
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 408 LIKSLNNQLNLAISLELTQLQITQISDQIQSKLEQQS 444
+ ++L I ++ ++ E+Q
Sbjct: 296 FKNEILDKLRQT------TDNIGLLTLELAKNEERQQ 326


10CGSHiGG_03885CGSHiGG_04050Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_03885720-7.167649single-stranded DNA-binding protein
CGSHiGG_03890922-8.359666hypothetical protein
CGSHiGG_03915522-6.083727hypothetical protein
CGSHiGG_03920419-5.711241patatin
CGSHiGG_03925217-5.941440hypothetical protein
CGSHiGG_03930317-5.685429hypothetical protein
CGSHiGG_03935115-2.601713hypothetical protein
CGSHiGG_03950117-2.572824TonB
CGSHiGG_03955114-3.350818biopolymer transport protein
CGSHiGG_03960017-3.059337transport protein ExbB
CGSHiGG_03965019-2.523982thioredoxin-dependent thiol peroxidase
CGSHiGG_039703190.692911dihydrodipicolinate synthase
CGSHiGG_039754221.193015lipoprotein
CGSHiGG_039804252.266736hypothetical protein
CGSHiGG_039854272.562997hypothetical protein
CGSHiGG_039904273.102821hypothetical protein
CGSHiGG_039955283.360769hypothetical protein
CGSHiGG_040002302.429106hypothetical protein
CGSHiGG_04005230-0.413812hypothetical protein
CGSHiGG_04010526-3.783718hypothetical protein
CGSHiGG_04015725-6.375586hypothetical protein
CGSHiGG_04020825-7.241056hypothetical protein
CGSHiGG_04025824-8.329173hypothetical protein
CGSHiGG_04040721-5.750628hypothetical protein
CGSHiGG_04045417-3.826929hypothetical protein
CGSHiGG_04050215-1.618439hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03950TONBPROTEIN1509e-47 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 150 bits (379), Expect = 9e-47
Identities = 56/209 (26%), Positives = 84/209 (40%), Gaps = 25/209 (11%)

Query: 55 MVLEEPAPEPEDVQKEPENVQKEPEPEKQEIVEDPTIKPEPKKIKEPEKEKPKPKEKPKE 114
MV P+ VQ PE EPEPE + I E P E P EKPK
Sbjct: 49 MVTPADLEPPQAVQPPPE-PVVEPEPEPEPIPEPPK-------------EAPVVIEKPKP 94

Query: 115 KPKNKPKKEVKPQKKPINKELPKGDENIDSSANVNDKASTTSAANSNAQVAGSGTDTSEI 174
KPK KPK K Q++P +++ + S A TS+ + A + S
Sbjct: 95 KPKPKPKPVKKVQEQP-KRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGP 153

Query: 175 AAYRSAIRREIESHKRYPTRAKIMRKQGKVSVSFNVGADGSLSGAKVTKSSGDESLDKAA 234
A +YP RA+ +R +G+V V F+V DG + ++ + ++
Sbjct: 154 RALSRN-------QPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREV 206

Query: 235 LDAINVSRSVGTRPAGFPSSLSVQISFTL 263
+A+ R +P S + V I F +
Sbjct: 207 KNAMRRWRYEPGKPG---SGIVVNILFKI 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03975PF06291280.011 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.1 bits (62), Expect = 0.011
Identities = 11/41 (26%), Positives = 19/41 (46%)

Query: 1 MKKIILSLTTAIILVGCSSNPETLKASNDSFQKSEASIPHF 41
MKK++ S A+++ GC+ T+ + E HF
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHF 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03990ANTHRAXTOXNA270.011 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 26.6 bits (58), Expect = 0.011
Identities = 12/27 (44%), Positives = 16/27 (59%), Gaps = 1/27 (3%)

Query: 48 PEMFAYMQELIEDAELVRIAEERKAEG 74
P+MF YM +L E +I+E K EG
Sbjct: 262 PDMFEYMNKL-EKGGFEKISESLKKEG 287


11CGSHiGG_04700CGSHiGG_05115Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_04700216-1.759047cysteine desulfurase
CGSHiGG_04705118-0.916585hypothetical protein
CGSHiGG_04710118-0.778289hypothetical protein
CGSHiGG_047151190.606168hypothetical protein
CGSHiGG_047200191.501216hypothetical protein
CGSHiGG_047351182.151841hypothetical protein
CGSHiGG_047401202.693738hypothetical protein
CGSHiGG_047450234.695387hypothetical protein
CGSHiGG_047500235.200132hypothetical protein
CGSHiGG_047551254.844254hypothetical protein
CGSHiGG_047601235.161381hypothetical protein
CGSHiGG_047652235.647584hypothetical protein
CGSHiGG_047702235.464810hypothetical protein
CGSHiGG_047750264.707757hypothetical protein
CGSHiGG_047800264.749265hypothetical protein
CGSHiGG_047851265.504190hypothetical protein
CGSHiGG_047902295.258417hypothetical protein
CGSHiGG_047952315.341738hypothetical protein
CGSHiGG_048003316.223764hypothetical protein
CGSHiGG_048053286.245270hypothetical protein
CGSHiGG_048102265.289649hypothetical protein
CGSHiGG_048151244.786303hypothetical protein
CGSHiGG_048201254.816509hypothetical protein
CGSHiGG_048251264.287073hypothetical protein
CGSHiGG_048302273.638067hypothetical protein
CGSHiGG_048352241.679868hypothetical protein
CGSHiGG_048402262.048912hypothetical protein
CGSHiGG_048452281.413739hypothetical protein
CGSHiGG_048503310.015627hypothetical protein
CGSHiGG_048553300.545286hypothetical protein
CGSHiGG_048601311.114861hypothetical protein
CGSHiGG_04865-1333.679031hypothetical protein
CGSHiGG_048700282.517222hypothetical protein
CGSHiGG_04875-1253.034464hypothetical protein
CGSHiGG_048801262.992588hypothetical protein
CGSHiGG_048850292.702647hypothetical protein
CGSHiGG_048901312.766470hypothetical protein
CGSHiGG_048952353.457476hypothetical protein
CGSHiGG_049002363.675489hypothetical protein
CGSHiGG_049153341.065490hypothetical protein
CGSHiGG_049204301.158373hypothetical protein
CGSHiGG_049254270.656852hypothetical protein
CGSHiGG_049304281.326933hypothetical protein
CGSHiGG_049355240.050119hypothetical protein
CGSHiGG_04940322-0.663286hypothetical protein
CGSHiGG_049453261.171237hypothetical protein
CGSHiGG_049504281.416765hypothetical protein
CGSHiGG_049554281.913803hypothetical protein
CGSHiGG_049602271.636253hypothetical protein
CGSHiGG_049650271.971571hypothetical protein
CGSHiGG_04970-1323.286092hypothetical protein
CGSHiGG_04975-1273.165812hypothetical protein
CGSHiGG_049800262.640023hypothetical protein
CGSHiGG_049850272.102905hypothetical protein
CGSHiGG_049902282.000185hypothetical protein
CGSHiGG_049952294.032468hypothetical protein
CGSHiGG_050002293.370239hypothetical protein
CGSHiGG_050050283.130249recombination associated protein
CGSHiGG_050102252.292573hypothetical protein
CGSHiGG_050152253.269869hypothetical protein
CGSHiGG_050202253.145575hypothetical protein
CGSHiGG_05025222-0.614185hypothetical protein
CGSHiGG_05030023-0.182128hypothetical protein
CGSHiGG_050350241.090907hypothetical protein
CGSHiGG_05040-1232.178419hypothetical protein
CGSHiGG_05045-1162.286955hypothetical protein
CGSHiGG_050502162.995755hypothetical protein
CGSHiGG_050552152.838641hypothetical protein
CGSHiGG_050603152.672145hypothetical protein
CGSHiGG_050653172.265828**hypothetical protein
CGSHiGG_050702181.836242translocation protein TolB
CGSHiGG_050754180.238302cell envelope integrity inner membrane protein
CGSHiGG_05080217-4.027928colicin uptake protein TolR
CGSHiGG_05095-112-2.860677hypothetical protein
CGSHiGG_05110-111-2.665983hypothetical protein
CGSHiGG_05115010-3.045898hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_04855BACINVASINB250.019 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 25.5 bits (55), Expect = 0.019
Identities = 13/45 (28%), Positives = 24/45 (53%)

Query: 29 RLGNAKDAIKSAQMAYTNPNFVRQIKKDPDKLIYQGIQTLIAKYG 73
+LGNA + + PN ++Q+ ++ KL QG+Q + + G
Sbjct: 436 KLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_04900FbpA_PF05833280.033 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.033
Identities = 12/71 (16%), Positives = 27/71 (38%), Gaps = 1/71 (1%)

Query: 117 LPPLVNELNPEPSIEPSDNHHIKKTTQKSESEILLEQF-GITGQLAKDFIAHRKAKKGVI 175
PP +LNP + K+ + + I + F G++ L+ + K +
Sbjct: 164 YPPKSPKLNPFDFSYDMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDL 223

Query: 176 NQTQLNRLQKQ 186
+ + L + +
Sbjct: 224 SLSNLKEIVEV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_0497060KDINNERMP310.006 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.006
Identities = 19/85 (22%), Positives = 30/85 (35%), Gaps = 17/85 (20%)

Query: 166 WFEWEKDNRKEIPKLHPT-------------QKPIAVLKRLIEIFTDEGDVVIDPVAG-- 210
W WE+D + T P + +LI + TD D+ I+ G
Sbjct: 20 WQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKLISVKTDVLDLTINTRGGDV 79

Query: 211 -SASTLRAARELNRPSYGFEIKKDS 234
A +ELN F++ + S
Sbjct: 80 EQALLPAYPKELNSTQ-PFQLLETS 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_05000PF00577280.049 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 27.9 bits (62), Expect = 0.049
Identities = 14/77 (18%), Positives = 22/77 (28%), Gaps = 2/77 (2%)

Query: 118 SNTGNGSVATNTGYQSVATNTGDLSVATNTGDLSAATNTGDRSVATNTGYQ-SVATNTGD 176
S+ NG + G +LS + TG + Y+
Sbjct: 624 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 683

Query: 177 LSAATNTGDLSAVEVSG 193
S + + L VSG
Sbjct: 684 YSHSDDIKQLYY-GVSG 699


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_05065OMPADOMAIN1084e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 108 bits (270), Expect = 4e-31
Identities = 37/117 (31%), Positives = 54/117 (46%), Gaps = 8/117 (6%)

Query: 42 ADLQQRYNT----VYFGFDKYDITGEYVQILDAHAAYLNAT--PAAKVLVEGNTDERGTP 95
++Q ++ T V F F+K + E LD + L+ V+V G TD G+
Sbjct: 208 PEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSD 267

Query: 96 EYNIALGQRRADAVKGYLAGKGVDAGKLGTVSYGEEKPAVLGHDEAAYSKNRRAVLA 152
YN L +RRA +V YL KG+ A K+ GE P V G+ K R A++
Sbjct: 268 AYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALID 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_05075IGASERPTASE646e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.9 bits (155), Expect = 6e-13
Identities = 35/188 (18%), Positives = 60/188 (31%), Gaps = 4/188 (2%)

Query: 98 HQQEVQRQEELKRQQEIKKQQEQARQEALEKQKQAEEARAKQAAEAAKLKA-DAEAKRLA 156
+ EV+++ + I E AR +A A +E
Sbjct: 981 YNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETV 1040

Query: 157 AAAKQAEEEAKAKAAEIAAQKAKQEAEA--KAKLEAEAKAKAVAEAKAKAEAEAKAKAAA 214
A + E + K + A + Q E +AK +A + A++ +E +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET 1100

Query: 215 EAKAKAEAEAKAKAEAKAKAEAKAKAEAEAKAKAEAEAKAKAAAEAKAKADAEAKAATEA 274
+ A E E KAK E + E K ++ K E + AE + D
Sbjct: 1101 KETATVEKEEKAKVETEKTQEV-PKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 275 KRKADQAS 282
+ A
Sbjct: 1160 SQTNTTAD 1167



Score = 61.2 bits (148), Expect = 4e-12
Identities = 36/246 (14%), Positives = 77/246 (31%), Gaps = 21/246 (8%)

Query: 43 EGEGDVIGAVIVDTGTAAQEWGRIQQQKKGQSDKQKRPEPVVEEKPPEPNQEEIKHQQEV 102
E + + T Q S+ ++ +E P P +
Sbjct: 986 EKRNQTVDTTNITTPNNIQ-----ADVPSVPSNNEEIARV--DEAPVPPPAPATPSETTE 1038

Query: 103 QRQEELKRQQEIKKQQEQARQEALEKQKQAEEARAKQAAEAAKLKADAEAKRLAAAAKQA 162
E K++ + ++ EQ E A+ ++ A+ AK A + A+
Sbjct: 1039 TVAENSKQESKTVEKNEQDATET--------TAQNREVAKEAKSNVKANTQT-NEVAQSG 1089

Query: 163 EEEAKAKAAEIAAQKAKQEAEAKAKLEAEAKAKA---VAEAKAK-AEAEAKAKAAAEAKA 218
E + + E + A E E KAK+E E + ++ K ++E A A+
Sbjct: 1090 SETKETQTTE-TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 219 KAEAEAKAKAEAKAKAEAKAKAEAEAKAKAEAEAKAKAAAEAKAKADAEAKAATEAKRKA 278
+ +++ A + A+ + + ++ + E T
Sbjct: 1149 NDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ 1208

Query: 279 DQASLD 284
+ +
Sbjct: 1209 PTVNSE 1214



Score = 46.2 bits (109), Expect = 2e-07
Identities = 29/228 (12%), Positives = 69/228 (30%), Gaps = 10/228 (4%)

Query: 67 QQQKKGQSDKQKRPEPVVEEKPPEPNQEEIKHQ-----QEVQRQEELKRQQEIKKQQEQA 121
Q ++ + K + + E + Q + ++E K + E +K QE
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 122 RQEALEKQKQAEEARAKQAAEAAKLKADAEAKRLAAAAKQAEEEAKAKAAEIAAQKAKQE 181
+ + KQ + + AE A+ + ++ + A+ + +Q
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIK-EPQSQTNTTADTEQPAKETSSNVEQP 1182

Query: 182 AEAKAKLEAEAKAKAVAEAKAKAEAEAKAKAAAEAKAKAEAEAKAKAEA---KAKAEAKA 238
+ E A + + + K K ++ + +
Sbjct: 1183 VTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSN 1242

Query: 239 KAEAEAKAKAEAEAKAKAAAEAKAKADAEAKAATEA-KRKADQASLDD 285
A + ++A+AKA A +A + Q +++
Sbjct: 1243 DRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNN 1290



Score = 32.3 bits (73), Expect = 0.004
Identities = 23/184 (12%), Positives = 54/184 (29%), Gaps = 3/184 (1%)

Query: 57 GTAAQEWGRIQQQKKGQSD---KQKRPEPVVEEKPPEPNQEEIKHQQEVQRQEELKRQQE 113
A E + Q+ K S KQ++ E V + P + + +E Q Q E
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 114 IKKQQEQARQEALEKQKQAEEARAKQAAEAAKLKADAEAKRLAAAAKQAEEEAKAKAAEI 173
++ + E + + + + + ++
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS 1229

Query: 174 AAQKAKQEAEAKAKLEAEAKAKAVAEAKAKAEAEAKAKAAAEAKAKAEAEAKAKAEAKAK 233
+ + A + ++A+AKA A +A ++ ++ +
Sbjct: 1230 VPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMN 1289

Query: 234 AEAK 237
E +
Sbjct: 1290 NEGQ 1293


12CGSHiGG_05635CGSHiGG_05685Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_05635220-0.147191hypothetical protein
CGSHiGG_056402190.396695imidazole glycerol phosphate synthase subunit
CGSHiGG_056453210.275843F0F1 ATP synthase subunit epsilon
CGSHiGG_056502240.329351F0F1 ATP synthase subunit beta
CGSHiGG_05655-119-0.568988F0F1 ATP synthase subunit gamma
CGSHiGG_05660119-0.903544F0F1 ATP synthase subunit alpha
CGSHiGG_05665115-3.103261F0F1 ATP synthase subunit delta
CGSHiGG_05670016-3.790098F0F1 ATP synthase subunit B
CGSHiGG_05675-114-3.320332F0F1 ATP synthase subunit C
CGSHiGG_05680-119-3.330735F0F1 ATP synthase subunit A
CGSHiGG_05685-221-3.275550F0F1 ATP synthase subunit I
13CGSHiGG_05920CGSHiGG_05995Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_05920-1193.154888hypothetical protein
CGSHiGG_059250184.279277aspartate ammonia-lyase
CGSHiGG_059300235.862227urease accessory protein UreH
CGSHiGG_059350225.661984urease accessory protein
CGSHiGG_059400174.747055urease accessory protein UreF
CGSHiGG_059450194.081163urease accessory protein UreE
CGSHiGG_059501244.037051urease subunit alpha
CGSHiGG_059653302.349314urease subunit gamma
CGSHiGG_059704322.419773co-chaperonin GroES
CGSHiGG_059753312.469357chaperonin GroEL
CGSHiGG_05980723-1.86686650S ribosomal protein L9
CGSHiGG_05985620-4.31486030S ribosomal protein S18
CGSHiGG_05990620-4.903925primosomal replication protein N
CGSHiGG_05995218-3.03258130S ribosomal protein S6
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_05950UREASE10400.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1040 bits (2692), Expect = 0.0
Identities = 361/575 (62%), Positives = 435/575 (75%), Gaps = 8/575 (1%)

Query: 1 MALTISRAQYVATYGPTVGDKVRLGDTNLWATIEQDLLTKGDECKFGGGKSVRDGMAQSG 60
M+ +SRA Y +GPTVGDKVRL DT L+ +E+D T G+E KFGGGK +RDGM QS
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQS- 59

Query: 61 TATRDNPNVLDFVITNVMIIDARLGIIKADIGIRDGRIVGIGQAGNPDTMDNVTPNMIIG 120
TR+ +D VITN +I+D GI+KADIG++DGRI IG+AGNPD V +I+G
Sbjct: 60 QVTREG-GAVDTVITNALILDH-WGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVG 115

Query: 121 ASTEVHNGVHLIATAGGIDTHIHFICPQQAQHAIESGVTTLIGGGTGPADGTHATTCTPG 180
TEV G I TAGG+D+HIHFICPQQ + A+ SG+T ++GGGTGPA GT ATTCTPG
Sbjct: 116 PGTEVIAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPG 175

Query: 181 AWYMEHMFQAAEALPVNVGFFGKGNCSTLDPLREQIEAGALGLKIHEDWGATPAVIDSAL 240
W++ M +AA+A P+N+ F GKGN S L E + GA LK+HEDWG TPA ID L
Sbjct: 176 PWHIARMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCL 235

Query: 241 KVADEMDIQVAIHTDTLNESGFLEDTMKAIDGRVIHTFHTEGAGGGHAPDIIKAAMYSNV 300
VADE D+QV IHTDTLNESGF+EDT+ AI GR IH +HTEGAGGGHAPDII+ NV
Sbjct: 236 SVADEYDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNV 295

Query: 301 LPASTNPTRPFTKNTIDEHLDMLMVCHHLDKRVPEDVAFADSRIRPETIAAEDILHDMGV 360
+P+STNPTRP+T NT+ EHLDMLMVCHHL +PED+AFA+SRIR ETIAAEDILHD+G
Sbjct: 296 IPSSTNPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGA 355

Query: 361 FSIMSSDSQAMGRIGEVVIRTWQTADKMKMQRGELGNE--GNDNFRIKRYIAKYTINPAI 418
FSI+SSDSQAMGR+GEV IRTWQTADKMK QRG L E NDNFR+KRYIAKYTINPAI
Sbjct: 356 FSIISSDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAI 415

Query: 419 AHGIAEHIGSLEVGKIADIVLWKPMFFGVKPEVVIKKGFISYAKMGDPNASIPTPQPVFY 478
AHG++ IGSLEVGK AD+VLW P FFGVKP++V+ G I+ A MGDPNASIPTPQPV Y
Sbjct: 416 AHGLSHEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHY 475

Query: 479 RPMYGAQGLATAQTAVFFVSQAAEKADIRAKFGLHKETIAVKGCR-NVGKKDLVHNDVTP 537
RPM+GA G + ++V FVSQA+ A + + G+ KE +AV+ R +GK ++HN +TP
Sbjct: 476 RPMFGAYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTP 535

Query: 538 NITVDAERYEVRVDGELITCEPVDSVPLGQRYFLF 572
+I VD E YEVR DGEL+TCEP +P+ QRYFLF
Sbjct: 536 HIEVDPETYEVRADGELLTCEPATVLPMAQRYFLF 570


14CGSHiGG_07815CGSHiGG_07895Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_078152122.167526DNA polymerase I
CGSHiGG_078202100.39620423S rRNA (guanosine-2'-O-)-methyltransferase
CGSHiGG_07825110-0.02182523S rRNA (guanosine-2'-O-)-methyltransferase
CGSHiGG_07830111-1.455232hypothetical protein
CGSHiGG_07835010-2.942296pyridoxamine 5'-phosphate oxidase
CGSHiGG_07840213-4.565068GTP-binding protein TypA/ BipA
CGSHiGG_07845518-7.681217CMP-neu5Ac--lipooligosaccharide alpha 2-3
CGSHiGG_07850621-8.502073glutamine synthetase
CGSHiGG_078551329-10.897750hypothetical protein
CGSHiGG_078751127-9.261065hypothetical protein
CGSHiGG_07880724-7.015041hypothetical protein
CGSHiGG_07885524-6.156319putative polysaccharide polymerase
CGSHiGG_07890419-3.921327N-acetylneuraminic acid synthase-like protein
CGSHiGG_07895315-1.794779undecaprenyl-phosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_07815HTHFIS381e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.3 bits (89), Expect = 1e-04
Identities = 32/160 (20%), Positives = 57/160 (35%), Gaps = 28/160 (17%)

Query: 570 VIGQEEAVDAVANAIRRRRAGLSDPNRPIGSFLFLGPTGVGKTELCKTLAKFLFDSEDAM 629
++G+ A+ + + R L + + + G +G GK + + L +
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGTGKELVARALHDYGKRRNGPF 191

Query: 630 VRIDMSEFMEKHSVSRLVGAPPGYVGYEEGGYLTEAVRRRPYSV-------ILLDEVEKA 682
V I+M+ S L G +E+G + T A R + LDE+
Sbjct: 192 VAINMAAIPRDLIESELFG-------HEKGAF-TGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 683 HADVFNILLQVLDDG---RLTDGQGRTVDFRNTVVIMTSN 719
D LL+VL G + D R ++ +N
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN 280



Score = 36.3 bits (84), Expect = 5e-04
Identities = 14/68 (20%), Positives = 29/68 (42%), Gaps = 3/68 (4%)

Query: 151 DQNAEESRQALEKYTIDLTARAESG-KLDPVIGRDEEIRRAIQVLQRRTKNN-PVLI-GE 207
+ +AL + + + P++GR ++ +VL R + + ++I GE
Sbjct: 109 TELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE 168

Query: 208 PGVGKTAI 215
G GK +
Sbjct: 169 SGTGKELV 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_07840TCRTETOQM1752e-49 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 175 bits (446), Expect = 2e-49
Identities = 103/443 (23%), Positives = 177/443 (39%), Gaps = 62/443 (13%)

Query: 9 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFESARGDVDE--RVMDSNDLEKERGITILAKN 66
K+ NI ++AHVD GKTTL + LL SG G VD+ D+ LE++RGITI
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITEL-GSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 67 TAINWNDYRINIVDTPGHADFGGEVERVLSMVDSVLLVVDAFDGPMPQTRFVTQKAFAHG 126
T+ W + ++NI+DTPGH DF EV R LS++D +L++ A DG QTR + G
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 127 LKPIVVINKVDRPGARPDWVVDQVFDLF---------VNLGASDEQLDFPII--YASALN 175
+ I INK+D+ G V + + V L + +F + + +
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 176 G--------VAG--LEHEDLAEDMT-----------------------PLFEAIVKYVES 202
G ++G LE +L ++ + L E I S
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240

Query: 203 PKVELDAPFQMQISQLDYNNYVGVIGIGRIKRGSIKPNQPVTIINSEGKTRQGRIGQVLG 262
+ ++ +++Y+ + R+ G + V I E +I ++
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI----KITEMYT 296

Query: 263 HLGLQRYEEDVAYAGDIVAITGLGELNISDTICDINAVEALPSLTVDEPTVTMFFCVNTS 322
+ + + D AY+G+IV + L ++ + D + + P + +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKP 355

Query: 323 PFAGQEGKYVTSRQILERLNKELVHNVALRVEETPNPDEFRVSGRGELHLSVLIENMRRE 382
Q L ++ + LR E +S G++ + V ++ +
Sbjct: 356 ---QQREML---LDALLEISDS---DPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEK 406

Query: 383 -GYELAVSRPKVIYRDIDGKKQE 404
E+ + P VIY + KK E
Sbjct: 407 YHVEIEIKEPTVIYMERPLKKAE 429



Score = 32.9 bits (75), Expect = 0.004
Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 3/89 (3%)

Query: 404 EPYEQVTIDVEEQHQGSVMEALGIRKGEVRDMLPDGKG-RVRLEYIIPSRGLIGFRGDFM 462
EPY I +++ + D K V L IP+R + +R D
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVD--TQLKNNEVILSGEIPARCIQEYRSDLT 594

Query: 463 TMTSGTGLLYSSFSHYDEIKGGDIGQRKN 491
T+G + + Y G + Q +
Sbjct: 595 FFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_07845PF08280290.033 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 28.7 bits (64), Expect = 0.033
Identities = 14/49 (28%), Positives = 25/49 (51%), Gaps = 1/49 (2%)

Query: 207 KENIIKLLPSFSQSKSQSNIHSMEYDLSAL-HFLQKHYGINIYCISPES 254
+E +I LL +F S++ I EY + L L +GI +Y ++ +
Sbjct: 156 REALIPLLRNFELKLSKNKIVGEEYRIRYLIALLYSKFGIKVYDLTQQD 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_07855TYPE4SSCAGA270.032 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.4 bits (60), Expect = 0.032
Identities = 24/90 (26%), Positives = 42/90 (46%), Gaps = 5/90 (5%)

Query: 33 SLNQTNQEFKQGFNLRLDELRFSKEQIEENLTEA---KKV--QVENLTNALDIAKNTPVV 87
+LN EFK G N ++ +K +E ++ + +KV +V+NL A+ +AK T
Sbjct: 741 NLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKVTDKVDNLNQAVSVAKATGDF 800

Query: 88 YPANYYITDRQLFKLNKLASQLELVEDVKT 117
+ D + F +LA Q + E +
Sbjct: 801 SRVEQALADLKNFSKEQLAQQAQKNESLNA 830


15CGSHiGG_08575CGSHiGG_08680Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_08575422-3.359959hypothetical protein
CGSHiGG_08580218-2.177621transferrin-binding protein 2
CGSHiGG_08585016-2.518023chromosomal replication initiation protein
CGSHiGG_08590016-3.07905450S ribosomal protein L34
CGSHiGG_08595-116-3.188096ribonuclease P
CGSHiGG_08600-114-2.578722hypothetical protein
CGSHiGG_08615013-2.447793tRNA modification GTPase TrmE
CGSHiGG_08620-112-2.677147peptidyl-prolyl cis-trans isomerase D
CGSHiGG_08625-214-2.191135tRNA modification GTPase TrmE
CGSHiGG_086300111.215706lipoprotein signal peptidase
CGSHiGG_086350111.6927604-hydroxy-3-methylbut-2-enyl diphosphate
CGSHiGG_086402111.198305hypothetical protein
CGSHiGG_086554121.2943133-hydroxyisobutyrate dehydrogenase
CGSHiGG_08660290.444374hypothetical protein
CGSHiGG_08665290.5062794-hydroxy-3-methylbut-2-enyl diphosphate
CGSHiGG_086702100.065742putative aldolase
CGSHiGG_08675210-0.239881hypothetical protein
CGSHiGG_08680212-0.186865gluconate transport protein GntT
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08680PHAGEIV290.038 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 29.1 bits (65), Expect = 0.038
Identities = 16/60 (26%), Positives = 26/60 (43%), Gaps = 2/60 (3%)

Query: 316 GAGGMFGKVLDASGVGKALADVLSSTGLPV-LLLGFILAALLRAAQGSATVAVITTATIL 374
AG G V + L VLSS G + G +L +RA + ++ +++ IL
Sbjct: 218 AAGSQRGTVAGGVNTDR-LTSVLSSAGGSFGIFNGDVLGLSVRALKTNSHSKILSVPRIL 276


16CGSHiGG_08735CGSHiGG_08915Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_08735320-3.7671033-keto-L-gulonate-6-phosphate decarboxylase
CGSHiGG_08750319-3.106321putative L-xylulose 5-phosphate 3-epimerase
CGSHiGG_08755319-3.163401L-xylulose kinase
CGSHiGG_08760216-2.361449hypothetical protein
CGSHiGG_08765115-1.929435hypothetical protein
CGSHiGG_08770112-1.776533hypothetical protein
CGSHiGG_08775-112-2.0566572,3-diketo-L-gulonate reductase
CGSHiGG_08780-211-2.841112L-ribulose-5-phosphate 4-epimerase
CGSHiGG_08785-211-2.032082phosphoserine phosphatase
CGSHiGG_08790-113-2.266379putative nucleotide-binding protein
CGSHiGG_08795015-2.114156magnesium/nickel/cobalt transporter CorA
CGSHiGG_08800218-2.663688integral membrane protein
CGSHiGG_08805119-1.197375hypothetical protein
CGSHiGG_08810021-1.452716hypothetical protein
CGSHiGG_08815121-1.604824hypothetical protein
CGSHiGG_08820318-1.159201hypothetical protein
CGSHiGG_08845319-0.764686hypothetical protein
CGSHiGG_08850415-0.479833B12-dependent methionine synthase
CGSHiGG_08855415-0.486035twin-argninine leader-binding protein DmsD
CGSHiGG_08860415-0.131923anaerobic dimethyl sulfoxide reductase chain C
CGSHiGG_088653120.055297anaerobic dimethyl sulfoxide reductase chain B
CGSHiGG_088701130.242859anaerobic dimethyl sulfoxide reductase chain A
CGSHiGG_08875012-0.397777anaerobic dimethyl sulfoxide reductase chain A
CGSHiGG_08880-114-0.212021hypothetical protein
CGSHiGG_08885-214-0.953054hypothetical protein
CGSHiGG_08890015-0.740046hypothetical protein
CGSHiGG_08895015-1.294379ABC transporter ATP-binding protein
CGSHiGG_08900219-1.425995transcriptional regulator AraC family protein
CGSHiGG_08905320-0.569428hypothetical protein
CGSHiGG_08910420-0.582945hypothetical protein
CGSHiGG_08915419-0.857715putative type III restriction-modification
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08755PF03309290.031 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.3 bits (66), Expect = 0.031
Identities = 12/74 (16%), Positives = 21/74 (28%), Gaps = 12/74 (16%)

Query: 5 LGIDCGGTFIKAAIFDQNGTLQSIARRNIPIISEKPGYAERDMDELWNLCAQVIQKTIRQ 64
L ID T + +G + + I +E E DEL +I
Sbjct: 3 LAIDVRNTHTVVGLISGSGDHAKV-VQQWRIRTEP----EVTADELALTIDGLIGDD--- 54

Query: 65 SSILPQQIKAIGIS 78
+++
Sbjct: 55 ----AERLTGASGL 64


17CGSHiGG_09540CGSHiGG_09580Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_09540215-2.736718hypothetical protein
CGSHiGG_09545214-2.330500hypothetical protein
CGSHiGG_09550115-1.515487para-aminobenzoate synthase component I
CGSHiGG_09555113-0.692906putative anthranilate synthase component II
CGSHiGG_09560112-0.038453S-adenosylmethionine synthetase
CGSHiGG_09565314-1.157901hypothetical protein
CGSHiGG_09570214-0.712916opacity protein
CGSHiGG_09575316-0.772121hypothetical protein
CGSHiGG_09580217-1.012805arginine transporter permease subunit ArtM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09570OMPADOMAIN280.023 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 27.6 bits (61), Expect = 0.023
Identities = 46/203 (22%), Positives = 67/203 (33%), Gaps = 45/203 (22%)

Query: 1 MKKSLLAVIVGAFAFASVANA-----NIYAEGDIGLSQTKANGSSNTRVEPR-------V 48
MKK+ +A+ V FA+VA A Y +G SQ G N
Sbjct: 1 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGA 60

Query: 49 SVGYKVGNTRVA--------GDYTHHGKVDGT--KIQG------LGASVLYDFDTNSKVQ 92
GY+V N V G + G V+ K QG LG + D D ++
Sbjct: 61 FGGYQV-NPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTR-- 117

Query: 93 PYVGARVATNQFKYTNRAEQKFKSSSDIKLGYGVVAGAKYKLDGNWYANGGVEYNRLGNF 152
+G V K + + D + G +Y + +EY N
Sbjct: 118 --LGGMVWRADTK-----SNVYGKNHDTGVSPVFAGGVEYAITPEIAT--RLEYQWTNNI 168

Query: 153 -----DSTKVNNYGAKVGVGYGF 170
T+ +N +GV Y F
Sbjct: 169 GDAHTIGTRPDNGMLSLGVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09575PF03895395e-07 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 39.0 bits (91), Expect = 5e-07
Identities = 18/72 (25%), Positives = 30/72 (41%), Gaps = 3/72 (4%)

Query: 97 KGTAGSVSAGNEIR--GITRQITGVAAGTYRGQSAVAAGVRYTPKPNMVISLSGSADTNN 154
G A + ++ G+ + A G YR ++A+A GV + +T N
Sbjct: 7 TGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVAFNTYN 66

Query: 155 GVGAATGVSFGF 166
G G + G S G+
Sbjct: 67 G-GMSYGASVGY 77


18CGSHiGG_03225CGSHiGG_03280N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_03225-1120.713307ferric transporter ATP-binding subunit
CGSHiGG_03230-2110.728138ferric transport system permease protein FbpB
CGSHiGG_03235-1100.511847ferric transporter ATP-binding subunit
CGSHiGG_03240090.114897uridine kinase
CGSHiGG_03245-212-0.148958deoxycytidine triphosphate deaminase
CGSHiGG_03255-210-0.163257sugar efflux transporter
CGSHiGG_03260-212-0.198793GTP-binding protein EngA
CGSHiGG_03265-216-0.915788**DNA polymerase III subunit epsilon
CGSHiGG_03270-116-0.748537ribonuclease H
CGSHiGG_03275-113-0.099733Outer membrane protein P2
CGSHiGG_03280-2130.816548N-acetylglucosamine-6-phosphate deacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03225PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.003
Identities = 13/56 (23%), Positives = 22/56 (39%)

Query: 1 MSNNDFLVLKNITKAFGKAVVIDNLDLTIKRGTMVTLLGPSGCGKTTVLRLVAGLE 56
L+ + K V ++ K V L G G GK+T++ + GL+
Sbjct: 565 YKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03255TCRTETB552e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 55.3 bits (133), Expect = 2e-10
Identities = 41/185 (22%), Positives = 81/185 (43%), Gaps = 1/185 (0%)

Query: 36 LSDIAQSFDMQTADTGLMMTVYAWTVLIMSLPAMLATGNMERKSLLIKLFIIFIVGHILS 95
L DIA F+ A T + T + T I + + + K LL+ II G ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 VIAWNFW-ILLLARMCIALAHSVFWSITASLVMRISPKHKKTQALGMLAIGTALATILGL 154
+ +F+ +L++AR + F ++ +V R PK + +A G++ A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PIGRIVGQLVGWRVTFGIIAVLALSIMFLIIRLLPNLPSKNAGSIASLPLLAKRPLLLWL 214
IG ++ + W I + +++ FL+ L + K I + L++ + L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 215 YVTTA 219
+ T+
Sbjct: 217 FTTSY 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03260MYCMG045320.006 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 32.0 bits (72), Expect = 0.006
Identities = 15/49 (30%), Positives = 24/49 (48%)

Query: 166 EKMENADENDRTSEEEQDEWEQEFDFDSEEDTALIDDALDEELEEEQDK 214
K NA+ + +Q E+EFD+ +E AL++ EL E + K
Sbjct: 388 SKKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEKEDSAELNENEKK 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03265cloacin290.023 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.9 bits (64), Expect = 0.023
Identities = 13/46 (28%), Positives = 22/46 (47%)

Query: 100 PFDVGFMDYEFRKLNLNVKTDDICLVTDTLQMARQMYPGKRSNLDA 145
P + +YE + LN +D+ + A Q+Y ++S LDA
Sbjct: 315 PVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03275ECOLIPORIN446e-07 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 43.7 bits (103), Expect = 6e-07
Identities = 60/280 (21%), Positives = 101/280 (36%), Gaps = 49/280 (17%)

Query: 2 KKTLAALIVGAFAASAANAAVVYNNEGTKVELGGRLSVIAEQSNTTVDDQKQQHGALRNQ 61
+K LA +I AA AA+AA +YN +G K++L G++ + S+ + D Q +
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTY------ 56

Query: 62 GSRFHIKATHNFGDGFYAQGYLETRLVTDVTKNSADHFGDITTKYAYVTLGNKALGEVKL 121
R K D G E + + T+ + T+ A+ L G
Sbjct: 57 -MRVGFKGETQINDQLTGYGQWEYNVQANTTEGEGA---NSWTRLAFAGLKFGDYGSFDY 112

Query: 122 GRAKTIADGITSAED--KEYGVLNNSKYVPTNGNTVGYTFKGIDGLVLGANYLLAQERST 179
GR + + D E+G G++ Y D + G +A R+T
Sbjct: 113 GRNYGVLYDVEGWTDMLPEFG-----------GDSYTYA----DNYMTGRANGVATYRNT 157

Query: 180 SDFFGTPGEVSPQKISNGVQVGAKYDANNIIAAIAFGRTNYRENIQPNVSLSGR---KQQ 236
DFFG + +G+ +Y N ++ NI N +G
Sbjct: 158 -DFFG---------LVDGLNFALQYQGKN------ESQSADDVNIGTNNRNNGDDIRYDN 201

Query: 237 LEGVLSTLGYRFSDLGLLVSLDSGYAKTKNHKEMPSRSGN 276
+G + Y D+G+ S + Y + E + G
Sbjct: 202 GDGFGISTTY---DIGMGFSAGAAYTTSDRTNEQVNAGGT 238



Score = 33.3 bits (76), Expect = 0.001
Identities = 28/122 (22%), Positives = 48/122 (39%), Gaps = 15/122 (12%)

Query: 151 NGNTVGY--TFKGIDGLVLGANYLLAQERSTSDFFGTPGEVSPQKISNGVQVGAKYDANN 208
NG+ G T+ G GA Y T++ G ++ ++ G KYDANN
Sbjct: 201 NGDGFGISTTYDIGMGFSAGAAY--TTSDRTNEQVNAGGTIAGGDKADAWTAGLKYDANN 258

Query: 209 IIAAIAFGRTN-----YRENIQPNVSLSGRKQQLEGVLSTLGYRFSDLGLLVSLDSGYAK 263
I A + T + + + ++ + Q E Y+F D GL ++ +
Sbjct: 259 IYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ---YQF-DFGLRPAV--SFLM 312

Query: 264 TK 265
+K
Sbjct: 313 SK 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_03280UREASE330.002 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.8 bits (75), Expect = 0.002
Identities = 12/29 (41%), Positives = 19/29 (65%)

Query: 339 PARAIGVDDRLGSVEKGKIANLAVFTPNY 367
PA A G+ +GS+E GK A+L ++ P +
Sbjct: 413 PAIAHGLSHEIGSLEVGKRADLVLWNPAF 441


19CGSHiGG_06890CGSHiGG_06920N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_06890-2120.240836ATP-dependent protease ATP-binding subunit ClpX
CGSHiGG_06895-1120.962813preprotein translocase subunit SecE
CGSHiGG_06900-1121.180990transcription antitermination protein NusG
CGSHiGG_06905-2130.425969lipoprotein
CGSHiGG_06910-214-0.020659transcription antitermination protein NusG
CGSHiGG_06915-116-0.504735heat shock protein HtpX
CGSHiGG_06920-214-1.731927sulfur transfer protein SirA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_06890HTHFIS349e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 9e-04
Identities = 33/192 (17%), Positives = 66/192 (34%), Gaps = 53/192 (27%)

Query: 51 LEESAVENKDKLPTPHEIRAHLDDYVIGQDYAKKVLSVAVYNHYKRLRTNYESNDVELGK 110
+ A+ + P+ E + ++G+ A +Y R +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAA----MQEIYRVLAR---------LMQTD 160

Query: 111 SNILLIGPTGSGKTLLAQTL---ARRLNVPFAMADATTLTEAGYVGEDVENVLQKLLQNC 167
+++ G +G+GK L+A+ L +R N PF + + ++++ L
Sbjct: 161 LTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIP---------RDLIESELFGH 211

Query: 168 EYDT------------EKAEKGIIYIDEIDKISRKSEGASITRDVSGEGVQQALLKLIEG 215
E E+AE G +++DEI + Q LL++++
Sbjct: 212 EKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMP--------------MDAQTRLLRVLQQ 257

Query: 216 TIASIPPQGGRK 227
GGR
Sbjct: 258 G--EYTTVGGRT 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_06895SECETRNLCASE1399e-46 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 139 bits (351), Expect = 9e-46
Identities = 49/121 (40%), Positives = 72/121 (59%), Gaps = 1/121 (0%)

Query: 18 EGKSKGLNTFLWVLAVIFFAAAAIGNIYFQQIYSLPIRVIGMAIALVIAFILAAITNQGT 77
+G +GL WV+ V A +GN ++ I LP+R + + I + A +A +T +G
Sbjct: 8 QGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIM-LPLRALAVVILIAAAGGVALLTTKGK 66

Query: 78 KARAFFNDSRTEARKVVWPTRAEARQTTLIVIGVTMIASLFFWAVDSIIVTVINFLTDLR 137
AF ++RTE RKV+WPTR E TTLIV VT + SL W +D I+V +++F+T LR
Sbjct: 67 ATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVSFITGLR 126

Query: 138 F 138
F
Sbjct: 127 F 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_06905VACJLIPOPROT319e-113 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 319 bits (820), Expect = e-113
Identities = 111/253 (43%), Positives = 159/253 (62%), Gaps = 11/253 (4%)

Query: 4 KAILTAL-LGAIALTGCANNNDTKQASERNDSLEGFNRTMWKFNYNVLDRYVLEPVAKGW 62
K L+AL LG L GCA++ +Q R+D LEGFNRTM+ FN+NVLD Y++ PVA W
Sbjct: 2 KLRLSALALGTTLLVGCASSGTDQQ--GRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAW 59

Query: 63 NNYVPKPISSGLAGIANNLDEPVSFINRLIEGEPKKAFVHFNRFWINTVFGLGGFIDFAS 122
+YVP+P +GL+ NL+EP +N ++G+P + VHF RF++NT+ G+GGFID A
Sbjct: 60 RDYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAG 119

Query: 123 -ASKELRIDNQRGFGETLGSYGVDAGTYIVLPIYNATTPRQLTGAVVDAAYMYPFWQWVG 181
A+ +L+ FG TLG YGV G Y+ LP Y + T R G + DA +YP W+
Sbjct: 120 MANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADA--LYPVLSWLT 177

Query: 182 GPWALVKYGVQAVDGRAKNLNNAELLRQAQDPYITFREAYYQNLQFKVNDGKLVESK--- 238
P ++ K+ ++ ++ RA+ L++ LLRQ+ DPYI REAY+Q F N G+L +
Sbjct: 178 WPMSVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPN 237

Query: 239 -ESLPDDILKDID 250
+++ DD LKDID
Sbjct: 238 AQAIQDD-LKDID 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_06920PF01206905e-28 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 90.2 bits (224), Expect = 5e-28
Identities = 24/71 (33%), Positives = 42/71 (59%)

Query: 8 QTLNTLGLRCPEPVMLVRKNIRHLNDGEILLIIADDPATTRDIPSFCQFMDHTLLQSEVE 67
Q+L+ GL CP P++ +K + +N GE+L ++A DP + +D SF + H LL+ + E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 68 KPPFKYWVKRG 78
+ + +KR
Sbjct: 66 DGTYHFRLKRA 76


20CGSHiGG_08020CGSHiGG_08120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_08020010-0.564490transcriptional repressor
CGSHiGG_080250100.002705putative membrane-fusion protein
CGSHiGG_08030-190.018359acriflavine resistance protein
CGSHiGG_08035-212-0.095618cell division protein
CGSHiGG_08050-1130.619947multidrug resistance protein A
CGSHiGG_08055-1171.286960dihydrofolate reductase
CGSHiGG_080600171.214256gamma-glutamyl kinase
CGSHiGG_08065-1201.009108dinucleoside polyphosphate hydrolase
CGSHiGG_08070-2151.337885hypothetical protein
CGSHiGG_080850161.387938thymidylate synthase
CGSHiGG_08090-1110.614028hypothetical protein
CGSHiGG_08095011-0.154934hypothetical protein
CGSHiGG_08100090.377405hypothetical protein
CGSHiGG_081050120.539249preprotein translocase subunit SecA
CGSHiGG_08110-213-0.045054mutator protein MutT
CGSHiGG_08115-115-0.399437glutathione-regulated potassium-efflux system
CGSHiGG_08120019-0.656125hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08020HTHTETR588e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.1 bits (140), Expect = 8e-13
Identities = 18/77 (23%), Positives = 35/77 (45%)

Query: 1 MRQAKTDLAEQIFSATDRLMAREGLNQLSMHKLAKEANVAAGTIYLYFKNKDELLEQFAH 60
+Q + + I RL +++G++ S+ ++AK A V G IY +FK+K +L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RVFSMFMATLEKDFDET 77
S + +
Sbjct: 65 LSESNIGELELEYQAKF 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08025RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 17/55 (30%), Positives = 30/55 (54%)

Query: 90 GAVSQVLVQNGQNVKKGEVLVELDSSVERANLQAAQAQLSALRQTYQRYVGLLNS 144
V +++V+ G++V+KG+VL++L + A+ Q+ L R RY L S
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159



Score = 44.4 bits (105), Expect = 5e-07
Identities = 33/216 (15%), Positives = 68/216 (31%), Gaps = 39/216 (18%)

Query: 118 RANLQAAQAQLSALRQTYQRYVGLLNSNAVSRQEMDNAKAAYDAQVASIESLKAAIERRK 177
+ + +A+ + + Q ++ + ++ + + +
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEI---------LDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 178 IVAPFDGKAGIVKVN-VGQYVNVGT---EIVRVEDTSSMKVDFALSQNDLDKLHIGQRVT 233
I AP K +KV+ G V IV +DT ++V + D+ +++GQ
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT--LEVTALVQNKDIGFINVGQNAI 387

Query: 234 ATTDARLGKTFSARITAIEPAINSSTGLVDVQATFDPEDGHKLLSGMFSRLRIALPTETN 293
+A F + +++ A D G + I++
Sbjct: 388 IKVEA-----FPYTRY---GYLVGKVKNINLDAIEDQRLGL------VFNVIISIEENCL 433

Query: 294 QVVVPQVAISYNMYGE----------IAYLLEPLSE 319
+ +S M I+YLL PL E
Sbjct: 434 STGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEE 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08030ACRIFLAVINRP8960.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 896 bits (2316), Expect = 0.0
Identities = 323/1044 (30%), Positives = 537/1044 (51%), Gaps = 48/1044 (4%)

Query: 5 DIFIRRPVLAVSISLLMIILGLQAISKLAVREYPKMTTTVITVSTAYPGADANLIQAFVT 64
+ FIRRP+ A +++++++ G AI +L V +YP + ++VS YPGADA +Q VT
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 65 SKLEESIAQADNIDYMSSTSAPS-SSTITIKMKLNTDPAGALADVLAKVNAVKSALPNGI 123
+E+++ DN+ YMSSTS + S TIT+ + TDP A V K+ LP +
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 124 EDPSVSSS-SGGSGIMYISFRSKKLDSSQ--VTDYINRVVKPQFFTIEGVAEVQVFGAAE 180
+ +S S S +M F S ++Q ++DY+ VK + GV +VQ+FGA +
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-Q 181

Query: 181 YALRIWLDPQKMAAQNLSVPTVMSALSANNVQTAAGNDN------GYYVSYRNKVETTTK 234
YA+RIWLD + L+ V++ L N Q AAG G ++ +T K
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 235 SVEQLSNLIISSNGD-DLVRLRDIATVELNKENDNSRATANGAESVVLAINPTSTANPLT 293
+ E+ + + N D +VRL+D+A VEL EN N A NG + L I + AN L
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 294 VAEKIRPLYESIKTQLPDSMESDILYDRTIAINSSIHEVIKTIGEATLIVLVVILMFIGS 353
A+ I+ ++ P M+ YD T + SIHEV+KT+ EA ++V +V+ +F+ +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 354 FRAILIPILAIPISLIGVLMLLQSFNFSINLMTLLALILAIGLVVDDAIVVLENIDRHIK 413
RA LIP +A+P+ L+G +L +F +SIN +T+ ++LAIGL+VDDAIVV+EN++R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 414 AGETPFRAAII-GTREIAVPVISMTIALIAVYSPMALMGGITGTLFKEFALTLAGAVFIS 472
+ P + A +I ++ + + L AV+ PMA GG TG ++++F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 473 GVVALTLSPMMSSKLLKSNAKP---------TWMEERVEHTLGKVNRVYEYMLDLVMLNR 523
+VAL L+P + + LLK + W +H+ VN + ++
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHS---VNHYTNSVGKILGSTG 538

Query: 524 KSMLAFAVVIFSTLPFLFNSLSSELTPNEDKGAFIAIGNAPSSVNVDYIQNAMQP----Y 579
+ +L +A+++ + LF L S P ED+G F+ + P+ + Q + Y
Sbjct: 539 RYLLIYALIVAGMV-VLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 580 MKNVMETPEVSF---GMSIAGAPTSNSSLNIITLKDWKERSRK---QSAIMNEINEKAKS 633
+KN E F G S +G N+ + ++LK W+ER+ A+++ +
Sbjct: 598 LKNEKANVESVFTVNGFSFSG-QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGK 656

Query: 634 IPEVSVSAFNIPEIDTG--EQGPPVSIVLKTAQDYKSLANTAEKFLS-AMKASGKFIYTN 690
I + V FN+P I G ++ + + +L + L A + +
Sbjct: 657 IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 691 LDLTYDTAQMTISVDKEKAGTYGITMQQISSTLGSFLSGATITRVDVDGRAYKVISQVKR 750
+ DTAQ + VD+EKA G+++ I+ T+ + L G + GR K+ Q
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 751 DDRLSPESFQNYYLTASNGQSVPLSSVISMKLETQPTSLPRFSQLNSAEISAVPMPGTSS 810
R+ PE Y+ ++NG+ VP S+ + L R++ L S EI PGTSS
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 811 GDAIAWLQQHATDNLPQGYTFDFKSEARQLVQEGNALAVTFALAVIIIFLVLAIQFESIR 870
GDA+A ++ LP G +D+ + Q GN A++ +++FL LA +ES
Sbjct: 837 GDAMALMEN-LASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 871 DPMVIMISVPLAVSGALVSLNILSFFSIAGTTLNIYSQVGLITLVGLITKHGILMCEVAK 930
P+ +M+ VPL + G L++ + ++Y VGL+T +GL K+ IL+ E AK
Sbjct: 896 IPVSVMLVVPLGIVGVLLAAT------LFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 931 EEQL-NGKTRIEAITHAAKVRLRPILMTTAAMVAGLIPLLYATGAGAVSRFSIGIVIVAG 989
+ GK +EA A ++RLRPILMT+ A + G++PL + GAG+ ++ ++GI ++ G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 990 LSIGTIFTLFVLPVVYSYVATEHK 1013
+ T+ +F +PV + + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08035PF04647310.003 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 31.3 bits (71), Expect = 0.003
Identities = 8/41 (19%), Positives = 15/41 (36%)

Query: 2 AHRDFAARRSSNNKKKNKKKNTNVLIFLALFIVLVFVVGLY 42
D SN +++ K ++ + LF + LY
Sbjct: 122 VPVDNPRNLISNTEQRKTLKLKTSMVLMVLFGGSIGAYRLY 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08050RTXTOXIND811e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 80.6 bits (199), Expect = 1e-18
Identities = 57/415 (13%), Positives = 121/415 (29%), Gaps = 101/415 (24%)

Query: 25 SIFILLLLIIGIACALYWFFFLKDFEETEDAYVGGNQVMVSSQV-------AGNVAKINA 77
+ ++ + A + E ++ S + V +I
Sbjct: 57 PRLVAYFIMGFLVIAFILSVL----GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIV 112

Query: 78 DNMDKVHAGDILVELDDTNAKLSFEQAKSNLANA----------VRQIEQLGFTV----- 122
+ V GD+L++L A+ + +S+L A R IE
Sbjct: 113 KEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172

Query: 123 ----QQLQSAVHANEISLAQAQGNLARRVQLEKMGAIDK---------ESFQHAKEAVEL 169
Q + SL + Q + + + +K +DK + +
Sbjct: 173 EPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRV 232

Query: 170 AKANLNA----------SKNQLAANQALL------RNVPLREQPQIQNAMSSLKQ----- 208
K+ L+ +K+ + + V + QI++ + S K+
Sbjct: 233 EKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLV 292

Query: 209 ----------------------------AWLNLQRTKIRSPIDGYVARRNVQ-VGQAVSV 239
Q + IR+P+ V + V G V+
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 240 GGALMAVV-SNEQMWLEANFKETQLTNMRIGQPVKIHFDLYGKNK--EFDGVINGIEMGT 296
LM +V ++ + + A + + + +GQ I + + + G + I +
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA 412

Query: 297 GNAFSLLPSQNATGNWIKVVQRVPVRIKLDPQQFTETPLRIGLSATAKVRISDSS 351
G V+ + + PL G++ TA+++ S
Sbjct: 413 -------IEDQRLGLVFNVIISIEENCLSTGNK--NIPLSSGMAVTAEIKTGMRS 458


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08060CARBMTKINASE376e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.5 bits (87), Expect = 6e-05
Identities = 33/133 (24%), Positives = 48/133 (36%), Gaps = 22/133 (16%)

Query: 129 IPVINENDAVATAEIKVGDNDNLSALVAILVQAEQLYLLTDQQGLFDSDPRKNSEAKLIP 188
+PVI E+ + E V D D +A V A+ +LTD G E L
Sbjct: 197 VPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILTDVNGAA-LYYGTEKEQWLRE 254

Query: 189 V-VEQITDHIRSIAGGSGTNLGTGGMMTKIIAADVATRSGIETIIAPGNRPNVIADL--- 244
V VE++ + + G M K++AA G +IA L
Sbjct: 255 VKVEELRKYYEE------GHFKAGSMGPKVLAAIRFIEW--------GGERAIIAHLEKA 300

Query: 245 --AYEQNIGTKFI 255
A E GT+ +
Sbjct: 301 VEALEGKTGTQVL 313



Score = 33.6 bits (77), Expect = 8e-04
Identities = 16/49 (32%), Positives = 25/49 (51%), Gaps = 4/49 (8%)

Query: 3 KKTIVVKFGTSTLTQGSPKLNSPHMMEIVR----QIAQLHNDGFRIVIV 47
K +V+ G + L Q K + MM+ VR QIA++ G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVIT 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08095FbpA_PF05833290.003 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.7 bits (64), Expect = 0.003
Identities = 12/68 (17%), Positives = 30/68 (44%), Gaps = 8/68 (11%)

Query: 12 VNITDVLEQSPFAKIMKK---GLAINELNQ-KFNRIFPQEFHGKFRIGNITDHLIFIEV- 66
+ + ++ F +++K I +++Q +RI +F +G + + + IE+
Sbjct: 64 LTKPNPIKAPMFCMVLRKYISNAKIVDIHQINQDRIVVIDFESTDELGFNSIYSLIIEIM 123

Query: 67 ---SNAIV 71
SN +
Sbjct: 124 GRHSNMTL 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08100PF06580260.023 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.4 bits (58), Expect = 0.023
Identities = 13/49 (26%), Positives = 23/49 (46%), Gaps = 4/49 (8%)

Query: 53 KVAREVQRQAIPQPSISRQTEK---QLKIQPHFFTEALN-ISAPIRAGP 97
K ++ + S++++ + + +I PHF ALN I A I P
Sbjct: 142 KNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDP 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08105SECA13280.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1328 bits (3438), Expect = 0.0
Identities = 613/897 (68%), Positives = 728/897 (81%), Gaps = 1/897 (0%)

Query: 1 MSILTRIFGSRNERVLRKLKKQVVKINKMEPAFEALSDDELKAKTQEFRDRLSGGETLQQ 60
+ +LT++FGSRN+R LR+++K V IN MEP E LSD+ELK KT EFR RL GE L+
Sbjct: 3 IKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLEN 62

Query: 61 ILPEAFATVREASKRVLGMRHFDVQLIGGMVLTNRCIAEMRTGEGKTLTATLPCYLIALE 120
++PEAFA VREASKRV GMRHFDVQL+GGMVL RCIAEMRTGEGKTLTATLP YL AL
Sbjct: 63 LIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALT 122

Query: 121 GKGVHVVTVNDYLARRDAETNRPLFEFLGMSVGVNIPGLSPEEKREAYAADITYATNSEL 180
GKGVHVVTVNDYLA+RDAE NRPLFEFLG++VG+N+PG+ KREAYAADITY TN+E
Sbjct: 123 GKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNEY 182

Query: 181 GFDYLRDNLAHSKEERFQRTLGYALVDEVDSILIDEARTPLIISGQAEKSSELYIAVNKL 240
GFDYLRDN+A S EER QR L YALVDEVDSILIDEARTPLIISG AE SSE+Y VNK+
Sbjct: 183 GFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKI 242

Query: 241 IPSLIKQEKEDTEEYQGEGDFTLDLKSKQAHLTERGQEKVEDWLIAQGLMPEGDSLYSPS 300
IP LI+QEKED+E +QGEG F++D KS+Q +LTERG +E+ L+ +G+M EG+SLYSP+
Sbjct: 243 IPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPA 302

Query: 301 RIVLLHHVMAALRAHTLFEKDVDYIVKDGEIVIVDEHTGRTMAGRRWSDGLHQAIEAKEG 360
I+L+HHV AALRAH LF +DVDYIVKDGE++IVDEHTGRTM GRRWSDGLHQA+EAKEG
Sbjct: 303 NIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEG 362

Query: 361 VDIKSENQTVASISYQNYFRLYERLAGMTGTADTEAFEFQQIYGLETVVIPTNRPMIRDD 420
V I++ENQT+ASI++QNYFRLYE+LAGMTGTADTEAFEF IY L+TVV+PTNRPMIR D
Sbjct: 363 VQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKD 422

Query: 421 RTDVMFENEQYKFNAIIEDIKDCVERQQPVLVGTISVEKSEELSKALDKAGIKHNVLNAK 480
D+++ E K AIIEDIK+ + QPVLVGTIS+EKSE +S L KAGIKHNVLNAK
Sbjct: 423 LPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAK 482

Query: 481 FHQQEAEIVAEAGFPSAVTIATNMAGRGTDIILGGNWKAQAAKLENPTQEQIEALKAEWE 540
FH EA IVA+AG+P+AVTIATNMAGRGTDI+LGG+W+A+ A LENPT EQIE +KA+W+
Sbjct: 483 FHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQ 542

Query: 541 KNHEIVMKAGGLHIIGTERHESRRIDNQLRGRSGRQGDPGSSRFYLSLEDGLMRIYLNEG 600
H+ V++AGGLHIIGTERHESRRIDNQLRGRSGRQGD GSSRFYLS+ED LMRI+ ++
Sbjct: 543 VRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDR 602

Query: 601 KRNLMRKAFTVAGEAMESKMLAKVIASAQAKVEAFHFDGRKNLLEYDDVANDQRHAIYEQ 660
+MRK GEA+E + K IA+AQ KVE+ +FD RK LLEYDDVANDQR AIY Q
Sbjct: 603 VSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQ 662

Query: 661 RNYLLDNDDISETINAIRHDVFNGVIDQYIPPQSLEEQWDIKGLEERLSQEFGMELPISN 720
RN LLD D+SETIN+IR DVF ID YIPPQSLEE WDI GL+ERL +F ++LPI+
Sbjct: 663 RNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 721 WLEEDNNLHEESLRERIVEIAEKEYKEKEALVGEDAMHHFEKGVMLQTLDELWKEHLASM 780
WL+++ LHEE+LRERI+ + + Y+ KE +VG + M HFEKGVMLQTLD LWKEHLA+M
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 781 DYLRQGIHLRGYAQKDPKQEYKKESFRMFTEMLDSLKHHVIMTLTRVRVRTQEEMEEAEL 840
DYLRQGIHLRGYAQKDPKQEYK+ESF MF ML+SLK+ VI TL++V+VR EE+EE E
Sbjct: 783 DYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEELEQ 842

Query: 841 ARQEMATRINQ-NNLPVDENSQTTQNSETEDYSDRRIGRNEPCPCGSGKKYKHCHGS 896
R+ A R+ Q L ++ + +R++GRN+PCPCGSGKKYK CHG
Sbjct: 843 QRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_08120ADHESNFAMILY300.007 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 30.2 bits (68), Expect = 0.007
Identities = 15/63 (23%), Positives = 25/63 (39%), Gaps = 1/63 (1%)

Query: 68 AKVIGTDLSEKMLEQAEKDLQKCGQFSARF-SLYHLPMEKLAELPESHFDVITSSFAFHY 126
AK I LS K E + +++ + L +K ++P ++TS AF Y
Sbjct: 151 AKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKY 210

Query: 127 IEN 129

Sbjct: 211 FSK 213


21CGSHiGG_09225CGSHiGG_09260N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CGSHiGG_092250151.863514ADP-L-glycero-D-mannoheptose-6-epimerase
CGSHiGG_09230-1162.275245ADP-L-glycero-D-mannoheptose-6-epimerase
CGSHiGG_092450162.884163deoxyribose-phosphate aldolase
CGSHiGG_092501182.378927ribosome biogenesis GTP-binding protein YsxC
CGSHiGG_09255-1182.388561D-xylose transport permease protein
CGSHiGG_09260-3192.773800GTPase EngB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09225NUCEPIMERASE996e-26 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 98.7 bits (246), Expect = 6e-26
Identities = 78/348 (22%), Positives = 130/348 (37%), Gaps = 68/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDLGRKDILVVDNLKD--------------GTKFANLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYCDKEDFIASIIAGDEFGDIDAVFHEGACSATTEWDGKYIMHNNYEYSK-------EL 100
AD + + + A F + VF A +Y + N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLDREIP-FFYASSAATYGDTKV--FREEREFEGPLNVYGYSKFLFDQYVRNILPEA 157
L C +I YASS++ YG + F + + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 158 KSPVCGFRYFNVYGPRENHKGSMASVAFHLNNQILKGENPKLFAGSEGFRRDFVYVGDVA 217
P G R+F VYGP + MA F +L+G++ ++ +RDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 218 AVNI------------WCWQNGISG-------IYNLGTGNAESFRAVADAVVKFHG-KGE 257
I W + G +YN+G + A+ G + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 258 IETIPFPEHLKSRYQEYTQADLTKLRS-TGYDKPFKTVAEGVAEYMAW 304
+P T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQPG----DVLETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09230SECA280.022 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.022
Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 11/37 (29%)

Query: 67 TSPA-INSLANEGYQVVSVALRSGNEADVNDYLSKHD 102
T PA +N+L +G VV+V NDYL++ D
Sbjct: 113 TLPAYLNALTGKGVHVVTV----------NDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09245HTHFIS381e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 1e-04
Identities = 39/178 (21%), Positives = 58/178 (32%), Gaps = 53/178 (29%)

Query: 194 DLTDIIGQ----QHAKRALTIAAAGQHNLLFLGPPGTGKTMLASRLTGLLPEMTDLEAIE 249
D ++G+ Q R L L+ G GTGK ++A A+
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA-------------RALH 181

Query: 250 TASVTSLVQNELNFHNWKQRPFRAPHHSASMP------ALVG-------GGTIPKPGEIS 296
+ PF A + A++P L G G G
Sbjct: 182 DYG------------KRRNGPFVAI-NMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFE 228

Query: 297 LATNGVLFLDEL----PEFERKVLDALRQPLESGEIIISRANAKIQFPARFQLVAAMN 350
A G LFLDE+ + + ++L L+ GE I+ R +VAA N
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRV----LQQGEYTTVGGRTPIRSDVR--IVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CGSHiGG_09260HTHFIS300.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.011
Identities = 9/16 (56%), Positives = 12/16 (75%)

Query: 54 VVGESGCGKSTLARAI 69
+ GESG GK +ARA+
Sbjct: 165 ITGESGTGKELVARAL 180



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.