PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomesequence.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NZ_CP007504 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1CG09_RS00135CG09_RS00225Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS00135-110-4.656967hypothetical protein
CG09_RS00140012-3.476682hypothetical protein
CG09_RS00145-115-2.792667hypothetical protein
CG09_RS00150-114-3.286786hypothetical protein
CG09_RS00155-115-2.950871hypothetical protein
CG09_RS00165114-2.846380monofunctional biosynthetic peptidoglycan
CG09_RS00170013-2.746345metallophosphoesterase
CG09_RS00175316-2.628251hypothetical protein
CG09_RS00180417-1.974942catalase
CG09_RS00185615-1.549324hypothetical protein
CG09_RS00195515-1.975132hypothetical protein
CG09_RS00200416-3.142512hypothetical protein
CG09_RS00205417-3.402925glycan metabolism protein RagB
CG09_RS00215-215-3.574848hypothetical protein
CG09_RS00220-314-4.044095hypothetical protein
CG09_RS00225-214-3.085712GLPGLI family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS00200PF00577290.006 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.4 bits (66), Expect = 0.006
Identities = 12/51 (23%), Positives = 21/51 (41%), Gaps = 3/51 (5%)

Query: 6 PYNGNVTSINPEDIESTVILKDATATAIYGARGANGVVLINTKTGRGRSVI 56
Y N +++ + V L +A A + RGA +V K G ++
Sbjct: 752 EYRENRVALDTNTLADNVDLDNAVANVVPT-RGA--IVRAEFKARVGIKLL 799


2CG09_RS00800CG09_RS00870Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS00800325-0.514066cell filamentation protein Fic
CG09_RS00805224-0.687241excisionase
CG09_RS00810225-0.385095phage antirepressor protein
CG09_RS008152290.932813transcriptional regulator
CG09_RS008202320.994205hypothetical protein
CG09_RS008255330.111486hypothetical protein
CG09_RS008305350.639812hypothetical protein
CG09_RS00835428-0.641532hypothetical protein
CG09_RS00845020-1.446701hypothetical protein
CG09_RS00850216-2.508550hypothetical protein
CG09_RS00855220-3.491017arsenate reductase family protein
CG09_RS00860219-3.691672hypothetical protein
CG09_RS00865-116-3.972508hypothetical protein
CG09_RS00870-214-3.344232cyclic nucleotide-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS00855LPSBIOSNTHSS260.019 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 25.9 bits (57), Expect = 0.019
Identities = 11/37 (29%), Positives = 19/37 (51%), Gaps = 2/37 (5%)

Query: 56 KDRQIPDEVIVGGYEALNPKKRPLFYRDKALEYIKQL 92
+ ++ D+V V NP K+P+F + LE I +
Sbjct: 22 RGCRLFDQVYVA--VLRNPNKQPMFSVQERLEQIAKA 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS00860STREPTOPAIN290.017 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 29.3 bits (65), Expect = 0.017
Identities = 18/72 (25%), Positives = 37/72 (51%), Gaps = 7/72 (9%)

Query: 186 IEGYIKNIREDGKIDVSLQPEGYTNIDEFKQKILDKLDENYGLLYLSDQSSPEEIKTELQ 245
+E Y++ I+E+ K+D + Y E KQ ++ L ++ G+ Y +Q +P + T +
Sbjct: 121 MESYVEQIKENKKLDTT-----YAGTAEIKQPVVKSLLDSKGIHY--NQGNPYNLLTPVI 173

Query: 246 MSKKNFKKAIGG 257
K +++ G
Sbjct: 174 EKVKPGEQSFVG 185


3CG09_RS01055CG09_RS01135Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS01055314-0.341611translation initiation factor
CG09_RS01060314-1.408615ribonuclease Y
CG09_RS01065210-0.5820646-phosphofructokinase
CG09_RS01070010-1.182951hypothetical protein
CG09_RS01075-18-1.056116DNA topoisomerase IV subunit B
CG09_RS01080-38-0.371691hypothetical protein
CG09_RS01085-39-0.977299RelA/SpoT family protein
CG09_RS01090-310-0.851756sodium:proton antiporter
CG09_RS01100-115-1.663658gamma-glutamyltransferase
CG09_RS01105116-0.395625hypothetical protein
CG09_RS01110320-1.792778hypothetical protein
CG09_RS01115322-0.143760tryptophan synthase subunit alpha
CG09_RS011204200.523733hypothetical protein
CG09_RS011254200.695797tryptophan synthase subunit beta
CG09_RS011304201.067289phosphoribosylanthranilate isomerase
CG09_RS01135217-0.481234DNA-3-methyladenine glycosylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS01065GPOSANCHOR381e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.7 bits (87), Expect = 1e-04
Identities = 23/166 (13%), Positives = 45/166 (27%), Gaps = 4/166 (2%)

Query: 38 EDAKKNAENLIEKAKVKAESIKQEKNIQAKERFLELKTEHDAHIQQREKKMQEVEKRARD 97
+D + AK K + + +A + + A +++ +
Sbjct: 84 KDHNDELTEELSNAKEKLRKNDKSLSEKASKI--QELEARKADLEKALEGAMNFSTADSA 141

Query: 98 KEQKLNDELSKVGKLEKDLDKKLSDLSRKEEQLEKKQEELSVAIAKKVELLEKVSGYSAE 157
K + L E + + + DL+K L K + L A ++ A
Sbjct: 142 KIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE--KAL 199

Query: 158 EAKEELVETMKSEAKTKAQSYVQNIMEEAQLNAKNEAKKIVIQTIQ 203
E ++ KT +A L E
Sbjct: 200 EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADS 245


4CG09_RS01450CG09_RS01545Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS01450317-1.116463membrane protein
CG09_RS014556200.527852hypothetical protein
CG09_RS014606151.339967dipeptidyl carboxypeptidase II
CG09_RS014704121.527417barnase inhibitor
CG09_RS01480-1172.718860glycosyl transferase family 2
CG09_RS01485-3131.373927RNA methyltransferase
CG09_RS014900150.206403abortive infection protein
CG09_RS01495-212-0.104058hypothetical protein
CG09_RS01500-112-0.625988DEAD/DEAH box helicase
CG09_RS01510216-3.489577hypothetical protein
CG09_RS01515416-3.917750hypothetical protein
CG09_RS01520313-1.385371hypothetical protein
CG09_RS01525115-1.841355hypothetical protein
CG09_RS015352140.019424DNA gyrase subunit A
CG09_RS01540114-0.711901hypothetical protein
CG09_RS01545214-0.248984DNA gyrase subunit B
5CG09_RS01770CG09_RS01830Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS017706263.6848295-hydroxyisourate hydrolase
CG09_RS017756253.777824hypothetical protein
CG09_RS017807243.928812hypothetical protein
CG09_RS017858232.7489165-hydroxyisourate hydrolase
CG09_RS017904223.734383transketolase
CG09_RS017953202.954045acetyltransferase
CG09_RS01805-1140.090702NAD-dependent epimerase
CG09_RS01810-1160.403065hypothetical protein
CG09_RS01815-1180.603266molecular chaperone GroEL
CG09_RS01820018-0.483455molecular chaperone GroES
CG09_RS01825015-0.486683META domain-containing protein
CG09_RS01830216-1.075750phenylalanine--tRNA ligase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS01815PF00577310.007 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 31.0 bits (70), Expect = 0.007
Identities = 13/72 (18%), Positives = 23/72 (31%), Gaps = 10/72 (13%)

Query: 163 VYLRF----GRPAVPVFMPEDMPFEIGKGILLQEGKDVTIVATGHLVW-ESLVAAEQL-- 215
V F G + + P G + + + IVA V+ + A ++
Sbjct: 786 VRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQV 845

Query: 216 ---EKEGISCEV 224
E+E C
Sbjct: 846 KWGEEENAHCVA 857


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS01825NUCEPIMERASE842e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 83.7 bits (207), Expect = 2e-20
Identities = 68/336 (20%), Positives = 118/336 (35%), Gaps = 57/336 (16%)

Query: 7 KILITGALGQIGTELTAKLVE----IYGKDNVIASGID-----KWREGITTAG-HYERID 56
K L+TGA G IG ++ +L+E + G DN + D E + G + +ID
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDN-LNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 57 VTNFKLLEDFIKENKITTVYHLASLLSGT--SEKQPLFAWKLNLEPLLHLCELAKEGYLK 114
+ + + + D V+ S + P NL L++ E + ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 115 KIFWPSSIAVFGKGIPKHNVGQDVVLNPTTVYGISKMAGEKWCEYYHDKYGVDVRSIRY- 173
+ + SS +V+G D V +P ++Y +K A E Y YG+ +R+
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 174 ----PGLISWKAPAGGGTTDYAVEIFYEAVEKGE-YQCFISENTAMPMLYMDDAINATIK 228
P W P D A+ F +A+ +G+ + Y+DD A I+
Sbjct: 180 TVYGP----WGRP------DMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIR 229

Query: 229 LMQEPAENISVWGS--------------YNLGGMSFTP-AELTNEI-----KKVMPNFKI 268
L + W YN+G S + + + N
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLP 289

Query: 269 SYQPDFRQSIADSWPASIDDSKAKEDWGLSYEFDIK 304
D ++ AD+ E G + E +K
Sbjct: 290 LQPGDVLETSADT-------KALYEVIGFTPETTVK 318


6CG09_RS02045CG09_RS02075Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS020452150.132430ATP synthase F1 subunit gamma
CG09_RS020502160.111303ATP synthase subunit alpha
CG09_RS02055319-0.684373ATP synthase F1 subunit delta
CG09_RS02060522-0.369140ATP synthase F0 subunit B
CG09_RS020655240.117739ATP synthase F0 subunit C
CG09_RS020704250.672058ATP synthase F0 subunit A
CG09_RS020753240.396529hypothetical protein
7CG09_RS02180CG09_RS02275Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS02180-1245.164787ABC transporter permease
CG09_RS021850234.888882ABC transporter
CG09_RS021900255.338231carbamoyl phosphate synthase large subunit
CG09_RS021951276.140912hypothetical protein
CG09_RS022051214.674438carbamoyl phosphate synthase small subunit
CG09_RS022102171.224756aspartate carbamoyltransferase
CG09_RS022151171.457050L-asparaginase 1
CG09_RS02220-2151.673910*cold-shock protein
CG09_RS02225-3130.409132cold-shock protein
CG09_RS02230-2140.609786DEAD/DEAH box helicase
CG09_RS02240-2121.18543923S rRNA (adenine(2503)-C(2))-methyltransferase
CG09_RS02245-2120.371908tRNA preQ1(34) S-adenosylmethionine
CG09_RS02255012-2.707337oxidoreductase
CG09_RS02260115-3.632573hypothetical protein
CG09_RS02270323-6.151292hypothetical protein
CG09_RS02275117-3.403254WYL domain-containing protein
8CG09_RS02755CG09_RS02785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS027553151.250341DNA polymerase III subunit epsilon
CG09_RS02760619-1.5349422-hydroxyhepta-2,4-diene-1,7-dioate isomerase
CG09_RS02765720-3.512750universal stress protein UspA
CG09_RS02770621-3.348928hypothetical protein
CG09_RS02775417-2.948549sulfurtransferase
CG09_RS02780316-2.878360DNA sulfur modification protein DndD
CG09_RS02785214-2.562775DNA sulfur modification protein DndE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS02785TYPE4SSCAGA320.008 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 32.4 bits (73), Expect = 0.008
Identities = 29/118 (24%), Positives = 52/118 (44%), Gaps = 6/118 (5%)

Query: 432 NDNYSRSKNDLDSIRRKIRA-----AEKDAED-EYIANLRNEKTRLDNRVYSIDKEVYDL 485
N N +K +S + +I A A +DA Y NL+ K L +++ +++K + D
Sbjct: 638 NKNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDF 697

Query: 486 SEKIGSFKNEIKTLKQRQEELRKKIDDSRRYSDKDKITQRQIENLRNFIKDFKDATKK 543
+ FKN + EE K + S + + ++ENL + +FK+ K
Sbjct: 698 DKSFDEFKNGKNKDFSKAEETLKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNK 755


9CG09_RS03720CG09_RS03770Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS03720224-0.717055sugar transferase
CG09_RS03725323-0.834428pyridoxal phosphate-dependent enzyme apparently
CG09_RS03730123-1.261882acetyltransferase
CG09_RS03735123-2.656938sugar transferase
CG09_RS03740223-2.766886hypothetical protein
CG09_RS03745223-3.302026glycosyl transferase family 1
CG09_RS03750221-2.311007asparagine synthetase B
CG09_RS03755122-2.915592capsule biosynthesis protein CapM
CG09_RS03760223-3.482406glycosyl transferase
CG09_RS03765122-2.643495glycosyl transferase family 2
CG09_RS03770123-3.001268lipopolysaccharide biosynthesis protein
10CG09_RS04220CG09_RS04300Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS04220-2203.681317hypothetical protein
CG09_RS04225-1183.407267acetyl-CoA carboxylase carboxyltransferase
CG09_RS04235-1172.477804glutamate--tRNA ligase
CG09_RS042400201.954045auxin-regulated protein
CG09_RS04245-1192.121278aminoacyl-tRNA hydrolase
CG09_RS042501152.516532hypothetical protein
CG09_RS042550132.695235hypothetical protein
CG09_RS042600153.318545transposase
CG09_RS04265-1142.821829hypothetical protein
CG09_RS04270-280.398489membrane protein
CG09_RS04275-19-0.129864hypothetical protein
CG09_RS04280211-1.992581carbonic anhydrase
CG09_RS04285412-3.418110serine acetyltransferase
CG09_RS04290314-3.640909glycosyl transferase
CG09_RS04295314-3.767652hypothetical protein
CG09_RS04300114-3.366131O-acyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS04340adhesinb320.002 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 32.1 bits (73), Expect = 0.002
Identities = 15/86 (17%), Positives = 37/86 (43%), Gaps = 12/86 (13%)

Query: 208 EAKGKWELNPKQ---IEALKENLDYIKKRNIPYILVQAPITKKLYE--SKTNNKS----- 257
+ WE+N ++ + +K ++ ++K +P + V++ + + + SK N
Sbjct: 219 PSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKI 278

Query: 258 -VDSLLSTMGTYKNFYGELPLN-DTI 281
DS+ ++Y + N + I
Sbjct: 279 FTDSVAEKGEEGDSYYSMMKYNLEKI 304


11CG09_RS04755CG09_RS04845Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS04755314-0.008298hypothetical protein
CG09_RS04760311-1.277967TonB family protein
CG09_RS047653190.083446biopolymer transporter ExbD
CG09_RS047753200.431209biopolymer transporter ExbD
CG09_RS047802190.860864flagellar motor protein MotA
CG09_RS047852201.233943leucine--tRNA ligase
CG09_RS047900191.233943glycosyl transferase family 2
CG09_RS047950161.246932protease
CG09_RS048000150.703239collagenase
CG09_RS048100140.6363934Fe-4S ferredoxin
CG09_RS048150140.2063165-formyltetrahydrofolate cyclo-ligase
CG09_RS04820219-0.458177hypothetical protein
CG09_RS04825215-1.950807hypothetical protein
CG09_RS04830416-2.582145serine--tRNA ligase
CG09_RS04840321-2.967086hypothetical protein
CG09_RS04845317-2.712430tRNA(Ile)-lysidine synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS04795PF03544421e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 41.9 bits (98), Expect = 1e-06
Identities = 38/190 (20%), Positives = 62/190 (32%), Gaps = 11/190 (5%)

Query: 86 ILEQPKEETPPPPPPPKVEEEKIEIIQNVVPEPVKAPTVETPPPPISKQLETTTGLVNQE 145
++ E P PP + E +PEP K V P + V +
Sbjct: 54 MVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK--PKPKPVKKV 111

Query: 146 GVKKPSYAPPPPPPSTGKGTTVEVKPQVSTTEVYTTVDQEAEFSGGGINGFRSAFQESFD 205
K P P++ T +P ST + G +
Sbjct: 112 EQPKRDVKPVESRPASPFENTAPARPTSSTAT--AATSKPVTSVASGPRALSRNQPQYPA 169

Query: 206 TSVMEGDEGTLKAEVTFVVERDGSLSQVKVTGS--NSTFNREAERAVKSIKKKWTPGKVN 263
+ EG +K V F V DG + V++ + + F RE + A++ ++ PGK
Sbjct: 170 RAQALRIEGQVK--VKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRR--WRYEPGKPG 225

Query: 264 GE-PVRSRFR 272
V F+
Sbjct: 226 SGIVVNILFK 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS04825V8PROTEASE672e-14 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 67.3 bits (164), Expect = 2e-14
Identities = 32/167 (19%), Positives = 57/167 (34%), Gaps = 32/167 (19%)

Query: 120 PTGLGSGVIISPDGYIISNNHVVAGASKLEVTLS------------NKKTYVAKLIGSDP 167
T + SGV++ D +++N HVV L N ++
Sbjct: 100 GTFIASGVVVGKD-TLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSG 158

Query: 168 STDIALLKIED--------SGLPYLNFANSDLLEVGQWVVAVGNPLGLNSTVTAGIVSAK 219
D+A++K + +N+ +V Q + G P A + +K
Sbjct: 159 EGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPV---ATMWESK 215

Query: 220 GRSIDLLRQQSKTPIESFIQTDAVINRGNSGGALVNLSGDLVGINSA 266
G+ L +Q D GNSG + N +++GI+
Sbjct: 216 GKITYL--------KGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG 254


12CG09_RS06030CG09_RS06095Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS06030015-3.079260transposase
CG09_RS06035116-6.147602hypothetical protein
CG09_RS06040216-6.270030quinol:cytochrome C oxidoreductase
CG09_RS06045117-3.806170quinol:cytochrome C oxidoreductase
CG09_RS06050017-1.502096molybdopterin oxidoreductase
CG09_RS060602200.890713membrane protein
CG09_RS060654221.755900quinol:cytochrome C oxidoreductase
CG09_RS060704231.781383hypothetical protein
CG09_RS060754231.014072hypothetical protein
CG09_RS060803231.072650adenine phosphoribosyltransferase
CG09_RS060853210.037630hypothetical protein
CG09_RS06090320-0.0867346,7-dimethyl-8-ribityllumazine synthase
CG09_RS060952190.406664metalloprotease
13CG09_RS06255CG09_RS06325Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS06255316-3.819796hypothetical protein
CG09_RS06260423-7.369504hypothetical protein
CG09_RS06265521-8.661492transcriptional regulator
CG09_RS06270620-8.423018hypothetical protein
CG09_RS06275520-8.037693hypothetical protein
CG09_RS06280618-8.746330hypothetical protein
CG09_RS06285718-8.698028ATPase AAA
CG09_RS06290716-7.465230hypothetical protein
CG09_RS06295617-6.475599integrase
CG09_RS06300618-6.774519*amidohydrolase
CG09_RS06305617-6.928993hypothetical protein
CG09_RS06310317-5.066849excinuclease ABC subunit B
CG09_RS06315216-4.435833Crp/Fnr family transcriptional regulator
CG09_RS06320216-4.575475hypothetical protein
CG09_RS06325015-3.934904hypothetical protein
14CG09_RS06895CG09_RS06960Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS068953191.115244acetyltransferase
CG09_RS069004211.349935hypothetical protein
CG09_RS069054191.918243uroporphyrinogen decarboxylase
CG09_RS069104182.688697uroporphyrinogen III methyltransferase
CG09_RS069152173.962894hydroxymethylbilane synthase
CG09_RS069201194.625244hypothetical protein
CG09_RS069250174.506220hypothetical protein
CG09_RS06930-1162.757300GLPGLI family protein
CG09_RS06935-1131.494770hypothetical protein
CG09_RS06940-113-0.006977hypothetical protein
CG09_RS06945-113-2.870827hypothetical protein
CG09_RS06950-113-3.647772fumarate hydratase
CG09_RS06955113-4.589394hypothetical protein
CG09_RS06960-115-3.470220peptidase T
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06930SACTRNSFRASE280.006 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.006
Identities = 9/33 (27%), Positives = 20/33 (60%)

Query: 37 LIISFVNIFPKFEGRGLGKALIREAISFAREHQ 69
+I + + + +G+G AL+ +AI +A+E+
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06950OMPADOMAIN362e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 35.7 bits (82), Expect = 2e-04
Identities = 24/90 (26%), Positives = 40/90 (44%), Gaps = 8/90 (8%)

Query: 6 NTIKIGTR--NSPLALWQAKEVAAALEQKNYATEIVPIVSSGDKNLTQPLYSLGITGVFT 63
T +IG+ N L+ +A+ V L K + + G+ N P+ V
Sbjct: 260 YTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESN---PVTGNTCDNVKQ 316

Query: 64 KDLDIALL--NKQIDIAVHSLKDVPTQLPQ 91
+ I L +++++I V +KDV TQ PQ
Sbjct: 317 RAALIDCLAPDRRVEIEVKGIKDVVTQ-PQ 345


15CG09_RS07020CG09_RS07235Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS070202142.514421Negative regulator of beta-lactamase expression
CG09_RS070253162.719539hypothetical protein
CG09_RS070303162.961753hypothetical protein
CG09_RS070354153.106156hypothetical protein
CG09_RS070405173.308618hypothetical protein
CG09_RS070455193.491676hypothetical protein
CG09_RS070505242.951346hypothetical protein
CG09_RS070552213.453316hypothetical protein
CG09_RS070651254.129898hypothetical protein
CG09_RS070705285.279827hypothetical protein
CG09_RS070755305.720796hypothetical protein
CG09_RS070806325.925542hypothetical protein
CG09_RS070906323.986240hypothetical protein
CG09_RS070956303.914202hypothetical protein
CG09_RS071003231.571272hypothetical protein
CG09_RS071053231.302374hypothetical protein
CG09_RS07110120-1.368872hypothetical protein
CG09_RS07115221-1.205284hypothetical protein
CG09_RS07120119-1.560735hypothetical protein
CG09_RS07125219-3.023172hypothetical protein
CG09_RS07130424-0.604636hypothetical protein
CG09_RS07140425-1.255882hypothetical protein
CG09_RS07145427-0.859766hypothetical protein
CG09_RS071503281.775619hypothetical protein
CG09_RS071554281.131969hypothetical protein
CG09_RS071605290.571531hypothetical protein
CG09_RS071653312.125216hypothetical protein
CG09_RS071703312.432610hypothetical protein
CG09_RS071754302.627432Fe-S oxidoreductase
CG09_RS071805293.003148catalase/peroxidase HPI
CG09_RS071854304.008632GNAT family acetyltransferase
CG09_RS071904283.591205rRNA methyltransferase
CG09_RS071954232.509876DNA recombination protein RmuC
CG09_RS072004180.164464nicotinate phosphoribosyltransferase
CG09_RS07205417-0.634204DNA starvation/stationary phase protection
CG09_RS072101170.849223hypothetical protein
CG09_RS07220-1141.234081hypothetical protein
CG09_RS07225-1162.040512hypothetical protein
CG09_RS072301223.987078transposase
CG09_RS072350233.980657hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07090HTHTETR280.015 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.1 bits (62), Expect = 0.015
Identities = 16/81 (19%), Positives = 35/81 (43%), Gaps = 5/81 (6%)

Query: 1 MAKKGRLSNKEREQK-KEYAKILFLQE--KNITIKDLAERVGVSVNTLSEWIKAEKWEGL 57
MA+K + +E Q + A LF Q+ + ++ ++A+ GV+ + K + L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDK--SDL 58

Query: 58 RRNILLTRQEQLVQMQDELAE 78
I + + +++ E
Sbjct: 59 FSEIWELSESNIGELELEYQA 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07120TYPE3OMGPROT290.009 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 28.7 bits (64), Expect = 0.009
Identities = 18/105 (17%), Positives = 40/105 (38%), Gaps = 13/105 (12%)

Query: 50 DTFHDVLQQFNAEFTLSLVVKAGDTSSLTEENRREQAMAYLDLSEKIYSKLQGYEDGHF- 108
++ D+L F A + ++VV ++ + + +L +Y+ L Y DG+
Sbjct: 43 ESLRDLLTDFGANYDATVVVSDKINDKVSGQFEHDNPQDFLQHIASLYN-LVWYYDGNVL 101

Query: 109 ----------ETFTFQSASEPNLRKGLKIVAL-RYSCGWKQEATK 142
Q + L++ L+ + GW+ +A+
Sbjct: 102 YIFKNSEVASRLIRLQESEAAELKQALQRSGIWEPRFGWRPDASN 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07205FbpA_PF05833310.022 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 30.6 bits (69), Expect = 0.022
Identities = 15/92 (16%), Positives = 33/92 (35%), Gaps = 7/92 (7%)

Query: 563 NQNGEKYFVAYAEPKRSFEVVPKLMPDGEKEQAQKDMQVAELEYQRDLQLLRDLEKKTGI 622
+QN + Y+ Y + K+S E + + E+E + + + + +++
Sbjct: 380 SQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIK----- 434

Query: 623 STERLIAEQEMAVKTQNINSKKLNIKADRGES 654
+ LI + K + K K S
Sbjct: 435 --KELIETGYIKFKKIYKSKKSKTSKPMHFIS 464


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07240SACTRNSFRASE511e-10 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 51.1 bits (122), Expect = 1e-10
Identities = 23/96 (23%), Positives = 41/96 (42%), Gaps = 7/96 (7%)

Query: 53 EELQDEFSEFYFARVDGVLAGYLKLNFGVSQTELKDPKAIEIERIYVLKAFQGKRVGQAL 112
+++E + ++ G +K+ + L IE I V K ++ K VG AL
Sbjct: 58 SYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYAL-------IEDIAVAKDYRKKGVGTAL 110

Query: 113 YEHALQLARDRGVDYIWLGVWEQNHKAIRFYEKNGF 148
A++ A++ + L + N A FY K+ F
Sbjct: 111 LHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07250TYPE4SSCAGA320.010 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 31.6 bits (71), Expect = 0.010
Identities = 42/174 (24%), Positives = 76/174 (43%), Gaps = 18/174 (10%)

Query: 48 LKNWISKTKDL-EQSLNLEKEHYKAKTTENESLKDSLSKTSATLETAHSQVEELKTQLQT 106
+K+++S K+L ++LN K AK T N D + K LE + + E L+ +++
Sbjct: 574 IKDFLSSNKELVGKTLNFNKAVADAKNTGN---YDEVKKAQKDLEKSLRKREHLEKEVEK 630

Query: 107 QTLNLTQLQEKNQHYYAKISELSAKNETLEQSLVNQKKEIQELQEATKLQFENIANKILE 166
L AK S K+E +L+N++ ++A + + I
Sbjct: 631 ---KLESKSGNKNKMEAKAQANSQKDEIF--ALINKEAN----RDARAIAYAQNLKGIKR 681

Query: 167 EKTEKFTSLNKENLGHILKPFQEKITELKNTVHETYDKEAKERFSLGAKVKELA 220
E ++K ++NK LK F + E KN ++ + K + +L VK+L
Sbjct: 682 ELSDKLENVNKN-----LKDFDKSFDEFKNGKNKDFSKAEETLKALKGSVKDLG 730


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07260HELNAPAPROT1518e-50 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 151 bits (382), Expect = 8e-50
Identities = 47/140 (33%), Positives = 75/140 (53%)

Query: 17 ITEKLNILLANYSIFYQNTRGAHWNIKGADFFTLHPKFEELYDSLVLKIDEIAERILTLG 76
+ LN L+N+ + Y HW +KG FFTLH KFEELYD +D IAER+L +G
Sbjct: 13 VENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIG 72

Query: 77 ATPNHNYSDYLKVSSIKESKEVTDGNKCVEQILEAFKIVIDLQREILEIAGEAGDEGTNS 136
P +Y + +SI + T ++ V+ ++ +K + + ++ +A E D T
Sbjct: 73 GQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLAEENQDNATAD 132

Query: 137 QMSDYIKEQEKEVWMYNAFL 156
I+E EK+VWM +++L
Sbjct: 133 LFVGLIEEVEKQVWMLSSYL 152


16CG09_RS07295CG09_RS07340Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS072952240.800923GLPGLI family protein
CG09_RS073000214.126088GLPGLI family protein
CG09_RS073051265.818093DUF5103 domain-containing protein
CG09_RS073100255.430945hypothetical protein
CG09_RS073150235.350462hypothetical protein
CG09_RS073204275.672513gamma carbonic anhydrase family protein
CG09_RS073303250.328986GCN5 family acetyltransferase
CG09_RS07335226-0.535247ATPase AAA
CG09_RS07340226-0.476048aminoglycoside phosphotransferase
17CG09_RS07520CG09_RS07600Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS07520012-3.228445glycosyl transferase family 2
CG09_RS07525117-4.564212exopolysaccharide biosynthesis protein
CG09_RS07530118-5.295442polysaccharide biosynthesis protein
CG09_RS07540116-5.520991nucleotidyltransferase
CG09_RS07545017-5.208854hypothetical protein
CG09_RS07550218-5.143807preprotein translocase
CG09_RS07555315-5.213297protein translocase TatA
CG09_RS07560313-4.468942hypothetical protein
CG09_RS07565316-3.778008hypothetical protein
CG09_RS07570116-3.716489GTP cyclohydrolase II
CG09_RS07575-114-2.281062glycoside hydrolase
CG09_RS07580-213-1.544134N-acetyl-alpha-D-glucosaminyl L-malate synthase
CG09_RS07585017-0.707800hypothetical protein
CG09_RS07590517-1.034648hypothetical protein
CG09_RS07600820-0.385779hypothetical protein
18CG09_RS07870CG09_RS07970Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS07870320-1.329293thrombospondin
CG09_RS078754230.119584hypothetical protein
CG09_RS079004250.915001hypothetical protein
CG09_RS079057311.113307hypothetical protein
CG09_RS079157300.120711hypothetical protein
CG09_RS079203251.297001DNA-directed RNA polymerase subunit beta
CG09_RS079253241.106229DNA-directed RNA polymerase subunit beta'
CG09_RS07930124-0.502032hypothetical protein
CG09_RS07935125-0.730364TonB-dependent receptor
CG09_RS07940228-0.442741ABC transporter
CG09_RS079451200.940812hypothetical protein
CG09_RS079505261.591114uracil phosphoribosyltransferase
CG09_RS079555271.786223enoyl-CoA hydratase
CG09_RS079604231.826506hypothetical protein
CG09_RS079704221.701873guanylate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07945BINARYTOXINB300.022 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.022
Identities = 10/15 (66%), Positives = 11/15 (73%)

Query: 34 DLDSDNDGIPDSIEK 48
D DNDGIPDS+E
Sbjct: 204 VPDRDNDGIPDSLEV 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07975RTXTOXIND330.009 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D signature.

Length = 478

Score = 33.3 bits (76), Expect = 0.009
Identities = 13/58 (22%), Positives = 28/58 (48%), Gaps = 10/58 (17%)

Query: 1155 AVVTEID-GVVSYGKIK-RGNRELIVESKSGEIKKYLVKLSNQILVQENDFVRAGSPL 1210
+V+ +++ + GK+ G + I ++ +K +I+V+E + VR G L
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVK--------EIIVKEGESVRKGDVL 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS07985ECOLNEIPORIN310.022 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 30.5 bits (69), Expect = 0.022
Identities = 17/89 (19%), Positives = 33/89 (37%), Gaps = 10/89 (11%)

Query: 639 GFHQYNAKIGYKKE---------FGKWSLDAYVAGNNLTNRVNYAFLFVGNAIGDTDLGN 689
G +KIG+K + + A +AG + +F+ + G +G
Sbjct: 52 GIVDLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWGNRQSFIGLKGGFGKLRVGR 111

Query: 690 GY-PVGVTTDVNPGPARAYFFGGTTIKYS 717
+ T D+NP +++ + G I
Sbjct: 112 LNSVLKDTGDINPWDSKSDYLGVNKIAEP 140


19CG09_RS08025CG09_RS08165Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS080252160.202871uracil-DNA glycosylase
CG09_RS08030-119-3.600186phospholipase D
CG09_RS08035-317-4.264280WYL domain-containing protein
CG09_RS08040-316-4.167797**hypothetical protein
CG09_RS08045-315-3.232622RNA polymerase sigma-54 factor
CG09_RS08050-315-3.0687714-hydroxybutyrate CoA-transferase
CG09_RS08055-214-3.226416homogentisate 1,2-dioxygenase
CG09_RS08060116-1.025356hypothetical protein
CG09_RS080651210.157584hypothetical protein
CG09_RS08070-1190.426491hypothetical protein
CG09_RS080750160.611589hypothetical protein
CG09_RS08080-1151.597602hypothetical protein
CG09_RS08110-1193.130783hypothetical protein
CG09_RS08115-1183.263997hypothetical protein
CG09_RS081200174.048863hypothetical protein
CG09_RS081251185.258867alanine--tRNA ligase
CG09_RS081303226.131350sugar kinase
CG09_RS08140-1296.396729gliding motility lipoprotein GldD
CG09_RS08145-2294.779008A/G-specific adenine glycosylase
CG09_RS081501345.143342integration host factor subunit beta
CG09_RS081550304.625699ribonuclease E/G
CG09_RS081600233.805993metalloprotease
CG09_RS081651203.201454carbohydrate-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08210DNABINDINGHU867e-26 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 85.9 bits (213), Expect = 7e-26
Identities = 29/89 (32%), Positives = 47/89 (52%)

Query: 2 TKAELVNTISNKLGIEKNDTQKVIEAFMQEIRGSLYEGNNVYLRGFGSFIIKTRAAKTGR 61
K +L+ ++ + K D+ ++A + L +G V L GFG+F ++ RAA+ GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NISKNTAIEIPAHNIPAFKPSKSFAERVK 90
N I+I A +PAFK K+ + VK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


20CG09_RS08350CG09_RS08455Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS083501245.235044transposase
CG09_RS083551266.083700hypothetical protein
CG09_RS083601276.227791hypothetical protein
CG09_RS083652275.689876hypothetical protein
CG09_RS083703306.537035phage shock protein A
CG09_RS083756389.380386hypothetical protein
CG09_RS083806378.814847DNA ligase (NAD(+)) LigA
CG09_RS083857378.724370hypothetical protein
CG09_RS083904378.273691TonB-dependent receptor
CG09_RS083953337.796840MFS transporter
CG09_RS084002337.341110thioredoxin
CG09_RS084052263.276031porin
CG09_RS084103283.228152ribonuclease N1
CG09_RS084153282.966829membrane protein
CG09_RS08425120-1.363652hypothetical protein
CG09_RS08430-1171.806175hypothetical protein
CG09_RS08435-1131.680221hypothetical protein
CG09_RS08440-1132.021210potassium transporter Kup
CG09_RS08445-1132.582637transcriptional repressor
CG09_RS08450-1133.045311organic solvent tolerance protein OstA
CG09_RS08455-1123.028437aspartate aminotransferase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08460TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/126 (20%), Positives = 49/126 (38%), Gaps = 4/126 (3%)

Query: 48 LSIDPAQASLIYGYFTGFVYFTPLIGGWLADKFLGQRLSITIGGVLMMLGQFTLFAINTH 107
+ PA + + F + G L+D+ +RL + G ++ G F ++
Sbjct: 44 FNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL-FGIIINCFGSVIGFVGHSF 102

Query: 108 FGLYI-GLLLLIIGNGFFKPNISVLVGNLYEEGDERRDSAFSIFYMGINLGALIAPLVIG 166
F L I + G F + V+V + E R AF + + +G + P + G
Sbjct: 103 FSLLIMARFIQGAGAAAFPALVMVVVARYIPK--ENRGKAFGLIGSIVAMGEGVGPAIGG 160

Query: 167 VLTDDI 172
++ I
Sbjct: 161 MIAHYI 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS0847056KDTSANTIGN290.034 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 29.2 bits (65), Expect = 0.034
Identities = 21/77 (27%), Positives = 32/77 (41%), Gaps = 12/77 (15%)

Query: 43 QLHLDALPYYNFGKGVGITSPDSLFQFNIRFRMQNRLEADFNDNRSTEYKAAIRRLRLRF 102
Q++ D P F GI PD+ + N + ++ E + LR F
Sbjct: 270 QIYSDIKP---FADIAGINVPDT--------GLPNSASIEQIQSKIQELGDTLEELRDSF 318

Query: 103 DGYVGNPRFLYAIQLSF 119
DGY+ N F+ I L+F
Sbjct: 319 DGYINNA-FVNQIHLNF 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08515DPTHRIATOXIN290.037 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 28.9 bits (64), Expect = 0.037
Identities = 21/70 (30%), Positives = 36/70 (51%), Gaps = 8/70 (11%)

Query: 261 EIMQSLSHSPKLGHITTFGG-NPLIAAASYA----TLKEVLESELMEEVEEKEALFRQLL 315
E Q+ P+L + T G NP+ A A+YA + +V++SE + +E+ A L
Sbjct: 281 EFHQTALEHPELSELKTVTGTNPVFAGANYAAWAVNVAQVIDSETADNLEKTTA---ALS 337

Query: 316 VHPKIKNING 325
+ P I ++ G
Sbjct: 338 ILPGIGSVMG 347


21CG09_RS08520CG09_RS08555Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS08520215-0.489692tRNA pseudouridine(38-40) synthase TruA
CG09_RS085251171.348081xenobiotic ABC transporter ATP-binding protein
CG09_RS08535-2225.367351multidrug ABC transporter ATP-binding protein
CG09_RS08540-2235.450537phenylalanine--tRNA ligase subunit alpha
CG09_RS08545-2245.270154hypothetical protein
CG09_RS08550-1234.909341coproporphyrinogen III oxidase
CG09_RS08555-1204.514862membrane protein
22CG09_RS08615CG09_RS08640Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS08615319-1.249824TonB-dependent receptor
CG09_RS08620423-0.839643DNA-binding response regulator
CG09_RS08625523-0.652761polyisoprenoid-binding protein
CG09_RS08630524-0.854409transcription-repair coupling factor
CG09_RS08635320-0.581043peptide deformylase
CG09_RS08640215-0.986254hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08670HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.3 bits (206), Expect = 1e-20
Identities = 32/142 (22%), Positives = 62/142 (43%), Gaps = 3/142 (2%)

Query: 1 MKGKRILLIDDEPDILEIISYNLKKEGYEVHTANNGNEGIEKAKEILPHLILLDVMMPDK 60
M G IL+ DD+ I +++ L + GY+V +N L++ DV+MPD+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGIETCQEIRKIATLEGCLIVFLSARGEEFSQLAGYQAGANDYIVKLIKPKVLISKV-NA 119
+ + I+K ++ +SA+ + + + GA DY+ K LI + A
Sbjct: 61 NAFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 120 LMNLTQQVEEKNKILKIGHLII 141
L ++ + + G ++
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLV 140


23CG09_RS08685CG09_RS08780Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS08685-1193.129299proteinase inhibitor
CG09_RS086904272.419921hypothetical protein
CG09_RS086952222.111018ABC transporter
CG09_RS087000200.450605nitrous oxide reductase maturation protein nosd
CG09_RS08705-1180.041568hypothetical protein
CG09_RS08710-215-0.253304hypothetical protein
CG09_RS08715-115-0.878274nitrous-oxide reductase
CG09_RS08720-215-2.196052cytochrome c
CG09_RS08725-116-2.152794cytochrome c
CG09_RS08730-215-1.580763hypothetical protein
CG09_RS08740-116-3.494625hypothetical protein
CG09_RS08745016-3.445614Crp/Fnr family transcriptional regulator
CG09_RS08750019-3.051051iron-sulfur cluster repair di-iron protein
CG09_RS08755119-3.566373Rrf2 family transcriptional regulator
CG09_RS08765117-0.876632hypothetical protein
CG09_RS08770119-0.590102thiamine biosynthesis protein ApbE
CG09_RS087802180.585455nitric oxide synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08755cdtoxina300.004 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 30.1 bits (67), Expect = 0.004
Identities = 11/27 (40%), Positives = 15/27 (55%)

Query: 5 IILTGLLSLGLLTGCNSQKQNDMNEPK 31
I + G+L LL GC+S K +PK
Sbjct: 8 IFIAGILIPILLNGCSSGKNKAYLDPK 34


24CG09_RS09160CG09_RS09235Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS091602181.527915DNA mismatch repair protein
CG09_RS091653181.915609DNA-binding protein
CG09_RS091702182.320324hypothetical protein
CG09_RS091751172.580520hypothetical protein
CG09_RS091802162.875316hypothetical protein
CG09_RS091853172.641260hypothetical protein
CG09_RS091904193.704136hypothetical protein
CG09_RS091953173.179111hypothetical protein
CG09_RS092003193.246434MoxR-like ATPase
CG09_RS092054203.452825hypothetical protein
CG09_RS092104152.874188hypothetical protein
CG09_RS092153132.716512hypothetical protein
CG09_RS092201151.935664hypothetical protein
CG09_RS092251142.197033hypothetical protein
CG09_RS092302141.647742hypothetical protein
CG09_RS092352141.285541hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS09270HTHFIS330.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.002
Identities = 33/153 (21%), Positives = 61/153 (39%), Gaps = 17/153 (11%)

Query: 32 GKVVIGQS----YMVDRLLVGLLGNGHILLEGVPGLAKTL---AIKTLSEALQGQFSRIQ 84
G ++G+S + L + + +++ G G K L A+ + G F I
Sbjct: 136 GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAIN 195

Query: 85 FTPDLLPADVVGTMIYNVKENDFSIKKGPVFANFVLA-------DEINRAPAKVQSALLE 137
+P D++ + ++ ++ F+ + F A DEI P Q+ LL
Sbjct: 196 MAA--IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLR 253

Query: 138 VMQEKQVT-IGDETLQLPKPFLVMATQNPIDQE 169
V+Q+ + T +G T +V AT + Q
Sbjct: 254 VLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQS 286


25CG09_RS09325CG09_RS09795Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS09325-3143.344790HDIG domain-containing protein
CG09_RS09330-2143.743509aconitate hydratase
CG09_RS09335-1162.834957GLPGLI family protein
CG09_RS09340-1162.411972hypothetical protein
CG09_RS093450172.298277hypothetical protein
CG09_RS093500181.416349class A beta-lactamase
CG09_RS093550170.406729aminoglycoside 6-adenylyltransferase AadS
CG09_RS09360017-1.636700type B chloramphenicol O-acetyltransferase
CG09_RS09370212-2.188652EreD family erythromycin esterase
CG09_RS09375212-2.804978dihydrofolate reductase
CG09_RS09380113-1.172278type B chloramphenicol O-acetyltransferase
CG09_RS09390-1132.662620class D beta-lactamase
CG09_RS09395-1163.185427TetX family tetracycline inactivation enzyme
CG09_RS09400-2151.746905hypothetical protein
CG09_RS09405-1161.848732esterase
CG09_RS09410-2140.600074cation transporter
CG09_RS09420321-3.296916GLPGLI family protein
CG09_RS09425421-3.112363hypothetical protein
CG09_RS09435919-4.332784mercury transporter
CG09_RS09440919-2.891751AraC family transcriptional regulator
CG09_RS094451021-2.864828hypothetical protein
CG09_RS094501124-3.405588toxin RelE
CG09_RS094551023-2.035854transcriptional regulator
CG09_RS094601022-3.271615cation transporter
CG09_RS09465923-2.940238toxin RelE
CG09_RS09470924-3.251677transcriptional regulator
CG09_RS09475925-4.360641hypothetical protein
CG09_RS09480924-3.273286hypothetical protein
CG09_RS094851026-3.735221hypothetical protein
CG09_RS094901227-2.308866restriction endonuclease subunit M
CG09_RS094951224-1.693518type II restriction endonuclease MboI
CG09_RS095001122-0.124013site-specific DNA-methyltransferase
CG09_RS0950512241.023295GLPGLI family protein
CG09_RS0951012241.109240esterase
CG09_RS095201226-0.087506hypothetical protein
CG09_RS095251129-0.426626hypothetical protein
CG09_RS095301438-3.273367hypothetical protein
CG09_RS095351434-3.805108hypothetical protein
CG09_RS095401434-3.520657hypothetical protein
CG09_RS095451229-5.455173membrane protein
CG09_RS095501431-4.799531hypothetical protein
CG09_RS095551327-4.464511twitching motility protein PilT
CG09_RS095651226-3.724361hypothetical protein
CG09_RS09570823-4.418235hypothetical protein
CG09_RS09575926-3.694511prevent-host-death protein
CG09_RS09580926-3.889537toxin YoeB
CG09_RS09585726-3.841952hypothetical protein
CG09_RS09590725-3.693472twitching motility protein PilT
CG09_RS09595724-4.918019hypothetical protein
CG09_RS096051125-6.177092hypothetical protein
CG09_RS096101026-7.340831restriction endonuclease
CG09_RS096151028-7.306081hypothetical protein
CG09_RS09620827-6.721562hypothetical protein
CG09_RS09625626-6.944990DNA-binding protein
CG09_RS09630829-6.949582hypothetical protein
CG09_RS09635628-7.327205hypothetical protein
CG09_RS09640631-5.441940hypothetical protein
CG09_RS09645328-5.387712hypothetical protein
CG09_RS09650628-5.688794hypothetical protein
CG09_RS09655726-5.277839hypothetical protein
CG09_RS09660625-5.376488GLPGLI family protein
CG09_RS09665725-4.276875hypothetical protein
CG09_RS09670722-5.128430hypothetical protein
CG09_RS096751020-4.810273hypothetical protein
CG09_RS09680821-4.652776hypothetical protein
CG09_RS09685821-5.172585hypothetical protein
CG09_RS09690620-5.610708hypothetical protein
CG09_RS09695521-6.232217ATP-dependent Clp protease proteolytic subunit
CG09_RS09705321-6.306492DNA primase
CG09_RS09710223-6.994246alkaline phosphatase
CG09_RS09715323-7.377810peptide chain release factor 2
CG09_RS09720522-6.410454MATE family efflux transporter
CG09_RS09725725-5.878597YggS family pyridoxal phosphate enzyme
CG09_RS09730721-6.984016large-conductance mechanosensitive channel
CG09_RS09735721-4.191806prolipoprotein diacylglyceryl transferase
CG09_RS09740721-3.022375membrane protein insertion efficiency factor
CG09_RS09745721-3.017149ATPase AAA
CG09_RS09750519-3.189178threonylcarbamoyl-AMP synthase
CG09_RS09755416-2.9518601-acyl-sn-glycerol-3-phosphate acyltransferase
CG09_RS09760316-1.589495coenzyme A pyrophosphatase
CG09_RS09765417-3.947646potassium transporter TrkA
CG09_RS09775516-4.930357ATPase
CG09_RS09780614-4.579799prephenate dehydrogenase
CG09_RS09785415-3.486708hypothetical protein
CG09_RS09790313-2.930843alanine dehydrogenase
CG09_RS09795213-1.780127tRNA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS09445BLACTAMASEA1713e-54 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 171 bits (436), Expect = 3e-54
Identities = 60/296 (20%), Positives = 131/296 (44%), Gaps = 17/296 (5%)

Query: 4 LKLIIISFTFLIINSCATVHDNNLKYQIEKIISSK-KGDFGISIIDENNN--IIEINGNK 60
++ I + L+ VH + + K+ S+ G G+ +D + + ++
Sbjct: 1 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE 60

Query: 61 SYPLLSTFKFPIALTILHKVENGELLMQQQIFIKKEELLENTWSPFKEKYPNGNISISLE 120
+P++STFK + +L +V+ G+ ++++I ++++L+ +SP EK+ ++ +
Sbjct: 61 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLV--DYSPVSEKHLADGMT--VG 116

Query: 121 EALHWMIVYSDNNMTDILLRLIGGTNAVEKF---IDDENFVIKNNEDEMHKDWNSQFINK 177
E I SDN+ ++LL +GG + F I D + E E+++ +
Sbjct: 117 ELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDT 176

Query: 178 STPNSFTKLLKNFSEGKMLNSENTKWLYESMVNSKTGVKRLKGKLP-NVKIAQRAGTSFT 236
+TP S L+ + L++ + + L + MV+ + ++ LP IA + G
Sbjct: 177 TTPASMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGA--- 233

Query: 237 NDDGITGAINNVGIMQLPNNQKIYITVFIHNTSEEFNKGEEIIADIAKTTYEFYTK 292
G GA V ++ N + + +++ +T + + IA I E + +
Sbjct: 234 ---GERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS09845MECHCHANNEL1266e-41 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 126 bits (319), Expect = 6e-41
Identities = 61/132 (46%), Positives = 82/132 (62%), Gaps = 10/132 (7%)

Query: 1 MGFIKEFKEFALRGNVIDLAVGVIIGGAFGKIVNSFVEDVVTPALLSPALEKLG------ 54
M IKEF+EFA+RGNV+DLAVGVIIG AFGKIV+S V D++ P L + +
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMP-PLGLLIGGIDFKQFAV 59

Query: 55 --AENIAQLSWNGIKYGSFLSAVISFLCIAFVLFVMIKGINKM-KKEEKVEEAPAGPTQE 111
+ + + YG F+ V FL +AF +F+ IK INK+ +K+E+ APA +E
Sbjct: 60 TLRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEE 119

Query: 112 ELLAEIRDLLKK 123
LL EIRDLLK+
Sbjct: 120 VLLTEIRDLLKE 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS09860HTHFIS389e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 9e-05
Identities = 52/245 (21%), Positives = 95/245 (38%), Gaps = 50/245 (20%)

Query: 31 TIRKMLDTDRLNSLILWGPPGTGKTTLAEILSEQSGRK---FFKL--SAVSSGVKE---- 81
+ +++ TD +L++ G GTGK +A L + R+ F + +A+ + E
Sbjct: 152 VLARLMQTDL--TLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELF 209

Query: 82 --VREVIDDAKKQH--LF---SGKSPILFIDEIHRFNKSQQDSLLHAVEKGWVILIGATT 134
+ A+ + F G + LF+DEI Q LL +++G +G T
Sbjct: 210 GHEKGAFTGAQTRSTGRFEQAEGGT--LFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRT 267

Query: 135 ENPS-FEVVSA-----------------LLSRSQVY--ILKPLSYEKLEELATIA---VA 171
S +V+A L R V L PL ++ E++ + V
Sbjct: 268 PIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLR-DRAEDIPDLVRHFVQ 326

Query: 172 RFNQD--ENTDFTIENNQGFIQYS-GGDARKLINSVEWVLNQYKGSGQNQLSSETILKVL 228
+ ++ + F E + + G+ R+L N V + Y + ++ E I L
Sbjct: 327 QAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQ---DVITREIIENEL 383

Query: 229 QETIP 233
+ IP
Sbjct: 384 RSEIP 388


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS09880NUCEPIMERASE367e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.3 bits (84), Expect = 7e-05
Identities = 13/32 (40%), Positives = 22/32 (68%), Gaps = 1/32 (3%)

Query: 1 MKYIIVGLGNF-GASLAQKLTVQGNEVIGIDN 31
MKY++ G F G ++++L G++V+GIDN
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDN 32


26CG09_RS10015CG09_RS10070Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS100152112.032670phosphohydrolase
CG09_RS100202102.466815acyl-CoA thioesterase
CG09_RS100253102.285783amino acid transporter
CG09_RS100300141.731957T9SS C-terminal target domain-containing
CG09_RS10035-1161.125194dipeptidase E
CG09_RS100401170.155274thioesterase
CG09_RS10050522-7.496854
CG09_RS10060824-8.304539
CG09_RS10065319-6.025122
CG09_RS10070114-3.733201
27CG09_RS05210CG09_RS05235N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS05210-211-1.407531short-chain dehydrogenase
CG09_RS05215-210-1.195938aldehyde dehydrogenase
CG09_RS05220-310-1.031963cell division protein FtsZ
CG09_RS05225-390.275269cell division protein FtsA
CG09_RS05230-3110.423644hypothetical protein
CG09_RS05235-3100.779885UDP-N-acetylmuramate--L-alanine ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05235DHBDHDRGNASE792e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.9 bits (194), Expect = 2e-19
Identities = 49/194 (25%), Positives = 90/194 (46%), Gaps = 3/194 (1%)

Query: 2 SKKFQHKNILITGGASGIGKIMARLSLEKGAKVIIWDIDQSKIDETILQFSSLG-SIFGY 60
+K + K ITG A GIG+ +AR +GA + D + K+++ + + +
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 61 KVDVSNYDEVQHFAIKTKQKIGNVDILINNAGIVVGKYFHEHSQKDILKTIEINTNAPMV 120
DV + + + ++++G +DIL+N AG++ H S ++ T +N+
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 121 ITNLFLQDMLTQNSGHICNIASSAGLVSNPKMSVYAGSKWAVVGWSDSLRLEMEQLKKNI 180
+ + M+ + SG I + S+ V M+ YA SK A V ++ L LE+ + NI
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAE--YNI 180

Query: 181 KVTTIMPYYINTGM 194
+ + P T M
Sbjct: 181 RCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05250SHAPEPROTEIN629e-13 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 62.5 bits (152), Expect = 9e-13
Identities = 56/216 (25%), Positives = 90/216 (41%), Gaps = 23/216 (10%)

Query: 151 KRLEANFHVVVGQMSSIKNIS-RCVKE----AGLEMESLTLEPLASSEAVLTKEEKEAGV 205
+ + V+V + R ++E AG L EP+A++ + G
Sbjct: 102 SFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGS 161

Query: 206 AIVDIGGGTTDIAIFKDNIIRHTCVIPYGGGIITEDI------KDGCSIIEKHAEQLKVR 259
+VDIGGGTT++A+ N + ++ + GG E I G I E AE++K
Sbjct: 162 MVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHE 221

Query: 260 FGSAVPELEKESTFVTIPGLHGRTEKEISLKTLAKIIHARVEEILEMVNTELKAYGAHEK 319
GSA P E V GR E + + +E + E + + A +
Sbjct: 222 IGSAYPGDEVREIEV-----RGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALE 276

Query: 320 KRK--LIA-----GIVLTGGGSNLKHLRQLANYITG 348
+ L + G+VLTGGG+ L++L +L TG
Sbjct: 277 QCPPELASDISERGMVLTGGGALLRNLDRLLMEETG 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05255TOXICSSTOXIN290.013 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 28.8 bits (64), Expect = 0.013
Identities = 29/149 (19%), Positives = 53/149 (35%), Gaps = 25/149 (16%)

Query: 81 QRVPVFRLSKGKKEFYVDEKGVEFPINRNYSASCMLISGNVQPEEYPQLIE--LVKKINQ 138
P F + K + Y ISG E+ P IE L K++
Sbjct: 91 YYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQ--ISGVTNTEKLPTPIELPLKVKVHG 148

Query: 139 DDFSKKFFIGVVKERENYYLIANEENYRVELGSLENIDFKVKGFKAFVEKYLVYQPSDK- 197
D K+ ++ ++ +DF+++ + + +Y+ SDK
Sbjct: 149 KDSPLKY----------GPKFDKKQL------AISTLDFEIR--HQLTQIHGLYRSSDKT 190

Query: 198 --YTKISLKYDNQIVTTLSKGYKEETYKE 224
Y KI++ + + LSK ++ T K
Sbjct: 191 GGYWKITMNDGSTYQSDLSKKFEYNTEKP 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05260SALSPVBPROT300.024 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.7 bits (66), Expect = 0.024
Identities = 19/60 (31%), Positives = 30/60 (50%), Gaps = 3/60 (5%)

Query: 160 KDYSVVEADEYDRSFLNLAPDWAIITSTDADHLDIYGDKSTIEKGFRDFAHLVSEERQLF 219
K+Y+ + D++F++ +PD A I T L+IY +K + D AH E LF
Sbjct: 485 KEYTTIGNIIIDKAFMSTSPDKAWINDTI---LNIYLEKGHKGRILGDVAHFKGEAEMLF 541


28CG09_RS05605CG09_RS05625N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS0560509-0.653419hemolysin D
CG09_RS05610-113-0.171860multidrug transporter AcrB
CG09_RS05615-1140.098039RND transporter
CG09_RS05620015-0.360461phosphoglucosamine mutase
CG09_RS05625014-0.095552hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05625RTXTOXIND506e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 50.2 bits (120), Expect = 6e-09
Identities = 22/95 (23%), Positives = 41/95 (43%), Gaps = 6/95 (6%)

Query: 59 EVRAQGKGFLDKIYVDEGQYVKAGQVLFRIMPQVYEAELMKTRAEVEQARIEYQNASILA 118
E++ + +I V EG+ V+ G VL ++ EA+ +KT++ + QAR+E
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE--QTRYQI 155

Query: 119 GNNIVSKNE----KALAKAKLDAASAEMRMAQLHL 149
+ + N+ K + S E + L
Sbjct: 156 LSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190



Score = 39.0 bits (91), Expect = 2e-05
Identities = 25/109 (22%), Positives = 51/109 (46%), Gaps = 8/109 (7%)

Query: 75 EGQYVKAGQVLFRIMPQVYEAELMKTRAEVEQARIEYQNASILAGNNIVSKNEKALAKAK 134
E +YV+A L +VY+++L + +E+ A+ EYQ + L N I+ K +
Sbjct: 258 ENKYVEAVNEL-----RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT--TDN 310

Query: 135 LDAASAEMRMAQLHLSFTTIRAPFSGIINRIPLK-LGSLIEEGDLLTSL 182
+ + E+ + + IRAP S + ++ + G ++ + L +
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVI 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05630ACRIFLAVINRP9120.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 912 bits (2358), Expect = 0.0
Identities = 378/1039 (36%), Positives = 588/1039 (56%), Gaps = 15/1039 (1%)

Query: 1 MFKTFIKRPVLSIVISLIIVFLGVLSLLSLPITQFPSISPPKVNITAEYPGANNELLVKS 60
M FI+RP+ + V+++I++ G L++L LP+ Q+P+I+PP V+++A YPGA+ + + +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VVIPLEQALNGVQGMKYITSDAGNDGVASIQVVFNLGTDPNLAAVNVQNRVSSAINKLPP 120
V +EQ +NG+ + Y++S + + G +I + F GTDP++A V VQN++ A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 LVVREGVKITREEPNMLMYVNLYSDDPKADQKFLFNYADINILPELRRVNGVGFADILGT 180
V ++G+ + + + LM SD+P Q + +Y N+ L R+NGVG + G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 181 REYAMRIWLKPDRLTAYNISTDEVMEALSSQSLEASPGRTGESSGKRSQAFEYVLKYPGR 240
+YAMRIWL D L Y ++ +V+ L Q+ + + G+ G + Q + R
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 FDNEKDYGNIIVKANSNGEFVRLKDVADVEFGSSMYDIYSTLNGKPSAAITIKQSYGSNA 300
F N +++G + ++ NS+G VRLKDVA VE G Y++ + +NGKP+A + IK + G+NA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 SEVIKNVKTLLEELNKTSFPKGMHYEISYDVSRFLDASMEKVVHTLFEAFVLVGIVVFIF 360
+ K +K L EL + FP+GM YD + F+ S+ +VV TLFEA +LV +V+++F
Sbjct: 300 LDTAKAIKAKLAEL-QPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 361 LGDWRSTLIPALAVPVSLIGAFAVMSSFGITVNMITLFALVMAIGVVVDDAIVVIEAVHA 420
L + R+TLIP +AVPV L+G FA++++FG ++N +T+F +V+AIG++VDDAIVV+E V
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 421 KMEEKHLSPLEATQEAMGEISGAIIAITLVMAAVFIPVAFMSGPVGVFYRQFSITMASSI 480
M E L P EAT+++M +I GA++ I +V++AVFIP+AF G G YRQFSIT+ S++
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 481 ILSGIVALTLTPALCALILKNNHGKERKKTPINRFIDGFNRVFAKGTKRYETLLYKTVSK 540
LS +VAL LTPALCA +LK F FN F Y + K +
Sbjct: 479 ALSVLVALILTPALCATLLKPV--SAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGS 536

Query: 541 KWITLGGLSVFCFLVYFLNNGLPSGFIPNEDQGMIYAIVQTPPGSTIERTNQQALKIQKI 600
L ++ + L LPS F+P EDQG+ ++Q P G+T ERT + ++
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 601 A--EGIDGVKSVSSLAGYEILSEGTGANSGTCLINLKNWDERDK---SATEIIEELEEKC 655
V+SV ++ G+ G N+G ++LK W+ER+ SA +I + +
Sbjct: 597 YLKNEKANVESVFTVNGF--SFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMEL 654

Query: 656 KDIGGSNIEFFQPPSIPGYGAAGGFELRLLDKTGSNDYAKMEEVSRNFVKELSKRP-ELA 714
I + F P+I G A GF+ L+D+ G + + + + ++ P L
Sbjct: 655 GKIRDGFVIPFNMPAIVELGTATGFDFELIDQAG-LGHDALTQARNQLLGMAAQHPASLV 713

Query: 715 SVFTFYSASFPQYMLKVDNDIAEQKGVSIGSAMNNLSTLIGSNYETGFIRFGKPYKVIVQ 774
SV Q+ L+VD + A+ GVS+ +ST +G Y FI G+ K+ VQ
Sbjct: 714 SVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQ 773

Query: 775 AAPQYRALPQDIMNLYVKNDKEEMVPYSDFMHMEKVYGMSEITRHNMYNSAQISGYPSEG 834
A ++R LP+D+ LYV++ EMVP+S F VYG + R+N S +I G + G
Sbjct: 774 ADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPG 833

Query: 835 YSSGQAIEAIKETADKTLPRGYGIDWAGISKDEVGRGNEAVYIFLICLGFVYLILSAQYE 894
SSG A+ ++ A K LP G G DW G+S E GN+A + I V+L L+A YE
Sbjct: 834 TSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYE 892

Query: 895 SFILPLPVILCLPAGIFGAFLFLKLFGLENNIYAQVALVMLIGLLGKNAVLIVEYAVQRK 954
S+ +P+ V+L +P GI G L LF +N++Y V L+ IGL KNA+LIVE+A
Sbjct: 893 SWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLM 952

Query: 955 -NEGATILKAAIEGAVTRFRPILMTSFAFIAGLIPLALATGPGAIGNRTIGTAAAGGMFI 1013
EG +++A + R RPILMTS AFI G++PLA++ G G+ +G GGM
Sbjct: 953 EKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVS 1012

Query: 1014 GTIFGVVLIPGLYLIFGKI 1032
T+ + +P +++ +
Sbjct: 1013 ATLLAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05635RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 24/139 (17%), Positives = 42/139 (30%), Gaps = 19/139 (13%)

Query: 180 LSSLIAEVAQSYYELLALDSQYSYLKKYIELQRKALEVSKIQKQAAATTELSVKKFEAEL 239
L AE + ++ K ++ L I K A E + EL
Sbjct: 209 LDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268

Query: 240 AKSSANLYTVQQSILEKENDINLLLGRFYQPIPRSSAEFLELVPQSIKTGIPSELLANRP 299
+ L ++ IL + + +LV Q K I +L
Sbjct: 269 RVYKSQLEQIESEILSAKEEY-------------------QLVTQLFKNEILDKLRQTTD 309

Query: 300 DVKQAELELEAAKLDVEAA 318
++ LEL + +A+
Sbjct: 310 NIGLLTLELAKNEERQQAS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05645SYCDCHAPRONE399e-06 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 39.1 bits (91), Expect = 9e-06
Identities = 22/124 (17%), Positives = 43/124 (34%), Gaps = 4/124 (3%)

Query: 201 FEYGQFYFNRKNYEEAIRGFDYLLAINSNSVGVYANKAACYEAMQEWDKAVEVYEEMLEL 260
+ + YE+A + F L ++ + AC +AM ++D A+ Y +
Sbjct: 40 YSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIM 99

Query: 261 EYTKAYTYYKIGLCHKENKKPLLALKSFQKSLVEDPQFYLSMMEQSFLYEEMGQMREALH 320
+ + + C + + A + L + E L + M EA+
Sbjct: 100 DIKEPRFPFHAAECLLQKGELAEA----ESGLFLAQELIADKTEFKELSTRVSSMLEAIK 155

Query: 321 FAKE 324
KE
Sbjct: 156 LKKE 159



Score = 32.2 bits (73), Expect = 0.002
Identities = 16/97 (16%), Positives = 33/97 (34%)

Query: 307 FLYEEMGQMREALHFAKEATVLNETNVDYQKRLAFLYIDSELFEESLPCLEKMVEAEPDR 366
F + G+ +A + VL+ + + L ++ ++ +
Sbjct: 44 FNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKE 103

Query: 367 FYNWYAYTEVLMLVGQYNKALEILQKAISLHKDRAEF 403
+ E L+ G+ +A L A L D+ EF
Sbjct: 104 PRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140



Score = 32.2 bits (73), Expect = 0.002
Identities = 19/69 (27%), Positives = 27/69 (39%), Gaps = 7/69 (10%)

Query: 252 EVYEEMLELEYTKAYTYYKIGLCHKENKKPLLALKSFQKSLVEDPQFYLSMMEQSFLYEE 311
E+ + LE Y+ A+ Y+ G K A K FQ V D + +
Sbjct: 30 EISSDTLEQLYSLAFNQYQSG-------KYEDAHKVFQALCVLDHYDSRFFLGLGACRQA 82

Query: 312 MGQMREALH 320
MGQ A+H
Sbjct: 83 MGQYDLAIH 91



Score = 29.1 bits (65), Expect = 0.025
Identities = 16/88 (18%), Positives = 26/88 (29%)

Query: 173 FYKLNKNDDAIKFLNHYIEEFPFSETAWFEYGQFYFNRKNYEEAIRGFDYLLAINSNSVG 232
Y+ K +DA K + + G Y+ AI + Y ++
Sbjct: 46 QYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPR 105

Query: 233 VYANKAACYEAMQEWDKAVEVYEEMLEL 260
+ A C E +A EL
Sbjct: 106 FPFHAAECLLQKGELAEAESGLFLAQEL 133


29CG09_RS05785CG09_RS05815N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS05785-1130.973425type III pantothenate kinase
CG09_RS05790-2140.593911aminopeptidase
CG09_RS05795-112-0.039757hypothetical protein
CG09_RS05800-213-0.213356two-component sensor histidine kinase
CG09_RS05805-210-0.698905transcriptional regulator
CG09_RS05810-210-0.591894hypothetical protein
CG09_RS05815014-1.070292UDP-N-acetylglucosamine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05805PF033091961e-64 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 196 bits (501), Expect = 1e-64
Identities = 75/257 (29%), Positives = 125/257 (48%), Gaps = 16/257 (6%)

Query: 4 IVVNIGNTNIRFGLFNNEGCSL----SWVINTKPYRTKDELFVQFLMHYQSYDIKPKEID 59
+ +++ NT+ GL + G W I T+P T DEL + + +
Sbjct: 3 LAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTID---GLIGDDAERLT 59

Query: 60 QLIIGSVVPQMTNDIVRALEKIHHLKPILV---DRNTPSEVKPKS-KQMGTDIYANLVAA 115
S VP + +++ LE+ P ++ T + + K++G D N +AA
Sbjct: 60 GASGLSTVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGADRIVNCLAA 119

Query: 116 HHLYPNKSKIIFDFGTALTASCISHSGETLGAIIAPGIITSLKSLIQDTAQLLEIELQAP 175
+H Y + I+ DFG+++ +S GE LG IAPG+ S + +A L +EL P
Sbjct: 120 YHKYG-TAAIVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAAAARSAALRRVELTRP 178

Query: 176 KSVLGLDTVSCMQSGMVYGYLGMVEGFIERINREI----GEETFVIATGGVSHVYKPLSD 231
+SV+G +TV CMQ+G V+G+ G+V+G + RI ++ G + V+ATG + + P
Sbjct: 179 RSVIGKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFSGADVAVVATGHTAPLVLPDLR 238

Query: 232 KIHIADRLHTLKGLYFL 248
+ DR TL GL +
Sbjct: 239 TVEHYDRHLTLDGLRLV 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05820PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 30/189 (15%), Positives = 72/189 (38%), Gaps = 35/189 (18%)

Query: 319 SGLIKQENLRMKKQVENVLNMSKLERNEMKLF-LRETNLRELIRNIANSVRLIVNERGGR 377
LI ++ K E + ++S+L R ++ R+ +L + + + + ++L + R
Sbjct: 183 RALILEDP---TKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 378 LT--EDFKAERYNLKVDEFHLSNTLINLLDNANKY----SPDKPEIKIATRNEGNYYVIE 431
L +++V + L++N K+ P +I + + +E
Sbjct: 240 LQFENQINPAIMDVQVPPM----LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 432 ISDKGMGMEPQNKTKIFEKFFREETGNVHNVKGQGLGLSYVKKIIELHKG---QISVETQ 488
+ + G K + G GL V++ +++ G QI + +
Sbjct: 296 VENTGSLALKNTK------------------ESTGTGLQNVRERLQMLYGTEAQIKLSEK 337

Query: 489 KGKGSTFIV 497
+GK + ++
Sbjct: 338 QGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05825HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 1e-23
Identities = 34/128 (26%), Positives = 64/128 (50%)

Query: 4 RILLVEDDQSFGAVLKDYLSINNFEVTLATDGEEGLKEYTNNDFDICIFDVMMPKKDGFT 63
IL+ +DD + VL LS ++V + ++ + D D+ + DV+MP ++ F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LAEDVKKLGKNIPIIFLTARNLREDILKGYQLGADDYITKPFDTELLLYKIKAILSRSTS 123
L +KK ++P++ ++A+N +K + GA DY+ KPFD L+ I L+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 124 LEEEEQEQ 131
+ ++
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS05835FLGPRINGFLGI290.040 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 28.7 bits (64), Expect = 0.040
Identities = 17/49 (34%), Positives = 21/49 (42%), Gaps = 6/49 (12%)

Query: 26 ALQILCAVLLTDQEVRIKNIPDIQDV--NKLIGILGDLGVKVTKNGKGD 72
AL L RIK+I +Q N+LIG G+ V G GD
Sbjct: 15 ALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGY----GLVVGLQGTGD 59


30CG09_RS06780CG09_RS06855N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS06780-115-1.821527epimerase
CG09_RS06785-114-1.669523hypothetical protein
CG09_RS06790-116-1.738703preprotein translocase subunit SecA
CG09_RS06795-217-1.101019hypothetical protein
CG09_RS06800-121-0.893873TonB-dependent receptor
CG09_RS06805-215-0.838415hypothetical protein
CG09_RS06810016-0.887319glucokinase
CG09_RS068200130.423140ribosome biogenesis GTPase Der
CG09_RS06825-311-0.036880rod shape-determining protein
CG09_RS06830-29-0.061018rod shape-determining protein MreC
CG09_RS06840-391.075362rod shape-determining protein MreD
CG09_RS06845-1110.677150peptidoglycan glycosyltransferase
CG09_RS06850-1110.753315rod shape-determining protein RodA
CG09_RS06855-1151.039515membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06830NUCEPIMERASE373e-131 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 373 bits (959), Expect = e-131
Identities = 128/351 (36%), Positives = 192/351 (54%), Gaps = 37/351 (10%)

Query: 5 TYLVTGGSGFIGSHLVEALLKNGHFVINVDNFDDFYNYKTKINNTLESLGITTNFDFENK 64
YLVTG +GFIG H+ + LL+ GH V+ +DN +D+Y+ K LE L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLK-QARLELLA---------- 50

Query: 65 NLDIKKLASLVNKGNYKFYYQDIRDKEGLEKIFKNHRPDVVIHLAALAGVRPSIERPLEY 124
+ ++F+ D+ D+EG+ +F + + V VR S+E P Y
Sbjct: 51 ------------QPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAY 98

Query: 125 QEVNIKGTMNIWEVAKDLGICKFVIASSSSVYGNNEKIPFSEEDNVDRPISPYAATKKCV 184
+ N+ G +NI E + I + ASSSSVYG N K+PFS +D+VD P+S YAATKK
Sbjct: 99 ADSNLTGFLNILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKAN 158

Query: 185 EVLGHTYHHLYGMDMVQLRFFTVYGPRQRPDLAIHKFAKIIKDNKQVPFYGDGNTARDYT 244
E++ HTY HLYG+ LRFFTVYGP RPD+A+ KF K + + K + Y G RD+T
Sbjct: 159 ELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFT 218

Query: 245 FVDDIIDGIMKSIKYVEE--------------NAGVYEIFNLGESEVIPLHKMLSTIEEE 290
++DDI + I++ + + Y ++N+G S + L + +E+
Sbjct: 219 YIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDA 278

Query: 291 LGVKATLNKLPMQAGDVQKTNADIRKAQQKIGYAPTTNFQNGIKKFVEWFL 341
LG++A N LP+Q GDV +T+AD + + IG+ P T ++G+K FV W+
Sbjct: 279 LGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06840SECA8600.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 860 bits (2223), Expect = 0.0
Identities = 392/1051 (37%), Positives = 537/1051 (51%), Gaps = 254/1051 (24%)

Query: 4 LNTILKSFLGNKNEKDLKEVKKVVAKIKAVEPEVGKLSDDGLRQKTEEFQNKIKEATSKI 63
L +L G++N++ L+ ++KVV I A+EPE+ KLSD+ L+ KT EF+ ++++
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEV-- 59

Query: 64 TSQVEELKEKIKTSKDVDEKEALFNKIEELKKEAYQIEEKVLTDILPEAFAVLKETARRW 123
L +++PEAFAV++E ++R
Sbjct: 60 -----------------------------------------LENLIPEAFAVVREASKR- 77

Query: 124 AQNGEIRVKANDRDRALAATKDFVVIEGDEAVWLNHWDAAGTKVQWDMVHYDVQFIGGVV 183
+ M H+DVQ +GG+V
Sbjct: 78 --------------------------------------------VFGMRHFDVQLLGGMV 93

Query: 184 LHGGKIAEMATGEGKTLVGTLPIYLNALPGRGVHVVTVNDYLARRDSAWMGPLYEFHGLS 243
L+ IAEM TGEGKTL TLP YLNAL G+GVHVVTVNDYLA+RD+ PL+EF GL+
Sbjct: 94 LNERCIAEMRTGEGKTLTATLPAYLNALTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLT 153

Query: 244 IDCIDNHQPNSDARRKAYQCNITYGTNNEFGFDYLRDNMVNSPNEMVQGELNYAIVDEVD 303
+ P + A+R+AY +ITYGTNNE+GFDYLRDNM SP E VQ +L+YA+VDEVD
Sbjct: 154 VGINLPGMP-APAKREAYAADITYGTNNEYGFDYLRDNMAFSPEERVQRKLHYALVDEVD 212

Query: 304 SVLIDDARTPLIISGPVPQGDRQEFDVLKPSVDRIVDVQKKTVSAIFHEAKKLIAQGNTK 363
S+LID+ARTPLIISGP + S ++ K+I
Sbjct: 213 SILIDEARTPLIISGPA-----------------------EDSSEMYKRVNKIIP----- 244

Query: 364 EGGFKLLQAYRGLPKNRQLIKFLSETGNKALLQKVEAQYMQDNNREMPKVDKDLYFVIDE 423
+ + E + + F +DE
Sbjct: 245 -----------------------------------HLIRQEKEDSETFQGEGH--FSVDE 267

Query: 424 KNNQIDLTDKGVEYMSQGNSDPNFFVLQDIGTELAELEAQNLPKEEEFAKKEELFRDFAV 483
K+ Q++LT++G+ + + + D G L L
Sbjct: 268 KSRQVNLTERGLVLIEELLVKEG---IMDEGESLYSPANIML------------------ 306

Query: 484 KSERIHTLNQLLKAYTLFEKDDQYVVMDGEVKIVDEQTGRIMEGRRYSDGLHQAIEAKEN 543
+H + L+A+ LF +D Y+V DGEV IVDE TGR M+GRR+SDGLHQA+EAKE
Sbjct: 307 ----MHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEG 362

Query: 544 VKIEAATQTFATITLQNYFRMYNKLAGMTGTAETESGEFWEIYRLDVVVIPTNRPIQRND 603
V+I+ QT A+IT QNYFR+Y KLAGMTGTA+TE+ EF IY+LD VV+PTNRP+ R D
Sbjct: 363 VQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKD 422

Query: 604 KHDLVYKTNREKYNAVIEEVEKLTSAGRPVLVGTTSVEISQLLSKALQLRKIPHQVLNAK 663
DLVY T EK A+IE++++ T+ G+PVLVGT S+E S+L+S L I H VLNAK
Sbjct: 423 LPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAK 482

Query: 664 LHKKEAEIVAEAGRAGVVTIATNMAGRGTDIKL--------------------------- 696
H EA IVA+AG VTIATNMAGRGTDI L
Sbjct: 483 FHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQ 542

Query: 697 --SKEVKDAGGLAIIGTERHDSRRVDRQLRGRAGRQGDPGSSQFYVSLEDNLMRLFGSER 754
V +AGGL IIGTERH+SRR+D QLRGR+GRQGD GSS+FY+S+ED LMR+F S+R
Sbjct: 543 VRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDR 602

Query: 755 IAKMMDRLGHKEGEVIQHSMITKSIERAQKKVEENNFGIRKRLLEYDDVMNKQRDVIYKR 814
++ MM +LG K GE I+H +TK+I AQ+KVE NF IRK+LLEYDDV N QR IY +
Sbjct: 603 VSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQ 662

Query: 815 RKNALFGDHLKYDIANMIFDVSHSIVNQTKMHGDYKDFEFEVIKYFTMEAPVSEADFKNK 874
R L + I ++ DV + ++ + E+ ++ + +
Sbjct: 663 RNELLDVSDVSETINSIREDVFKATIDAYIPPQSLE----EMWDIPGLQERLKNDFDLDL 718

Query: 875 TVKELTDVVFKKAQEDYEMKLNLLKEKSFPIIENVYQNQGNMFKMIQVPFSDGTKTMTIL 934
+ E D ++ E+ L+E+ IL
Sbjct: 719 PIAEWLD-------KEPELHEETLRER-------------------------------IL 740

Query: 935 ADLKEAYETQCDSL----INDFEKNICLSIIDENWKLHLREMDDLRRSSQGAVYEQKDPL 990
A E Y+ + + + + FEK + L +D WK HL MD LR+ Y QKDP
Sbjct: 741 AQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAMDYLRQGIHLRGYAQKDPK 800

Query: 991 VIYKQESFHLFSEMVDKINKEIISFLYKGEI 1021
YK+ESF +F+ M++ + E+IS L K ++
Sbjct: 801 QEYKRESFSMFAAMLESLKYEVISTLSKVQV 831


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06855INTIMIN270.022 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.0 bits (59), Expect = 0.022
Identities = 15/69 (21%), Positives = 23/69 (33%), Gaps = 7/69 (10%)

Query: 56 HHSHFSTQKHYKSFFSASYFVLPKLVNIPSLLKHKR-------EKKIADYRKWQIVKYTF 108
+ +S Q Y+ S + P+ VN L R I +Y+K I+
Sbjct: 399 NDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI 458

Query: 109 THSNRGPPH 117
H G
Sbjct: 459 PHDINGTER 467


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06860PF03309362e-04 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 35.5 bits (82), Expect = 2e-04
Identities = 25/148 (16%), Positives = 51/148 (34%), Gaps = 27/148 (18%)

Query: 10 MALGIDIGGTDTKFGLVN---HRGEILGKGRIKTDYDEIDDFINALYKEIEPILEQHNAK 66
M L ID+ T T GL++ +++ + RI+T+ + D + L +A+
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDG----LIGDDAE 56

Query: 67 SQLEGIGIG--APNGNYYKGTIENAPNLKWKGIVPLAEKMTAKFGVQCKVTND------- 117
+L G P+ + + W + + + + G+ V N
Sbjct: 57 -RLTGASGLSTVPSVLH---EVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGADR 112

Query: 118 -ANAAAYGEMMFGAARGMKDFIMITLGT 144
N A + I++ G+
Sbjct: 113 IVNCLA------AYHKYGTAAIVVDFGS 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06865TCRTETOQM330.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 32.5 bits (74), Expect = 0.003
Identities = 35/169 (20%), Positives = 64/169 (37%), Gaps = 19/169 (11%)

Query: 2 SNIVAIVGRPNVGKSTLFNRLLERREAIVDSVAGVTRDRHYGKSEWNGVEFTVIDTGGYD 61
S + +G + G + N LLER+ I + W + +IDT G+
Sbjct: 27 SGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQ-------WENTKVNIIDTPGH- 78

Query: 62 VGTDDIFEEEIRHQVQLAVDEATSIIFMLNVEEGLTDTDQEIHELLRRSNKPIYIVVNKV 121
D + E V +D A I +++ ++G+ + + LR+ P +NK+
Sbjct: 79 --MDFLAEVYRSLSV---LDGA---ILLISAKDGVQAQTRILFHALRKMGIPTIFFINKI 130

Query: 122 DSAKEELPATEFYQLGIEKYYTLSSATGSGTGDLLDAVVADFPTTEYKD 170
D +L YQ I++ + + V +F +E D
Sbjct: 131 DQNGIDLSTV--YQ-DIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176



Score = 31.0 bits (70), Expect = 0.010
Identities = 30/138 (21%), Positives = 55/138 (39%), Gaps = 30/138 (21%)

Query: 178 ITIAGRPNVGKSTLTNALLDNKRNI----VTDIAGTTRDSIE-------------TIYNK 220
I + + GK+TLT +LL N I D T D+ T +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 221 FGHEFVLVDTAGMRKKSKVSENLEFYS-VMRSVRAIEHSDVVVIMVDATQGWESQDMNIF 279
+ ++DT G +++F + V RS+ + D ++++ A G ++Q +F
Sbjct: 66 ENTKVNIIDTPG---------HMDFLAEVYRSLSVL---DGAILLISAKDGVQAQTRILF 113

Query: 280 GIAQKNRKGIVILVNKWD 297
+K + +NK D
Sbjct: 114 HALRKMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06870SHAPEPROTEIN362e-127 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 362 bits (930), Expect = e-127
Identities = 159/339 (46%), Positives = 221/339 (65%), Gaps = 9/339 (2%)

Query: 3 LFDLFTQEIAIDLGTANTLIIHNNK-IVIDQPSIVAIDRQSGRP----IAVGEQAKHMQG 57
+F+ +++IDLGTANTLI + IV+++PS+VAI + AVG AK M G
Sbjct: 5 FRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAKQMLG 64

Query: 58 KTHEDIRTVRPLKDGVIADFHASEHMIKEFIKQIPGIKGKLFQPALKIVICIPSGITEVE 117
+T +I +RP+KDGVIADF +E M++ FIKQ+ +P+ ++++C+P G T+VE
Sbjct: 65 RTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQV--HSNSFMRPSPRVLVCVPVGATQVE 122

Query: 118 KRAVRDSAQKVNAKEVRLIYEPMAAAIGVGIDVQKPEGNMIIDIGGGTTEIAVVALGGIV 177
+RA+R+SAQ A+EV LI EPMAAAIG G+ V + G+M++DIGGGTTE+AV++L G+V
Sbjct: 123 RRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVV 182

Query: 178 CDKSVKIAGDVFTNDIAYYLRTHHNLFIGERTAERIKIEVGSAVEELDIDIEDIPVQGRD 237
SV+I GD F I Y+R ++ IGE TAERIK E+GSA ++ +I V+GR+
Sbjct: 183 YSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYP--GDEVREIEVRGRN 240

Query: 238 LITGKPKEIMIGYKEIARALDKSIIRIEDAVMETLSMTPPELAADIYKTGIYLAGGGALL 297
L G P+ + EI AL + + I AVM L PPELA+DI + G+ L GGGALL
Sbjct: 241 LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL 300

Query: 298 RGLADRLHKKTGLPVFVAEDPLRAVVRGTGIALKNMDKF 336
R L L ++TG+PV VAEDPL V RG G AL+ +D
Sbjct: 301 RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMH 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS06895OMPADOMAIN692e-15 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 68.8 bits (168), Expect = 2e-15
Identities = 40/176 (22%), Positives = 66/176 (37%), Gaps = 20/176 (11%)

Query: 98 SNAYIKQLISTNARNDSLNLALSNKLKRSLDNVADQDVQVKVLKGVV--MISLSDKMLYR 155
+N I T N L+L +S + + + V +L +L+
Sbjct: 166 NNIGDAHTIGTRPDNGMLSLGVSYRFGQG-EAAPVVAPAPAPAPEVQTKHFTLKSDVLFN 224

Query: 156 SGDYNILPAAQEVLGKVAKVINDYD--KYSVLIEGNTDNVPLNSASLPKDNWDLSALRAT 213
+ P Q L ++ +++ D SV++ G TD + N LS RA
Sbjct: 225 FNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI-----GSDAYNQGLSERRAQ 279

Query: 214 SVAKVLQNQFGVDPSRITAGGRSEYNPKATNMS---------VSGRAENRRTEIII 260
SV L ++ G+ +I+A G E NP N + A +RR EI +
Sbjct: 280 SVVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


31CG09_RS08285CG09_RS08315N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CG09_RS08285-2141.344554ribosomal protein L11 methyltransferase
CG09_RS08290-2162.047795glycerol-3-phosphate dehydrogenase
CG09_RS08295-2161.998221GCN5 family acetyltransferase
CG09_RS08300-2171.759161sulfurtransferase
CG09_RS08305-2161.646929tRNA epoxyqueuosine(34) reductase QueG
CG09_RS08310-3150.943517hypothetical protein
CG09_RS08315-1110.431196metalloendopeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08350MYCMG045290.019 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 29.3 bits (65), Expect = 0.019
Identities = 19/61 (31%), Positives = 30/61 (49%), Gaps = 5/61 (8%)

Query: 30 FDSFTEETNGILAYIPKNDLNEDAIKSLYIFEQEGVEIDYTYTEMPNINWNEEWEKNFSP 89
FD +TE +L +LNE+ K + E ++ YT + +I WN+ EK SP
Sbjct: 411 FDYYTETLKALLEKEDSAELNENEKKLV-----ETIKKAYTIEKDSSIRWNQLVEKPISP 465

Query: 90 I 90
+
Sbjct: 466 L 466


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08360SACTRNSFRASE300.003 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.9 bits (67), Expect = 0.003
Identities = 13/101 (12%), Positives = 30/101 (29%), Gaps = 14/101 (13%)

Query: 48 DGVGYVWVQGDEVLGYAVLMLNNEPAYDNIEGEWLSNGDYLVVHRVVVHDRCLGKGIAKQ 107
+++ + +G + N W Y ++ + V KG+
Sbjct: 64 GKAAFLYYLENNCIGRIKIRSN-----------W---NGYALIEDIAVAKDYRKKGVGTA 109

Query: 108 MFLWIEGWAKQQNIYSVKVDTNYDNQPMLHILQHLGYQYCG 148
+ WAK+ + + ++T N H +
Sbjct: 110 LLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08375BINARYTOXINB280.027 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.5 bits (63), Expect = 0.027
Identities = 12/38 (31%), Positives = 19/38 (50%), Gaps = 1/38 (2%)

Query: 163 ESGKTILIETDAVYGTNDAFYEAKGSYSFMHTCNTWAN 200
E K + ++TD VYG N A Y + + T + W+
Sbjct: 472 EKTKQLRLDTDQVYG-NIATYNFENGRVRVDTGSNWSE 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CG09_RS08380RTXTOXIND290.015 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.015
Identities = 9/57 (15%), Positives = 22/57 (38%), Gaps = 11/57 (19%)

Query: 189 DIKSAAAGTILFAGEKSGYGKCVIISHGNGLATLYGHLSQVLVKANDKIKAGETIAK 245
+I + A G + +G I + +++VK + ++ G+ + K
Sbjct: 81 EIVATANGKLTHSGRS------KEIKPIEN-----SIVKEIIVKEGESVRKGDVLLK 126



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.