PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomeakk.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.01Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NZ_AP021898 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1GOZ73_RS00030GOZ73_RS00075Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS00030-2223.942923Nif3-like dinuclear metal center hexameric
GOZ73_RS00035-2174.169652RluA family pseudouridine synthase
GOZ73_RS00040-1225.608744prephenate dehydratase
GOZ73_RS000450236.506583alpha-L-fucosidase
GOZ73_RS000501257.253361NAD-dependent DNA ligase LigA
GOZ73_RS000552257.185562CDP-diacylglycerol--glycerol-3-phosphate
GOZ73_RS000602256.986408phospholipase D-like domain-containing protein
GOZ73_RS000652236.924734DUF1156 domain-containing protein
GOZ73_RS000701144.806248DUF3780 domain-containing protein
GOZ73_RS000750134.001480DUF499 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00040ALARACEMASE290.042 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 28.6 bits (64), Expect = 0.042
Identities = 18/110 (16%), Positives = 37/110 (33%), Gaps = 34/110 (30%)

Query: 83 LNRLNAGVLPEASLQAIYREIISCSFFLEGGLTIAYLGPKGTWSH----------QAALK 132
+NRL G P+ + +++++ + + +G SH A+
Sbjct: 128 MNRL--GFQPD-RVLTVWQQLRAMAN----------VGEMTLMSHFAEAEHPDGISGAMA 174

Query: 133 QFGKSCELIPCQ----------SFKDV-FDMVDRGKAQYGVVPVENSSEG 171
+ ++ E + C+ + FD V G YG P +
Sbjct: 175 RIEQAAEGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDI 224


2GOZ73_RS00235GOZ73_RS00315Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS00235117-3.714496diaminopimelate epimerase
GOZ73_RS00240216-4.082901hypothetical protein
GOZ73_RS00250216-2.833970*ubiquinone/menaquinone biosynthesis
GOZ73_RS00255218-3.618093MATE family efflux transporter
GOZ73_RS00265521-5.209814*toll/interleukin-1 receptor domain-containing
GOZ73_RS00270418-3.473609FRG domain-containing protein
GOZ73_RS002751150.170204hypothetical protein
GOZ73_RS002801151.229079hypothetical protein
GOZ73_RS002850151.745282hypothetical protein
GOZ73_RS002901152.263820hypothetical protein
GOZ73_RS002950152.9388692-isopropylmalate synthase
GOZ73_RS003000142.988269single-stranded-DNA-specific exonuclease RecJ
GOZ73_RS003051152.850852beta-N-acetylglucosaminidase domain-containing
GOZ73_RS003101203.494708metal ABC transporter substrate-binding protein
GOZ73_RS003151213.233030hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00310adhesinb1862e-59 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 186 bits (474), Expect = 2e-59
Identities = 81/317 (25%), Positives = 144/317 (45%), Gaps = 27/317 (8%)

Query: 4 RILIP-FLCLLGLF-CSLQAQDAPV----LKIAALHPVLGDMARAIGGSRVQVADLLRPN 57
R L+ L +GL CS Q L + A + ++ D+ + I G ++ + ++
Sbjct: 5 RFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVG 64

Query: 58 GNLHSFEPSPQDIAAAGRARLVLASGKNLE----PYLPKLKDAL--GGSAQILDLGASVP 111
+ H +EP P+D+ +A L+ +G NLE + KL + + + V
Sbjct: 65 QDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVD 124

Query: 112 DVPVEEDTADHDHEHDGDCCAHGPNDPHWWHTPANMKRAARTLAATLTRLDPAHEQDYKA 171
+ +E + D PH W N A+ +A L+ DPA+++ Y+
Sbjct: 125 VIYLEG----QSEKGKED--------PHAWLNLENGIIYAQNIAKRLSEKDPANKETYEK 172

Query: 172 GLSRWNRKMDQLSSWARKELADIPETDRVLVTGHAAMNHFCREFGFRSISIQGISREDEG 231
L + K+ L A+++ +IP +++VT +F + + S I I+ E+EG
Sbjct: 173 NLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEG 232

Query: 232 NSAQLASTLKKLRSAGVKALFPEYSSNPKSLTEIAKSLNIPVARPINTDGLAPEGH---T 288
Q+ + ++KLR V +LF E S + + + ++K NIP+ I TD +A +G +
Sbjct: 233 TPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDS 292

Query: 289 FESMFKQNTGIIKEALS 305
+ SM K N I E LS
Sbjct: 293 YYSMMKYNLEKIAEGLS 309


3GOZ73_RS12150GOZ73_RS12155Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS12150538-11.374729hypothetical protein
GOZ73_RS00455552-16.010909site-specific integrase
GOZ73_RS00460458-17.045762sigma-54 dependent transcriptional regulator
GOZ73_RS00470354-16.257540metallophosphoesterase
GOZ73_RS00475345-13.614045ATP-binding protein
GOZ73_RS12155-118-3.354129hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00465HTHFIS468e-165 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 468 bits (1207), Expect = e-165
Identities = 156/480 (32%), Positives = 248/480 (51%), Gaps = 46/480 (9%)

Query: 1 MDKTRIIVVEDNIVYCEYICNMLAREGYDTVKAYHLSTAKKQLQQAADDDIVVSDLRLPD 60
M I+V +D+ + L+R GYD + +T + + A D D+VV+D+ +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-AGDGDLVVTDVVMPD 59

Query: 61 GDGIELLRWMRKEGKMQPFIIMTDYAEVHTAVESMKLGSIDYIPKKLIEDKLVPLIRTIQ 120
+ +LL ++K P ++M+ TA+++ + G+ DY+PK +L+ +I
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 121 KERQAGLNRIPVFAREG-------SAFQKIMHRIKLVATTDMSVMIFGENGTGKEHIAHH 173
E + +++ +++G +A Q+I + + TD+++MI GE+GTGKE +A
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 174 LHEKSKRAGKPFVAVDCGSLSRELAPSAFFGHVKGAFTGADSTKKGYFHEAEGGTLFLDE 233
LH+ KR PFVA++ ++ R+L S FGH KGAFTGA + G F +AEGGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 234 VGNLALETQQMLLRAIQERRYRPVGDKSDRSFNVRIIATTNEDLEAAVSEKRFRQDLLYR 293
+G++ ++ Q LLR +Q+ Y VG ++ +VRI+A TN+DL+ ++++ FR+DL YR
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 294 LRDFVITMPPLRDCQEDIMPLAEFFREIANKELECNASGFSSEARKALLTHAWPGNVREL 353
L + +PPLRD EDI L F + A KE + F EA + + H WPGNVREL
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 354 RQKIMGAVLQAQEGVVTKEHLELAVPK--PTSPVSFALRNDAE----------------- 394
+ + V+T+E +E + P SP+ A
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 395 ------------------DKERIMRALKQTNGNRCAAAELLGISRTTLYGKLEEYGLKYK 436
+ I+ AL T GN+ AA+LLG++R TL K+ E G+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVY 478


4GOZ73_RS12340GOZ73_RS00625Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS123400173.042700hypothetical protein
GOZ73_RS123450172.839347ROK family protein
GOZ73_RS00565-1182.445910RsmB/NOP family class I SAM-dependent RNA
GOZ73_RS00575-1222.851529metallophosphatase family protein
GOZ73_RS00580-1182.835195NUDIX domain-containing protein
GOZ73_RS00590-1172.670404efflux RND transporter periplasmic adaptor
GOZ73_RS005950153.185478ABC transporter ATP-binding protein
GOZ73_RS006000144.135391ABC transporter permease
GOZ73_RS00605-1143.980675efflux transporter outer membrane subunit
GOZ73_RS006100173.414298serine hydrolase
GOZ73_RS00615-1163.640286hypothetical protein
GOZ73_RS00620-2182.232667tRNA
GOZ73_RS00625-2203.208605dihydroorotase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00590RTXTOXIND712e-15 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 71.4 bits (175), Expect = 2e-15
Identities = 43/251 (17%), Positives = 92/251 (36%), Gaps = 45/251 (17%)

Query: 88 VKAGQMLAEIDKLPVQLDVQRAEASKAQAIAGIARSKADIQ----QAKAKHQQARLDRER 143
V ++L + Q + + K Q + + +A+ + +R+++ R
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQ--KYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 144 --------AEKLGPGDAL--SKSSYDQYIADEETARANVAVAEASLQEAEASLKQAEAAL 193
++ A+ ++ Y + + + ++ + E+ + A+ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 194 KKEMR----------------------NLEYTTIKSPVDGVVVKRLVNIGQTVVSSMSAS 231
K E+ + + I++PV V + V+ VV+ +A
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVT--TAE 354

Query: 232 SLFYIATDLSKLKIWAAVNEADIGSIRKGQEVLFTVDAFAGRKF---KGTVDKIRLDATM 288
+L I + L++ A V DIG I GQ + V+AF ++ G V I LDA
Sbjct: 355 TLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIE 414

Query: 289 --TSNVVTYIV 297
+V ++
Sbjct: 415 DQRLGLVFNVI 425



Score = 46.0 bits (109), Expect = 2e-07
Identities = 25/159 (15%), Positives = 53/159 (33%), Gaps = 17/159 (10%)

Query: 60 VDVGAQVSGIIMEFGKD------LNGKV----VDYSSPVKAGQMLAEIDKLPVQLDVQRA 109
V++ A +G + G+ N V V V+ G +L ++ L + D +
Sbjct: 80 VEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKT 139

Query: 110 EASKAQAIAGIARSKADIQQAKAKHQ-------QARLDRERAEKLGPGDALSKSSYDQYI 162
++S QA R + + + + E++ +L K + +
Sbjct: 140 QSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199

Query: 163 ADEETARANVAVAEASLQEAEASLKQAEAALKKEMRNLE 201
+ N+ A A + + E + E L+
Sbjct: 200 NQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00605FLGMRINGFLIF290.035 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.2 bits (65), Expect = 0.035
Identities = 41/171 (23%), Positives = 63/171 (36%), Gaps = 26/171 (15%)

Query: 225 RIPQLEAQVASSKNQLAVLLGTYNSRVELTLPKASVF---EKTPTVPVGLPSELLRRRPD 281
LE ++A + L ++RV L +PK S+F +K+P+ V + P
Sbjct: 131 YQRALEGELARTIETLG---PVKSARVHLAMPKPSLFVREQKSPSASV-----TVTLEPG 182

Query: 282 VIAAEADLHAAVANVGVAVADLYP----------RFSLTGSLSSRGGDFGQLFRENNNAW 331
E + A V V AVA L P + S R + QL N+
Sbjct: 183 RALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVES 242

Query: 332 SLGG---NLLQPLFRGGALRATVRAQ--KAAAEQAAETYRKTLITAVSEVE 377
+ +L P+ G + A V AQ A EQ E Y + + +
Sbjct: 243 RIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLR 293


5GOZ73_RS00695GOZ73_RS00730Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS00695290.194371GDSL-type esterase/lipase family protein
GOZ73_RS007003100.822321hypothetical protein
GOZ73_RS00705211-0.074179hypothetical protein
GOZ73_RS007100120.228931DNA gyrase/topoisomerase IV subunit A
GOZ73_RS007201133.127817*hypothetical protein
GOZ73_RS007250163.000992hypothetical protein
GOZ73_RS007300123.029216hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00700IGASERPTASE629e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 61.6 bits (149), Expect = 9e-12
Identities = 45/264 (17%), Positives = 86/264 (32%), Gaps = 15/264 (5%)

Query: 15 KLRRRLVTAAPAAAPKVKFVSPAAE---GVEVVKPR--FLTPELQAKLAQEQEEAARRAA 69
+ R + V P + E+ + + P A ++ E A +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 70 EQAEADRIAAEKAAAEQEAARIAAEKEAARIAAEQEAARI-KAEQEAAEAARIAAEKEAA 128
++++ + A R A++ + + A + + ++ E E KE A
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET-KETA 1104

Query: 129 RVAAEEERARL----AAELEKKEAASAPQPEPSVDAQIQDALAKVAEAQKALAEAQAAAL 184
V +EE+A++ E+ K + +P+ E S Q Q A+ + + E Q+
Sbjct: 1105 TVE-KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 185 AQAAGVAPAAVQTPEPSVPEPAPAGGAPKLKVFKTAAAPAAAEKSPVAAPESA-VPKLKV 243
A PA + P V + A P ES+ PK +
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 244 LKS--PVGAAPASSVAPAAVPQPV 265
+S V + + V
Sbjct: 1224 RRSVRSVPHNVEPATTSSNDRSTV 1247



Score = 53.1 bits (127), Expect = 4e-09
Identities = 42/296 (14%), Positives = 85/296 (28%), Gaps = 54/296 (18%)

Query: 10 SPEPPKLRRRLVTAAPAAAPKVKFVSPAAEGVEVVKPRFLTPELQAKLAQEQEEAARRAA 69
+PE K + + T ++ P+ R + E A
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 70 EQAEADRIAAEKAAAEQEAARIAAEKEAARIAAEQEAARIKAEQEAAEAARIAAEKEAAR 129
E ++ + + EQ+A A+ R A++ + +KA + E A+ +E + +
Sbjct: 1042 ENSK--QESKTVEKNEQDATETTAQ---NREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 130 VAAEEERARLAAELEKKEAASAPQPEPSVDAQIQDALAKVAEAQKALAEAQAAALAQAAG 189
+E A + E + K Q P V +Q+ K +++ +A+ A
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVS---PKQEQSETVQPQAEPA------- 1146

Query: 190 VAPAAVQTPEPSVPEPAPAGGAPKLKVFKTAAAPAAAEKSPVAAPESAVPKLKVLKSPVG 249
E P ++ + A E
Sbjct: 1147 -------------RENDPT----------VNIKEPQSQTNTTADTEQP------------ 1171

Query: 250 AAPASSVAPAAVPQPVVAEGVSPAAGAPPAGDGGIQPPAGTESARDMAASVDKSRS 305
+ + V QPV + P + +++ K+R
Sbjct: 1172 ----AKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS00705SECA318e-04 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.4 bits (71), Expect = 8e-04
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 2/42 (4%)

Query: 86 DLRRLAAVLEEIAELEAQTAKL--DTLRAEAEELRNRLTAKQ 125
LRR+ V+ I +E + KL + L+ + E R RL +
Sbjct: 17 TLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGE 58


6GOZ73_RS01240GOZ73_RS01485Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS01240-110-3.183260hypothetical protein
GOZ73_RS01245013-2.932839peptide deformylase
GOZ73_RS01250016-3.281854L-threonine 3-dehydrogenase
GOZ73_RS01255017-2.673442alpha-glucan family phosphorylase
GOZ73_RS01260029-2.511808ion transporter
GOZ73_RS01265130-1.991971arsenate reductase family protein
GOZ73_RS01270029-1.070738MBL fold metallo-hydrolase
GOZ73_RS12350024-0.566723sel1 repeat family protein
GOZ73_RS01280025-0.768168pseudouridine synthase
GOZ73_RS01285221-1.445429small basic protein
GOZ73_RS01295118-1.955620tRNA (guanosine(37)-N1)-methyltransferase TrmD
GOZ73_RS01300121-3.190679peptide chain release factor 2
GOZ73_RS01305121-4.246475metal-dependent transcriptional regulator
GOZ73_RS01310222-4.76603430S ribosomal protein S1
GOZ73_RS01315130-6.322418SagB/ThcOx family dehydrogenase
GOZ73_RS01320134-6.489763GNAT family N-acetyltransferase
GOZ73_RS01330033-6.495074*hypothetical protein
GOZ73_RS12355233-5.552881type VI secretion system amidase effector
GOZ73_RS12360333-5.163011PA14 domain-containing protein
GOZ73_RS01340332-4.236644GYF domain-containing protein
GOZ73_RS01345233-3.750150M23 family metallopeptidase
GOZ73_RS01350234-4.242001TIM barrel protein
GOZ73_RS01355236-5.8339053-methyl-2-oxobutanoate
GOZ73_RS01360136-6.7841202-amino-4-hydroxy-6-
GOZ73_RS01365035-7.4438374-hydroxy-tetrahydrodipicolinate reductase
GOZ73_RS01370037-8.5644214-hydroxy-tetrahydrodipicolinate synthase
GOZ73_RS01375028-7.108364hypothetical protein
GOZ73_RS01405023-5.145576**redoxin domain-containing protein
GOZ73_RS01410117-0.404805hypothetical protein
GOZ73_RS014152170.770377helix-turn-helix domain-containing protein
GOZ73_RS014252191.677507*DUF4339 domain-containing protein
GOZ73_RS014303202.776601hypothetical protein
GOZ73_RS014355255.535322hypothetical protein
GOZ73_RS014405244.287402hypothetical protein
GOZ73_RS014455231.950643hypothetical protein
GOZ73_RS014503190.024882hypothetical protein
GOZ73_RS014551180.111003hypothetical protein
GOZ73_RS01460020-3.065141hypothetical protein
GOZ73_RS01465019-2.703808hypothetical protein
GOZ73_RS01470020-3.142305hypothetical protein
GOZ73_RS01475118-3.204205hypothetical protein
GOZ73_RS01480217-2.763720hypothetical protein
GOZ73_RS01485220-3.733607copper resistance protein NlpE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01300NEISSPPORIN290.040 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 28.8 bits (64), Expect = 0.040
Identities = 23/82 (28%), Positives = 35/82 (42%), Gaps = 18/82 (21%)

Query: 244 VDTYRSGGKGGQNVNKVETAVRITHIPSGVIVACQNERSQLRNKEEAMNMLRAKLYQIEE 303
V TYRS V+KVET I S + + +E+ N L+A ++Q+E+
Sbjct: 31 VQTYRSVEHTDGKVSKVETGSEIADFGSKI---------GFKGQEDLGNGLKA-VWQLEQ 80

Query: 304 DKKQAEADRQYSEKGDIGWGNQ 325
A + GWGN+
Sbjct: 81 GASVA--------GTNTGWGNK 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01410HTHFIS270.024 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.5 bits (61), Expect = 0.024
Identities = 9/27 (33%), Positives = 15/27 (55%)

Query: 117 EKFRNEAERHGLTVPEWASEALDALSS 143
F +AE+ GL V + EAL+ + +
Sbjct: 322 RHFVQQAEKEGLDVKRFDQEALELMKA 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01440PF00577350.001 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 35.2 bits (81), Expect = 0.001
Identities = 50/341 (14%), Positives = 95/341 (27%), Gaps = 65/341 (19%)

Query: 253 SLSQPYVVTAEVDVPGGVKVSDTAGQYSPAETGSLGYDAPRMIVRGDKFPTGTAQWAARV 312
+ PY + G + S TAG+Y G+ + PR F T
Sbjct: 357 IFTVPYSSVPLLQREGHTRYSITAGEYRS---GNAQQEKPR-------FFQSTLLHGLP- 405

Query: 313 KRWAPALEDCAGLEVAASPKITSITPADAEHRGYSSAAVTHELTSGQINGKSARIKWGKV 372
+ G ++A Y + G + S +
Sbjct: 406 --AGWTIY--GGTQLA---------------DRYRAFNFGIGKNMGALGALSVDMTQANS 446

Query: 373 RVDLRVRATDPPDTVKQYFPEYGGKSGTGDRWIGTLTFEVTTTNVGYASYRVDRAGTVES 432
+ P+ G R++ + + TN+ YR +G +
Sbjct: 447 TL-----------------PDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYF-N 488

Query: 433 VSDDGGSPGDDETSGSYDTSALYKNFLKSYYEATRALPYDGSATVHDDFDQVCGGRLSIT 492
+D S + + D K YY TV + L ++
Sbjct: 489 FADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR--TSTLYLS 546

Query: 493 GGLKEWEAMRSVIQEISLDLKTGVSDVTV-------------GAPEQISLQDSIDRSRQL 539
G + + +V ++ L T D+ G + ++L +I S L
Sbjct: 547 GSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWL 606

Query: 540 AEALRRTAWADSSTSAGGGSSGGGSGSGGG--GSSGADDEV 578
+ S++ + G + G G+ D+ +
Sbjct: 607 RSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647


7GOZ73_RS01535GOZ73_RS01660Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS01535221-1.152455hypothetical protein
GOZ73_RS01540222-0.709937hypothetical protein
GOZ73_RS01545222-1.342435hypothetical protein
GOZ73_RS015502260.116135hypothetical protein
GOZ73_RS015553281.227126replicative DNA helicase
GOZ73_RS015605310.475128hypothetical protein
GOZ73_RS015654331.793977class I SAM-dependent methyltransferase
GOZ73_RS015703341.394531hypothetical protein
GOZ73_RS015803341.509337RusA family crossover junction
GOZ73_RS015853331.505647hypothetical protein
GOZ73_RS01590027-0.394999DUF2800 domain-containing protein
GOZ73_RS01595029-4.729676ATP-binding protein
GOZ73_RS01600548-12.776885hypothetical protein
GOZ73_RS01605754-14.586594hypothetical protein
GOZ73_RS01610757-16.200555hypothetical protein
GOZ73_RS01615762-18.007529S24 family peptidase
GOZ73_RS01620765-18.245955hypothetical protein
GOZ73_RS01625661-16.710313hypothetical protein
GOZ73_RS01630243-11.801136protein-export chaperone SecB
GOZ73_RS01640036-9.300086sel1 repeat family protein
GOZ73_RS01645-131-7.063091SEL1-like repeat protein
GOZ73_RS01650-125-4.439423site-specific integrase
GOZ73_RS01660-123-3.556018*rRNA maturation RNase YbeY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01545BINARYTOXINA340.002 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 33.9 bits (77), Expect = 0.002
Identities = 44/225 (19%), Positives = 74/225 (32%), Gaps = 70/225 (31%)

Query: 344 GEVWAKA------SRLEKN---ALFSYTGNGYARINNDLRKGKS--------NAKAKQIA 386
G++W K ++L N + Y GY INN L ++K I
Sbjct: 261 GDLWGKENYSDWSNKLTPNELADVNDYMRGGYTAINNYLISNGPLNNPNPELDSKVNNIE 320

Query: 387 KVIDRCKVPQDMVVFRGCGVY---------------KELKDAL--NWKGEEITDELVDML 429
+ +P +++V+R G E DA W+G+ IT
Sbjct: 321 NALKLTPIPSNLIVYRRSGPQEFGLTLTSPEYDFNKIENIDAFKEKWEGKVIT------- 373

Query: 430 NLSVVGNPLKDEGFMSAAVAEGKGFMNRPVLFRILLKKKTRAIYAEPFSRFGAGAGKDWD 489
+ + + MSA F R ++ RI + K + Y
Sbjct: 374 YPNFISTSIGSVN-MSA-------FAKRKIILRINIPKDSPGAYLSAIPG---------- 415

Query: 490 GLSPQTYFSSEDEIIIQKGGTLKFLQFHNQNG----KLIIDCELI 530
++ E E+++ G K + + KLI+D LI
Sbjct: 416 -------YAGEYEVLLNHGSKFKINKVDSYKDGTVTKLILDATLI 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01595PF05272290.025 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.025
Identities = 12/28 (42%), Positives = 15/28 (53%), Gaps = 1/28 (3%)

Query: 18 VIIYGPEGVGKSTLAAGL-PAPLFLDTE 44
V++ G G+GKSTL L F DT
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTH 626


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS01630SECBCHAPRONE362e-05 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 36.0 bits (83), Expect = 2e-05
Identities = 27/141 (19%), Positives = 62/141 (43%), Gaps = 16/141 (11%)

Query: 15 FQLRAYYVSEFSIRTNKRFDNKKEPDLKFNSLKFELDFEELDSEKGAQKSEHLKGFNLWE 74
Q++ YV + S F+ P + + +L D A++ +L+E
Sbjct: 20 LQIQRIYVKDVS------FEAPNLPHIFQQDWEPKLS---FDLSTEAKQVGD----DLYE 66

Query: 75 VRLNVSMDEKKVEENNIPYSFFIALSGLFSFPQKNPTPKDDQLRFVRINGPSILYGFVRE 134
V LN+S++ ++ + + +G+F+ + + P++L+ + RE
Sbjct: 67 VCLNISVETTMESSGDVAFICEVKQAGVFTISG---LEEMQMAHCLTSQCPNMLFPYARE 123

Query: 135 IVNSFYDKGPYPSPVLPTISF 155
+V+S ++G +P+ L ++F
Sbjct: 124 LVSSLVNRGTFPALNLSPVNF 144


8GOZ73_RS02005GOZ73_RS02175Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS02005-1183.159583thioredoxin family protein
GOZ73_RS02010-2243.509347GDSL-type esterase/lipase family protein
GOZ73_RS02015-1253.5930043'-5' exonuclease
GOZ73_RS02020-1233.628802DNA mismatch repair protein MutS
GOZ73_RS02025-1183.472071ribosome silencing factor
GOZ73_RS020300173.207751ATP-dependent zinc metalloprotease FtsH
GOZ73_RS02035-1201.800830hypothetical protein
GOZ73_RS02040-1180.814972AAA family ATPase
GOZ73_RS02045-2130.0257481,4-dihydroxy-2-naphthoyl-CoA synthase
GOZ73_RS02050-114-1.312814hypothetical protein
GOZ73_RS02055-119-2.376342hypothetical protein
GOZ73_RS02060017-3.391268hypothetical protein
GOZ73_RS02065-118-3.754556hypothetical protein
GOZ73_RS02070-2140.337630hypothetical protein
GOZ73_RS02075-1150.795354hypothetical protein
GOZ73_RS02080-1141.864568ATP-binding cassette domain-containing protein
GOZ73_RS02085-1132.640987hypothetical protein
GOZ73_RS02090-1123.735006hypothetical protein
GOZ73_RS020950123.554683ATP-binding cassette domain-containing protein
GOZ73_RS02100-2113.043607hypothetical protein
GOZ73_RS02105-1122.563402SDR family NAD(P)-dependent oxidoreductase
GOZ73_RS02110-1111.082274glycoside hydrolase family 75 protein
GOZ73_RS02115114-0.287197squalene/phytoene synthase family protein
GOZ73_RS02120113-1.765613sigma-54-dependent Fis family transcriptional
GOZ73_RS02125113-2.453316chloride channel protein
GOZ73_RS02130017-4.302657hypothetical protein
GOZ73_RS02135016-4.442125beta-N-acetylhexosaminidase
GOZ73_RS02140019-4.484797hypothetical protein
GOZ73_RS02145-119-3.958494hypothetical protein
GOZ73_RS02150019-4.328051glutamate decarboxylase
GOZ73_RS02155020-4.505359GNAT family N-acetyltransferase
GOZ73_RS02160022-3.697153DNA polymerase III subunit alpha
GOZ73_RS02165227-2.937100quinone-dependent dihydroorotate dehydrogenase
GOZ73_RS02170326-3.582014thiamine phosphate synthase
GOZ73_RS02175227-3.705619TetR/AcrR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02030HTHFIS412e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 40.6 bits (95), Expect = 2e-05
Identities = 57/282 (20%), Positives = 91/282 (32%), Gaps = 80/282 (28%)

Query: 279 RARLISPDKNKVTFKDVAGISEAKEEVWELVEFLRNPEKFRDLGATIPRGVLMVGAPGTG 338
+ R + + + G S A +E++ ++ +++ G GTG
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL----------ARLMQTDLTLMITGESGTG 172

Query: 339 KTLLARAIAGES---NASFYS-----ISGSDFVEMFVGV------GASRVRD-MFEQA-K 382
K L+ARA+ N F + I G GA FEQA
Sbjct: 173 KELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEG 232

Query: 383 RTAPSLIFIDEIDA-----------VGRQRGYGMGGGNDEREQTLNALLVEMDGFENNSN 431
T +F+DEI V +Q Y GG S+
Sbjct: 233 GT----LFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPI----------------RSD 272

Query: 432 VIVIAATNRADILDPALLRPGRFDRQ--------VVVNLPDVRGR--------EQILQVH 475
V ++AATN+ D+ + G F R+ V + LP +R R +Q
Sbjct: 273 VRIVAATNK-DL--KQSINQGLF-REDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQA 328

Query: 476 ARKVKMAPGVSFERIARGTS-GFSG--AQLANLVNEAALLAA 514
++ E + + + G +L NLV L
Sbjct: 329 EKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYP 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02080PF05272300.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.005
Identities = 14/42 (33%), Positives = 20/42 (47%), Gaps = 5/42 (11%)

Query: 14 GKQWIFENYNRVFTSGIT-----ILKGYSGCGKTTLLKILAG 50
GK + + RV G +L+G G GK+TL+ L G
Sbjct: 577 GKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02095PF05272340.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 0.004
Identities = 23/94 (24%), Positives = 33/94 (35%), Gaps = 18/94 (19%)

Query: 418 VVHLNAYHALRCRFSA-----------------GVLDEEYHAIRTLSVEGVTKEFLRSGR 460
+N H R A G ++Y R ++ V K L G
Sbjct: 526 AADMNRVHPFRDWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYIL-MGH 584

Query: 461 VLDNIDLTVKRGEMVCILGPSGSGKSTLLSMLAG 494
V ++ K V + G G GKSTL++ L G
Sbjct: 585 VARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02105DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 70.5 bits (172), Expect = 2e-16
Identities = 57/197 (28%), Positives = 91/197 (46%), Gaps = 9/197 (4%)

Query: 1 MHTARLQYDRIIITGASSGFGEAFAGTLAPHTAELVLIARNEAALRQLAAALEKRHSGLH 60
M+ ++ ITGA+ G GEA A TLA A + + N L ++ ++L+ H
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARH 58

Query: 61 ASVLPCDLADEASLNMLISRLDSLPPGRTLLINNAG---AGDYGDFSDGRWEKIRFLLRL 117
A P D+ D A+++ + +R++ +L+N AG G SD WE +
Sbjct: 59 AEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT---FSV 115

Query: 118 NVESLTRLCHALVPSMK-RNGGDIINLSSLGALLPIPDFAVYAATKAYVSSLSEALRLEL 176
N + ++ M R G I+ + S A +P A YA++KA ++ L LEL
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 177 REHGIRVLVVCPGPVST 193
E+ IR +V PG T
Sbjct: 176 AEYNIRCNIVSPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02110PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 33.6 bits (76), Expect = 0.002
Identities = 18/47 (38%), Positives = 21/47 (44%), Gaps = 1/47 (2%)

Query: 471 APKPPPAPKPESGGTAETPADKRKKEEAPGNKPAPAPAPPASPQTAP 517
A P P PE AE PA+ E PG +P P P P +P P
Sbjct: 320 AEAPNAQPLPEVS-PAENPANNPAPNENPGTRPNPEPDPDLNPDANP 365



Score = 31.3 bits (70), Expect = 0.012
Identities = 16/41 (39%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 478 PKPE-SGGTAETPADKRKKEEAPGNKPAPAPAPPASPQTAP 517
P+P+ + G+AE P + E +P PA PAP +P T P
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRP 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02120HTHFIS401e-138 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 401 bits (1033), Expect = e-138
Identities = 132/364 (36%), Positives = 194/364 (53%), Gaps = 30/364 (8%)

Query: 172 EERAALVAENEKLRELLTTN--PGELIGNCSTMLQVYEQIRQVAPSDATVLIRGSSGTGK 229
AL + +L + L+G + M ++Y + ++ +D T++I G SGTGK
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGK 173

Query: 230 ELVARAVVNLSGRKDKPLVTLNCAAMPENLLESELFGHEKGSFTGATSRRIGRAEAADGG 289
ELVARA+ + R++ P V +N AA+P +L+ESELFGHEKG+FTGA +R GR E A+GG
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGG 233

Query: 290 TLFLDEIGDLSLQMQVKLLRFLQEKTFSRVGSNRELHADVRFIAATSRNLEELMAASKFR 349
TLFLDEIGD+ + Q +LLR LQ+ ++ VG + +DVR +AAT+++L++ + FR
Sbjct: 234 TLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFR 293

Query: 350 EDLYYRLNIFPIVMPDLSKRKGDVMLLAEHFLSKFNLKYGKDIKRLSTPAINMLMAYHWP 409
EDLYYRLN+ P+ +P L R D+ L HF+ + K G D+KR A+ ++ A+ WP
Sbjct: 294 EDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWP 352

Query: 410 GNVRELENCMERAVITAQDDCIYGYNLPASLQMPSHDVPYSRDG---------------- 453
GNVRELEN + R D I + L+ D P +
Sbjct: 353 GNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENM 412

Query: 454 -----------EAPADLPTMVDSFERELIVAALKRSPGNMSAAARELGISPRVLHYKMHR 502
++ E LI+AAL + GN AA LG++ L K+
Sbjct: 413 RQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 503 LGLQ 506
LG+
Sbjct: 473 LGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02175HTHTETR691e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.3 bits (169), Expect = 1e-16
Identities = 27/148 (18%), Positives = 60/148 (40%), Gaps = 9/148 (6%)

Query: 6 KSQSDRSAETSGKILRAAQKLFALRGFNGVTMRAVASEAGVNLASIVYYFENKEGLYLAV 65
+ + ET IL A +LF+ +G + ++ +A AGV +I ++F++K L+ +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 66 YRQYAEPLMKARMKMLKEADLHPSLRAYARAFIEPAFRLFLDESLGGPDHVRLLWRLP-Q 124
+ + +++ +A + R + + E + RLL +
Sbjct: 63 WELSESNI--GELELEYQAKFPGDPLSVLREILIHVLESTVTE-----ERRRLLMEIIFH 115

Query: 125 EPEYLG-RKIYDEFYAPSIQEMIARIRR 151
+ E++G + + E RI +
Sbjct: 116 KCEFVGEMAVVQQAQRNLCLESYDRIEQ 143


9GOZ73_RS02405GOZ73_RS02435Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS02405-124-5.151504transcription antitermination factor NusB
GOZ73_RS02410029-8.872677signal recognition particle-docking protein
GOZ73_RS02415343-13.16601023S rRNA (adenine(2503)-C(2))-methyltransferase
GOZ73_RS02420445-13.656493replicative DNA helicase
GOZ73_RS02425653-16.051761hypothetical protein
GOZ73_RS02430336-9.629233hypothetical protein
GOZ73_RS02435123-5.470396IS1595 family transposase
10GOZ73_RS02615GOZ73_RS02640Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS026154170.021668PDZ domain-containing protein
GOZ73_RS02620416-1.237524hypothetical protein
GOZ73_RS02625315-1.267955phosphopantothenoylcysteine decarboxylase
GOZ73_RS02630314-1.338966hypothetical protein
GOZ73_RS02635211-0.667616M23 family metallopeptidase
GOZ73_RS02640212-0.886393tetratricopeptide repeat protein
11GOZ73_RS03320GOZ73_RS03390Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS03320323-4.353839hypothetical protein
GOZ73_RS03325223-4.218396hypothetical protein
GOZ73_RS03330324-5.766823DUF1348 family protein
GOZ73_RS03335322-6.367530hypothetical protein
GOZ73_RS03340420-5.730545DUF4339 domain-containing protein
GOZ73_RS03345323-6.462100DUF4339 domain-containing protein
GOZ73_RS03350118-4.646985DUF1311 domain-containing protein
GOZ73_RS03355117-3.784030DUF1311 domain-containing protein
GOZ73_RS03360-111-0.656543UvrB/UvrC motif-containing protein
GOZ73_RS033650161.346633hypothetical protein
GOZ73_RS033700142.275680endonuclease/exonuclease/phosphatase family
GOZ73_RS033750133.21993116S rRNA
GOZ73_RS033800144.070127LPS assembly lipoprotein LptE
GOZ73_RS03385-1233.441581tetratricopeptide repeat protein
GOZ73_RS03390-1193.023591phosphoribosyl-AMP cyclohydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS03330TONBPROTEIN375e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 37.3 bits (86), Expect = 5e-05
Identities = 14/60 (23%), Positives = 19/60 (31%), Gaps = 1/60 (1%)

Query: 1 MDIPLPPLEAPPLPPPMDIPLPPLPEVPLPPPVDIPLPPPAGIPPLPPLSGRPTVPRQKK 60
D+ P PP P ++ P P P + + P P P V Q K
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP-KPVKKVQEQPK 111



Score = 30.0 bits (67), Expect = 0.010
Identities = 15/52 (28%), Positives = 19/52 (36%), Gaps = 1/52 (1%)

Query: 10 APPLPPPMDIPLPPLPEV-PLPPPVDIPLPPPAGIPPLPPLSGRPTVPRQKK 60
L PP + PP P V P P P IP PP + +P +
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS0335560KDINNERMP280.041 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.4 bits (63), Expect = 0.041
Identities = 11/36 (30%), Positives = 18/36 (50%)

Query: 108 RRYYLSGEKSYEKAKGENPKPATITFTDKGGYCYYK 143
R Y + +Y A+G+N +T+TD G + K
Sbjct: 128 RPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTK 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS03370cdtoxinb310.002 Cytolethal distending toxin B signature.
		>cdtoxinb#Cytolethal distending toxin B signature.

Length = 269

Score = 31.1 bits (70), Expect = 0.002
Identities = 26/123 (21%), Positives = 44/123 (35%), Gaps = 13/123 (10%)

Query: 63 YAKAIDYSGGSYGVGILSREKPLSVKRIPLPGREEARVLLMAEF-RDYWFCVTHLSLTKE 121
Y A+D GG + ++S + V + P R+ R LL D +F +++
Sbjct: 103 YFSAVDALGGRVNLALVSNRRADEVFVLS-PVRQGGRPLLGIRIGNDAFFTAHAIAMRNN 161

Query: 122 DSSASIDMIAALAAKCSKP------FFIAGDFNLTPDS-----EPITRMKKYFILLSDPA 170
D+ A ++ + P + I GDFN P R I +
Sbjct: 162 DAPALVEEVYNFFRDSRDPVHQALNWMILGDFNREPADLEMNLTVPVRRASEIISPAAAT 221

Query: 171 QKT 173
Q +
Sbjct: 222 QTS 224


12GOZ73_RS03435GOZ73_RS12365Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS03435121-5.605244multidrug efflux SMR transporter
GOZ73_RS03440227-8.335610glycoside hydrolase
GOZ73_RS12175340-12.532959hypothetical protein
GOZ73_RS03445446-14.329154M60 family metallopeptidase
GOZ73_RS03450560-17.126265helix-turn-helix domain-containing protein
GOZ73_RS03455448-13.730484autotransporter outer membrane beta-barrel
GOZ73_RS03460638-11.959009hypothetical protein
GOZ73_RS03465324-6.330882hypothetical protein
GOZ73_RS03470216-1.725433hypothetical protein
GOZ73_RS034800111.258227*hypothetical protein
GOZ73_RS034850102.007785Holliday junction branch migration DNA helicase
GOZ73_RS034900110.632229ketoacyl-ACP synthase III
GOZ73_RS03495015-1.585909sulfite reductase subunit alpha
GOZ73_RS03500120-2.195846glycosyltransferase family 2 protein
GOZ73_RS03505017-1.605835hypothetical protein
GOZ73_RS03510220-1.043123tRNA (adenosine(37)-N6)-dimethylallyltransferase
GOZ73_RS03515218-3.564506GDSL-type esterase/lipase family protein
GOZ73_RS03520116-3.418238acyltransferase
GOZ73_RS03525114-3.114696acyltransferase
GOZ73_RS03530114-3.739016glycosyltransferase family 4 protein
GOZ73_RS03535115-4.442021glycosyltransferase family 4 protein
GOZ73_RS03540117-4.660020sugar transporter
GOZ73_RS03545215-2.574060hypothetical protein
GOZ73_RS03550314-2.092704glycosyltransferase family 8 protein
GOZ73_RS03555115-0.884507glycosyltransferase family 2 protein
GOZ73_RS035601241.506009hypothetical protein
GOZ73_RS121800302.440134DUF1232 domain-containing protein
GOZ73_RS035704233.936855TatD family hydrolase
GOZ73_RS035754232.825281hypothetical protein
GOZ73_RS035804220.326486succinate dehydrogenase cytochrome b subunit
GOZ73_RS03585421-0.580388fumarate reductase/succinate dehydrogenase
GOZ73_RS03590525-2.622412succinate dehydrogenase/fumarate reductase
GOZ73_RS03595628-4.208883sugar-binding protein
GOZ73_RS03600659-18.422393hypothetical protein
GOZ73_RS03605550-14.985389hypothetical protein
GOZ73_RS03610332-8.552130RHS repeat-associated core domain-containing
GOZ73_RS03615128-6.850037hypothetical protein
GOZ73_RS12365122-4.779776RHS repeat-associated core domain-containing
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS03435PF06580270.017 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.8 bits (59), Expect = 0.017
Identities = 17/86 (19%), Positives = 32/86 (37%), Gaps = 5/86 (5%)

Query: 14 IGWPLGFKLSQTPGTGPGKFAACIACSVFSMVLSGFLLWLAQRNIPIGTAYAVW-----T 68
IGW + + G ++ L G +L A R+ +
Sbjct: 18 IGWGVYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQII 77

Query: 69 GIGALGTFLVGILWFGDSSNIWRMLA 94
++G++WF +++IWR+LA
Sbjct: 78 LRVLPACVVIGMVWFVANTSIWRLLA 103


13GOZ73_RS04040GOZ73_RS04165Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS04040-2133.550206hypothetical protein
GOZ73_RS04045-2133.780322uroporphyrinogen-III C-methyltransferase
GOZ73_RS04050-2133.113316PepSY-like domain-containing protein
GOZ73_RS04055-2133.300422zinc metallopeptidase
GOZ73_RS04060-2133.770431autotransporter outer membrane beta-barrel
GOZ73_RS04065-1174.255902multidrug efflux RND transporter permease
GOZ73_RS040701174.466112efflux RND transporter periplasmic adaptor
GOZ73_RS040751193.900420MerR family DNA-binding transcriptional
GOZ73_RS040801204.971819thiamine pyrophosphate-dependent dehydrogenase
GOZ73_RS040850205.121947transketolase
GOZ73_RS040901163.6166532-oxo acid dehydrogenase subunit E2
GOZ73_RS040951182.119504UDP-2,3-diacylglucosamine diphosphatase LpxI
GOZ73_RS041000242.309502hypothetical protein
GOZ73_RS041051183.943236hypothetical protein
GOZ73_RS04110-1194.263952hypothetical protein
GOZ73_RS04115-2163.371836MmcQ/YjbR family DNA-binding protein
GOZ73_RS04120-2153.140409WecB/TagA/CpsF family glycosyltransferase
GOZ73_RS04125-1152.928190MBL fold metallo-hydrolase
GOZ73_RS04130-1153.114003DNA repair protein RecN
GOZ73_RS04135-2141.892000heavy metal translocating P-type ATPase
GOZ73_RS04140-2150.805367glycoside hydrolase family 16 protein
GOZ73_RS041450152.035970elongation factor P
GOZ73_RS041500152.714164amino acid-binding protein
GOZ73_RS041551182.529380acetylornithine transaminase
GOZ73_RS041601172.053148large conductance mechanosensitive channel
GOZ73_RS041652152.017176alkaline phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04060PRTACTNFAMLY373e-04 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 37.3 bits (86), Expect = 3e-04
Identities = 63/226 (27%), Positives = 88/226 (38%), Gaps = 21/226 (9%)

Query: 706 AGGLAVNS-LWSSASNAAALGGNVLSRLSVPRLTDRHARNLWGMGLGDFARQRSRGGVDG 764
GG+ + S LW + SNA + RL RL A WG G + +R G
Sbjct: 616 TGGVGLASTLWYAESNA------LSKRLGELRLNP-DAGGAWGRGFAQRQQLDNRAG-RR 667

Query: 765 YDYNGGGYSVGADSGMGGEGEGIWGIAFGQLCGHARS-RDFQGRNTQDTLMGSLYWGRLM 823
+D G+ +GAD + G G W + G L G+ R R F G T S++ G
Sbjct: 668 FDQKVAGFELGADHAVAVAG-GRWHL--GGLAGYTRGDRGFTGDGGGHT--DSVHVGGYA 722

Query: 824 EESNRACWIFKGSLTWAETRNNMTSRLGGAPASTGKWNNETWLAQAEVSRTADCAGGWRL 883
+ + +L + N+ A GK+ A E R A GW L
Sbjct: 723 TYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHADGWFL 782

Query: 884 TPFVRMEFTHGRQDAFREQGGYG-RDFGG-AALKRLSIPVGLEIGR 927
P + A+R G RD GG + L RL GLE+G+
Sbjct: 783 EPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRL----GLEVGK 824


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04065ACRIFLAVINRP10970.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1097 bits (2839), Expect = 0.0
Identities = 519/1027 (50%), Positives = 715/1027 (69%), Gaps = 6/1027 (0%)

Query: 1 MPDFFIRRPIFAWVVALLISLMGLLAIPSLPIAQYPDIAPPVITLRTTYPGASAQDTEEA 60
M +FFIRRPIFAWV+A+++ + G LAI LP+AQYP IAPP +++ YPGA AQ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTAVIEKEINGAPGLMYMSATSSSDGTVEIAATFTQGTDPDIAAVEVQNRLKIVESRLPE 120
VT VIE+ +NG LMYMS+TS S G+V I TF GTDPDIA V+VQN+L++ LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 SVRREGVFVEKSSNNIQAMISLSST-GSLNDTELGEWASARIIPELKRVKGVGRVELFGA 179
V+++G+ VEKSS++ + S ++ ++ ++ + L R+ GVG V+LFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 ETSMRIWPDPEKLEALGLTPTDIVNAVAAQSERIIIGDIGGAAVPESAPINAGVVAEDAF 239
+ +MRIW D + L LTP D++N + Q+++I G +GG +NA ++A+ F
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 240 RTPEEFASIALRTLSDGSSVKLGDVARVELGANSYAFFSRCNGQTATGMAIKMAPGSNAV 299
+ PEEF + LR SDGS V+L DVARVELG +Y +R NG+ A G+ IK+A G+NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 300 ETMGLLKAKMDELSRDFPPGVSYQIPYETTHFVEISIRKVISTLLEAVLLVFLVMFLFLQ 359
+T +KAK+ EL FP G+ PY+TT FV++SI +V+ TL EA++LVFLVM+LFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 360 NFRATLIPTLVVPVALLGTFAVMWLSGFSINMLTMFGMVLSIGILVDDAIVVVENVERIM 419
N RATLIPT+ VPV LLGTFA++ G+SIN LTMFGMVL+IG+LVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 420 MEEGLDARRATVKAMKQIGGAIVGITAVLVAVFVPMAFFSGAVGNIYRQFAVTLIVSISF 479
ME+ L + AT K+M QI GA+VGI VL AVF+PMAFF G+ G IYRQF++T++ +++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 480 SAFLALSLTPALCAAMLKPSSVEHQE-KKGFFGWFNRTIHRSTDRYGRAVSGMIRRPVRS 538
S +AL LTPALCA +LKP S EH E K GFFGWFN T S + Y +V ++ R
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 539 LFLYALIIAGAGYLYLTLPTSFLPEEDQGNFMAMVSLPAGTLQKETSVRLREVEDYLMRH 598
L +YALI+AG L+L LP+SFLPEEDQG F+ M+ LPAG Q+ T L +V DY +++
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 EP--VEHVYSVGGFSFMGSGTNMAMLFVGLKNWDERPGENSSVQAIIDRVNARFAGDGQM 656
E VE V++V GFSF G N M FV LK W+ER G+ +S +A+I R
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 657 MVMALNMPALPELGNSSGFDFRLQDKGGLGYARMAEARDELLARANANP-NLAGVYFSGQ 715
V+ NMPA+ ELG ++GFDF L D+ GLG+ + +AR++LL A +P +L V +G
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 716 ADTPRLHVSIDREKAFSMGVPIAEISNSLAVMFGSSYVGDFMHDSQVRRIMVQADGKSRL 775
DT + + +D+EKA ++GV +++I+ +++ G +YV DF+ +V+++ VQAD K R+
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 776 NGNDIEDLHVRNDQGKLLPLSSFVRLDWTAGPPQLTRYNNYPSFTINGSSAPGKSSGDAM 835
D++ L+VR+ G+++P S+F W G P+L RYN PS I G +APG SSGDAM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 836 NALEQITASLPQGVGFEWTGQSYEEKLAGSQATLLFGLSILIVFLVLAALYESWSIPLAV 895
+E + + LP G+G++WTG SY+E+L+G+QA L +S ++VFL LAALYESWSIP++V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 896 ILVVPLGLLGALLAVFLRDMPNDIYFKVGLITTIGLSAKNAILIVEVAREFYR-EGMNAA 954
+LVVPLG++G LLA L + ND+YF VGL+TTIGLSAKNAILIVE A++ EG
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 955 EAAVAAAKLRLRPILMTSLAFGAGVIPLAFAHGAAAGAQRAVGTGVLGGIITATLLAIFL 1014
EA + A ++RLRPILMTSLAF GV+PLA ++GA +GAQ AVG GV+GG+++ATLLAIF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1015 VPLFFRL 1021
VP+FF +
Sbjct: 1021 VPVFFVV 1027



Score = 89.5 bits (222), Expect = 4e-20
Identities = 91/526 (17%), Positives = 177/526 (33%), Gaps = 50/526 (9%)

Query: 532 IRRPVRSLFLYALIIAGAGYLYLTLPTSFLPEEDQGNFMAMVSLP-AGTLQKETSVRLRE 590
IRRP+ + L +++ L LP + P + P A + +V +
Sbjct: 6 IRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV-TQV 64

Query: 591 VEDYLMRHEPVEHVYSVGGFSFMGSGTNMAMLFVGLKNWDERPGENSSVQAIIDR----- 645
+E + + + ++ S + GS T G + + +Q
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSA--GSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 646 ----VNARFAGDGQMMVMALNMPALPELGNSSGFDFRLQDKGGLGYARMAEARDELLARA 701
++ + +MV S D + + + L+R
Sbjct: 123 QQQGISVEKSSSSYLMVAGFV---------SDNPGTTQDDISDYVASNVKDT----LSRL 169

Query: 702 NANPNLAGVYFSGQADTPRLHVSIDREKAFSMGVPIAEISNSLAV----MFGSSYVGDFM 757
+ V G + + +D + + ++ N L V + G
Sbjct: 170 ---NGVGDVQLFGAQY--AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPA 224

Query: 758 HDSQVRRIMVQADGKSRLNGNDIEDLHVR-NDQGKLLPLSSF--VRLDWTAGPPQLTRYN 814
Q + A + + N + + +R N G ++ L V L + R N
Sbjct: 225 LPGQQLNASIIAQTRFK-NPEEFGKVTLRVNSDGSVVRLKDVARVELG-GENYNVIARIN 282

Query: 815 NYPSFTINGSSAPGKSSGDAMNA----LEQITASLPQGVGFEW---TGQSYEEKLAGSQA 867
P+ + A G ++ D A L ++ PQG+ + T + +
Sbjct: 283 GKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVK 342

Query: 868 TLLFGLSILIVFLVLAALYESWSIPLAVILVVPLGLLGALLAVFLRDMPNDIYFKVGLIT 927
TL I++VFLV+ ++ L + VP+ LLG + + G++
Sbjct: 343 TLFEA--IMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVL 400

Query: 928 TIGLSAKNAILIVE-VAREFYREGMNAAEAAVAAAKLRLRPILMTSLAFGAGVIPLAFAH 986
IGL +AI++VE V R + + EA + ++ ++ A IP+AF
Sbjct: 401 AIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 987 GAAAGAQRAVGTGVLGGIITATLLAIFLVPLFFRLTAGKRGPDHPE 1032
G+ R ++ + + L+A+ L P +H E
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHE 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04070RTXTOXIND453e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.8 bits (106), Expect = 3e-07
Identities = 22/139 (15%), Positives = 44/139 (31%), Gaps = 7/139 (5%)

Query: 65 EVRARVTGIIQERCYQEGQTVRPGDLLFKIDP-------APLQAVLDECKAAVARARAVL 117
E++ I++E +EG++VR GD+L K+ Q+ L + + R + +
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 118 SDAEDKAARYSSLVAKGAVSIREHKQARAEEERARAEYAAAAASLEQARLNLEYTRVEAP 177
E L + ++ + +++ Q LNL+ R E
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 178 ISGRVRRALVTEGAFANQN 196

Sbjct: 218 TVLARINRYENLSRVEKSR 236



Score = 31.0 bits (70), Expect = 0.010
Identities = 12/94 (12%), Positives = 32/94 (34%), Gaps = 10/94 (10%)

Query: 99 LQAVLDECKAAVARARAVLSDAEDKAARYSSLVAKGAVSIREHKQARAEEERARAEYAAA 158
L K+ + + + + A+++ + L + + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLF---------KNEILDKLRQTTDNIGLL 314

Query: 159 AASLEQARLNLEYTRVEAPISGRV-RRALVTEGA 191
L + + + + AP+S +V + + TEG
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGG 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04130GPOSANCHOR310.012 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 31.2 bits (70), Expect = 0.012
Identities = 43/250 (17%), Positives = 64/250 (25%), Gaps = 35/250 (14%)

Query: 148 DAFGEHAPLVHAYSDAWRQWQDARRAYDDLEHAEAATAREIELLRHQVDEIDSAAFTPEE 207
+ E + + + +LE +A + +E + + T E
Sbjct: 89 ELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEA 148

Query: 208 VL--------TLEERWQRARNGTRLREQVSKMLS---MLEETDVPGLGTQLREL------ 250
LE+ + A N + K L E L L
Sbjct: 149 EKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTA 208

Query: 251 -TRAAHELERMDASTAAWLAPLAGVNLELKEIEGRLADYSAELDCDPRELFQLEERINLL 309
+ LE A+ AA A L + L+ E LE R L
Sbjct: 209 DSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLE---AEKAALEARQAEL 265

Query: 310 ESLKMKYGPSFEDVCSRREEAASRLDRIEHRTERLEELRASIAVLRKQMDAAGQALTRAR 369
E + + +I+ L A A L Q Q L R
Sbjct: 266 EKALEGA----------MNFSTADSAKIKTLEAEKAALEAEKADLEHQS----QVLNANR 311

Query: 370 QDSAPRLAAS 379
Q L AS
Sbjct: 312 QSLRRDLDAS 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04160MECHCHANNEL1262e-40 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 126 bits (317), Expect = 2e-40
Identities = 63/132 (47%), Positives = 86/132 (65%), Gaps = 8/132 (6%)

Query: 9 GFLEEFKAFALKGNVIDLAVAVVIGGAFNKIVSVFVSSIITPIIGYLTSGVNLAYLVLKL 68
++EF+ FA++GNV+DLAV V+IG AF KIVS V+ II P +G L G++ + L
Sbjct: 2 SIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVTL 61

Query: 69 GD-------VELKYGELLQATIDFLIIAFSVFVAIKLIGALRRKSEEAPAAPAPKPDDVV 121
D V + YG +Q DFLI+AF++F+AIKLI L RK EE AAPAP + V
Sbjct: 62 RDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPT-KEEV 120

Query: 122 LLEQIRDILQKQ 133
LL +IRD+L++Q
Sbjct: 121 LLTEIRDLLKEQ 132


14GOZ73_RS04435GOZ73_RS04475Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS04435023-3.156627TatD family hydrolase
GOZ73_RS04440-125-4.707463LysM peptidoglycan-binding domain-containing
GOZ73_RS04445-126-5.246912L,D-transpeptidase family protein
GOZ73_RS04450-128-5.484317MATE family efflux transporter
GOZ73_RS04455-129-4.823294hypothetical protein
GOZ73_RS04460-127-5.110131hypothetical protein
GOZ73_RS04465-125-4.999919DNA polymerase III subunit beta
GOZ73_RS04470-125-4.854017LamG domain-containing protein
GOZ73_RS04475-225-4.833563hypothetical protein
15GOZ73_RS04805GOZ73_RS04880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS04805216-0.352093aspartate 1-decarboxylase
GOZ73_RS04810116-4.013420preprotein translocase subunit SecG
GOZ73_RS04815016-4.917090prepilin-type N-terminal cleavage/methylation
GOZ73_RS04820017-5.0676773-deoxy-7-phosphoheptulonate synthase
GOZ73_RS04825-117-3.628362hypothetical protein
GOZ73_RS04830117-1.821983AEC family transporter
GOZ73_RS04835-115-2.787727NYN domain-containing protein
GOZ73_RS048401120.621916HAD hydrolase-like protein
GOZ73_RS04845211-0.692278sugar O-acetyltransferase
GOZ73_RS04850111-0.634456glutamine-hydrolyzing GMP synthase
GOZ73_RS04855114-1.474629IMP dehydrogenase
GOZ73_RS04860215-2.459260low molecular weight phosphotyrosine protein
GOZ73_RS04865114-2.058154RHS repeat-associated core domain-containing
GOZ73_RS04870018-5.476417hypothetical protein
GOZ73_RS04875-217-2.220757glycosyltransferase
GOZ73_RS04880-116-3.087609hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04810SECGEXPORT491e-10 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 49.2 bits (117), Expect = 1e-10
Identities = 26/107 (24%), Positives = 52/107 (48%), Gaps = 2/107 (1%)

Query: 14 QLLFAVLVIVSLLLLGVVLMQRPKQEGLGAAFGAAITDQAFGAR-TTDVLKKATVYFGTA 72
+ L V +IV++ L+G++++Q+ K +GA+FGA + FG+ + + + + T T
Sbjct: 3 EALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATL 62

Query: 73 FMVLCLGLGMLINRQHVKSSESLLSPEMMKAAARQEASVPAKTPEEL 119
F ++ L LG IN + + + + PAK ++
Sbjct: 63 FFIISLVLGN-INSNKTNKGSEWENLSAPAKTEQTQPAAPAKPTSDI 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04815BCTERIALGSPG614e-14 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 60.7 bits (147), Expect = 4e-14
Identities = 26/63 (41%), Positives = 37/63 (58%)

Query: 8 RKSKGFTLIELLVVIAIILTLAGIAMTVVNGVMEKSKVTTARSDCNGLQTAVENYMLDNG 67
K +GFTL+E++VVI II LA + + + G EK+ A SD L+ A++ Y LDN
Sbjct: 5 DKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNH 64

Query: 68 RVP 70
P
Sbjct: 65 HYP 67


16GOZ73_RS05165GOZ73_RS05285Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS051652211.931357trypsin-like peptidase domain-containing
GOZ73_RS051701222.235904hypothetical protein
GOZ73_RS051750211.7368103-isopropylmalate dehydratase small subunit
GOZ73_RS051800201.8716683-isopropylmalate dehydratase large subunit
GOZ73_RS05185-1192.134957lipoyl(octanoyl) transferase LipB
GOZ73_RS051900192.329360N-acetylmuramoyl-L-alanine amidase
GOZ73_RS05195-1202.602399recombinase RecA
GOZ73_RS052000193.032420FtsQ-type POTRA domain-containing protein
GOZ73_RS052050193.783160D-alanine--D-alanine ligase
GOZ73_RS052100203.879288UDP-N-acetylmuramate--L-alanine ligase
GOZ73_RS052150203.970341undecaprenyldiphospho-muramoylpentapeptide
GOZ73_RS052200224.261204putative lipid II flippase FtsW
GOZ73_RS05225-1213.774718LysM peptidoglycan-binding domain-containing
GOZ73_RS05230-1213.510040UDP-N-acetylmuramoyl-L-alanine--D-glutamate
GOZ73_RS05235-1214.190837phospho-N-acetylmuramoyl-pentapeptide-
GOZ73_RS05240-1193.758051UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-
GOZ73_RS052450183.399445UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--2,
GOZ73_RS05250-1152.990684penicillin-binding protein 2
GOZ73_RS05255-1101.839457hypothetical protein
GOZ73_RS05260-1112.92569516S rRNA (cytosine(1402)-N(4))-methyltransferase
GOZ73_RS05265-2141.863160MraZ protein
GOZ73_RS05270-2141.628515MBL fold metallo-hydrolase
GOZ73_RS05275-1141.266002trimeric intracellular cation channel family
GOZ73_RS05280-1161.973786IdeS/Mac family cysteine endopeptidase
GOZ73_RS05285-1173.027872tRNA uridine-5-carboxymethylaminomethyl(34)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05165V8PROTEASE772e-17 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 77.0 bits (189), Expect = 2e-17
Identities = 42/212 (19%), Positives = 81/212 (38%), Gaps = 43/212 (20%)

Query: 50 YKGVVKIEMDSLTPDYATPWNTGRYQGGIGTGFLIGENAFMTNAHVVSNAERIYI----- 104
Y V I++++ T + I +G ++G++ +TN HVV
Sbjct: 87 YAPVTYIQVEAPTGTF------------IASGVVVGKDTLLTNKHVVDATHGDPHALKAF 134

Query: 105 ------SMYGDSRKIPARVKFIAHDADLALLE----ADDPRPFKGIRPFEFSKN-LPHLE 153
Y + ++ + + DLA+++ + + ++P S N +
Sbjct: 135 PSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVN 194

Query: 154 DEVRVIGYPIGGNR--LSVTRGVVSRIDFTTYAHPRNTEHLTIQVDAAINPGNSGGPVLM 211
+ V GYP + ++G ++ + + +Q D + GNSG PV
Sbjct: 195 QNITVTGYPGDKPVATMWESKGKITYL-----------KGEAMQYDLSTTGGNSGSPVFN 243

Query: 212 GNK-VIGVAFQGLNNANNTGYVIPTPVIRHFL 242
VIG+ + G+ N N I +R+FL
Sbjct: 244 EKNEVIGIHWGGVPNEFNGAVFI-NENVRNFL 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05225PF03544300.006 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 30.3 bits (68), Expect = 0.006
Identities = 24/102 (23%), Positives = 36/102 (35%), Gaps = 10/102 (9%)

Query: 52 LVVLLLLHLLVIGGVYVRSTWFRNTAETVEMAAALPAPPTVPQRPAPAAPPAALPQVPQA 111
++ + +H V+ G+ S V LPAP A PQ Q
Sbjct: 18 TLLSVCIHGAVVAGLLYTS---------VHQVIELPAPAQPISVTMVAPADLEPPQAVQP 68

Query: 112 PPAAPVTITQPEQVVDARPNRQPAPAVEEHIPDAAPVAGPAR 153
PP PV +PE P ++ +E+ P P P +
Sbjct: 69 PPE-PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109



Score = 28.0 bits (62), Expect = 0.039
Identities = 12/85 (14%), Positives = 20/85 (23%)

Query: 85 ALPAPPTVPQRPAPAAPPAALPQVPQAPPAAPVTITQPEQVVDARPNRQPAPAVEEHIPD 144
A +P P P + P+ + A + A
Sbjct: 90 APVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKP 149

Query: 145 AAPVAGPARHIVRTGDTWERVARDN 169
VA R + R + A+
Sbjct: 150 VTSVASGPRALSRNQPQYPARAQAL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05260NUCEPIMERASE300.029 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.8 bits (67), Expect = 0.029
Identities = 20/79 (25%), Positives = 31/79 (39%), Gaps = 11/79 (13%)

Query: 247 GGH-TELLLEQGATVWGIDR-----DADARRAAMLRLARFGSRFKVLAGNFQD---VESI 297
G H ++ LLE G V GID D ++A + LA+ F+ + D + +
Sbjct: 13 GFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQ--PGFQFHKIDLADREGMTDL 70

Query: 298 LSQQGVSRVDGLLADLGVS 316
+ RV L V
Sbjct: 71 FASGHFERVFISPHRLAVR 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05265PF07520280.014 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 28.4 bits (63), Expect = 0.014
Identities = 12/38 (31%), Positives = 19/38 (50%)

Query: 33 GCALLLLSGRRLDLPTVKAYTREKFQQLIDKIETTPGY 70
GC ++LL+GR LP V+A E ++ + Y
Sbjct: 803 GCDVVLLTGRPSRLPAVRAIVEEMLVVPPHRLISMHRY 840


17GOZ73_RS05850GOZ73_RS05875Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS05850-213-3.269022prolipoprotein diacylglyceryl transferase
GOZ73_RS05855-118-7.368513DUF3313 family protein
GOZ73_RS05860021-7.302482aminopeptidase P N-terminal domain-containing
GOZ73_RS05865118-6.687193ATP phosphoribosyltransferase
GOZ73_RS05870017-6.305756hypothetical protein
GOZ73_RS05875-114-4.024661hypothetical protein
18GOZ73_RS05990GOZ73_RS06220Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS05990015-3.046195MATE family efflux transporter
GOZ73_RS05995-118-3.498546hypothetical protein
GOZ73_RS06005126-4.906923*hypothetical protein
GOZ73_RS06010227-4.096840M15 family metallopeptidase
GOZ73_RS06015224-3.057112hypothetical protein
GOZ73_RS06020222-1.799356hypothetical protein
GOZ73_RS060251191.795889hypothetical protein
GOZ73_RS060301142.561531hypothetical protein
GOZ73_RS060351141.816832hypothetical protein
GOZ73_RS060402140.823257glycosyltransferase
GOZ73_RS06045317-0.819945alpha-1,2-fucosyltransferase
GOZ73_RS06050319-2.474559hypothetical protein
GOZ73_RS06055526-5.982375RHS repeat-associated core domain-containing
GOZ73_RS06060557-16.676069hypothetical protein
GOZ73_RS06065557-16.596880hypothetical protein
GOZ73_RS06070555-15.282639RHS repeat-associated core domain-containing
GOZ73_RS06075639-11.933252hypothetical protein
GOZ73_RS06080639-12.157405hypothetical protein
GOZ73_RS06085734-8.915598RHS repeat-associated core domain-containing
GOZ73_RS06090739-10.303536hypothetical protein
GOZ73_RS06095542-12.200555RHS repeat protein
GOZ73_RS06100547-13.421556DUF2778 domain-containing protein
GOZ73_RS12385368-18.966936hypothetical protein
GOZ73_RS06110359-15.712762hypothetical protein
GOZ73_RS06115244-13.341877hypothetical protein
GOZ73_RS06120233-10.243484hypothetical protein
GOZ73_RS06125127-7.226716hypothetical protein
GOZ73_RS06130123-6.134795hypothetical protein
GOZ73_RS06140123-4.796542hypothetical protein
GOZ73_RS06150123-5.322248potassium-transporting ATPase subunit KdpA
GOZ73_RS06155223-4.216801potassium-transporting ATPase subunit KdpB
GOZ73_RS06160326-3.627357K(+)-transporting ATPase subunit C
GOZ73_RS06165329-3.076527sensor protein KdpD
GOZ73_RS06170233-2.188551SUMF1/EgtB/PvdO family nonheme iron enzyme
GOZ73_RS06175232-2.720527hypothetical protein
GOZ73_RS06180129-1.375371MBL fold metallo-hydrolase
GOZ73_RS061850280.454423sigma-70 family RNA polymerase sigma factor
GOZ73_RS06190-1181.876202serine/threonine protein kinase
GOZ73_RS061950143.530330MerR family transcriptional regulator
GOZ73_RS062002134.091817aminotransferase class I/II-fold pyridoxal
GOZ73_RS062052143.826381hypothetical protein
GOZ73_RS062102173.754912VWA domain-containing protein
GOZ73_RS062153233.014400BatA and WFA domain-containing protein
GOZ73_RS062203222.493944DUF58 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06050PF05704402e-05 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 39.9 bits (93), Expect = 2e-05
Identities = 23/118 (19%), Positives = 40/118 (33%), Gaps = 22/118 (18%)

Query: 1 MNTRTVTSLWVGGE--LPLMSVLCIKSFLDH--GHAFQLFT---YRNYDNIPAGTLVRDA 53
M + + W+ G P + C+ S + + Y+ + +IP
Sbjct: 66 MRQKYIFICWLQGIEKAPYIVQQCVASVKKNSGDFKVIIIDGNNYKEWVDIP-------- 117

Query: 54 RDILPEEAIFHDSHNSLAPFSDWFRMKFLSQEGGFWVDMDVICLGDELPTSPLWFCRE 111
D L + + + A FSD R+ L + GG W+D V P +
Sbjct: 118 -DFLIKR--WQEGKMLDAWFSDILRLFLLCKYGGLWIDATVYMFDKV----PNYIVES 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06115RTXTOXINA326e-05 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 6e-05
Identities = 15/33 (45%), Positives = 21/33 (63%), Gaps = 1/33 (3%)

Query: 3 SGAIGSAGAGVGAAP-GSVVGARIGALIGDVTG 34
S + S +G+ AA S+VGA + AL+G VTG
Sbjct: 372 STVLASVSSGISAAATTSLVGAPVSALVGAVTG 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS0613060KDINNERMP270.018 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 26.8 bits (59), Expect = 0.018
Identities = 16/70 (22%), Positives = 28/70 (40%), Gaps = 6/70 (8%)

Query: 8 IKDTGSWSPFYYW-IVGVVTMCALMWMLPNNYPREPWENIIRVLLGALFIFLFFVCPVWM 66
I D + P+Y I+ VTM + M P +P + I + +F F P +
Sbjct: 455 IHDLSAQDPYYILPILMGVTMFFIQKMSPTTVT-DPMQQKIMTFMPVIFTVFFLWFPSGL 513

Query: 67 IISAIRYWKT 76
++ Y+
Sbjct: 514 VL----YYIV 519


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06160PF07201310.004 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.6 bits (69), Expect = 0.004
Identities = 18/113 (15%), Positives = 41/113 (36%), Gaps = 20/113 (17%)

Query: 96 RIDSFLKKHPYLERKDVPSEMVTASASGLDPHITPESAY---------------VQVKRV 140
+++ +L K P LE+K SE+++ ++ + ++ AY ++
Sbjct: 86 QVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDA 145

Query: 141 AKARGMEEGAVRSLVDQAVEKPLLGMFGTE---KVNVLKLNIALEKADPSGKH 190
K R E + LV+QA+ + G + ++ +
Sbjct: 146 LKGRP-ELAHLSHLVEQALVS-MAEEQGETIVLGARITPEAYRESQSGVNPLQ 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06170ACRIFLAVINRP330.003 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.5 bits (74), Expect = 0.003
Identities = 15/57 (26%), Positives = 22/57 (38%), Gaps = 2/57 (3%)

Query: 44 QLNEETKRLIASYRRDPSEANRLALRKQVGINYDKVLDRKKAKLEELKRTARHASKI 100
+L E +IA P+ L ++ G N KAKL EL+ K+
Sbjct: 269 ELGGENYNVIARINGKPAAG--LGIKLATGANALDTAKAIKAKLAELQPFFPQGMKV 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06190YERSSTKINASE320.009 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 32.4 bits (73), Expect = 0.009
Identities = 30/104 (28%), Positives = 46/104 (44%), Gaps = 10/104 (9%)

Query: 188 RCGILHRDIKPANLLLDEA-GEVHVSDFGLAFLLQGRNEVIEKEGAQSGTLRYMSPERLV 246
+ G++H DIKP N++ D A GE V D G L R+ E + T + +PE V
Sbjct: 263 KAGVVHNDIKPGNVVFDRASGEPVVIDLG----LHSRS----GEQPKGFTESFKAPELGV 314

Query: 247 HGVN-TFSSDQYAFGVTLYELVAKSPPFPGLAPEELMERICREP 289
+ + SD + TL + P + P + + I EP
Sbjct: 315 GNLGASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITSEP 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06220SHIGARICIN290.045 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 29.0 bits (65), Expect = 0.045
Identities = 18/77 (23%), Positives = 27/77 (35%), Gaps = 14/77 (18%)

Query: 67 ILAVLLLTWVLAAPVWPGKGSGQTVVFILD--DSANMAPFRQEAVNAVSEDMDTILRCGI 124
+ ++L+LT L AP G V F L S++ F A+ I
Sbjct: 6 VFSLLILTLFLTAPAVEG-----DVSFRLSGATSSSYGVFISNLRKALP---YERKLYDI 57

Query: 125 PVTWVLMGSRASGQPLY 141
P+ + S G Y
Sbjct: 58 PL----LRSTLPGSQRY 70


19GOZ73_RS06470GOZ73_RS06700Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS06470127-4.015058Gfo/Idh/MocA family oxidoreductase
GOZ73_RS06475127-4.785230hypothetical protein
GOZ73_RS06480128-4.7630603-dehydroquinate synthase
GOZ73_RS06485132-5.541531hypothetical protein
GOZ73_RS06515135-5.699762**alpha-N-acetylglucosaminidase
GOZ73_RS06520232-5.889124PEP-CTERM sorting domain-containing protein
GOZ73_RS06525231-5.053239OPT/YSL family transporter
GOZ73_RS06530236-6.048162orotidine-5'-phosphate decarboxylase
GOZ73_RS06535335-6.482239energy transducer TonB
GOZ73_RS06540334-6.516299putative manganese-dependent inorganic
GOZ73_RS06545335-6.907171hypothetical protein
GOZ73_RS06550128-4.925933YbaB/EbfC family nucleoid-associated protein
GOZ73_RS06555125-3.624751DNA polymerase III subunit gamma/tau
GOZ73_RS12390125-2.484636hypothetical protein
GOZ73_RS06560126-1.578184hypothetical protein
GOZ73_RS06565228-0.847447hypothetical protein
GOZ73_RS065752250.742864*tyrosine-type recombinase/integrase
GOZ73_RS06580331-0.551510hypothetical protein
GOZ73_RS06585328-2.132128hypothetical protein
GOZ73_RS06590326-1.717992portal protein
GOZ73_RS06595423-2.037156hypothetical protein
GOZ73_RS06600425-1.467256N-acetylmuramoyl-L-alanine amidase
GOZ73_RS06605327-2.142258hypothetical protein
GOZ73_RS06610125-1.533081hypothetical protein
GOZ73_RS06615027-2.091363hypothetical protein
GOZ73_RS06620027-2.271876hypothetical protein
GOZ73_RS06625028-2.396998hypothetical protein
GOZ73_RS06630231-3.555568hypothetical protein
GOZ73_RS06640229-3.358069replicative DNA helicase
GOZ73_RS06645434-4.903158hypothetical protein
GOZ73_RS06650436-4.678707hypothetical protein
GOZ73_RS06655535-4.561666hypothetical protein
GOZ73_RS06660230-4.920861hypothetical protein
GOZ73_RS06665534-5.278879hypothetical protein
GOZ73_RS06670737-5.327476hypothetical protein
GOZ73_RS06675640-6.410257hypothetical protein
GOZ73_RS06680737-7.426249hypothetical protein
GOZ73_RS06685744-11.382559hypothetical protein
GOZ73_RS06690745-11.250262hypothetical protein
GOZ73_RS06695135-0.973059hypothetical protein
GOZ73_RS06700235-0.881701DUF1778 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06550BCTERIALGSPD270.015 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 26.8 bits (59), Expect = 0.015
Identities = 16/75 (21%), Positives = 26/75 (34%), Gaps = 8/75 (10%)

Query: 27 QTVTTEGAGGKLKVTATC-DGNLTELVIDPSIIDPSDSEFLQELLLQTINAAIAKGKETA 85
TV + G KLKV +G+ L I+ + +D+ + A
Sbjct: 477 NTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADA---ASSTSSDLGATFNTRTVNN 533

Query: 86 AAEMKK----LTGGL 96
A + + GGL
Sbjct: 534 AVLVGSGETVVVGGL 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06555TONBPROTEIN310.014 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.7 bits (69), Expect = 0.014
Identities = 11/43 (25%), Positives = 17/43 (39%)

Query: 651 PEPVQEELAPLPAPPPSASTIPAPKPHTPQKKTAEPVQEASKE 693
PEP E + P P P PKP K + ++ ++
Sbjct: 71 PEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS12390ANTHRAXTOXNA260.016 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 25.9 bits (56), Expect = 0.016
Identities = 10/21 (47%), Positives = 16/21 (76%)

Query: 29 TVFSGKDLVTHGSERLEDEFP 49
T ++G D+V HG+E+ +EFP
Sbjct: 567 TGYTGGDVVNHGTEQDNEEFP 587


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06560GPOSANCHOR310.006 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.8 bits (69), Expect = 0.006
Identities = 27/151 (17%), Positives = 61/151 (40%), Gaps = 4/151 (2%)

Query: 23 DNEEALTKERRECASRYKNLNGLRGEFTEEKTKYVDFLVKNEALTNDINALTKEKDELEV 82
+ + E A+ L K + L + AL EK +LE
Sbjct: 243 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302

Query: 83 KNQELASANEAKKSDLQTQQTALAELQSKSKDMESIQAIADR-IKGLEEESKQLEVVKQA 141
++Q L + ++ + DL + A +L+++ + +E I++ + L + K+
Sbjct: 303 QSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362

Query: 142 EQGKHDAIVAETEQLVVNNAALRQLKADQDA 172
+ +H + + + ++ A+ + L+ D DA
Sbjct: 363 LEAEHQKLEEQNK---ISEASRQSLRRDLDA 390



Score = 30.8 bits (69), Expect = 0.006
Identities = 20/108 (18%), Positives = 33/108 (30%), Gaps = 1/108 (0%)

Query: 62 KNEALTNDINALTKEKDELEVKNQELASANEAKKSDLQTQQTALAELQSKSKDM-ESIQA 120
+ L + K + L + A + + AL + S I+
Sbjct: 191 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 250

Query: 121 IADRIKGLEEESKQLEVVKQAEQGKHDAIVAETEQLVVNNAALRQLKA 168
+ LE +LE + A A+ + L AAL KA
Sbjct: 251 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA 298



Score = 27.7 bits (61), Expect = 0.047
Identities = 20/104 (19%), Positives = 39/104 (37%), Gaps = 1/104 (0%)

Query: 62 KNEALTNDINALTKEKDELEVKNQELASANEAKKSDLQTQQTALAELQSKSKDME-SIQA 120
KN L+ + AL DEL + L + + + EL+++ D+E +++
Sbjct: 72 KNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEG 131

Query: 121 IADRIKGLEEESKQLEVVKQAEQGKHDAIVAETEQLVVNNAALR 164
+ + K LE K A + + E + + A
Sbjct: 132 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADS 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06565GPOSANCHOR330.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.1 bits (75), Expect = 0.001
Identities = 25/148 (16%), Positives = 57/148 (38%), Gaps = 6/148 (4%)

Query: 48 EIRTWTSLRNDMIKNRGLAAEENLSLISNKASVTQQRDEQLSKEAELKEEEVQLNADLSK 107
+ + +N L N ++ DE + + KE+ + + LS+
Sbjct: 51 TLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSE 110

Query: 108 INKQIEELKSKLQDFG------VSSVQEIQEKMESLEASNKKLQEDIDQIKSATEVASKR 161
+I+EL+++ D ++ K+++LEA L ++ A E A
Sbjct: 111 KASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 170

Query: 162 RAEQASELASRQKEQAEYRAALAKNGEE 189
++++ + + E+A A A+ +
Sbjct: 171 STADSAKIKTLEAEKAALEARQAELEKA 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06580PF05616310.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 31.3 bits (70), Expect = 0.003
Identities = 16/42 (38%), Positives = 25/42 (59%), Gaps = 4/42 (9%)

Query: 30 PSASGRPSLANPAPEPTPADDEQPNPPPPSDPGS---PQGDP 68
P ++ P+ A P PE +PA++ NP P +PG+ P+ DP
Sbjct: 317 PGSAEAPN-AQPLPEVSPAENPANNPAPNENPGTRPNPEPDP 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06600STREPKINASE290.011 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.9 bits (64), Expect = 0.011
Identities = 16/57 (28%), Positives = 24/57 (42%)

Query: 13 TGARGNGLEEHDVACVIARHLFAQLKDMGHTVHVLDFPDKGNTEDLNATIKVANADG 69
+GA + LE+ D+ I L A + V+DF D N + A+ DG
Sbjct: 93 SGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEVIDFASDATITDRNGKVYFADKDG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06625IGASERPTASE320.010 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.010
Identities = 28/228 (12%), Positives = 61/228 (26%), Gaps = 11/228 (4%)

Query: 135 LSNTASIIAAQQAQQNANTSSTNAETASQAAKTATDA------AATAAARAEEAEGYAGS 188
+ A ++ + A S ++T + + AT+ A A +A
Sbjct: 1025 VPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNE 1084

Query: 189 AWASKSAAADSATAAG--TSATNAARDAKSANDAKTDVESLAATWP--ETVSNGEKKIVD 244
S S ++ T T+ AK + +V + + + S + +
Sbjct: 1085 VAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144

Query: 245 AGNEAVTAIQDKQADSVLAVGRASQTAQQNIAGARTDAVAAVQTAQERAVGAITPLVQRA 304
E + K+ S ++ + + V T P
Sbjct: 1145 PARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTP 1204

Query: 305 ETAKEAIDQA-EGRINTAATNAATSATSAANSATEAQQALEAIPQVDA 351
T + ++ + + S AT + + D
Sbjct: 1205 ATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDL 1252


20GOZ73_RS07045GOZ73_RS07105Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS070450173.335899DNA gyrase subunit A
GOZ73_RS070500164.035036lactate utilization protein
GOZ73_RS07055-1173.915833U32 family peptidase
GOZ73_RS07060-2141.267026hypothetical protein
GOZ73_RS12395-2130.539713SUMF1/EgtB/PvdO family nonheme iron enzyme
GOZ73_RS07070-218-1.443300cell surface protein
GOZ73_RS07075125-4.308713Rid family hydrolase
GOZ73_RS07080133-6.212405hypothetical protein
GOZ73_RS07085235-6.986321NADPH-dependent assimilatory sulfite reductase
GOZ73_RS07090234-6.729344ABC transporter permease
GOZ73_RS07095134-6.529830ABC transporter ATP-binding protein
GOZ73_RS07100229-5.293194ABC transporter substrate-binding protein
GOZ73_RS07105024-4.254037sulfate adenylyltransferase subunit CysN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07100SECA300.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.013
Identities = 16/49 (32%), Positives = 22/49 (44%), Gaps = 9/49 (18%)

Query: 18 LVPAFLASCREASERQDQIRFGLFPNVTHVQGLVARHFS-----RTGEG 61
L+P A REAS+R FG+ + G + + RTGEG
Sbjct: 63 LIPEAFAVVREASKR----VFGMRHFDVQLLGGMVLNERCIAEMRTGEG 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07105TCRTETOQM571e-10 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 56.8 bits (137), Expect = 1e-10
Identities = 45/137 (32%), Positives = 60/137 (43%), Gaps = 17/137 (12%)

Query: 24 VDDGKSTLIGRLLYDSKLIFDDQLAELRKASEKNGTAGAGKIDYALLLDGLRAEREQGIT 83
VD GK+TL LLY+S I +L + K GT D ER++GIT
Sbjct: 12 VDAGKTTLTESLLYNSGAI--TELGSVDK-----GTT---------RTDNTLLERQRGIT 55

Query: 84 IDVAYRYFTTPRRKFIIADCPGHEQYTRNMATGASTADAAIILIDARHGVLTQTKRHAFI 143
I F K I D PGH + + S D AI+LI A+ GV QT+
Sbjct: 56 IQTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHA 115

Query: 144 VSLLKIRHLIVAVNKMD 160
+ + I I +NK+D
Sbjct: 116 LRKMGIP-TIFFINKID 131


21GOZ73_RS07180GOZ73_RS07325Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS07180321-4.965243HDOD domain-containing protein
GOZ73_RS07185319-4.868579sugar O-acetyltransferase
GOZ73_RS07195319-4.150994DMT family transporter
GOZ73_RS07200017-3.708429nitroreductase family protein
GOZ73_RS07205020-1.743038radical SAM mobile pair protein B
GOZ73_RS07210024-1.329684DUF1848 domain-containing protein
GOZ73_RS072151290.004973winged helix DNA-binding protein
GOZ73_RS072200221.430177polymer-forming cytoskeletal protein
GOZ73_RS07225-2172.095340ribosome recycling factor
GOZ73_RS07230-2172.232854UMP kinase
GOZ73_RS07235-2172.515442mechanosensitive ion channel family protein
GOZ73_RS07240-1163.551988peroxiredoxin
GOZ73_RS07245-1164.282267sodium:calcium symporter
GOZ73_RS07250-1164.939842hypothetical protein
GOZ73_RS072550214.714271translation initiation factor IF-1
GOZ73_RS072600164.146026recombination mediator RecR
GOZ73_RS072650122.780037hypothetical protein
GOZ73_RS072700152.812705beta-ketoacyl-ACP synthase II
GOZ73_RS07275-2162.610568peptide-methionine (R)-S-oxide reductase
GOZ73_RS07280-2193.379908DNA polymerase IV
GOZ73_RS07285-2203.815252MFS transporter
GOZ73_RS07290-1173.995628PEP-CTERM sorting domain-containing protein
GOZ73_RS07300-1204.443012*ABC transporter permease
GOZ73_RS07305-1193.868585ABC transporter permease
GOZ73_RS073100173.991388ATP-binding cassette domain-containing protein
GOZ73_RS073150203.575291efflux RND transporter periplasmic adaptor
GOZ73_RS073200263.237480alanine--tRNA ligase
GOZ73_RS073250283.254750CDP-diacylglycerol--serine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07220RTXTOXIND290.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.003
Identities = 12/35 (34%), Positives = 18/35 (51%), Gaps = 2/35 (5%)

Query: 37 SDKGKVTIGETAQIKGDITAGDVRIYGNVEGKVTS 71
D G + +G+ A IK + A YG + GKV +
Sbjct: 375 KDIGFINVGQNAIIK--VEAFPYTRYGYLVGKVKN 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07230CARBMTKINASE320.002 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.002
Identities = 15/62 (24%), Positives = 23/62 (37%), Gaps = 13/62 (20%)

Query: 124 RSYLDNNKIVIFAAGTGNPFFST-------------DTTAALRASEINAEVVMKATSVDG 170
+ ++ IVI + G G P D A E+NA++ M T V+G
Sbjct: 180 KKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNG 239

Query: 171 VY 172

Sbjct: 240 AA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07285TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 44/223 (19%), Positives = 81/223 (36%), Gaps = 18/223 (8%)

Query: 133 TVSVAWSVLSDVMKPGQMKRLFALVASGSSLGAMAGPAVTAALAGVAGYLWLFLAAAVFL 192
T +VA + ++D+ + R F +++ G +AGP + + G + + F AAA+
Sbjct: 112 TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNG 171

Query: 193 ALAMLAGMYLHRWRDGNSPEDEETGVLLPADCRERPLGGNPFAGASAVFRSPFLMG-IGL 251
+ L G + A R G A AVF L+G +
Sbjct: 172 LNFLTGCFLLPESHKGERRPLRREALNPLASFRWA-RGMTVVAALMAVFFIMQLVGQVPA 230

Query: 252 FIILLAGTNTFLYFELMSVVASSFPDPVRQTQVFGALDVVVQGCTMLLQVFLAGRIVRKF 311
+ ++ G + F + ++ + FG L L Q + G + +
Sbjct: 231 ALWVIFGEDRFHWDATTIGISLA---------AFGILHS-------LAQAMITGPVAARL 274

Query: 312 GLAALLAAVPVLISLGFVWMAFAPVFAVVAVVMAVRRIGEYGM 354
G L + G++ +AFA + +M + G GM
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07300ABC2TRNSPORT563e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 56.5 bits (136), Expect = 3e-11
Identities = 36/190 (18%), Positives = 79/190 (41%), Gaps = 7/190 (3%)

Query: 176 FIVPGLIAVLVLINSILSG-SLSIAREREEGTFDQLLVAPYTPGEILLGKGTASVVTGIM 234
F+ G++A + + + R + T++ +L G+I+LG+ + +
Sbjct: 68 FLAAGMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAAL 127

Query: 235 QAVFVVLVAMFWFRIPFQGSIWLLSAALLLFIVTAAAIGLCISSFAQSLQQAIVGTFLLL 294
+ +VA + ++ L L + A+ +G+ +++ A S I L++
Sbjct: 128 AGAGIGVVAAALGYTQWLSLLYALPVIALTGLAFAS-LGMVVTALAPSYDYFIFYQTLVI 186

Query: 295 VPMVMLSGFATPISSMPEIFQDLTLLNPMRYGLELIQRIFLEGAG-----FLDLWPLFAA 349
P++ LSG P+ +P +FQ P+ + ++LI+ I L + ++
Sbjct: 187 TPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIV 246

Query: 350 IMAVTVAAVL 359
I A+L
Sbjct: 247 IPFFLSTALL 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07305ABC2TRNSPORT362e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 35.7 bits (82), Expect = 2e-04
Identities = 32/145 (22%), Positives = 59/145 (40%), Gaps = 2/145 (1%)

Query: 196 AREYERGTLESLFVTPVGSGEILAAKAATNFLLGMVSLAISMLFAAFVFGIPIRGSLTLL 255
R + T E++ T + G+I+ + A ++ A + AA + SL
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQ-WLSLLYA 150

Query: 256 LAVSALFLIVALGLGLVISTATKNQFLACQFAIMGTFMPALMLSGFLYDILNMPPAVRAI 315
L V AL + LG+V+ TA + F P L LSG ++ + +P +
Sbjct: 151 LPVIALTGLAFASLGMVV-TALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTA 209

Query: 316 TYLIPARYYVTLLQTLFLAGDIPSV 340
+P + + L++ + L + V
Sbjct: 210 ARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07315RTXTOXIND754e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 75.3 bits (185), Expect = 4e-17
Identities = 46/287 (16%), Positives = 98/287 (34%), Gaps = 31/287 (10%)

Query: 67 PGQKLATLETVRLRQAADEARQIAEAARQNYLRVKNGPRAEEIAQARANVQAAEATLNNA 126
P + + E V + + + ++ + + E A + E
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 127 GMRSKRLEALAETKSISRQEADDAVASRQVAAANLDVARKQLELLLAGSRE--------- 177
R +L ++I++ + A L V + QLE + +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 178 --------EDVAQALAQYNQAKASLTIREQNLKDAVLYAPGNGVVRN-RILEKGDMASPQ 228
+ + Q L E+ + +V+ AP + V+ ++ +G + +
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 229 KPVYNIS-LNHTKWVRAYLTESQLGKVKPGFSATVRNDSFPDTGFKGTVGFISSVAEFTP 287
+ + I + T V A + +G + G +A ++ ++FP T + VG + ++
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA- 412

Query: 288 KNVETPDLRTALVYEVRIIVD-------DPDNRLRLGAPATVTIPLG 327
D R LV+ V I ++ + + L G T I G
Sbjct: 413 ----IEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455


22GOZ73_RS07655GOZ73_RS07940Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS07655-227-4.451372zeta toxin family protein
GOZ73_RS07660-227-4.078563amino acid ABC transporter ATP-binding protein
GOZ73_RS07665-230-4.671063ABC transporter substrate-binding
GOZ73_RS07670-236-4.753986hypothetical protein
GOZ73_RS07675-135-4.203799hypothetical protein
GOZ73_RS07680038-9.551486hypothetical protein
GOZ73_RS07685-128-3.452795terminase small subunit
GOZ73_RS07690-120-1.098804MarR family winged helix-turn-helix
GOZ73_RS07695-121-0.349775hypothetical protein
GOZ73_RS07700-122-0.346024hypothetical protein
GOZ73_RS07705-125-0.104915hypothetical protein
GOZ73_RS07710-1240.646668hypothetical protein
GOZ73_RS077150250.521502DUF2800 domain-containing protein
GOZ73_RS07720430-1.896661hypothetical protein
GOZ73_RS07725432-2.353154hypothetical protein
GOZ73_RS07730129-4.059234hypothetical protein
GOZ73_RS07735141-12.157488hypothetical protein
GOZ73_RS07740354-15.532902hypothetical protein
GOZ73_RS07745459-16.639704hypothetical protein
GOZ73_RS07750464-18.376096hypothetical protein
GOZ73_RS07755563-17.940116hypothetical protein
GOZ73_RS07760669-19.168539sce7726 family protein
GOZ73_RS07765570-18.963840hypothetical protein
GOZ73_RS07770-140-1.901932ImmA/IrrE family metallo-endopeptidase
GOZ73_RS07775040-1.453271hypothetical protein
GOZ73_RS07785-139-0.340572thermonuclease family protein
GOZ73_RS07790-1390.594653YHYH domain-containing protein
GOZ73_RS07795-1400.654799hypothetical protein
GOZ73_RS07800-1401.967544hypothetical protein
GOZ73_RS07805342-0.702017hypothetical protein
GOZ73_RS07810-1430.511244hypothetical protein
GOZ73_RS078150440.325129hypothetical protein
GOZ73_RS07820041-0.236617hypothetical protein
GOZ73_RS07825038-0.580742hypothetical protein
GOZ73_RS07830036-1.752175hypothetical protein
GOZ73_RS07835032-1.381074portal protein
GOZ73_RS07840121-0.997994hypothetical protein
GOZ73_RS078452240.437750hypothetical protein
GOZ73_RS07850129-1.515974hypothetical protein
GOZ73_RS07855030-2.411056hypothetical protein
GOZ73_RS07860028-1.371134hypothetical protein
GOZ73_RS07865126-0.796679hypothetical protein
GOZ73_RS07870027-0.036279BACON domain-containing protein
GOZ73_RS07875016-0.362756phage capsid protein
GOZ73_RS078800120.843879site-specific integrase
GOZ73_RS07890-2123.632445*N-acetyl-gamma-glutamyl-phosphate reductase
GOZ73_RS07895-1143.775193bifunctional glutamate
GOZ73_RS07900-1183.983697YkgJ family cysteine cluster protein
GOZ73_RS07905-1162.412502sel1 repeat family protein
GOZ73_RS07910-1142.879957PEP-CTERM sorting domain-containing protein
GOZ73_RS07915-1123.650105excinuclease ABC subunit UvrA
GOZ73_RS079200132.374452polymer-forming cytoskeletal protein
GOZ73_RS079251161.736031polymer-forming cytoskeletal protein
GOZ73_RS079302202.166041hypothetical protein
GOZ73_RS079351194.520784hypothetical protein
GOZ73_RS079401213.319611phosphoribosylformimino-5-aminoimidazole
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07825PF05616310.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 31.3 bits (70), Expect = 0.003
Identities = 23/66 (34%), Positives = 31/66 (46%), Gaps = 8/66 (12%)

Query: 13 REEAIPGSEGEGPGGGAPPPASPVDSPPPANPPVPS-NPYDFSGGAEQPDPDPGSPPPLS 71
R + PGS E P P SP ++P AN P P+ NP G P+PDP P +
Sbjct: 312 RPDLTPGS-AEAPNAQPLPEVSPAENP--ANNPAPNENP----GTRPNPEPDPDLNPDAN 364

Query: 72 PQEETE 77
P + +
Sbjct: 365 PDTDGQ 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07870FLAGELLIN320.008 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 32.3 bits (73), Expect = 0.008
Identities = 22/217 (10%), Positives = 62/217 (28%), Gaps = 13/217 (5%)

Query: 137 MATTAAQAFAYHALQASKNAHADAETASQAAKTATDAAATAATRTEEAEGYAGSAWASKR 196
A + + ++ + A+ A A + + Y G +
Sbjct: 230 NAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDT 289

Query: 197 AAADSAAAADTSATNADRDAKSAHDAKTAVESLAATWPETVSNGEKKIVDAGNEAVTAIQ 256
+ ++ N ++ + D ++ A ++ N +V+ +
Sbjct: 290 KTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK 349

Query: 257 DKQADSVLAVGRASQTAQQNIAGARTDAVAAVQTAQERAVGAITPLVQRAETAKEDIDQA 316
++ A + + I + A + G + + A
Sbjct: 350 NESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTA---------- 399

Query: 317 ERRINTAATNAANSATSAGNSATAAANALAAIPQVDE 353
+ ++A + ++A A+ +A+ +VD
Sbjct: 400 ---SGVSTLINEDAAAAKKSTANPLASIDSALSKVDA 433


23GOZ73_RS08050GOZ73_RS08105Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS08050213-0.077125hypothetical protein
GOZ73_RS08055213-0.440379serine hydrolase
GOZ73_RS08060214-1.1314856-phosphofructokinase
GOZ73_RS08065112-0.752919DUF805 domain-containing protein
GOZ73_RS08070-110-0.159950zinc transporter ZupT
GOZ73_RS08075-110-0.050962thioredoxin family protein
GOZ73_RS122300100.624592hypothetical protein
GOZ73_RS080801111.300345hypothetical protein
GOZ73_RS080901122.063710valine--pyruvate transaminase
GOZ73_RS080952121.770843YicC family protein
GOZ73_RS081003142.082414guanylate kinase
GOZ73_RS081054152.158613SLC13 family permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08055BLACTAMASEA352e-04 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 35.2 bits (81), Expect = 2e-04
Identities = 24/99 (24%), Positives = 42/99 (42%), Gaps = 3/99 (3%)

Query: 20 QAQERPSYIAIEANSGKLLFSSNAEEKRPVASLSQVATAMVALDWVARTRLPLDTFIAVP 79
Q R I ++ SG+ L + A+E+ P+ S +V L V L+ I
Sbjct: 35 QLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYR 94

Query: 80 QEAFGLRAGNP-MDLQPGDRLTLRDALYSTLLGSDNVSA 117
Q+ L +P + D +T+ + + + SDN +A
Sbjct: 95 QQD--LVDYSPVSEKHLADGMTVGELCAAAITMSDNSAA 131


24GOZ73_RS08235GOZ73_RS08260Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS08235117-4.832327SpoIIE family protein phosphatase
GOZ73_RS08240423-5.278409sigma-70 family RNA polymerase sigma factor
GOZ73_RS08245324-4.795266endonuclease/exonuclease/phosphatase family
GOZ73_RS08250424-5.120104SNF2-related protein
GOZ73_RS08255222-4.996240DUF4391 domain-containing protein
GOZ73_RS08260120-4.125266site-specific DNA-methyltransferase
25GOZ73_RS08390GOZ73_RS08745Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS08390-126-3.175417ankyrin repeat domain-containing protein
GOZ73_RS08395-130-3.785151Cof-type HAD-IIB family hydrolase
GOZ73_RS08400130-3.578052NADH-quinone oxidoreductase subunit A
GOZ73_RS08405233-3.894017MFS transporter
GOZ73_RS08410229-2.596035tyrosine-type recombinase/integrase
GOZ73_RS084152241.001149hypothetical protein
GOZ73_RS084201182.906604hypothetical protein
GOZ73_RS084251182.850578type II toxin-antitoxin system HipA family
GOZ73_RS124053223.826424replicative DNA helicase
GOZ73_RS084352224.385363hypothetical protein
GOZ73_RS084402224.289056hypothetical protein
GOZ73_RS084452203.160826hypothetical protein
GOZ73_RS084502182.044694hypothetical protein
GOZ73_RS084554172.811876hypothetical protein
GOZ73_RS084604183.203486hypothetical protein
GOZ73_RS084656183.567419hypothetical protein
GOZ73_RS084705173.880632hypothetical protein
GOZ73_RS084806194.228114hypothetical protein
GOZ73_RS084856275.389740hypothetical protein
GOZ73_RS084906296.011429hypothetical protein
GOZ73_RS084955295.885415DUF1320 family protein
GOZ73_RS085005315.439457hypothetical protein
GOZ73_RS085056356.294482hypothetical protein
GOZ73_RS085103356.224199hypothetical protein
GOZ73_RS085152335.058602hypothetical protein
GOZ73_RS085203324.728948DUF935 family protein
GOZ73_RS085253334.270595terminase family protein
GOZ73_RS085303301.809107hypothetical protein
GOZ73_RS08535135-1.767514hypothetical protein
GOZ73_RS08540132-3.942296hypothetical protein
GOZ73_RS08545034-3.478578N-acetylmuramoyl-L-alanine amidase
GOZ73_RS08550036-4.255822hypothetical protein
GOZ73_RS08555229-4.203684PH domain-containing protein
GOZ73_RS08560123-2.184280hypothetical protein
GOZ73_RS08565122-0.596612helix-turn-helix transcriptional regulator
GOZ73_RS085703260.821282hypothetical protein
GOZ73_RS085754275.344797hypothetical protein
GOZ73_RS085803265.358827hypothetical protein
GOZ73_RS085852296.049580host-nuclease inhibitor Gam family protein
GOZ73_RS085902296.221707hypothetical protein
GOZ73_RS085953285.821126hypothetical protein
GOZ73_RS086003275.887444hypothetical protein
GOZ73_RS086053274.141945ATP-binding protein
GOZ73_RS086102232.575391hypothetical protein
GOZ73_RS086151232.677035hypothetical protein
GOZ73_RS086205220.686629hypothetical protein
GOZ73_RS086258201.817600hypothetical protein
GOZ73_RS08630525-6.629778site-specific DNA-methyltransferase
GOZ73_RS08635427-8.334497hypothetical protein
GOZ73_RS08640422-6.072030hypothetical protein
GOZ73_RS08645426-7.800401hypothetical protein
GOZ73_RS08650324-7.022274DUF932 domain-containing protein
GOZ73_RS08655325-7.873299hypothetical protein
GOZ73_RS08660322-4.936841M48 family metallopeptidase
GOZ73_RS08665322-4.801656HsdR family type I site-specific
GOZ73_RS08670226-6.201343restriction endonuclease subunit S
GOZ73_RS08675327-5.086015N-6 DNA methylase
GOZ73_RS08680434-6.089806restriction endonuclease
GOZ73_RS08685538-6.644999PD-(D/E)XK nuclease family protein
GOZ73_RS08690547-9.044565hypothetical protein
GOZ73_RS08700650-10.154365GYF domain-containing protein
GOZ73_RS08705546-8.813703DUF4339 domain-containing protein
GOZ73_RS08710639-7.575884GYF domain-containing protein
GOZ73_RS08715537-6.560397hypothetical protein
GOZ73_RS08720633-4.687928ankyrin repeat domain-containing protein
GOZ73_RS08725729-2.042032hypothetical protein
GOZ73_RS08730827-1.674624hypothetical protein
GOZ73_RS08735321-1.855542hypothetical protein
GOZ73_RS08740321-1.830184serine/threonine protein kinase
GOZ73_RS08745219-0.801378VWA domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08415PF07132290.045 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 28.5 bits (63), Expect = 0.045
Identities = 27/112 (24%), Positives = 44/112 (39%), Gaps = 13/112 (11%)

Query: 70 WKAQQTVNKPTESQ-----QLDNVIEFPALAKAASASPGLANLLAKALGAGVGGTVGALG 124
+ +Q + N + SQ Q N+ E L+ + + +++ LG G+GG +LG
Sbjct: 24 FPSQSSQNGGSPSQSAFGGQRSNIAE--QLSDIMTTMMFMGSMMGGGLGGGLGGLGSSLG 81

Query: 125 GGAGAAHLARHLGAGDTATALTGILGMGAGGLGGVLIGSSVANKVTPPNEGD 176
G G L G L LG G G G +G ++ + N
Sbjct: 82 GLGGG------LLGGGLGGGLGSSLGSGLGSALGGGLGGALGAGMNAMNPSA 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08480PYOCINKILLER350.001 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 34.8 bits (79), Expect = 0.001
Identities = 48/270 (17%), Positives = 97/270 (35%), Gaps = 20/270 (7%)

Query: 109 NPSTHRYFVRLSDAVTVSVDSVDLSWWAYAHALKAQESMESLADAWPETVSEAKQDVHAA 168
N + R V ++ + + + + L E +A ++ +
Sbjct: 109 NLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSLGAKNFLTRTAEEIGEQAVREGNIN 168

Query: 169 KEDALAAIRDKQADAVLAVGRAQKTATDKIAGTQA--DAVSAVQAAGKEAQGTITPLVQQ 226
+A D++ + + A K T+ I+ Q + ++A +A+ + A
Sbjct: 169 GPEAYMRFLDREMEGLTAAYNV-KLFTEAISSLQIRMNTLTAAKASIEAAAANKAREQAA 227

Query: 227 AETAKDDIDQAERRINTAATNAATSATSAANSATAAQQALAAMPQVDASGNMTLAGGLTA 286
AE + +QA ++ A N + + ATAA + L + Q AS ++ +
Sbjct: 228 AEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAV 287

Query: 287 AGAINANGGVNIPLAVGAATDTGAV----------NRLYAAGMAGVTGLFTLPAYLDTGA 336
G + A+ + + + T + + YA GM LP ++ A
Sbjct: 288 LGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGMDAAK--LGLPPSVNLNA 345

Query: 337 ITATGTAATTVSIPGQYAKTNIPSGTHSTI 366
+ A+ TV +P TN G +T+
Sbjct: 346 VAK---ASGTVDLP--MRLTNEARGNTTTL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08545NUCEPIMERASE270.044 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.1 bits (60), Expect = 0.044
Identities = 10/25 (40%), Positives = 13/25 (52%)

Query: 25 GVAVQIAAHLARYLRERGQSVSVID 49
G A I H+++ L E G V ID
Sbjct: 7 GAAGFIGFHVSKRLLEAGHQVVGID 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08655TACYTOLYSIN280.039 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 28.0 bits (62), Expect = 0.039
Identities = 15/63 (23%), Positives = 26/63 (41%), Gaps = 8/63 (12%)

Query: 111 KIEKDFLYEVCNYYRNIRNN----FIHKGEK----SPTKPKKKNGEYIVLSKKIKKLEIH 162
K E+D E+ + ++ N GE P + KK ++IV+ +K K +
Sbjct: 103 KSEEDHTEEINDKIYSLNYNELEVLAKNGETIENFVPKEGVKKADKFIVIERKKKNINTT 162

Query: 163 NSD 165
D
Sbjct: 163 PVD 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08675GPOSANCHOR320.012 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.012
Identities = 23/132 (17%), Positives = 44/132 (33%), Gaps = 15/132 (11%)

Query: 631 EELHQKRLQVEQMEAELDSQEEEQRGDNGFFFEWEKVDLAALKSRIKEIKGDLLAAEELA 690
++ + +EA E+ G F + + L++ ++ + E
Sbjct: 246 AKIKTLEAEKAALEARQAELEKALEGAMNFS-TADSAKIKTLEAEKAALEAEKADLEH-- 302

Query: 691 FLRDFLDRKNTVKEWKAEIKTLETKLSIQTRQACDALSEEEGKMLVVDKKWLESLRSGIH 750
++ V A ++L L R+A L E K L K E+ R +
Sbjct: 303 --------QSQVLN--ANRQSLRRDLDAS-REAKKQLEAEHQK-LEEQNKISEASRQSLR 350

Query: 751 EDVERVMHAFAQ 762
D++ A Q
Sbjct: 351 RDLDASREAKKQ 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08715GPOSANCHOR310.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.8 bits (69), Expect = 0.005
Identities = 13/52 (25%), Positives = 24/52 (46%), Gaps = 1/52 (1%)

Query: 89 ENAIAEAERNGNTGRANMLREQLARANSNNQWYLNELKQKADAGDAGAQREL 140
+N I+EA R + RE + + +Q L E + ++A +R+L
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQ-KLEEQNKISEASRQSLRRDL 388


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08740YERSSTKINASE411e-05 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 40.9 bits (95), Expect = 1e-05
Identities = 28/95 (29%), Positives = 43/95 (45%), Gaps = 6/95 (6%)

Query: 147 VLDGLDYLHGRGIYHRDIKPGNILFDE-RGLPVLIDFGAALNKPEVTCTVTQGEFSYAYA 205
+LD ++L G+ H DIKPGN++FD G PV+ID G E F+ ++
Sbjct: 254 LLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGE-----QPKGFTESFK 308

Query: 206 SPEQITGKGEIGPWTDFYALAATWYGLISSIPVEP 240
+PE G +D + + +T I P
Sbjct: 309 APELGVGNLGASEKSDVFLVVSTLLHCIEGFEKNP 343


26GOZ73_RS09320GOZ73_RS09725Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS09320331-1.666198hypothetical protein
GOZ73_RS09325432-1.434435substrate-binding domain-containing protein
GOZ73_RS09330432-0.949888hypothetical protein
GOZ73_RS093403290.158079*tyrosine-type recombinase/integrase
GOZ73_RS12300123-0.017932Eco57I restriction-modification methylase
GOZ73_RS093500212.201257DEAD/DEAH box helicase family protein
GOZ73_RS09355-1194.629075bifunctional adenosylcobinamide
GOZ73_RS09360-1153.310916TIM barrel protein
GOZ73_RS09365-2102.446390adenosylcobinamide-GDP ribazoletransferase
GOZ73_RS09370-1101.546019nicotinate-nucleotide--dimethylbenzimidazole
GOZ73_RS09375-18-0.714127adenosylcobinamide-phosphate synthase CbiB
GOZ73_RS09380-19-1.282052cobyric acid synthase
GOZ73_RS09385013-2.216166TonB-dependent receptor
GOZ73_RS09390-112-1.603713extracellular solute-binding protein
GOZ73_RS09395-113-1.969796beta-galactosidase
GOZ73_RS09400-116-3.166343AIPR family protein
GOZ73_RS09405-1130.943077porin family protein
GOZ73_RS09410-2132.244345amidohydrolase
GOZ73_RS09415-1171.560566dihydrolipoyl dehydrogenase
GOZ73_RS094200201.336522nucleotidyltransferase substrate binding
GOZ73_RS094250221.695379nucleotidyltransferase domain-containing
GOZ73_RS094300242.219363dihydrolipoyllysine-residue succinyltransferase
GOZ73_RS094351222.0279992-oxoglutarate dehydrogenase E1 component
GOZ73_RS094402191.071372cytochrome d ubiquinol oxidase subunit II
GOZ73_RS094450161.082274cytochrome ubiquinol oxidase subunit I
GOZ73_RS09450-1160.692338L,D-transpeptidase
GOZ73_RS09455-117-0.010116low molecular weight protein arginine
GOZ73_RS09460-216-0.087622hypothetical protein
GOZ73_RS09465-218-0.469318phosphotransferase
GOZ73_RS09470-217-1.658532tryptophan synthase subunit beta
GOZ73_RS09475-120-3.823477MarR family transcriptional regulator
GOZ73_RS09480-122-4.279762ferritin
GOZ73_RS09485-123-3.605878hypothetical protein
GOZ73_RS09490-132-5.970735RidA family protein
GOZ73_RS09500-227-5.335914*YdcF family protein
GOZ73_RS09505-218-4.320334alkaline phosphatase family protein
GOZ73_RS09510-216-2.021634hypothetical protein
GOZ73_RS09525-216-1.624175*ATP-dependent DNA helicase
GOZ73_RS09535113-0.821706*hypothetical protein
GOZ73_RS095400141.161933succinate--CoA ligase subunit alpha
GOZ73_RS095451170.703453ADP-forming succinate--CoA ligase subunit beta
GOZ73_RS095502230.426714histidinol dehydrogenase
GOZ73_RS09555220-0.224638hypothetical protein
GOZ73_RS095600130.125046na+-dependent transporter-like protein
GOZ73_RS095650110.509606hypothetical protein
GOZ73_RS095701161.838464hypothetical protein
GOZ73_RS095751131.198901hypothetical protein
GOZ73_RS095801131.308184exodeoxyribonuclease VII large subunit
GOZ73_RS095851141.115914AMP-binding protein
GOZ73_RS095902151.359409autotransporter domain-containing protein
GOZ73_RS095952141.518068autotransporter outer membrane beta-barrel
GOZ73_RS096000120.307105autotransporter domain-containing protein
GOZ73_RS096050172.031176thioesterase
GOZ73_RS09610-1192.500381SDR family oxidoreductase
GOZ73_RS09615-1203.927447HAMP domain-containing histidine kinase
GOZ73_RS096200203.834138response regulator transcription factor
GOZ73_RS09625-111-1.603886hypothetical protein
GOZ73_RS09630013-3.445894porphobilinogen synthase
GOZ73_RS09635120-6.107024hypothetical protein
GOZ73_RS09640226-8.404547threonine--tRNA ligase
GOZ73_RS09645541-12.516836translation initiation factor IF-3
GOZ73_RS09650649-14.339216DEAD/DEAH box helicase family protein
GOZ73_RS09655538-10.648723type I restriction-modification system subunit
GOZ73_RS09660230-7.281703restriction endonuclease subunit S
GOZ73_RS09665123-4.600389tyrosine-type recombinase/integrase
GOZ73_RS09670122-2.723430restriction endonuclease subunit S
GOZ73_RS09675118-0.056294restriction endonuclease subunit S
GOZ73_RS096801235.787951DUF418 domain-containing protein
GOZ73_RS09685-1194.6562261-deoxy-D-xylulose-5-phosphate reductoisomerase
GOZ73_RS09690-1153.645448pantetheine-phosphate adenylyltransferase
GOZ73_RS09695-1153.590871peptide chain release factor-like protein
GOZ73_RS09700-1132.063532hypothetical protein
GOZ73_RS09705-1130.612074UTP--glucose-1-phosphate uridylyltransferase
GOZ73_RS09710-116-1.568629methyltransferase domain-containing protein
GOZ73_RS09715121-3.219190hypothetical protein
GOZ73_RS09720215-4.212099tetratricopeptide repeat protein
GOZ73_RS09725215-4.462007LicD family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09380ENTEROTOXINA300.039 Heat-labile enterotoxin A chain signature.
		>ENTEROTOXINA#Heat-labile enterotoxin A chain signature.

Length = 258

Score = 30.0 bits (67), Expect = 0.039
Identities = 16/42 (38%), Positives = 21/42 (50%), Gaps = 9/42 (21%)

Query: 268 RNRERRDDLFRRLSSLPGAAVLPSEANFLLFRLAGAPHGLAA 309
RNRE RD +R L+ + P+E +RLAG P A
Sbjct: 159 RNREYRDRYYRNLN------IAPAEDG---YRLAGFPPDHQA 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09440PF06580330.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.001
Identities = 20/135 (14%), Positives = 48/135 (35%), Gaps = 10/135 (7%)

Query: 64 VWLITGGGALFAAFPYVYASVFSGFYLAFMLLLLTLIFRAVSIEFRSKQPMKWWRRGWDT 123
V+ +TG G S+ ++ M L+LT +R+ F +Q G
Sbjct: 22 VYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRS----FIKRQGWLKLNMGQII 77

Query: 124 TFSI-SSLLAALLIGVAMGNVTKGIPLDDHGNFTGTFLSLLNPYSILLGLTTVALFAMHG 182
+ + ++ ++ VA ++ + + + T P ++ + V + M
Sbjct: 78 LRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFT-----LPLALSIIFNVVVVTFMWS 132

Query: 183 GIYLLMKTQGSLQEQ 197
+Y + ++
Sbjct: 133 LLYFGWHFFKNYKQA 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09485IGASERPTASE310.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.013
Identities = 24/97 (24%), Positives = 38/97 (39%), Gaps = 17/97 (17%)

Query: 454 NNTVSRPRTVSAQAPSPK----------KTPKEKRKPEEKG-QDAPTTDSAAPEKQKAAS 502
N ++R P+P + K++ K EK QDA T + E K A
Sbjct: 1014 NEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAK 1073

Query: 503 S------SEAAAAREEASEENSVVTETEESPSVPAEE 533
S A+ + + + TET+E+ +V EE
Sbjct: 1074 SNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09590TYPE3OMBPROT270.014 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 26.6 bits (58), Expect = 0.014
Identities = 18/70 (25%), Positives = 33/70 (47%), Gaps = 10/70 (14%)

Query: 3 LVGSNIFGRE-ALAEIRVNAAQDLGDRRGE-TNISLLGNPGFAQSVR--------GAKVG 52
L +++ G E ++ + +VNA + L +RGE T + + + G + V V
Sbjct: 282 LTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDGLLKEVSVNLKVVTFNFGVN 341

Query: 53 TTALQLGAGL 62
AL++G G
Sbjct: 342 ELALKMGLGW 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09595BINARYTOXINB320.029 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.0 bits (72), Expect = 0.029
Identities = 19/112 (16%), Positives = 37/112 (33%), Gaps = 26/112 (23%)

Query: 493 ATAEHVIGGNNKGASTTLTGNTNVTVKDNAIVAGAIIGGSTSSHNAVTTITGNTSVLVTN 552
+ E N + T++ NT+ + + + G+ H + I G+ S
Sbjct: 301 SKNEDQSTQNTDSQTRTISKNTS-----TSRTHTSEVHGNAEVHASFFDIGGSVS---AG 352

Query: 553 IQHSNSATVNLGDFGNVTAQNFITGGSAWTANQTSGTTIQGNTSVTINVGDA 604
+SNS+TV + + G W ++ +N D
Sbjct: 353 FSNSNSSTVAIDH------SLSLAGERTW------------AETMGLNTADT 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09600IGASERPTASE360.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 0.002
Identities = 48/171 (28%), Positives = 74/171 (43%), Gaps = 17/171 (9%)

Query: 449 TDYVWNAADEGGALEGSGNLEKTGSAKLTINMDNADYAGKVIL-GGGTLEMGNDAALGTG 507
TDY W++ + + G EK+ + L D ++ V G GTL + N+ G G
Sbjct: 347 TDYSWSSNGKTSTITGG---EKSLNVDLADGKDKPNHGKSVTFEGSGTLTLNNNIDQGAG 403

Query: 508 DLAFDGGTLRYGTGVTADISGQISTTSKSSVKVDTNGNNVSWASADHYKALNLEKSGNGV 567
L F+G Y T+D +TT K + G V+W + + L K G G
Sbjct: 404 GLFFEGD---YEVKGTSD-----NTTWKGAGVSVAEGKTVTW-KVHNPQYDRLAKIGKGT 454

Query: 568 LTLGG-AVYTGGLTVDEGGVSI---TSGSVQTKFSAGVTVNAGGTLAISST 614
L + G G L V +G V + T+GS Q F++ V+ TL ++
Sbjct: 455 LIVEGTGDNKGSLKVGDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDD 505


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09610NUCEPIMERASE1851e-58 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 185 bits (471), Expect = 1e-58
Identities = 83/336 (24%), Positives = 148/336 (44%), Gaps = 38/336 (11%)

Query: 4 RILITGGAGFIGSHLSERLLREGHEVICMDNFFTGSKQNI----LHLTDYPGFEVIRHDV 59
+ L+TG AGFIG H+S+RLL GH+V+ +DN ++ L L PGF+ + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 T-------VPYVMEVDQIYNLACPASPPHYQFDPIHTMKTSVLGALNMLGLAKRCK-ARI 111
+ ++++ + + +P +++ G LN+L + K +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 112 LQASTSEVYGDPMVHPQPETYWGNVN-PVGVRSCYDEGKRCAETLFMDYRRMNGVDVRII 170
L AS+S VYG P + +V+ PV S Y K+ E + Y + G+ +
Sbjct: 122 LYASSSSVYGLN--RKMPFSTDDSVDHPV---SLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 171 RIFNTYGPRMNPNDGRVVSNFIVQALKGEDITIYGTGKQTRSFQYVDDLVEGMVRMMDTE 230
R F YGP P+ + F L+G+ I +Y GK R F Y+DD+ E ++R+ D
Sbjct: 177 RFFTVYGPWGRPD--MALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 231 GFSGP------------------VNLGNPEEFTMLELAEKVIEMTGSSSKTVFRPLPLDD 272
+ N+GN +++ + + + G +K PL D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 273 PTQRKPDIRLAKEKLGWKPHITLEKGLEKTIAYFRG 308
+ D + E +G+ P T++ G++ + ++R
Sbjct: 295 VLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09620HTHFIS1051e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 105 bits (264), Expect = 1e-28
Identities = 42/138 (30%), Positives = 69/138 (50%)

Query: 4 SAYTILIAEDDDSIRHALTDVLTASGYEVLALEEGLAAIHAVRERNFDLALLDVAMPGAD 63
+ TIL+A+DD +IR L L+ +GY+V + + DL + DV MP +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 64 GFQVLQVMSEERPGTPVIMLTARGEEEDRVQGLKLGADDYIVKPFSIRELLARIEAVLRR 123
F +L + + RP PV++++A+ ++ + GA DY+ KPF + EL+ I L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 124 SPERPRQLREIRIPGATL 141
RP +L + G L
Sbjct: 122 PKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09690LPSBIOSNTHSS1541e-50 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 154 bits (390), Expect = 1e-50
Identities = 60/165 (36%), Positives = 99/165 (60%), Gaps = 4/165 (2%)

Query: 4 IAVYAGSFDPLTNGHLWMIRQGARMFDELIVAMGDNPDKRYTFSHEERMDMLRVALSDMP 63
A+Y GSFDP+T GHL +I +G R+FD++ VA+ NP+K+ FS +ER++ + A++ +P
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 DVRIAEFHNRFLVDFANEHGATFMLRGIRSTQDYEYERVMRHINADMAPKVCTVFLMPPR 123
+ ++ F V++A + A +LRG+R D+E E M + N +A + TVFL
Sbjct: 62 NAQVDSFEG-LTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 124 DTAELSSSMVKGLIGPEGWEGQVSRYVPHNVFAMLKEKYREVFPR 168
+ + LSSS+VK + + G V +VP +V A L +++ V R
Sbjct: 121 EYSFLSSSLVKEVA---RFGGNVEHFVPSHVAAALYDQFHPVVER 162


27GOZ73_RS10255GOZ73_RS10280Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS10255121-6.003755extracellular solute-binding protein
GOZ73_RS10260224-6.264847ABC transporter permease subunit
GOZ73_RS10265022-5.673221ABC transporter permease
GOZ73_RS10270025-6.052933metallophosphatase family protein
GOZ73_RS10275023-5.277301hypothetical protein
GOZ73_RS10280025-5.037259DUF2075 domain-containing protein
28GOZ73_RS10345GOZ73_RS10370Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS103450144.921486chorismate-binding protein
GOZ73_RS103500164.840241polysulfide reductase NrfD
GOZ73_RS103550154.768455DUF3341 domain-containing protein
GOZ73_RS10360-1133.870976cytochrome c
GOZ73_RS10365-1103.295976hypothetical protein
GOZ73_RS10370-2113.053204cytochrome c
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10370TYPE3OMOPROT300.044 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 29.6 bits (66), Expect = 0.044
Identities = 12/25 (48%), Positives = 16/25 (64%)

Query: 82 SRWMKPAEWLWHTCVALGLATIAAG 106
S W+KP +WL H AL A ++AG
Sbjct: 51 SAWIKPGDWLEHVSPALAGAAVSAG 75


29GOZ73_RS10440GOZ73_RS10530Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS104402152.544688hypothetical protein
GOZ73_RS104451152.207561hypothetical protein
GOZ73_RS104501162.937359YkgJ family cysteine cluster protein
GOZ73_RS104551163.1543843-phosphoshikimate 1-carboxyvinyltransferase
GOZ73_RS104601152.584749UDP-N-acetylglucosamine
GOZ73_RS104701141.401366HU family DNA-binding protein
GOZ73_RS104750140.352267chorismate synthase
GOZ73_RS104850190.223133hypothetical protein
GOZ73_RS10490-1180.648974flavodoxin domain-containing protein
GOZ73_RS10495-2140.262002hypothetical protein
GOZ73_RS10500-114-0.434716metallophosphoesterase
GOZ73_RS10505018-3.601198hypothetical protein
GOZ73_RS10510323-6.132579GGGtGRT protein
GOZ73_RS10515528-7.258598hypothetical protein
GOZ73_RS10520325-7.023787DNA adenine methylase
GOZ73_RS10525327-7.351142tyrosine recombinase
GOZ73_RS10530424-6.258574hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10475DNABINDINGHU922e-28 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 92.4 bits (230), Expect = 2e-28
Identities = 40/88 (45%), Positives = 51/88 (57%), Gaps = 2/88 (2%)

Query: 2 NKAQLIELIQSKLGADTTKKHAEEALAAVLESIKEGVKESGKVQIIGFGTFATKTREART 61
NK LI + + TKK + A+ AV ++ + + KVQ+IGFG F + R AR
Sbjct: 3 NKQDLIAKVAEA--TELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARK 60

Query: 62 GRNPKTGNTINIPASKTVAFKASSALKD 89
GRNP+TG I I ASK AFKA ALKD
Sbjct: 61 GRNPQTGEEIKIKASKVPAFKAGKALKD 88


30GOZ73_RS10600GOZ73_RS10625Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS10600122-3.817716hypothetical protein
GOZ73_RS12445320-2.831853cytochrome c biogenesis protein CcsA
GOZ73_RS10610321-2.578624cytochrome c biogenesis protein ResB
GOZ73_RS10615320-2.349466siderophore ABC transporter substrate-binding
GOZ73_RS10620219-1.899365ATP-binding cassette domain-containing protein
GOZ73_RS10625219-2.132083iron chelate uptake ABC transporter family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10615FERRIBNDNGPP631e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 1e-13
Identities = 38/182 (20%), Positives = 77/182 (42%), Gaps = 14/182 (7%)

Query: 41 VNPKRVAVLDYGSLETIGELGVTPA----LALPKKFL--PPYLSKFSGEQYTDLGTVKEF 94
++P R+ L++ +E + LG+ P + ++ PP + D+G E
Sbjct: 33 IDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPL-----PDSVIDVGLRTEP 87

Query: 95 NIETINAFKPDLIIISGRQQDYYQKLSSIAPVYLVD-TLAKDQLGEAKKNIRLLGDVFGV 153
N+E + KP ++ S + L+ IAP + + K L A+K++ + D+ +
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 154 PEKAAEAVKNIDEAVTRTKDKAQASGKKALVL--LTNDGKVSAYGSGSRFGLVHDALGVA 211
A + ++ + K + G + L+L L + + +G S F + D G+
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 212 QA 213
A
Sbjct: 208 NA 209


31GOZ73_RS10715GOZ73_RS10875Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS10715017-3.386343DUF262 domain-containing protein
GOZ73_RS10720-1161.557101aminotransferase class IV
GOZ73_RS107250161.891586aminodeoxychorismate synthase component I
GOZ73_RS10730-2161.640165aminodeoxychorismate/anthranilate synthase
GOZ73_RS10735-3131.264606NADH:flavin oxidoreductase
GOZ73_RS10740-1150.521316TIGR02206 family membrane protein
GOZ73_RS10745-120-2.546638pyridoxamine 5'-phosphate oxidase family
GOZ73_RS10750020-2.402996glutamine amidotransferase
GOZ73_RS10755018-3.107487Cys-tRNA(Pro) deacylase
GOZ73_RS10760-29-0.247085cytidylate kinase family protein
GOZ73_RS10765-28-0.323664pyrimidine dimer DNA glycosylase/endonuclease V
GOZ73_RS10770-28-0.001541hypothetical protein
GOZ73_RS10780-1171.732283*hypothetical protein
GOZ73_RS10785-2140.260505PA14 domain-containing protein
GOZ73_RS10790-213-2.924491methionine synthase
GOZ73_RS10795025-7.443654threonine/serine exporter family protein
GOZ73_RS10800026-8.273467threonine/serine exporter family protein
GOZ73_RS10805031-9.598073type I restriction-modification system subunit
GOZ73_RS10810027-9.756717restriction endonuclease subunit S
GOZ73_RS10815019-7.234670restriction endonuclease subunit S
GOZ73_RS12245-211-3.322449transcriptional regulator
GOZ73_RS10820-210-3.204811tyrosine-type recombinase/integrase
GOZ73_RS10825-39-2.300991hypothetical protein
GOZ73_RS10830-310-0.868587type I restriction endonuclease subunit R
GOZ73_RS10840-2192.740784*homoserine dehydrogenase
GOZ73_RS10845-2193.080236aspartate kinase
GOZ73_RS10850-2173.236570thioredoxin domain-containing protein
GOZ73_RS10855-2173.320763TIGR02206 family membrane protein
GOZ73_RS10860-2233.102134hypothetical protein
GOZ73_RS10865-1253.463026adenosylcobalamin-dependent
GOZ73_RS108700213.198317shikimate dehydrogenase
GOZ73_RS10875-1203.113731A/G-specific adenine glycosylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10735CHLAMIDIAOMP280.050 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 28.4 bits (63), Expect = 0.050
Identities = 9/44 (20%), Positives = 16/44 (36%)

Query: 269 FDCSTRRFWEPEFEGSRLNLAGWTRKLTGKPTISVGSVGLKGDF 312
FD T R +P+ + + + G + + G GD
Sbjct: 297 FDADTIRIAQPKSATAIFDTTTLNPTIAGAGDVKASAEGQLGDT 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10840RTXTOXINA310.012 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.1 bits (70), Expect = 0.012
Identities = 31/106 (29%), Positives = 48/106 (45%), Gaps = 7/106 (6%)

Query: 108 ALLAEYGAEIFKLSAEMGTPIYFEASAGGGIPIIQ--SLQNSLICNHINSIVGIINGTSN 165
+LLA + E + A + T AS GI SL + + + ++ GII+G
Sbjct: 352 SLLAAFHKETGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGILE 411

Query: 166 YILSAMGEHGA-DYADALAQAQKLGFAEEDPSLDVNGWDAAHKALI 210
AM EH A AD +A+ +K + + NG+DA H A +
Sbjct: 412 ASKQAMFEHVASKMADVIAEWEK----KHGKNYFENGYDARHAAFL 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10845CARBMTKINASE362e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.3 bits (84), Expect = 2e-04
Identities = 21/100 (21%), Positives = 44/100 (44%), Gaps = 8/100 (8%)

Query: 112 RIHSIDPTLMNKYLQEGNILICAGFQGV---TEEGMVQTL-GRGGSDLSAIAIAAALKAD 167
+ + K ++ G I+I +G GV E+G ++ + DL+ +A + AD
Sbjct: 172 GHVEAET--IKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNAD 229

Query: 168 VCQIFTDVDGVYTCDPRVVKDAKKIQTLSYDEMLEMASNG 207
+ I TDV+G + + ++ + +E+ + G
Sbjct: 230 IFMILTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEG 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10865PRTACTNFAMLY320.026 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 31.6 bits (71), Expect = 0.026
Identities = 19/64 (29%), Positives = 29/64 (45%), Gaps = 2/64 (3%)

Query: 820 AVLTADELAVVNGAAHIESIPWLNKKDLPVFDCAA--TNGNGIRAISPAGHVLMLGALQP 877
A +T A+V+G HI ++ L +DLP TN + A V +LGA +
Sbjct: 169 ANVTVQRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASEL 228

Query: 878 FISG 881
+ G
Sbjct: 229 TLDG 232


32GOZ73_RS10960GOZ73_RS11030Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS10960-1204.112995hypothetical protein
GOZ73_RS10965-1224.081847serine/threonine protein kinase
GOZ73_RS109700204.133291tryptophanase
GOZ73_RS10975-1193.355001hypothetical protein
GOZ73_RS10980-1202.998009DUF4105 domain-containing protein
GOZ73_RS10985-2202.764038phenylalanine--tRNA ligase subunit beta
GOZ73_RS10990-2192.057291phenylalanine--tRNA ligase subunit alpha
GOZ73_RS10995-2181.544064MFS transporter
GOZ73_RS11000-3201.525733M60 family metallopeptidase
GOZ73_RS11005-2232.674665molecular chaperone HtpG
GOZ73_RS11010-1173.189965potassium channel family protein
GOZ73_RS11020-1174.347500alanine racemase
GOZ73_RS11025-1143.504222RrF2 family transcriptional regulator
GOZ73_RS11030-2143.158397O-acetylhomoserine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS10965YERSSTKINASE415e-06 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 41.3 bits (96), Expect = 5e-06
Identities = 25/66 (37%), Positives = 33/66 (50%), Gaps = 4/66 (6%)

Query: 149 RLLDILHYLHSMGVYHCDIKPSN-IFIQPDGTPKLIDFGAVRTKTLQHQGLVQITPGYTP 207
RLLD+ ++L GV H DIKP N +F + G P +ID G Q +G T +
Sbjct: 253 RLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGF---TESFKA 309

Query: 208 PEFYPG 213
PE G
Sbjct: 310 PELGVG 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS11020ALARACEMASE313e-107 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 313 bits (803), Expect = e-107
Identities = 109/362 (30%), Positives = 176/362 (48%), Gaps = 15/362 (4%)

Query: 9 AWAEIDLGAIRHNLNVVKQAAKGEYYMPVVKAGAYGHGLEQVCRTLDSEGIAFFGVANVG 68
A +DL A++ NL++V+QAA VVKA AYGHG+E++ + + F + N+
Sbjct: 5 IQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDG--FALLNLE 62

Query: 69 EARRISQAGCRTRPYILGPAF-PEEREEIVLNGWRSFISTMEEAAHYNSLARLYGKTLPI 127
EA + + G + +L F ++ E + + + + + + L I
Sbjct: 63 EAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARL--KAPLDI 120

Query: 128 HLSVDTGMGRGGFLPDQLEELLSRLGELDSLHLEGLGAHLPCADEDREITLRQISRFEQM 187
+L V++GM R GF PD++ + +L + ++ L +H A+ + ++R EQ
Sbjct: 121 YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHP-DGISGAMARIEQA 179

Query: 188 AARIREKLPLKYCHLANSAASLDYEIPSNNMCRPGLVLYGFSPIPS---PWAVQLKPAMA 244
E L + L+NSAA+L + + RPG++LYG SP L+P M
Sbjct: 180 ----AEGLECRR-SLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMT 234

Query: 245 LFSRLTVVRTLPEGHGISYGGTFVTDHPTRVATVGIGYADGYLRSLAHKGARVMVDGVSC 304
L S + V+TL G + YGG + R+ V GYADGY R G V+VDGV
Sbjct: 235 LSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAP-TGTPVLVDGVRT 293

Query: 305 PLLGRVTMDQIMVDVSRMPHAEPGMTAEIMGPHIPVTELAEKAGTISWEIFTGIGPRVPR 364
+G V+MD + VD++ P A G E+ G I + ++A AGT+ +E+ + RVP
Sbjct: 294 MTVGTVSMDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPV 353

Query: 365 HY 366

Sbjct: 354 VT 355


33GOZ73_RS11380GOZ73_RS11485Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS11380-219-3.076924glycosyltransferase
GOZ73_RS11385021-3.199934glycosyltransferase
GOZ73_RS11390-120-3.163237glycosyltransferase
GOZ73_RS11395-121-1.672009glycosyltransferase
GOZ73_RS11400-122-0.062499hypothetical protein
GOZ73_RS11405-1220.986046O-antigen ligase family protein
GOZ73_RS11410-1232.536733alpha-1,2-fucosyltransferase
GOZ73_RS11415-1243.079859glycosyltransferase
GOZ73_RS11420-1273.673219glycosyltransferase
GOZ73_RS11425-1231.939387glycosyltransferase
GOZ73_RS11430-1200.913046hypothetical protein
GOZ73_RS11435-119-2.239245transferase
GOZ73_RS11440018-3.545637glycosyltransferase family 2 protein
GOZ73_RS11445018-3.903921glycosyltransferase
GOZ73_RS11450018-4.934433glycosyltransferase family 2 protein
GOZ73_RS11455016-4.519306polysaccharide pyruvyl transferase family
GOZ73_RS11460-117-3.975246oligosaccharide flippase family protein
GOZ73_RS11465-218-2.704250nitroreductase family protein
GOZ73_RS11470-319-1.047515acyltransferase
GOZ73_RS12450-1200.125343hypothetical protein
GOZ73_RS11475-1232.617461hypothetical protein
GOZ73_RS11480-1232.966680aminoglycoside phosphotransferase family
GOZ73_RS11485-2193.132103L-fucose mutarotase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS11460PF05844320.007 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 31.5 bits (71), Expect = 0.007
Identities = 6/35 (17%), Positives = 14/35 (40%)

Query: 266 AQAVASQVSSGVTSLMGSFSSAMQPQITKSYAAGD 300
A AV + V + ++++GS + +
Sbjct: 122 AMAVIAGVGALASAVVGSLGALKNGKAISQEKTLQ 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS11475OMPADOMAIN362e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 35.7 bits (82), Expect = 2e-04
Identities = 22/86 (25%), Positives = 36/86 (41%), Gaps = 4/86 (4%)

Query: 42 AQLGAGYATKYISRGLAFQNSASDHVIPLEAVGA-YQLNNQYSLIGGLKYQWLTDNGLEH 100
+LG + ++ + + V P+ A G Y + + + L+YQW + G H
Sbjct: 116 TRLGGMVW-RADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIAT--RLEYQWTNNIGDAH 172

Query: 101 NDSGICDEGSAILGVSRKFGKSTVAA 126
D G LGVS +FG+ A
Sbjct: 173 TIGTRPDNGMLSLGVSYRFGQGEAAP 198


34GOZ73_RS11615GOZ73_RS11790Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS116151183.436055PEP-CTERM sorting domain-containing protein
GOZ73_RS116202234.627032carbohydrate porin
GOZ73_RS116254255.161948hydrogenase small subunit
GOZ73_RS116303234.624105nickel-dependent hydrogenase large subunit
GOZ73_RS116352205.331550cytochrome b/b6 domain-containing protein
GOZ73_RS116400153.351211HyaD/HybD family hydrogenase maturation
GOZ73_RS116450142.808431hydrogenase maturation nickel metallochaperone
GOZ73_RS11650-1152.467417hypothetical protein
GOZ73_RS11655-1162.238250DUF4870 domain-containing protein
GOZ73_RS11660-1162.282876hypothetical protein
GOZ73_RS11665-1170.600930family 20 glycosylhydrolase
GOZ73_RS116700140.566424NAD(P)H-dependent oxidoreductase
GOZ73_RS11675114-0.745876histidinol-phosphate transaminase
GOZ73_RS11680212-0.555067hypothetical protein
GOZ73_RS11685212-0.787183hypothetical protein
GOZ73_RS11690111-0.423240transcriptional regulator
GOZ73_RS1169509-0.785105hypothetical protein
GOZ73_RS11700-1100.026567acyltransferase family protein
GOZ73_RS117050121.680668fumarate hydratase
GOZ73_RS117101121.503340hypothetical protein
GOZ73_RS117150141.9066926-carboxytetrahydropterin synthase
GOZ73_RS117201172.685926DUF3313 family protein
GOZ73_RS117252161.634460beta-N-acetylhexosaminidase
GOZ73_RS11730317-0.625962signal recognition particle protein
GOZ73_RS11735521-2.054664quinolinate synthase NadA
GOZ73_RS11740623-3.477444SAM-dependent methyltransferase
GOZ73_RS11745726-4.717771RHS repeat-associated core domain-containing
GOZ73_RS11750658-17.732570RHS repeat-associated core domain-containing
GOZ73_RS11755758-18.346003hypothetical protein
GOZ73_RS12455956-17.135378hypothetical protein
GOZ73_RS11765877-26.177427signal peptidase I
GOZ73_RS11770971-23.922166RHS repeat-associated core domain-containing
GOZ73_RS117751056-20.585519hypothetical protein
GOZ73_RS11780844-14.839598hypothetical protein
GOZ73_RS11785429-8.871344hypothetical protein
GOZ73_RS11790325-6.690330hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS11730PF07132300.022 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 30.0 bits (67), Expect = 0.022
Identities = 23/47 (48%), Positives = 25/47 (53%), Gaps = 5/47 (10%)

Query: 431 MGAMMKQMASMKGKGGKMPGLPGMGGGLGGLKGLGGLGGGLGGLFGG 477
MG+MM G GG + GL GGLGG GGLGGGLG G
Sbjct: 61 MGSMM-----GGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGS 102



Score = 30.0 bits (67), Expect = 0.023
Identities = 21/40 (52%), Positives = 23/40 (57%), Gaps = 4/40 (10%)

Query: 438 MASMKGKGGKMPGLPGMGGGLGGLKGLGGLGGGLGGLFGG 477
M +M G M G G+GGGLGGL LGG GGL GG
Sbjct: 55 MTTMMFMGSMMGG--GLGGGLGGL--GSSLGGLGGGLLGG 90


35GOZ73_RS02080GOZ73_RS02120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS02080-1141.864568ATP-binding cassette domain-containing protein
GOZ73_RS02085-1132.640987hypothetical protein
GOZ73_RS02090-1123.735006hypothetical protein
GOZ73_RS020950123.554683ATP-binding cassette domain-containing protein
GOZ73_RS02100-2113.043607hypothetical protein
GOZ73_RS02105-1122.563402SDR family NAD(P)-dependent oxidoreductase
GOZ73_RS02110-1111.082274glycoside hydrolase family 75 protein
GOZ73_RS02115114-0.287197squalene/phytoene synthase family protein
GOZ73_RS02120113-1.765613sigma-54-dependent Fis family transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02080PF05272300.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.005
Identities = 14/42 (33%), Positives = 20/42 (47%), Gaps = 5/42 (11%)

Query: 14 GKQWIFENYNRVFTSGIT-----ILKGYSGCGKTTLLKILAG 50
GK + + RV G +L+G G GK+TL+ L G
Sbjct: 577 GKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02095PF05272340.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 0.004
Identities = 23/94 (24%), Positives = 33/94 (35%), Gaps = 18/94 (19%)

Query: 418 VVHLNAYHALRCRFSA-----------------GVLDEEYHAIRTLSVEGVTKEFLRSGR 460
+N H R A G ++Y R ++ V K L G
Sbjct: 526 AADMNRVHPFRDWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYIL-MGH 584

Query: 461 VLDNIDLTVKRGEMVCILGPSGSGKSTLLSMLAG 494
V ++ K V + G G GKSTL++ L G
Sbjct: 585 VARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02105DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 70.5 bits (172), Expect = 2e-16
Identities = 57/197 (28%), Positives = 91/197 (46%), Gaps = 9/197 (4%)

Query: 1 MHTARLQYDRIIITGASSGFGEAFAGTLAPHTAELVLIARNEAALRQLAAALEKRHSGLH 60
M+ ++ ITGA+ G GEA A TLA A + + N L ++ ++L+ H
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARH 58

Query: 61 ASVLPCDLADEASLNMLISRLDSLPPGRTLLINNAG---AGDYGDFSDGRWEKIRFLLRL 117
A P D+ D A+++ + +R++ +L+N AG G SD WE +
Sbjct: 59 AEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT---FSV 115

Query: 118 NVESLTRLCHALVPSMK-RNGGDIINLSSLGALLPIPDFAVYAATKAYVSSLSEALRLEL 176
N + ++ M R G I+ + S A +P A YA++KA ++ L LEL
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 177 REHGIRVLVVCPGPVST 193
E+ IR +V PG T
Sbjct: 176 AEYNIRCNIVSPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02110PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 33.6 bits (76), Expect = 0.002
Identities = 18/47 (38%), Positives = 21/47 (44%), Gaps = 1/47 (2%)

Query: 471 APKPPPAPKPESGGTAETPADKRKKEEAPGNKPAPAPAPPASPQTAP 517
A P P PE AE PA+ E PG +P P P P +P P
Sbjct: 320 AEAPNAQPLPEVS-PAENPANNPAPNENPGTRPNPEPDPDLNPDANP 365



Score = 31.3 bits (70), Expect = 0.012
Identities = 16/41 (39%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 478 PKPE-SGGTAETPADKRKKEEAPGNKPAPAPAPPASPQTAP 517
P+P+ + G+AE P + E +P PA PAP +P T P
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRP 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS02120HTHFIS401e-138 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 401 bits (1033), Expect = e-138
Identities = 132/364 (36%), Positives = 194/364 (53%), Gaps = 30/364 (8%)

Query: 172 EERAALVAENEKLRELLTTN--PGELIGNCSTMLQVYEQIRQVAPSDATVLIRGSSGTGK 229
AL + +L + L+G + M ++Y + ++ +D T++I G SGTGK
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGK 173

Query: 230 ELVARAVVNLSGRKDKPLVTLNCAAMPENLLESELFGHEKGSFTGATSRRIGRAEAADGG 289
ELVARA+ + R++ P V +N AA+P +L+ESELFGHEKG+FTGA +R GR E A+GG
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGG 233

Query: 290 TLFLDEIGDLSLQMQVKLLRFLQEKTFSRVGSNRELHADVRFIAATSRNLEELMAASKFR 349
TLFLDEIGD+ + Q +LLR LQ+ ++ VG + +DVR +AAT+++L++ + FR
Sbjct: 234 TLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFR 293

Query: 350 EDLYYRLNIFPIVMPDLSKRKGDVMLLAEHFLSKFNLKYGKDIKRLSTPAINMLMAYHWP 409
EDLYYRLN+ P+ +P L R D+ L HF+ + K G D+KR A+ ++ A+ WP
Sbjct: 294 EDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWP 352

Query: 410 GNVRELENCMERAVITAQDDCIYGYNLPASLQMPSHDVPYSRDG---------------- 453
GNVRELEN + R D I + L+ D P +
Sbjct: 353 GNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENM 412

Query: 454 -----------EAPADLPTMVDSFERELIVAALKRSPGNMSAAARELGISPRVLHYKMHR 502
++ E LI+AAL + GN AA LG++ L K+
Sbjct: 413 RQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 503 LGLQ 506
LG+
Sbjct: 473 LGVS 476


36GOZ73_RS04220GOZ73_RS04260N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS04220-2161.035822SLC13 family permease
GOZ73_RS04225-2141.960841ATP-binding protein
GOZ73_RS04230-2142.651564response regulator
GOZ73_RS04235-2133.118897YidE/YbjL duplication
GOZ73_RS04240-1122.489701alpha-galactosidase
GOZ73_RS04245-1122.422883HU family DNA-binding protein
GOZ73_RS04250-1123.098137hypothetical protein
GOZ73_RS12370-1132.867580hypothetical protein
GOZ73_RS04260-2142.955391hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04220PF01540310.013 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 31.2 bits (70), Expect = 0.013
Identities = 26/104 (25%), Positives = 49/104 (47%), Gaps = 10/104 (9%)

Query: 105 PEVVGKLKENVENRLNKKTKLDAEEVRMTLGFQLMDAYEKIDLNAQELSR-EGKTEEAAG 163
P ++ KL VEN +++ K+D ++ D KI A+EL + K + A
Sbjct: 89 PAIISKLSAAVENAKSEQQKVDQANK------KIADENLKIKEGAKELLKLSEKIQSFA- 141

Query: 164 QESIAAQLKTAAGRLYSKEITARIQGLQFVNTMQQKST-MATFA 206
++IA + G+ + + T + Q + + + +KS + TFA
Sbjct: 142 -DTIALTITKLEGKKFQIDETFKKQLISTIELLNKKSAEVKTFA 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04225PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 17/73 (23%), Positives = 26/73 (35%), Gaps = 19/73 (26%)

Query: 403 ILVSVTDNGPGVPKDLHEKIFRPNFTTKKKGLSFGLGLGLTIVR-RIVDSYGG--RIRLE 459
+ + V + G K+ E G GL VR R+ YG +I+L
Sbjct: 292 VTLEVENTGSLALKNTKE----------------STGTGLQNVRERLQMLYGTEAQIKLS 335

Query: 460 SVPGHTVFTIVIP 472
G ++IP
Sbjct: 336 EKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04230HTHFIS692e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.7 bits (168), Expect = 2e-16
Identities = 27/131 (20%), Positives = 51/131 (38%), Gaps = 10/131 (7%)

Query: 1 MEKINIICVDDQPEVLDSVLRDLRPLNSCFRLEGVESASECRELLEEFDQDGELTGLIIS 60
M I+ DD + + + L + + +A+ + D D L+++
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGD-----LVVT 53

Query: 61 DHVMPGGSGVELLGGIAKDDRFAGTRKILLTGQATHADTIQAVNDAHIDNYLEKPWDPAV 120
D VMP + +LL I K ++++ Q T T ++ +YL KP+D
Sbjct: 54 DVVMPDENAFDLLPRIKKAR--PDLPVLVMSAQNT-FMTAIKASEKGAYDYLPKPFDLTE 110

Query: 121 LLATARKLLTR 131
L+ + L
Sbjct: 111 LIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04245DNABINDINGHU642e-17 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 64.3 bits (157), Expect = 2e-17
Identities = 23/86 (26%), Positives = 42/86 (48%), Gaps = 2/86 (2%)

Query: 2 NKTQFIRELQRELGNDVTSRQAEHALNAVLETIARSVRRRSLVKFRGFGTFRVKRRRARI 61
NK I ++ ++T + + A++AV ++ + + V+ GFG F V+ R AR
Sbjct: 3 NKQDLIAKVAEA--TELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARK 60

Query: 62 VRHPENGGRLLVHESKTMKFRPSSLL 87
R+P+ G + + SK F+ L
Sbjct: 61 GRNPQTGEEIKIKASKVPAFKAGKAL 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS04260STREPKINASE270.020 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 26.6 bits (58), Expect = 0.020
Identities = 14/44 (31%), Positives = 19/44 (43%)

Query: 50 PWAPAPAQQEILPEDENEESAPGPAVLPGDFRPGLRTPRLPETL 93
P+ P Q + D P DFRPGL+ +L +TL
Sbjct: 171 PYKEKPIQNQAKSVDVEYTVQFTPLNPDDDFRPGLKDTKLLKTL 214


37GOZ73_RS05600GOZ73_RS05660N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS05600-115-0.792989cell division protein ZapB
GOZ73_RS056051130.526908methylenetetrahydrofolate reductase [NAD(P)H]
GOZ73_RS05615-1151.004728tetratricopeptide repeat protein
GOZ73_RS056200161.215324hypothetical protein
GOZ73_RS056250170.816278TetR/AcrR family transcriptional regulator
GOZ73_RS056300170.483241efflux RND transporter permease subunit
GOZ73_RS05635-1180.725265hypothetical protein
GOZ73_RS056400160.505656TolC family protein
GOZ73_RS056450150.845747hypothetical protein
GOZ73_RS05650-1140.665240ankyrin repeat domain-containing protein
GOZ73_RS05655-2151.485277MFS transporter
GOZ73_RS05660-311-0.131490hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05605SECYTRNLCASE310.003 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 31.3 bits (71), Expect = 0.003
Identities = 13/68 (19%), Positives = 29/68 (42%), Gaps = 2/68 (2%)

Query: 2 SNFRILILAFLVVVLAVLGMIYVNISSRMDVIQQQQQMASAPAPVNPSPYLPMPTQAPAS 61
+ V ++ V +++V + R +Q ++M + S Y+P+ + +
Sbjct: 217 GGWIEFGTVIAVGLIMVALVVFVEQAQRRIPVQYAKRMIGRRSYGGTSTYIPL--KVNQA 274

Query: 62 AVQPVVVA 69
V PV+ A
Sbjct: 275 GVIPVIFA 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05620PF07472250.037 Fucose-binding lectin II
		>PF07472#Fucose-binding lectin II

Length = 245

Score = 25.4 bits (55), Expect = 0.037
Identities = 16/65 (24%), Positives = 20/65 (30%), Gaps = 20/65 (30%)

Query: 23 SCTTDSYGNSSVTPGGAAAIGVGALAAGAAIGAAVQHDRDKDRYRNDWRHHNPPPPPPRP 82
+CT G V PG AA GVGA+ ++ P P P
Sbjct: 83 ACTVTWAGAPGVLPGAAAKFGVGAVV--------------------NYFSKATPQPEPTQ 122

Query: 83 PHRPQ 87
P
Sbjct: 123 PGTTT 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05625HTHTETR492e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.9 bits (116), Expect = 2e-09
Identities = 19/152 (12%), Positives = 54/152 (35%), Gaps = 12/152 (7%)

Query: 2 PKRKSTREHLLNTGAMIVGESGLRALTVRGLSLRAGTNTGSFVYHFGNREAFLTELLERW 61
+ + TR+H+L+ + + G+ + ++ ++ AG G+ +HF ++ +E+ E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 62 YEPLFTGIRTHANIHLPAFEKLEAILGDVMEFLL--------RHGQFIAQLVLDALAGEQ 113
+ ++L +++ +L R GE
Sbjct: 67 ESNI---GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 114 AAI-RFISGIHTRHPLVLLETIREAQQEGSVL 144
A + + + + +T++ + +
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLP 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05630ACRIFLAVINRP5690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 569 bits (1468), Expect = 0.0
Identities = 233/1066 (21%), Positives = 437/1066 (40%), Gaps = 72/1066 (6%)

Query: 7 FCLNNRITVILLTIALAIISIFVVKNIPVDVFPELKVPRVTIQTEAPGLTAEEVEQYITI 66
F + I +L I L + + +PV +P + P V++ PG A+ V+ +T
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 67 PIESAMNGTAGVKGVRSSSGS-GLSFVWVDFDWNKDIYQARQIVTERLGAVRESLPEGTS 125
IE MNG + + S+S S G + + F D A+ V +L LP+
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 126 PELAPIVSITGE-IMLIALTGD-KDTSQLDMRQLAEYKLRTRLLAIPGVGQVTVLGGRLP 183
+ + + +M+ D T+Q D+ ++ L + GVG V + G +
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY- 182

Query: 184 EYQVIYDPNKLKLAGVDLSSLKTAIQESQSSVPAGYLED---VAGQEL--PIQQDTRTAN 238
++ D + L + + ++ + AG L + GQ+L I TR N
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 239 IEQLNRALVPDHASG-ILRLQDVAEVKIDGAPRRGDAGFMGEDAVVLSVQKVPGANTLAL 297
E+ + + ++ G ++RL+DVA V++ G A G+ A L ++ GAN L
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 298 TQAVDAAVREFSQSQLPKGMKLHTAAYRQADFIEMSLDNGTETLLIAGAVVMIVIFLTLL 357
+A+ A + E P+GMK+ Y F+++S+ +TL A +V +V++L L
Sbjct: 303 AKAIKAKLAELQPF-FPQGMKV-LYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 358 NLRTAIITLISMPLSILFGLMMFPVFGLAINIMTLGGLAVAVGDVVDNAIIFVEIAWRHL 417
N+R +I I++P+ +L + FG +IN +T+ G+ +A+G +VD+AI+ VE R +
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 418 NRNAALPKDKRKSKYEVLMKAKSEIVGSISFSSVIILLVFTPVLFLSGLEGQFFRPLGIS 477
+ PK E K+ S+I G++ ++++ VF P+ F G G +R I+
Sbjct: 421 MEDKLPPK-------EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSIT 473

Query: 478 YMLALLSSLIVAVTITPVLCYIWFKKSKN---------AATLESGDSFSSRLIKRIYAPI 528
+ A+ S++VA+ +TP LC K + S I
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 529 LEFCLRFSKTVCAIMTAITLLALWLGSTFGTSFLPPFNEDCYTVFVSTVPGTSLDETERI 588
L R+ I+ + L +SFLP ++ + + G + + T+++
Sbjct: 534 LGSTGRYLLIYALIV----AGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKV 589

Query: 589 SRKVMKDI--EQIPGVLSVTQRTGRAENDEHAEPVSASELLV-------RVDLKKDQKEL 639
+V + V SV G + + + +A V R + + +
Sbjct: 590 LDQVTDYYLKNEKANVESVFTVNGFSFSG---QAQNAGMAFVSLKPWEERNGDENSAEAV 646

Query: 640 RAAIKKCIDGIPGTSSMIGYPLAHRISSALSGSNSEIAINIYGTELPQLRLAARK--AKE 697
K + I + I + + + + AR
Sbjct: 647 IHRAKMELGKIRDGFVIPFNMP--AIVELGTATGFDFELIDQAGLGHDALTQARNQLLGM 704

Query: 698 ILENMPEVADARANREIMVDTIRVQYNQEALASYGLTMANAAEQVSTAMNGQKLGEVIKN 757
++ + R N +++ +QE + G+++++ + +STA+ G + + I
Sbjct: 705 AAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDR 764

Query: 758 QDHWNIVLRIDPRLKTSMEDVKNLELISPNKKTVRLDDVAQVYREEVSNLILRDNTMRKA 817
+ ++ D + + EDV L + S N + V + S + R N +
Sbjct: 765 GRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSM 824

Query: 818 MISCNPSPNSNLGD---LAKACREQLDPVMNAMGCTVDYDGTIKARESASQRLYVLGAIV 874
I +P ++ GD L + +L G D+ G + + L AI
Sbjct: 825 EIQGEAAPGTSSGDAMALMENLASKLPA-----GIGYDWTGMSYQERLSGNQAPALVAIS 879

Query: 875 MVLIVLLLSSALGSVRRAMLTLVNIPLCLVGGIVAVFLASPGTLSSLFGGTYIPPILSVA 934
V++ L L++ S + ++ +PL +VG ++A L V
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF---------NQK-----NDVY 925

Query: 935 SIVGFVTVIGFAIRSGLILLNRYR-ALEHQGMEPAEAIREGSLERVVPIIMTSLTTVLGL 993
+VG +T IG + ++ ++++ + +E +G EA R+ PI+MTSL +LG+
Sbjct: 926 FMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGV 985

Query: 994 LPLIWAIDQPGGELLGPLAIVQFGGLVTATILNLLIIPATAKLFSR 1039
LPL + G + I GG+V+AT+L + +P + R
Sbjct: 986 LPLAISNG-AGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 96.8 bits (241), Expect = 2e-22
Identities = 91/551 (16%), Positives = 195/551 (35%), Gaps = 72/551 (13%)

Query: 528 ILEFCLRFSKTVCAIMTAITLLALWLGSTFGTSFLPPFNEDCYTVFVSTVPGTSLDETE- 586
+ F +R + + + + P +V + PG +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVS-ANYPGADAQTVQD 59

Query: 587 RISRKVMKDIEQIPGVLSVTQRTGRAENDEHAEPVSASELLVR----VDLKKDQKELRAA 642
+++ + +++ I ++ ++ + A + + + D Q +++
Sbjct: 60 TVTQVIEQNMNGIDNLMYMSSTSDSAGS---------VTITLTFQSGTDPDIAQVQVQNK 110

Query: 643 IKKCIDGIP----------GTSSMIGYPLAHRISSALSGSNSEIAINIYGTELPQLRLAA 692
++ +P SS +A +S + +I+ A
Sbjct: 111 LQLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDIS-----------DYVA 159

Query: 693 RKAKEILENMPEVADAR---ANREIMVDTIRVQYNQEALASYGLTMANAAEQVSTAMN-- 747
K+ L + V D + A + R+ + + L Y LT + Q+ +
Sbjct: 160 SNVKDTLSRLNGVGDVQLFGAQYAM-----RIWLDADLLNKYKLTPVDVINQLKVQNDQI 214

Query: 748 --GQKLGEVIKNQDHWNIVLRIDPRLKTSMEDVKNLELISPNKKTVRLDDVAQVYR-EEV 804
GQ G N + R K E K ++ + VRL DVA+V E
Sbjct: 215 AAGQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGEN 274

Query: 805 SNLILRDNTMRKAMISCNPSPNSNLGDLAKACRE---QLDPVM-NAMGCTVDYDGTIKAR 860
N+I R N A + + +N D AKA + +L P M YD T +
Sbjct: 275 YNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQ 334

Query: 861 ESASQRLYVL-GAIVMVLIVLLLSSALGSVRRAMLTLVNIPLCLVGGIVAVFLASPGTLS 919
S + + L AI++V +V+ L L ++R ++ + +P+ L+G +
Sbjct: 335 LSIHEVVKTLFEAIMLVFLVMYLF--LQNMRATLIPTIAVPVVLLGTFAILAA------- 385

Query: 920 SLFGGTYIPPILSVASIVGFVTVIGFAIRSGLILL-NRYRALEHQGMEPAEAIREGSLER 978
FG + ++ ++ G V IG + ++++ N R + + P EA + +
Sbjct: 386 --FGYS-----INTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQI 438

Query: 979 VVPIIMTSLTTVLGLLPLIWAIDQPGGELLGPLAIVQFGGLVTATILNLLIIPATAKLFS 1038
++ ++ +P + G + +I + + ++ L++ PA
Sbjct: 439 QGALVGIAMVLSAVFIP-MAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLL 497

Query: 1039 RWIASRRKELK 1049
+ +++ E K
Sbjct: 498 KPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05635RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.002
Identities = 15/94 (15%), Positives = 33/94 (35%), Gaps = 10/94 (10%)

Query: 167 VKSAQQVRKGELLYTLASPDIVEMEGNASQARAALDRAAAELGTLRERRAQLEKIGTRNS 226
VK + VRKG++L L + E + + +++L +A E + +E
Sbjct: 112 VKEGESVRKGDVLLKL---TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPEL 168

Query: 227 ELNTSI-------QFKEAEIHSLRAALNASNNKL 253
+L + ++ + N+
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQK 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS05660GPOSANCHOR280.031 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.1 bits (62), Expect = 0.031
Identities = 13/34 (38%), Positives = 15/34 (44%)

Query: 106 PEAAKARKTASARATMIPFFTALILLGMLAAGLL 139
P R+ S T PFFTA L M AG+
Sbjct: 497 PMKETKRQLPSTGETANPFFTAAALTVMATAGVA 530


38GOZ73_RS06550GOZ73_RS06600N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS06550128-4.925933YbaB/EbfC family nucleoid-associated protein
GOZ73_RS06555125-3.624751DNA polymerase III subunit gamma/tau
GOZ73_RS12390125-2.484636hypothetical protein
GOZ73_RS06560126-1.578184hypothetical protein
GOZ73_RS06565228-0.847447hypothetical protein
GOZ73_RS065752250.742864*tyrosine-type recombinase/integrase
GOZ73_RS06580331-0.551510hypothetical protein
GOZ73_RS06585328-2.132128hypothetical protein
GOZ73_RS06590326-1.717992portal protein
GOZ73_RS06595423-2.037156hypothetical protein
GOZ73_RS06600425-1.467256N-acetylmuramoyl-L-alanine amidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06550BCTERIALGSPD270.015 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 26.8 bits (59), Expect = 0.015
Identities = 16/75 (21%), Positives = 26/75 (34%), Gaps = 8/75 (10%)

Query: 27 QTVTTEGAGGKLKVTATC-DGNLTELVIDPSIIDPSDSEFLQELLLQTINAAIAKGKETA 85
TV + G KLKV +G+ L I+ + +D+ + A
Sbjct: 477 NTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADA---ASSTSSDLGATFNTRTVNN 533

Query: 86 AAEMKK----LTGGL 96
A + + GGL
Sbjct: 534 AVLVGSGETVVVGGL 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06555TONBPROTEIN310.014 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.7 bits (69), Expect = 0.014
Identities = 11/43 (25%), Positives = 17/43 (39%)

Query: 651 PEPVQEELAPLPAPPPSASTIPAPKPHTPQKKTAEPVQEASKE 693
PEP E + P P P PKP K + ++ ++
Sbjct: 71 PEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS12390ANTHRAXTOXNA260.016 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 25.9 bits (56), Expect = 0.016
Identities = 10/21 (47%), Positives = 16/21 (76%)

Query: 29 TVFSGKDLVTHGSERLEDEFP 49
T ++G D+V HG+E+ +EFP
Sbjct: 567 TGYTGGDVVNHGTEQDNEEFP 587


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06560GPOSANCHOR310.006 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.8 bits (69), Expect = 0.006
Identities = 27/151 (17%), Positives = 61/151 (40%), Gaps = 4/151 (2%)

Query: 23 DNEEALTKERRECASRYKNLNGLRGEFTEEKTKYVDFLVKNEALTNDINALTKEKDELEV 82
+ + E A+ L K + L + AL EK +LE
Sbjct: 243 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302

Query: 83 KNQELASANEAKKSDLQTQQTALAELQSKSKDMESIQAIADR-IKGLEEESKQLEVVKQA 141
++Q L + ++ + DL + A +L+++ + +E I++ + L + K+
Sbjct: 303 QSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362

Query: 142 EQGKHDAIVAETEQLVVNNAALRQLKADQDA 172
+ +H + + + ++ A+ + L+ D DA
Sbjct: 363 LEAEHQKLEEQNK---ISEASRQSLRRDLDA 390



Score = 30.8 bits (69), Expect = 0.006
Identities = 20/108 (18%), Positives = 33/108 (30%), Gaps = 1/108 (0%)

Query: 62 KNEALTNDINALTKEKDELEVKNQELASANEAKKSDLQTQQTALAELQSKSKDM-ESIQA 120
+ L + K + L + A + + AL + S I+
Sbjct: 191 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 250

Query: 121 IADRIKGLEEESKQLEVVKQAEQGKHDAIVAETEQLVVNNAALRQLKA 168
+ LE +LE + A A+ + L AAL KA
Sbjct: 251 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA 298



Score = 27.7 bits (61), Expect = 0.047
Identities = 20/104 (19%), Positives = 39/104 (37%), Gaps = 1/104 (0%)

Query: 62 KNEALTNDINALTKEKDELEVKNQELASANEAKKSDLQTQQTALAELQSKSKDME-SIQA 120
KN L+ + AL DEL + L + + + EL+++ D+E +++
Sbjct: 72 KNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEG 131

Query: 121 IADRIKGLEEESKQLEVVKQAEQGKHDAIVAETEQLVVNNAALR 164
+ + K LE K A + + E + + A
Sbjct: 132 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADS 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06565GPOSANCHOR330.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.1 bits (75), Expect = 0.001
Identities = 25/148 (16%), Positives = 57/148 (38%), Gaps = 6/148 (4%)

Query: 48 EIRTWTSLRNDMIKNRGLAAEENLSLISNKASVTQQRDEQLSKEAELKEEEVQLNADLSK 107
+ + +N L N ++ DE + + KE+ + + LS+
Sbjct: 51 TLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSE 110

Query: 108 INKQIEELKSKLQDFG------VSSVQEIQEKMESLEASNKKLQEDIDQIKSATEVASKR 161
+I+EL+++ D ++ K+++LEA L ++ A E A
Sbjct: 111 KASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 170

Query: 162 RAEQASELASRQKEQAEYRAALAKNGEE 189
++++ + + E+A A A+ +
Sbjct: 171 STADSAKIKTLEAEKAALEARQAELEKA 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06580PF05616310.003 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 31.3 bits (70), Expect = 0.003
Identities = 16/42 (38%), Positives = 25/42 (59%), Gaps = 4/42 (9%)

Query: 30 PSASGRPSLANPAPEPTPADDEQPNPPPPSDPGS---PQGDP 68
P ++ P+ A P PE +PA++ NP P +PG+ P+ DP
Sbjct: 317 PGSAEAPN-AQPLPEVSPAENPANNPAPNENPGTRPNPEPDP 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS06600STREPKINASE290.011 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.9 bits (64), Expect = 0.011
Identities = 16/57 (28%), Positives = 24/57 (42%)

Query: 13 TGARGNGLEEHDVACVIARHLFAQLKDMGHTVHVLDFPDKGNTEDLNATIKVANADG 69
+GA + LE+ D+ I L A + V+DF D N + A+ DG
Sbjct: 93 SGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEVIDFASDATITDRNGKVYFADKDG 149


39GOZ73_RS07285GOZ73_RS07315N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS07285-2203.815252MFS transporter
GOZ73_RS07290-1173.995628PEP-CTERM sorting domain-containing protein
GOZ73_RS07300-1204.443012*ABC transporter permease
GOZ73_RS07305-1193.868585ABC transporter permease
GOZ73_RS073100173.991388ATP-binding cassette domain-containing protein
GOZ73_RS073150203.575291efflux RND transporter periplasmic adaptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07285TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 44/223 (19%), Positives = 81/223 (36%), Gaps = 18/223 (8%)

Query: 133 TVSVAWSVLSDVMKPGQMKRLFALVASGSSLGAMAGPAVTAALAGVAGYLWLFLAAAVFL 192
T +VA + ++D+ + R F +++ G +AGP + + G + + F AAA+
Sbjct: 112 TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNG 171

Query: 193 ALAMLAGMYLHRWRDGNSPEDEETGVLLPADCRERPLGGNPFAGASAVFRSPFLMG-IGL 251
+ L G + A R G A AVF L+G +
Sbjct: 172 LNFLTGCFLLPESHKGERRPLRREALNPLASFRWA-RGMTVVAALMAVFFIMQLVGQVPA 230

Query: 252 FIILLAGTNTFLYFELMSVVASSFPDPVRQTQVFGALDVVVQGCTMLLQVFLAGRIVRKF 311
+ ++ G + F + ++ + FG L L Q + G + +
Sbjct: 231 ALWVIFGEDRFHWDATTIGISLA---------AFGILHS-------LAQAMITGPVAARL 274

Query: 312 GLAALLAAVPVLISLGFVWMAFAPVFAVVAVVMAVRRIGEYGM 354
G L + G++ +AFA + +M + G GM
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07300ABC2TRNSPORT563e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 56.5 bits (136), Expect = 3e-11
Identities = 36/190 (18%), Positives = 79/190 (41%), Gaps = 7/190 (3%)

Query: 176 FIVPGLIAVLVLINSILSG-SLSIAREREEGTFDQLLVAPYTPGEILLGKGTASVVTGIM 234
F+ G++A + + + R + T++ +L G+I+LG+ + +
Sbjct: 68 FLAAGMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAAL 127

Query: 235 QAVFVVLVAMFWFRIPFQGSIWLLSAALLLFIVTAAAIGLCISSFAQSLQQAIVGTFLLL 294
+ +VA + ++ L L + A+ +G+ +++ A S I L++
Sbjct: 128 AGAGIGVVAAALGYTQWLSLLYALPVIALTGLAFAS-LGMVVTALAPSYDYFIFYQTLVI 186

Query: 295 VPMVMLSGFATPISSMPEIFQDLTLLNPMRYGLELIQRIFLEGAG-----FLDLWPLFAA 349
P++ LSG P+ +P +FQ P+ + ++LI+ I L + ++
Sbjct: 187 TPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIV 246

Query: 350 IMAVTVAAVL 359
I A+L
Sbjct: 247 IPFFLSTALL 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07305ABC2TRNSPORT362e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 35.7 bits (82), Expect = 2e-04
Identities = 32/145 (22%), Positives = 59/145 (40%), Gaps = 2/145 (1%)

Query: 196 AREYERGTLESLFVTPVGSGEILAAKAATNFLLGMVSLAISMLFAAFVFGIPIRGSLTLL 255
R + T E++ T + G+I+ + A ++ A + AA + SL
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQ-WLSLLYA 150

Query: 256 LAVSALFLIVALGLGLVISTATKNQFLACQFAIMGTFMPALMLSGFLYDILNMPPAVRAI 315
L V AL + LG+V+ TA + F P L LSG ++ + +P +
Sbjct: 151 LPVIALTGLAFASLGMVV-TALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTA 209

Query: 316 TYLIPARYYVTLLQTLFLAGDIPSV 340
+P + + L++ + L + V
Sbjct: 210 ARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS07315RTXTOXIND754e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 75.3 bits (185), Expect = 4e-17
Identities = 46/287 (16%), Positives = 98/287 (34%), Gaps = 31/287 (10%)

Query: 67 PGQKLATLETVRLRQAADEARQIAEAARQNYLRVKNGPRAEEIAQARANVQAAEATLNNA 126
P + + E V + + + ++ + + E A + E
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 127 GMRSKRLEALAETKSISRQEADDAVASRQVAAANLDVARKQLELLLAGSRE--------- 177
R +L ++I++ + A L V + QLE + +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 178 --------EDVAQALAQYNQAKASLTIREQNLKDAVLYAPGNGVVRN-RILEKGDMASPQ 228
+ + Q L E+ + +V+ AP + V+ ++ +G + +
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 229 KPVYNIS-LNHTKWVRAYLTESQLGKVKPGFSATVRNDSFPDTGFKGTVGFISSVAEFTP 287
+ + I + T V A + +G + G +A ++ ++FP T + VG + ++
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA- 412

Query: 288 KNVETPDLRTALVYEVRIIVD-------DPDNRLRLGAPATVTIPLG 327
D R LV+ V I ++ + + L G T I G
Sbjct: 413 ----IEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455


40GOZ73_RS08150GOZ73_RS08170N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS08150-1160.397485outer membrane lipoprotein-sorting protein
GOZ73_RS08155-1140.380173polysaccharide deacetylase family protein
GOZ73_RS08160016-0.324711protein translocase subunit SecDF
GOZ73_RS081652191.145550EF-hand domain-containing protein
GOZ73_RS081700160.797069prepilin peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08150PF03627280.033 PapG
		>PF03627#PapG

Length = 336

Score = 28.0 bits (62), Expect = 0.033
Identities = 16/50 (32%), Positives = 23/50 (46%), Gaps = 6/50 (12%)

Query: 52 FSMSLRGNLIAFQ----YQLNNV--WNRFDLKFKDRGQEILSWKDGKAGV 95
FS+ + G A+ Y L NV + ++ R Q I SW+ G A V
Sbjct: 10 FSLCVSGESSAWNNIVFYSLGNVNSYQGGNVVITQRPQFITSWRPGIATV 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08160SECFTRNLCASE2312e-71 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 231 bits (590), Expect = 2e-71
Identities = 94/311 (30%), Positives = 156/311 (50%), Gaps = 16/311 (5%)

Query: 480 LNLFKKKTNIDFMSQKIWSCSLSAILLMICIVFGAVKGKDCLGMDFTGGSSISYLVNKG- 538
L L +KTN DF + + + ++++ ++ V G + G+DF GG++I
Sbjct: 5 LKLVPEKTNFDFFRWQWATFGAAIVMMIASVILPLVIGLN-FGIDFKGGTTIRTESTTAI 63

Query: 539 --DISFREVEGVVNKLSLTQKATVQEVAESTDSSKVNILINFSDHAQDKEA--------- 587
+ +E + + + E + + I + + +
Sbjct: 64 DVGVYRAALEPLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNK 123

Query: 588 ITTALDKAFPVLDNAPFSEETIGQSMGYDTLITSAWALFFGILGIMVYLTVRFEWTFAIG 647
+ TAL P L F E++G + + + T+ W+L + IM Y+ VRFEW FA+G
Sbjct: 124 VETALTAVDPALKITSF--ESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALG 181

Query: 648 AVIALTHDVLLVLGLVIVSGTELNVIHIGALLTVAGYSINDKIIVFDRIREFLRFSDPNE 707
AV+AL HDVLL +GL V + ++ + ALLT+ GYSIND ++VFDR+RE L
Sbjct: 182 AVVALVHDVLLTVGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLI-KYKTM 240

Query: 708 SASQIMNEAINQTLSRTILTSLSTLAVLVCLYFFGGPSMEDFAWTISAGILIGTYSSIFI 767
+MN ++N+TLSRT++T ++TL LV + +GG + F + + G+ GTYSS+++
Sbjct: 241 PLRDVMNLSVNETLSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYV 300

Query: 768 ASPAALFFSRK 778
A LF
Sbjct: 301 AKNIVLFIGLD 311



Score = 78.0 bits (192), Expect = 1e-17
Identities = 31/174 (17%), Positives = 75/174 (43%), Gaps = 3/174 (1%)

Query: 297 KLNFESASEISASLGHTALLQGEYAGITGLILCFIMMIIYYRFA-GLVAIMGLTINALLL 355
L S + + + ++ + ++ + + + + L A++ L + LL
Sbjct: 134 ALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLT 193

Query: 356 LGAMSIFGFELTLPGIAGIVLTLGVAVDSNVLIYERLRE--EKEMGRPFRVAIRNSFDKA 413
+G ++ + L +A ++ G +++ V++++RLRE K P R + S ++
Sbjct: 194 VGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 414 FSAIFDSNITSLITAVILYWMATGTIKGFAVTLTVGVITSMIGAILVTRVLFYW 467
S + +T+L+ V + I+GF + GV T ++ V + + +
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLF 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08165TYPE4SSCAGX310.008 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.9 bits (69), Expect = 0.008
Identities = 35/157 (22%), Positives = 65/157 (41%), Gaps = 16/157 (10%)

Query: 61 VQMKRIDPGIMIVGEELVLAKYDTDKDGKLSDEEIAVLQADVKKAQEARKAAILKKFDKD 120
+Q K + +M E + A D L +++ V D K+ +E +KA +K K+
Sbjct: 97 IQPKSVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKE 156

Query: 121 GDGKLSKEERKAMQEEWLKDHPEAAKRMEEMKARQEARKAEILKKYDKDGDGKLSDEEKK 180
K K++R+ +EE K+ M Q + L + K
Sbjct: 157 QAQKAQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQ----------- 205

Query: 181 AMREDWAKNHPEAAKRMEEMKARQEAHAAAMIKKFDK 217
RE N + +R+E+M+ + +A+A I++ +K
Sbjct: 206 -QRE----NELDQMERLEDMQEQAQANALKQIEELNK 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS08170PREPILNPTASE511e-09 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 51.0 bits (122), Expect = 1e-09
Identities = 26/79 (32%), Positives = 34/79 (43%), Gaps = 4/79 (5%)

Query: 170 REAMGSGDAWILMMAGCACGWQGALFSLFLGSLLGIVQAAASRM----GFGRNLPFGPAL 225
+E MG GD +L G GWQ L L SL+G + + +PFGP L
Sbjct: 208 KEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKPIPFGPYL 267

Query: 226 LSAALIWLLGGSQWWQAYI 244
A I LL G + Y+
Sbjct: 268 AIAGWIALLWGDSITRWYL 286


41GOZ73_RS09590GOZ73_RS09620N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS095902151.359409autotransporter domain-containing protein
GOZ73_RS095952141.518068autotransporter outer membrane beta-barrel
GOZ73_RS096000120.307105autotransporter domain-containing protein
GOZ73_RS096050172.031176thioesterase
GOZ73_RS09610-1192.500381SDR family oxidoreductase
GOZ73_RS09615-1203.927447HAMP domain-containing histidine kinase
GOZ73_RS096200203.834138response regulator transcription factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09590TYPE3OMBPROT270.014 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 26.6 bits (58), Expect = 0.014
Identities = 18/70 (25%), Positives = 33/70 (47%), Gaps = 10/70 (14%)

Query: 3 LVGSNIFGRE-ALAEIRVNAAQDLGDRRGE-TNISLLGNPGFAQSVR--------GAKVG 52
L +++ G E ++ + +VNA + L +RGE T + + + G + V V
Sbjct: 282 LTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDGLLKEVSVNLKVVTFNFGVN 341

Query: 53 TTALQLGAGL 62
AL++G G
Sbjct: 342 ELALKMGLGW 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09595BINARYTOXINB320.029 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.0 bits (72), Expect = 0.029
Identities = 19/112 (16%), Positives = 37/112 (33%), Gaps = 26/112 (23%)

Query: 493 ATAEHVIGGNNKGASTTLTGNTNVTVKDNAIVAGAIIGGSTSSHNAVTTITGNTSVLVTN 552
+ E N + T++ NT+ + + + G+ H + I G+ S
Sbjct: 301 SKNEDQSTQNTDSQTRTISKNTS-----TSRTHTSEVHGNAEVHASFFDIGGSVS---AG 352

Query: 553 IQHSNSATVNLGDFGNVTAQNFITGGSAWTANQTSGTTIQGNTSVTINVGDA 604
+SNS+TV + + G W ++ +N D
Sbjct: 353 FSNSNSSTVAIDH------SLSLAGERTW------------AETMGLNTADT 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09600IGASERPTASE360.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 0.002
Identities = 48/171 (28%), Positives = 74/171 (43%), Gaps = 17/171 (9%)

Query: 449 TDYVWNAADEGGALEGSGNLEKTGSAKLTINMDNADYAGKVIL-GGGTLEMGNDAALGTG 507
TDY W++ + + G EK+ + L D ++ V G GTL + N+ G G
Sbjct: 347 TDYSWSSNGKTSTITGG---EKSLNVDLADGKDKPNHGKSVTFEGSGTLTLNNNIDQGAG 403

Query: 508 DLAFDGGTLRYGTGVTADISGQISTTSKSSVKVDTNGNNVSWASADHYKALNLEKSGNGV 567
L F+G Y T+D +TT K + G V+W + + L K G G
Sbjct: 404 GLFFEGD---YEVKGTSD-----NTTWKGAGVSVAEGKTVTW-KVHNPQYDRLAKIGKGT 454

Query: 568 LTLGG-AVYTGGLTVDEGGVSI---TSGSVQTKFSAGVTVNAGGTLAISST 614
L + G G L V +G V + T+GS Q F++ V+ TL ++
Sbjct: 455 LIVEGTGDNKGSLKVGDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLNDD 505


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09610NUCEPIMERASE1851e-58 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 185 bits (471), Expect = 1e-58
Identities = 83/336 (24%), Positives = 148/336 (44%), Gaps = 38/336 (11%)

Query: 4 RILITGGAGFIGSHLSERLLREGHEVICMDNFFTGSKQNI----LHLTDYPGFEVIRHDV 59
+ L+TG AGFIG H+S+RLL GH+V+ +DN ++ L L PGF+ + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 T-------VPYVMEVDQIYNLACPASPPHYQFDPIHTMKTSVLGALNMLGLAKRCK-ARI 111
+ ++++ + + +P +++ G LN+L + K +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 112 LQASTSEVYGDPMVHPQPETYWGNVN-PVGVRSCYDEGKRCAETLFMDYRRMNGVDVRII 170
L AS+S VYG P + +V+ PV S Y K+ E + Y + G+ +
Sbjct: 122 LYASSSSVYGLN--RKMPFSTDDSVDHPV---SLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 171 RIFNTYGPRMNPNDGRVVSNFIVQALKGEDITIYGTGKQTRSFQYVDDLVEGMVRMMDTE 230
R F YGP P+ + F L+G+ I +Y GK R F Y+DD+ E ++R+ D
Sbjct: 177 RFFTVYGPWGRPD--MALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 231 GFSGP------------------VNLGNPEEFTMLELAEKVIEMTGSSSKTVFRPLPLDD 272
+ N+GN +++ + + + G +K PL D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 273 PTQRKPDIRLAKEKLGWKPHITLEKGLEKTIAYFRG 308
+ D + E +G+ P T++ G++ + ++R
Sbjct: 295 VLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09620HTHFIS1051e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 105 bits (264), Expect = 1e-28
Identities = 42/138 (30%), Positives = 69/138 (50%)

Query: 4 SAYTILIAEDDDSIRHALTDVLTASGYEVLALEEGLAAIHAVRERNFDLALLDVAMPGAD 63
+ TIL+A+DD +IR L L+ +GY+V + + DL + DV MP +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 64 GFQVLQVMSEERPGTPVIMLTARGEEEDRVQGLKLGADDYIVKPFSIRELLARIEAVLRR 123
F +L + + RP PV++++A+ ++ + GA DY+ KPF + EL+ I L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 124 SPERPRQLREIRIPGATL 141
RP +L + G L
Sbjct: 122 PKRRPSKLEDDSQDGMPL 139


42GOZ73_RS09935GOZ73_RS09965N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GOZ73_RS09935-290.121000SUMF1/EgtB/PvdO family nonheme iron enzyme
GOZ73_RS09940-112-0.717947hypothetical protein
GOZ73_RS12435-212-1.035734PDZ domain-containing protein
GOZ73_RS09950-214-0.850008serine protease
GOZ73_RS09960015-1.175959*acyltransferase
GOZ73_RS09965119-0.648160S24 family peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09935BLACTAMASEA290.029 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.4 bits (66), Expect = 0.029
Identities = 14/52 (26%), Positives = 23/52 (44%), Gaps = 3/52 (5%)

Query: 150 MNTAVPGDFHEETSRDQMKRLGIDLPPLVTPDVPGTTRMRTLLEWMREGQNG 201
+N A+PGD + T+ + L L+T R LL+WM + +
Sbjct: 165 LNEALPGDARDTTT---PASMAATLRKLLTSQRLSARSQRQLLQWMVDDRVA 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS12435V8PROTEASE290.025 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 29.2 bits (65), Expect = 0.025
Identities = 14/42 (33%), Positives = 23/42 (54%), Gaps = 3/42 (7%)

Query: 292 GGPLLNLKGELIGMNIARVNRAENYALPISVVQKSVQNILKN 333
G P+ N K E+IG++ V N A+ + ++V+N LK
Sbjct: 238 GSPVFNEKNEVIGIHWGGVPNEFNGAV---FINENVRNFLKQ 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09950V8PROTEASE557e-11 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 54.6 bits (131), Expect = 7e-11
Identities = 35/154 (22%), Positives = 54/154 (35%), Gaps = 19/154 (12%)

Query: 71 GSGSGILVSEKGLVFSAAHVVD---KKGTTLKIIL--------PDGTRLPGKTTAQNSNS 119
SG++V K + + HVVD LK P+G + T +
Sbjct: 102 FIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEG 160

Query: 120 DAGMAKITSQLNKKL------PCVEKAEKMPRVGDWVFALGHGGGLDRKRGPMVRLGRVV 173
D + K + K P +V + G+ G G++
Sbjct: 161 DLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-VATMWESKGKIT 219

Query: 174 SLKNGVIQTDCKLIRGDSGGPLFNLDGKLIGIHS 207
LK +Q D G+SG P+FN ++IGIH
Sbjct: 220 YLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GOZ73_RS09965TCRTETOQM310.007 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 30.6 bits (69), Expect = 0.007
Identities = 28/108 (25%), Positives = 43/108 (39%), Gaps = 26/108 (24%)

Query: 106 EAAEERQLRPMHSPLVPLVRQFPSHAGHARREYSTQIIGN--------IAAGSLAESDTV 157
E ++ +Q + L+ + P + + +II + + L E V
Sbjct: 351 EPSKPQQREMLLDALLEISDSDPLLR-YYVDSATHEIILSFLGKVQMEVTCALLQEKYHV 409

Query: 158 ------PSTIYMERPLGKNEYVVRVE----------GKSMEPLIPDGS 189
P+ IYMERPL K EY + +E G S+ PL P GS
Sbjct: 410 EIEIKEPTVIYMERPLKKAEYTIHIEVPPNPFWASIGLSVSPL-PLGS 456



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.