PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeCP017720-2.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP017720 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1GX95_00090GX95_00140Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_000900174.110488trimethylamine N-oxide reductase I catalytic
GX95_000953235.617480molecular chaperone TorD
GX95_001001318.129984cytochrome-c peroxidase
GX95_0010575615.227372heme ABC transporter ATP-binding protein CcmA
GX95_0011065514.008501heme exporter protein CcmB
GX95_0011565413.670813heme ABC transporter permease
GX95_0012054511.894662heme exporter protein CcmD
GX95_0012544310.955878cytochrome c biogenesis protein CcmE
GX95_001303399.765985c-type cytochrome biogenesis protein CcmF
GX95_001350275.872404thiol:disulfide interchange protein
GX95_00140-1204.225696cytochrome c biogenesis protein CcmH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00095PF06872290.021 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 28.5 bits (63), Expect = 0.021
Identities = 14/54 (25%), Positives = 27/54 (50%)

Query: 111 LLLEAGMEVNDDFKEPTDHLAIYLELLSHLHFSLGESFQQRRMNKLRQKTLSSL 164
L+L+A +++N D+K+P + + +LL L L + + Q L+ L
Sbjct: 29 LVLDATIKINSDYKKPWNEMTCAEKLLKILTLGLWNPKYSQDERQQFQGLLTVL 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00120PF06580250.031 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 24.8 bits (54), Expect = 0.031
Identities = 11/66 (16%), Positives = 20/66 (30%), Gaps = 3/66 (4%)

Query: 1 MSPAFSSWSDFFAMGGYAFFVWLAVAMTVAPLVLLALHTVLQRRAILRGVAQQRAREARM 60
P + F V + M ++ I + A+EA++
Sbjct: 107 TKPVAFTLPLA---LSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQL 163

Query: 61 RAAQAQ 66
A +AQ
Sbjct: 164 MALKAQ 169


2GX95_00185GX95_00260Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_00185-2133.168553D-serine ammonia-lyase
GX95_00190-1122.934460D-serine transporter DsdX
GX95_00195-1143.070022colanic acid biosynthesis glycosyltransferase
GX95_002000133.595179hypothetical protein
GX95_00205-1133.140093multidrug transporter EmrD
GX95_00210-1161.833925hypothetical protein
GX95_00215014-0.411556type I toxin-antitoxin system toxin TisB
GX95_00220-114-0.527268leader peptide IlvB
GX95_00225-112-0.105019acetolactate synthase, large subunit,
GX95_00230-211-0.431378acetolactate synthase isozyme 1 small subunit
GX95_00235-2131.839994DeoR family transcriptional regulator
GX95_00240-2163.115850ribokinase
GX95_00245-2203.814303L-fucose:H+ symporter permease
GX95_00250-1204.612554DUF4432 domain-containing protein
GX95_002550214.364673DNA-binding response regulator
GX95_002601203.401919two-component system sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00205TCRTETB582e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 58.0 bits (140), Expect = 2e-11
Identities = 48/189 (25%), Positives = 85/189 (44%), Gaps = 11/189 (5%)

Query: 5 RNVNLLLMLVLLVAVGQMAQTIYIPAIADMAQALNVREGAVQSVMAAYLLTYGVSQLFYG 64
R+ +L+ L +L + + + ++ D+A N + V A++LT+ + YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 65 PLSDRVGRRPVILVGMSIFMVATLFAMTTHS-LTVLIAASAMQGMGTG-----VGGVMAR 118
LSD++G + ++L G+ I ++ HS ++LI A +QG G V V+AR
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 119 TLPRDLYEGTQLRHANSLLNMGILVSPLLAPLIGGLLDTLWNWRACYAFLLVLCAGVTFS 178
+P++ G S++ MG V P IGG++ +W ++ V F
Sbjct: 131 YIPKE-NRGKAFGLIGSIVAMGEGV----GPAIGGMIAHYIHWSYLLLIPMITIITVPFL 185

Query: 179 MARWMPETR 187
M E R
Sbjct: 186 MKLLKKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00255HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.0 bits (148), Expect = 2e-13
Identities = 23/116 (19%), Positives = 45/116 (38%), Gaps = 5/116 (4%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHT 114
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00260PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 30/142 (21%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLAQAIRSLLREMELESRGIVSHLDWRIDETALSESQRVTLFRVCQEGLNN 424
LR ++LA + + ++L S L + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----NASAVTLQGWQQDERLMLVIEDDGSGLPPGSQQ-QGFGLTGMRERVSALG 478
+KH + L+G + + + L +E+ GS +++ G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLTISCTHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


3GX95_00360GX95_00420Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_00360419-3.837716hypothetical protein
GX95_00365319-3.720572hydrolase
GX95_00370319-4.075679transcriptional regulator
GX95_00375419-3.734863hypothetical protein
GX95_00380320-3.990271autotransporter outer membrane beta-barrel
GX95_00385016-4.411217transcriptional regulator
GX95_00390111-0.806762hydroxyacid dehydrogenase
GX95_00400-1111.027415hypothetical protein
GX95_00415-1153.091940*glycoside-pentoside-hexuronide family
GX95_00420-1133.170037alpha-xylosidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00360cloacin290.009 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.9 bits (64), Expect = 0.009
Identities = 13/47 (27%), Positives = 21/47 (44%)

Query: 30 NGNGGGHGNNAANQGNNGNGHKGNAGQKTEHRKNGGKPDHVESDISY 76
N GGG G+ G +G+G+ G G GG V + +++
Sbjct: 44 NPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAF 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00365ISCHRISMTASE426e-07 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 42.3 bits (99), Expect = 6e-07
Identities = 43/180 (23%), Positives = 63/180 (35%), Gaps = 22/180 (12%)

Query: 1 MSTPANF--NGQRPAIDANDAVMLLIDHQSGLFQTVGD--MPMPELRARAAALAKIATLC 56
M T ++ N D N AV+L+ D Q+ P+ EL A L
Sbjct: 11 MPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQL 70

Query: 57 NMPVITTASVPQ-------------GPNGPLIPE----IHANAPYA-QYVARKGEINAWD 98
+PV+ TA GP P I AP V K +A+
Sbjct: 71 GIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFK 130

Query: 99 NADFVQAVKATGRKTLIIAGTITSVCMAFPAISAVAEGYKVFAVIDASGTYSKMAQEITM 158
+ ++ ++ GR LII G + A A E K F V DA +S ++ +
Sbjct: 131 RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKHQMAL 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00380PERTACTIN1208e-30 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 120 bits (302), Expect = 8e-30
Identities = 164/749 (21%), Positives = 289/749 (38%), Gaps = 90/749 (12%)

Query: 230 TGDSSEGLRTGQSGSLIRLGDDATIETSGASSTGIYAASSSRTELGNNATITVNGASAHA 289
TG + G+ G+++ L ATI A + G + +
Sbjct: 236 TGGRAAGV-AAMDGAIVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGG-FGPLLDGWYG 292

Query: 290 VYATNATVNLGDNATISVNSASKAASYSKAPAGLYALSRGAINLAGGAAITMAGDNSSES 349
V +++TV+L A V + A+ +S G+++ G I G
Sbjct: 293 VDVSDSTVDL---AQSIVEAPQLGAAIRAGRGARVTVSGGSLSAPHGNVIETGGGARRFP 349

Query: 350 YAISTETGGIVDGS--SGGRFVIDGDIRAAGATAASGTLPQ--------------QNSTI 393
S + + G+ G + T A G Q + +
Sbjct: 350 PPASPLSITLQAGARAQGRALLYRVLPEPVKLTLAGGAQGQGDIVATELPPIPGASSGPL 409

Query: 394 KLNMTDNSRWDGASYITSATAGTGVISVQMSDATWNMTSSSTLTDLTLNSGAIINFSH-- 451
+ + +RW GA+ V S+ + +ATW MT +S + L L S ++F
Sbjct: 410 DVALASQARWTGATRA--------VDSLSIDNATWVMTDNSNVGALRLASDGSVDFQQPA 461

Query: 452 EDGEPWQTLTINEDYVGNGGKLVFNTVLNDDDSETDRLQVLGNTSGNTFVAVNNIGGAGA 511
E G ++ L ++ G+G +F + D +D+L V+ + SG + V N G A
Sbjct: 462 EAGR-FKVLMVDT-LAGSG---LFRMNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPA 516

Query: 512 QTIEGIEIVNVAGNSNGTFEKASR---IVAGAYDYNVVQKGKNWYLTSYIEPDEPIIPDP 568
+ + +V S TF A++ + G Y Y + G + S + P P P
Sbjct: 517 -SGNTMLLVQTPRGSAATFTLANKDGKVDIGTYRYRLAANGNGQW--SLVGAKAPPAPKP 573

Query: 569 VDPVIPDPVDPDPVDPVIPDPVIPDPVDPDPVDPEPVDPVIPDPVIPDIGQSDTPPITEH 628
P P P P P P P P P +P P P ++ + +
Sbjct: 574 APQPGPQPGPQPPQPPQPPQP----PQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTG 629

Query: 629 QFRPEVGSYLANNYAANTLFMTRLHDRLGETQYTDMLTGEKKVTSLWMRNVGAHTRFNDG 688
+ A + A L RLGE + G W R + ++
Sbjct: 630 GVGLASTLWYAESNA--------LSKRLGELRLNPDAGG------AWGRGFAQRQQLDNR 675

Query: 689 SGQLKTRINSYVLQLGGDLAQWSTDGLDRWHIGTMAGYANSQNRTQSSVSDYHSRGQVTG 748
+G+ + +LG D A + G RWH+G +AGY + D G
Sbjct: 676 AGRRFDQ-KVAGFELGADHA-VAVAG-GRWHLGGLAGYTRGD---RGFTGD--GGGHTDS 727

Query: 749 YSVGLYGTWYANNIDRSGAYVDTWMLYNWFDN--KVMGQDQAA--EKYKSKGITASVEAG 804
VG Y T+ AN+ G Y+D + + +N KV G D A KY++ G+ S+EAG
Sbjct: 728 VHVGGYATYIANS----GFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGVSLEAG 783

Query: 805 YSFRLGESAHQSYWLQPKAQVVWMGVQANDHREANGTLVKDDTAGNLLTRMGVKAYINGH 864
F ++L+P+A++ V +R ANG V+D+ ++L R+G++
Sbjct: 784 RRFAH----ADGWFLEPQAELAVFRVGGGAYRAANGLRVRDEGGSSVLGRLGLEV----G 835

Query: 865 NAIDDNKSREFQPFVEANWIHNTQPA-SVKMDDVS--SDMRGTKNIGELKVGIEGQITPR 921
I+ R+ QP+++A+ + A +V+ + ++ +++RGT+ EL +G+ +
Sbjct: 836 KRIELAGGRQVQPYIKASVLQEFDGAGTVRTNGIAHRTELRGTR--AELGLGMAAALGRG 893

Query: 922 LNVWGNVAQQVGDQGYSNTQGLLGVKYSF 950
+++ + G + G +YS+
Sbjct: 894 HSLYASYEYSKGPKLAMPWTFHAGYRYSW 922


4GX95_00535GX95_00760Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_00535-221-6.277083pantetheine-phosphate adenylyltransferase
GX95_00540-126-8.1244573-deoxy-D-manno-octulosonic acid transferase
GX95_00545034-11.486640lipopolysaccharide core heptosyltransferase
GX95_00550240-13.931296glucosyltransferase I RfaG
GX95_00555443-16.473770lipopolysaccharide core heptose(I) kinase RfaP
GX95_00560546-17.633392UDP-glucose--glucosyl LPS a 1,
GX95_00565545-15.878532glycosyl transferase
GX95_00570643-15.736462lipopolysaccharide 1,3-galactosyltransferase
GX95_00575335-12.830117lipopolysaccharide 1,2-glucosyltransferase
GX95_00580224-8.616141heptose kinase
GX95_00585217-6.4497853-deoxy-D-manno-oct-2-ulosonate III transferase
GX95_00590010-3.594182UDP-glucose--(glucosyl)LPS
GX95_00595012-1.448212O-antigen ligase
GX95_006000140.352088lipopolysaccharide heptosyltransferase 1
GX95_006051140.516001ADP-heptose--LPS heptosyltransferase
GX95_006100140.892981ADP-glyceromanno-heptose 6-epimerase
GX95_006150131.515085glycine C-acetyltransferase
GX95_00620-1130.925366L-threonine 3-dehydrogenase
GX95_006251110.876594glycosyltransferase
GX95_006302171.819688hypothetical protein
GX95_006351162.090374murein hydrolase activator EnvC
GX95_006400191.148897phosphoglycerate mutase
GX95_006450151.263913hypothetical protein
GX95_006500143.162750glutaredoxin 3
GX95_006550153.650088protein-export chaperone SecB
GX95_00660-1184.129886glycerol-3-phosphate dehydrogenase
GX95_006650161.743472serine O-acetyltransferase
GX95_006700160.303956tRNA
GX95_006750170.067459alpha-hydroxy-acid oxidizing enzyme
GX95_00680118-0.956880transcriptional regulator LldR
GX95_00685117-1.343538L-lactate permease
GX95_00690219-2.195300hypothetical protein
GX95_00695-218-1.785917hypothetical protein
GX95_00700-3230.721180hypothetical protein
GX95_00705-3182.154359hypothetical protein
GX95_00710-2152.735237MltR family transcriptional regulator
GX95_00715-2142.391412mannitol-1-phosphate 5-dehydrogenase
GX95_00720-1142.869065PTS mannitol transporter subunit IICBA
GX95_007250122.890828glutathione S-transferase
GX95_00730-1123.210615L-seryl-tRNA(Sec) selenium transferase
GX95_00735-2112.514587selenocysteinyl-tRNA-specific translation
GX95_00740-3142.451606L-threonine dehydrogenase
GX95_00745-3173.214061sugar kinase
GX95_00750-3193.578056aldehyde dehydrogenase
GX95_00755-2224.796549hypothetical protein
GX95_00760-1213.665767AraC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00535LPSBIOSNTHSS2429e-86 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 242 bits (620), Expect = 9e-86
Identities = 82/154 (53%), Positives = 110/154 (71%)

Query: 5 AIYPGTFDPITNGHLDIVTRATQMFDHVILAIAASPGKKPMFTLDERVALAQKATAHLGN 64
AIYPG+FDPIT GHLDI+ R ++FD V +A+ +P K+PMF++ ER+ KA AHL N
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 65 VEVVGFSDLMANFARDRQANILIRGLRAVADFEYEMQLAHMNRHLMPQLESVFLMPSKEW 124
+V F L N+AR RQA ++RGLR ++DFE E+Q+A+ N+ L LE+VFL S E+
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 125 SFISSSLVKEVARHQGDVTHFLPDNVHQALMDKL 158
SF+SSSLVKEVAR G+V HF+P +V AL D+
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00610NUCEPIMERASE993e-26 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 99.5 bits (248), Expect = 3e-26
Identities = 75/348 (21%), Positives = 125/348 (35%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLNI 47
+VTG AGFIG ++ K L + G ++ +DNL D +++
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMSGEELGDIEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + G E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLF---ASGHFERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 AVNL------------WFLESGKSG-------IFNLGTGRAESFQAVADATLAY-HKKGS 258
+ W +E+G ++N+G A +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRNA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00635RTXTOXIND477e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.1 bits (112), Expect = 7e-08
Identities = 25/196 (12%), Positives = 62/196 (31%), Gaps = 21/196 (10%)

Query: 45 RDQLKSIQADIAAKERDVRQQQQQRASLLAQLKAQEEAISAAARKLRETQSTLDQLNAQI 104
++ + + Q +L +E S + + + LD+ A+
Sbjct: 157 SRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER 216

Query: 105 DEMNASIAKLEQQKASQERNLAAQLDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLN 164
+ A I + E ++ L + + ++ Q + ++A
Sbjct: 217 LTVLARINRYENLSRVEKSRLDD-FSSLLHKQAIAKHAVL-----EQENKYVEA-----V 265

Query: 165 QARQETIAELKQTREQVATQKAELEEKQSQQQTLLYEQRAQ-QAKLEQARNERKKTLAGL 223
+ ++L+Q ++ + K E + + + ++ Q + E K
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE-- 323

Query: 224 ESSIQQGQQQLSELRA 239
+QQ S +RA
Sbjct: 324 -------RQQASVIRA 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00655SECBCHAPRONE2342e-82 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 234 bits (598), Expect = 2e-82
Identities = 91/153 (59%), Positives = 118/153 (77%), Gaps = 4/153 (2%)

Query: 3 EQNNTEMAFQIQRIYTKDVSFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVLRV 62
Q + QIQRIY KDVSFEAPN PH+FQ+DW+P++ DL T + Q+ DD+YEV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 63 TVTASLGEE--TAFLCEVQQAGIFSISGIEGTQMAHCLGAYCPNILFPYARECITSLVSR 120
+V ++ AF+CEV+QAG+F+ISG+E QMAHCL + CPN+LFPYARE ++SLV+R
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 121 GTFPQLNLAPVNFDALFMNYL--QQQAGEGTEE 151
GTFP LNL+PVNFDALFM+YL Q+QA + TEE
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQAEQTTEE 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00660NUCEPIMERASE290.028 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.028
Identities = 21/87 (24%), Positives = 30/87 (34%), Gaps = 13/87 (14%)

Query: 8 MTVI---GAGSYGTALAITLARNGHQVVLWGHD---PKHIATLEHDRCNVAFLPDVPFPD 61
M + AG G ++ L GHQVV G D + +L+ R + P F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVV--GIDNLNDYYDVSLKQARLELLAQPGFQF-- 56

Query: 62 TLHLESDLATALAASRNILVVVPSHVF 88
+ DLA + VF
Sbjct: 57 ---HKIDLADREGMTDLFASGHFERVF 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00690PF03895706e-17 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 70.2 bits (172), Expect = 6e-17
Identities = 17/80 (21%), Positives = 38/80 (47%), Gaps = 2/80 (2%)

Query: 1382 VENKMSGGIASAMAMAGLPQAYAPGANMTSIAGGTFNGESAVAIGI-SMVSESGGWVYKL 1440
+ ++ G+A+ A++ L Q G S A G + ++A+AIG+ S +++ +
Sbjct: 1 LSKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGV 60

Query: 1441 QGTSNSQGDYSAAIGAGFQW 1460
+ + G S G+++
Sbjct: 61 AFNTYN-GGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00735TCRTETOQM532e-09 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 52.9 bits (127), Expect = 2e-09
Identities = 35/106 (33%), Positives = 53/106 (50%), Gaps = 16/106 (15%)

Query: 3 IATAGHVDHGKTTLLQAI---TGV------------NADRLPEEKKRGMTIDLGYAYWPQ 47
I HVD GKTTL +++ +G D E++RG+TI G +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 48 PDGRVLGFIDVPGHEKFLSNMLAGVGGIDHALLVVACDDGVMAQTR 93
+ +V ID PGH FL+ + + +D A+L+++ DGV AQTR
Sbjct: 66 ENTKV-NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTR 110


5GX95_00860GX95_00980Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_00860-215-3.736772hypothetical protein
GX95_00865-118-5.928313acetyltransferase
GX95_00870121-6.113481hypothetical protein
GX95_00875123-6.068996glycine--tRNA ligase subunit alpha
GX95_00880026-7.305134glycine--tRNA ligase subunit beta
GX95_00885547-14.078120AAA family ATPase
GX95_00890437-9.988427hypothetical protein
GX95_00895220-3.219042hypothetical protein
GX95_00900017-0.922064GNAT family N-acetyltransferase
GX95_00905-1160.939181hypothetical protein
GX95_009100132.877493cold-shock protein
GX95_009150122.909557transcriptional regulator
GX95_00920-1113.018243hypothetical protein
GX95_009250112.937257bifunctional glyoxylate/hydroxypyruvate
GX95_00930-1101.740683OmpA family lipoprotein
GX95_00935-2102.137587trimethylamine N-oxide reductase I catalytic
GX95_00940-2131.636775N-acetyltransferase
GX95_00945-1141.856013DNA-3-methyladenine glycosylase I
GX95_00950-2141.498210autotransporter outer membrane beta-barrel
GX95_00955-2151.795655lipid A phosphoethanolamine transferase
GX95_00980-2163.116461*LacI family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00930OMPADOMAIN1161e-33 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (293), Expect = 1e-33
Identities = 43/124 (34%), Positives = 64/124 (51%), Gaps = 11/124 (8%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVVGYTDSTGSHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD GS N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASSLITQGVDASRIRTSGMGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I GMG +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SPLQ 220
++
Sbjct: 335 KGIK 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00940SACTRNSFRASE348e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.1 bits (78), Expect = 8e-05
Identities = 20/52 (38%), Positives = 26/52 (50%), Gaps = 5/52 (9%)

Query: 76 VAPDALRHGIGKALL----EYVQQR-FPLLSLEVYQKNQSAVNFYHALGFRI 122
VA D + G+G ALL E+ ++ F L LE N SA +FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


6GX95_01200GX95_01290Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_01200-2133.253770universal stress global response regulator UspA
GX95_01205-2143.294412universal stress protein UspB
GX95_01210-2143.476156anion permease
GX95_01215-2164.206172hypothetical protein
GX95_01220-2143.411591hypothetical protein
GX95_01225-2143.156038ABC transporter ATP-binding protein
GX95_012300141.994542hypothetical protein
GX95_012351141.461620nickel responsive regulator
GX95_012402151.381718ACP synthase
GX95_012450131.229420permease
GX95_012501133.810260MFS transporter
GX95_012551143.506990hypothetical protein
GX95_012602143.660050hypothetical protein
GX95_012652153.974958sulfurtransferase TusA
GX95_012702154.092128methyl-accepting chemotaxis protein II
GX95_012752133.957689zinc/cadmium/mercury/lead-transporting ATPase
GX95_012802151.919399hypothetical protein
GX95_012853162.229576hypothetical protein
GX95_012902162.116085hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01210TYPE3IMSPROT300.022 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.1 bits (68), Expect = 0.022
Identities = 23/194 (11%), Positives = 57/194 (29%), Gaps = 40/194 (20%)

Query: 12 TGLLLLLALAFVLFYEAINGFHDTANAVATVIY------TRAMRSQLAVVMAAVFNFFGV 65
L++AL+ +L + F + + ++A+ + V+ F
Sbjct: 30 VSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFP 89

Query: 66 LLGGLSVAYAIVHML-------------------PTDLLLNMGSAHGLAMVFSMLLAAII 106
LL ++ H++ P + + S L +L ++
Sbjct: 90 LLTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVL 149

Query: 107 WNLGTWYFGLPASSSHTLIGAIIGIGLTNAMMTGTSVVDALNIPKVINIFGSLIISPIVG 166
++ W + ++ + T + T ++ + L++ VG
Sbjct: 150 LSILIWIIIKG------NLVTLLQLP-TCGIECITPLLGQI--------LRQLMVICTVG 194

Query: 167 LVFAGGLIFLLRRY 180
V + Y
Sbjct: 195 FVVISIADYAFEYY 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01220RTXTOXIND784e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 4e-18
Identities = 70/413 (16%), Positives = 136/413 (32%), Gaps = 82/413 (19%)

Query: 3 KMKRHLVWWGAGILVAVAAIAWWMLRPAGIPEGFAASNGRI--EATEVDIATKIAGRIDT 60
+ LV + + V A +L E A +NG++ +I +
Sbjct: 54 SRRPRLVAY-FIMGFLVIAFILSVLGQV---EIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 61 ILVSEGQFVRQGEVLAKMDTRV----------------LQEQRLEAI------------- 91
I+V EG+ VR+G+VL K+ L++ R + +
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 92 -----------------------AQIKEAESAVAAARALLEQRQSEMRAAQSVVKQREAE 128
Q ++ L+++++E + + + E
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 129 LDSVSKRHVRSRSLSQRGAVSVQQLDDDRAAAESARAALETAKAQVSAAKAAIEAARTSI 188
R SL + A++ + + A L K+Q+ ++ I +A+
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 189 IQ-------------AQTRVEAAQATERRIVADID--DSELKAPRDGRV-QYRVAEPGEV 232
QT T + S ++AP +V Q +V G V
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 233 LSAGGRVLNMVDLSDVY-MTFFLPTEQAGLLKIGGEARLVLDAAPDLRIPATISFVASVA 291
++ ++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVK 406

Query: 292 QFTPKTVETHDERLKLMFRVKARIPPELLRQHLEYV--KTGLPGMAWVRLDER 342
+E D+RL L+F V I L + + +G+ A ++ R
Sbjct: 407 NINLDAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01230ABC2TRNSPORT482e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 48.0 bits (114), Expect = 2e-08
Identities = 43/171 (25%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPVTPFEIMMAKV-WSMGLVVLVVSGLSLMLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G + +V LG + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG---IGVVAAALGY-TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLMILVLLPLQMLSGGSTPRESMPQAVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGLSIVWPQFLTLLAIGGVFFL-IALLRFR 367
+P +H + L + I+ + + + I FFL ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01240ENTSNTHTASED336e-04 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 32.7 bits (74), Expect = 6e-04
Identities = 25/93 (26%), Positives = 44/93 (47%), Gaps = 6/93 (6%)

Query: 30 RRASWLAGRVLLSRALSPL---PEMVYGEQGKPAFSAGTPLWFNLSHSGDTIALLLSDEG 86
R+A LAGR+ AL + G++ +P + G L+ ++SH T ++S +
Sbjct: 46 RKAEHLAGRIAAVHALREVGVRTVPGMGDKRQPLWPDG--LFGSISHCATTALAVISRQ- 102

Query: 87 EVGCDIEVIRPRDNWRSLANAVFSLGEHAEMEA 119
+G DIE I + LA ++ E ++A
Sbjct: 103 RIGIDIEKIMSQHTATELAPSIIDSDERQILQA 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01250TCRTETA492e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 48.7 bits (116), Expect = 2e-08
Identities = 75/403 (18%), Positives = 138/403 (34%), Gaps = 42/403 (10%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHD--AMGFSAFWAGLIISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADVLGPKKIVVFGLCGCFLSGLGYLLADIASAWPLISLLLLGLGRVILGI-GQS 129
P G +D G + +++ L G + + Y + A L +L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPF-----LWVLYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVLCYAWGGLQGLALTVMGV 189
A G+ + + R + M G LG L G
Sbjct: 113 GAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPVLGGLM----GGFSPHAPFFAA 166

Query: 190 ALLAILLAL----------PRPSVKANKGKPLPFRAVLGRVWLYGMALALA-----SAGF 234
A L L L + P + + +A +A
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVG 226

Query: 235 GVIATFITLFYDAK-GWDGAAFALTLFSVAFVGT---RLLFPNGINRLGGLNVAMICFGV 290
V A +F + + WD ++L + + + ++ RLG M+
Sbjct: 227 QVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286

Query: 291 EIIGLLLVGTAAMPWMAKIGVLLTGMGFSLVFPALGVVAVKAVPPQNQGAALATYTVFMD 350
+ G +L+ A WMA ++L + PAL + + V + QG +
Sbjct: 287 DGTGYILLAFATRGWMAFPIMVLLA-SGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS 345

Query: 351 MSLGVTGPLAGLVMTWAGVPV----IYLAAAGLVAMALLLTWR 389
++ + GPL + A + ++A A L + L R
Sbjct: 346 LT-SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01260PF04183280.038 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.038
Identities = 17/91 (18%), Positives = 28/91 (30%), Gaps = 14/91 (15%)

Query: 121 LGQILDVHVFNRLRQNRRWWLAPTASTLFGNISDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
LG I + L+ + +TL + + AE W+
Sbjct: 347 LGVIWRENPCRWLKPDES---PVLMATLMECDENNQPL--AGAYIDRSGLDAETWLT--- 398

Query: 181 VDYCFKVLISIIFFLPMYGVLL-----NMLL 206
V++ + L YGV L N+ L
Sbjct: 399 -QLFRVVVVPLYHLLCRYGVALIAHGQNITL 428


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01265PF012061012e-32 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 101 bits (254), Expect = 2e-32
Identities = 28/72 (38%), Positives = 42/72 (58%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQTGETLLIIADDPATTRDIPGFCTFMEHDLLAQET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F H+LL Q+
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 EGLPYRYLLRKA 80
E Y + L++A
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01275ACRIFLAVINRP300.040 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.040
Identities = 17/78 (21%), Positives = 34/78 (43%), Gaps = 3/78 (3%)

Query: 336 AEERRAPIERFIDRFSRIYTPVIMVIALLVTLIPPLMFDGGWQEWIYKGLTLLLIGCPCA 395
E++ P E S+I ++ + +L + P+ F GG IY+ ++ ++ A
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS---A 477

Query: 396 LVISTPAAITSGLAAAAR 413
+ +S A+ A A
Sbjct: 478 MALSVLVALILTPALCAT 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01285SHIGARICIN270.026 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 26.7 bits (59), Expect = 0.026
Identities = 6/29 (20%), Positives = 16/29 (55%)

Query: 7 FFIIIIALIVVAASFRFVQQRREKAANEA 35
+++I AA ++F++Q+ K ++
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQIGKRVDKT 201


7GX95_01400GX95_01445Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_01400115-3.752881hypothetical protein
GX95_01405115-3.652020gamma-glutamyltransferase
GX95_01410119-5.329176phosphotriesterase-related protein
GX95_01415017-5.021997hypothetical protein
GX95_01420015-3.878917cytoplasmic protein
GX95_01425014-2.573379ribokinase
GX95_014300161.489941N-acetyltransferase
GX95_014350181.873569oxidoreductase
GX95_01440-1172.764887quercetin 2,3-dioxygenase
GX95_01445-1163.118643transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01405NAFLGMOTY320.004 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.4 bits (73), Expect = 0.004
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 275 RTPISGDYRGYQVFSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 331
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 332 YAYADRSEYLGDPDFVKVPWQA 353
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01430SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 5e-05
Identities = 18/92 (19%), Positives = 34/92 (36%), Gaps = 16/92 (17%)

Query: 55 VACIDDIVVGHLSIQVTQRPRRSHVADFGICVDARWHNRGIASALIRTMID------MCD 108
+ +++ +G + I+ + + D + D R G+ +AL+ I+ C
Sbjct: 69 LYYLENNCIGRIKIRSNWN-GYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVERIELTVFVDNEPAVAVYKKYGFEIEG 140
L + I N A Y K+ F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01435MICOLLPTASE300.013 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.5 bits (68), Expect = 0.013
Identities = 15/50 (30%), Positives = 25/50 (50%), Gaps = 3/50 (6%)

Query: 268 FAADESVGVLEYVNDDGVTVKEEVKPETGDYGRVYDALYQTLTVGTPNYV 317
++AD+ ++Y N DG + K + G+ Y +YQ GT NY+
Sbjct: 1052 YSADDLSNYVDYANADGNKLSNTCK---LNPGKYYLCVYQFENSGTGNYI 1098


8GX95_02290GX95_02350Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_02290316-2.851106hypothetical protein
GX95_02295317-2.436302hypothetical protein
GX95_02300117-1.786175hypothetical protein
GX95_02305015-1.686224arginine repressor
GX95_02310-115-2.143964malate dehydrogenase
GX95_02315-113-2.084034GntR family transcriptional regulator
GX95_023200182.250123LysR family transcriptional regulator
GX95_023250254.725594cation transporter
GX95_023301255.343028L(+)-tartrate dehydratase subunit alpha
GX95_02335-1255.563489L(+)-tartrate dehydratase subunit beta
GX95_02340-1245.717303oxaloacetate decarboxylase subunit gamma
GX95_02345-1225.260924oxaloacetate decarboxylase subunit alpha
GX95_02350-2173.278074oxaloacetate decarboxylase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02305ARGREPRESSOR1643e-55 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 164 bits (417), Expect = 3e-55
Identities = 43/141 (30%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVLALQDQGFENINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L+ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPASGFSVRDLYEAILELF 152
+ + + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02310DHBDHDRGNASE280.031 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.5 bits (63), Expect = 0.031
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKNQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 IAKTCPK----ACVGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
++K + V + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02345RTXTOXIND320.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.008
Identities = 15/60 (25%), Positives = 27/60 (45%), Gaps = 1/60 (1%)

Query: 509 SSAPVQA-VAPAGAGTPVTAPLAGNIWKVIATEGQTVAEGDVLLILEAMKMETEIRAAQA 567
A + +G + + ++I EG++V +GDVLL L A+ E + Q+
Sbjct: 82 IVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS 141


9GX95_03440GX95_03530Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_03440118-3.362063MFS transporter
GX95_03445121-3.604955beta-glucuronidase
GX95_03450226-4.578027TetR family transcriptional regulator
GX95_03455124-4.633800amidohydrolase
GX95_03460122-3.630904xylanase deacetylase
GX95_03465020-2.808809type VI secretion effector protein (Hcp)
GX95_03470121-3.173212DNA recombination protein
GX95_03475124-3.869724NAD-dependent phenylacetaldehyde dehydrogenase
GX95_03480223-4.252907FAD-dependent oxidoreductase
GX95_03485325-4.470886cytoplasmic protein
GX95_03490224-4.081251APC family amino acid permease
GX95_03495126-4.626622cytoplasmic protein
GX95_03500127-4.308725helix-turn-helix transcriptional regulator
GX95_03505129-4.563512anaerobic sulfatase maturase
GX95_03510232-5.380639arylsulfatase
GX95_03515032-5.030184LysR family transcriptional regulator
GX95_03520-132-5.906117CoA ester lyase
GX95_03525-216-3.339996molybdenum cofactor biosynthesis protein MoeC
GX95_03530-211-3.0997694-hydroxybutyrate--acetyl-CoA CoA transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_03450HTHTETR966e-27 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 95.8 bits (238), Expect = 6e-27
Identities = 38/205 (18%), Positives = 68/205 (33%), Gaps = 20/205 (9%)

Query: 7 ESLPTRMRILRAARQCFAENGFHSTSMKTICKASDMSPGTLYHHFPSKEALIEAIILEDQ 66
E+ TR IL A + F++ G STS+ I KA+ ++ G +Y HF K L I +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 67 ERALTHFREPLEGVG-----LVDYLVESTIAVTREDCAQRALVVEIMAEGMR---NPQVA 118
E ++ ++ + T + +R L+ I + V
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 119 EMLTNKYHTIIASLVARFNDAQAKGEIGADVDKEMAARLLLATTYGVL-------SDSSS 171
+ N + + AD+ AA ++ G++
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL 187

Query: 172 AENARHVSFATTLRTMLTGLLKCNS 196
+ AR +L L C +
Sbjct: 188 KKEARDYV-----AILLEMYLLCPT 207


10GX95_03925GX95_04110Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_03925018-3.743776lysine--tRNA ligase
GX95_03930126-4.838956isopentenyl-diphosphate delta-isomerase
GX95_03935228-5.168614hypothetical protein
GX95_03945129-5.653884*integrase
GX95_03950225-5.308357hypothetical protein
GX95_03955220-1.576730dipicolinate synthase
GX95_03960220-1.060651antA/AntB antirepressor family protein
GX95_03965122-0.870328transporter
GX95_03970025-0.090168hypothetical protein
GX95_03975133-0.628366hypothetical protein
GX95_039804210.271977hypothetical protein
GX95_03985322-0.175685hypothetical protein
GX95_039904240.683880hypothetical protein
GX95_039954251.028291hypothetical protein
GX95_040004271.429892DNA primase
GX95_040055271.217288hypothetical protein
GX95_04015535-0.765141proQ/FINO family protein
GX95_04020333-1.701250PerC family transcriptional regulator
GX95_04025436-2.606235hypothetical protein
GX95_04030333-3.235675hypothetical protein
GX95_04035431-3.182454hypothetical protein
GX95_04040329-0.982018hypothetical protein
GX95_04050330-3.498092hypothetical protein
GX95_04055430-3.282747hypothetical protein
GX95_04060428-3.954773hypothetical protein
GX95_04065228-4.799423hypothetical protein
GX95_04070427-4.230611hypothetical protein
GX95_040756241.577457hypothetical protein
GX95_040806262.899278hypothetical protein
GX95_040856240.284019hypothetical protein
GX95_040906240.566877fimbrial protein StdA
GX95_040957271.539634fimbrial protein
GX95_041006261.763971fimbrial chaperone protein StdC
GX95_04105527-2.234394adhesion protein
GX95_04110224-4.553135CaiF/GrlA family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_03935RTXTOXIND375e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.1 bits (86), Expect = 5e-05
Identities = 17/94 (18%), Positives = 32/94 (34%), Gaps = 7/94 (7%)

Query: 149 IAGARGTPVYAAGA--GKVVYVGNQLRGYGNLIMIKHNEDYITAYAHNDTMLVNNGQSVK 206
+ + + V +L G IK E+ I ++V G+SV+
Sbjct: 65 MGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIV-----KEIIVKEGESVR 119

Query: 207 AGQKIATMGSTDAASVRLHFQIRYRATAIDPLRY 240
G + + + A + L Q ++ RY
Sbjct: 120 KGDVLLKLTALGAEADTLKTQSSLLQARLEQTRY 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04085ENTEROVIROMP974e-28 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 97.3 bits (242), Expect = 4e-28
Identities = 53/183 (28%), Positives = 78/183 (42%), Gaps = 17/183 (9%)

Query: 1 MNKMLLAGSAGIVLLSAAASPVWADDNASTFSLGYAQSH-TNHAGTLRGVRLANNYEMSP 59
M K+ SA +L+ A A ST + GYAQS + G L YE
Sbjct: 1 MKKIACL-SALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDN 57

Query: 60 D-WGLTTSFAWLNGSQRYSDESSNGRVTTRYYSLLAGPSWKINNQLSLYSQVGPVLLHQR 118
G+ SF + S SS +YY + AGP+++IN+ S+Y VG +
Sbjct: 58 SPLGVIGSFTYTEKS---RTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQ 114

Query: 119 DH---GINESDSKVGYGYSAGVAYTPVSNVAINLGYEGADFDATHNSGSLNSNGFNLGVG 175
S G+ Y AG+ + P+ NVA++ YE + S++ + GVG
Sbjct: 115 TTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQS------RIRSVDVGTWIAGVG 168

Query: 176 YRF 178
YRF
Sbjct: 169 YRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04100PF005776330.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 633 bits (1635), Expect = 0.0
Identities = 230/856 (26%), Positives = 380/856 (44%), Gaps = 66/856 (7%)

Query: 19 SQATEFNASLLDSGNLSNVDLTAFSREGYVAPGNYILDIWLNDQPVREQYPVRVVPVAGR 78
S FN L + DL+ F + PG Y +DI+LN+ + + V
Sbjct: 44 SAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRD-VTFNTGDSE 102

Query: 79 DAAVICVTTDMVAMLGLKDKIIHGLKPVTGIPDGQCLELRSA--DSQVRYSAENQRLTFI 136
V C+T +A +GL + + + D C+ L S D+ + QRL
Sbjct: 103 QGIVPCLTRAQLASMGLNTA---SVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLT 159

Query: 137 IPQAWMRYQDPDWVPPSRWSDGVTAGLLDYNLMVNRYMPQQGETSTSYSLYGTAGFNLGA 196
IPQA+M + ++PP W G+ AGLL+YN N + G S L +G N+GA
Sbjct: 160 IPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGA 219

Query: 197 WRLRSDYQYSRFDS-GQGASQSDFYLPQTYLFRALPALRSKLTLGQTYLSSAIFDSFRFA 255
WRLR + +S S S++ + T+L R + LRS+LTLG Y IFD F
Sbjct: 220 WRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFR 279

Query: 256 GLTLASDERMLPPSLQGYAPKISGIANSNAQVTVSQNGRILYQTRVSPGPFELPDLSQ-N 314
G LASD+ MLP S +G+AP I GIA AQVT+ QNG +Y + V PGPF + D+
Sbjct: 280 GAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAG 339

Query: 315 ISGNLDVSVRESDGSVRTWQVNTASVPFMARQGQVRYKVAAGRPLYGGTHNNSTVSPDFL 374
SG+L V+++E+DGS + + V +SVP + R+G RY + AG G N P F
Sbjct: 340 NSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSG---NAQQEKPRFF 396

Query: 375 LGEATWGAFNNTSLYGGLIASTGDYQSAALGIGQNMGLLGALSADVTRSDARLPHGQKQS 434
G ++YGG + Y++ GIG+NMG LGALS D+T++++ LP +
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLA-DRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHD 455

Query: 435 GYSYRINYAKTFDKTGSTLAFVGYRFSDRHFLSMPEYLQRRATDGGD------------- 481
G S R Y K+ +++G+ + VGYR+S + + + R
Sbjct: 456 GQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKF 515

Query: 482 ------AWHEKQSYTVTYSQSVPVLNMSAALSVSRLNYWNAQ-SNNNYMLSFNKVFSLGE 534
A++++ +T +Q + + LS S YW + + N F
Sbjct: 516 TDYYNLAYNKRGKLQLTVTQQLG-RTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF---- 570

Query: 535 LQGLSASVSFARNQYTGG-GSQNQVYATISIPWGDSR-----------QVSYSVQKDNRG 582
+ ++ ++S++ + G + ++IP+ SYS+ D G
Sbjct: 571 -EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNG 629

Query: 583 GLQQTVNYSD--FHNPDTTWNISAGHNRYDTGSN-SSFSGSVQSRLPWGQAAADATLQPG 639
+ + + ++++ G+ G++ S+ ++ R +G A +
Sbjct: 630 RMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDD 689

Query: 640 QYRSLGLSWYGSVTATAHGAAFSQSMAGNEPRMMIDTGDVAGVPVNGNSGV-TNRFGVGV 698
+ L G V A A+G Q + N+ +++ V +GV T+ G V
Sbjct: 690 -IKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGAKDAKVENQTGVRTDWRGYAV 746

Query: 699 VSAGSSYRRSDISVDVAALPEDVDVSSSVISQVLTEGAVGYRKIDASQGEQVLGHIRLAD 758
+ + YR + +++D L ++VD+ ++V + V T GA+ + A G ++L + +
Sbjct: 747 LPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HN 805

Query: 759 GASPPFGALVVSGKTGRTAGMVGDGGLAYLTGLSGEDRRTLNVSW--DGRVQCRLTLPET 816
PFGA+V S +++G+V D G YL+G+ + V W + C
Sbjct: 806 NKPLPFGAMVTSES-SQSSGIVADNGQVYLSGM--PLAGKVQVKWGEEENAHCVANYQLP 862

Query: 817 VTLSRGPL---LLPCR 829
+ L CR
Sbjct: 863 PESQQQLLTQLSAECR 878


11GX95_04645GX95_05030Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_04645-1133.862591phenolic acid decarboxylase
GX95_04650-1154.333886phenolic acid decarboxylase
GX95_04655-1154.341979transcriptional regulator
GX95_04660-1144.376685DeoR family transcriptional regulator
GX95_04665-2134.9258803-hydroxyisobutyrate dehydrogenase
GX95_04670-3144.213785hypothetical protein
GX95_04675-3133.062571aldolase
GX95_04680-2122.933673hypothetical protein
GX95_04685-2103.099255hypothetical protein
GX95_04690-3102.228747gluconate permease
GX95_04695-212-0.025996LysR family transcriptional regulator
GX95_04700-211-0.275210hypothetical protein
GX95_04705-212-1.683617DUF4440 domain-containing protein
GX95_04710-115-2.682750DNA mismatch repair protein MutS
GX95_04715939-10.897181cytoplasmic protein
GX95_04720535-10.063217serine/threonine protein phosphatase
GX95_04725131-8.408740transposase
GX95_04730031-8.655439hypothetical protein
GX95_04735-128-8.032910hypothetical protein
GX95_04740-127-7.939447invasion lipoprotein InvH
GX95_04745-224-6.151756transcriptional regulator InvF
GX95_04750-224-6.272744EscC/YscC/HrcC family type III secretion system
GX95_04755-124-5.840653SepL/TyeA/HrpJ family type III secretion system
GX95_04760-223-4.577749EscV/YscV/HrcV family type III secretion system
GX95_04765-224-4.200277type III secretion system chaperone SpaK
GX95_04770-224-3.911257EscN/YscN/HrcN family type III secretion system
GX95_04775-126-5.344344type III secretion system protein SpaM
GX95_04780-126-5.983360antigen presentation protein SpaN
GX95_04785-127-6.811652type III secretion system protein SpaO
GX95_04790122-6.007002EscR/YscR/HrcR family type III secretion system
GX95_04795122-5.206718EscS/YscS/HrcS family type III secretion system
GX95_04800120-5.317244EscT/YscT/HrcT family type III secretion system
GX95_04805123-5.571171EscU/YscU/HrcU family type III secretion system
GX95_04810025-5.235257CesD/SycD/LcrH family type III secretion system
GX95_04815025-5.122098pathogenicity island 1 effector protein SipB
GX95_04820028-7.144465pathogenicity island 1 effector protein SipC
GX95_04825231-8.199321cell invasion protein SipD
GX95_04830232-9.033294pathogenicity island 1 effector protein SipA
GX95_04835232-10.936672acyl carrier protein
GX95_04840232-10.993555hypothetical protein
GX95_04845231-9.597675chaperone protein SicP
GX95_04850231-9.227773pathogenicity island 1 effector protein StpP
GX95_04855330-8.927147invasion protein IagB
GX95_04860331-8.330294transcriptional regulator
GX95_04865334-7.157311AraC family transcriptional regulator
GX95_04870334-6.107264type III secretion system protein PrgH
GX95_04875337-7.130192EscF/YscF/HrpA family type III secretion system
GX95_04880339-9.353375type III secretion system protein PrgJ
GX95_04885238-9.380480EscJ/YscJ/HrcJ family type III secretion inner
GX95_04890240-10.104511invasion protein OrgA
GX95_04895-133-9.342858invasion protein OrgB
GX95_04900-127-7.464968type III secretion system effector protein OrgC
GX95_04905-221-6.065425AraC family transcriptional regulator
GX95_04910-216-3.112417transcriptional regulator
GX95_04915-214-1.984281effector protein YopJ
GX95_04920-1140.622149iron ABC transporter permease
GX95_049250151.989261hypothetical protein
GX95_04930-1152.568744manganese/iron transporter ATP-binding protein
GX95_049350142.762735iron ABC transporter substrate-binding protein
GX95_049400143.381068hypothetical protein
GX95_049450132.999421transcriptional regulator FhlA
GX95_049500153.088037hydrogenase expression/formation protein HypE
GX95_049551172.492680hydrogenase formation protein HypD
GX95_049601184.108403hydrogenase assembly chaperone
GX95_049652184.420360hydrogenase accessory protein HypB
GX95_049701244.458113hydrogenase maturation nickel metallochaperone
GX95_049751274.893073transcriptional regulator
GX95_049801275.562073formate hydrogenlyase
GX95_049851295.573433formate hydrogenlyase subunit 3
GX95_049901294.483494hydrogenase 3 membrane subunit
GX95_04995-1243.358154hydrogenase 3 large subunit
GX95_050000192.806309formate hydrogenlyase complex iron-sulfur
GX95_050051144.521220formate hydrogenlyase
GX95_050101134.023927formate hydrogenlyase maturation protein HycH
GX95_050151123.215867hydrogenase maturation peptidase HycI
GX95_050200123.758753hypothetical protein
GX95_050250134.292212formate dehydrogenase
GX95_05030-1133.642270carbamoyltransferase HypF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04685NUCEPIMERASE856e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 85.2 bits (211), Expect = 6e-21
Identities = 68/297 (22%), Positives = 116/297 (39%), Gaps = 48/297 (16%)

Query: 1 MQIIITGGGGFLGQKLASALLNSSL------AFNELLLVDLKMPARLS--DSPRLRCLEA 52
M+ ++TG GF+G ++ LL + N+ V LK ARL P + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQ-ARLELLAQPGFQFHKI 59

Query: 53 DLT-QPGVLENVITANTSVVYHLAA-------IVSSHAEDDFDLGWKVNLDLTRQLLEAC 104
DL + G+ + + + V+ + + HA D NL +LE C
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYAD------SNLTGFLNILEGC 113

Query: 105 RRQPQKIRFVFSSSLAVYGG--TLPECVTDTTALTPRSSYGAQKAACELLVNDYTRKGYV 162
R + +++SS +VYG +P D+ P S Y A K A EL+ + Y+ +
Sbjct: 114 RHNKIQ-HLLYASSSSVYGLNRKMPFSTDDSVD-HPVSLYAATKKANELMAHTYSHLYGL 171

Query: 163 DGLALRLPTICVRPGKPNRAASSFVSAIIREPLQGE--------------TTICPVSESL 208
LR T+ G+P+ A F A+ L+G+ T I ++E++
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAM----LEGKSIDVYNYGKMKRDFTYIDDIAEAI 227

Query: 209 RLWISSPATVIHNLSLAATLPAPGEA--SSINLPGIS-VTVGEMLETLRQAGGQAAR 262
++ PA A N+ S V + + ++ L A G A+
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04700TCRTETB801e-18 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 80.3 bits (198), Expect = 1e-18
Identities = 67/387 (17%), Positives = 144/387 (37%), Gaps = 48/387 (12%)

Query: 16 FLDLINLFIASVAFPAMSVDLHTSISALAWVSNGYIAGLTLIIPFSAFLSRYLGARRLII 75
F ++N + +V+ P ++ D + ++ WV+ ++ ++ LS LG +RL++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 76 FSLILFSVAAAAAGFADSLHS-LVFWRIVQGVGGGLLIPVGQALTWQQFKPHERAGVSSV 134
F +I+ + S S L+ R +QG G + + + R +
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 135 VMMVALLAPACSPAIGGLLVETCGWRWIFF-----------------------ATLPVAV 171
+ + + PAIGG++ W ++ +
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKG 203

Query: 172 LTLLLAYCWLNVASTTMASARLLHL-----------------PLLTDRLLRFAMIVYLCV 214
+ L+ + TT S L + P + L + + +
Sbjct: 204 IILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVL 263

Query: 215 PGMFIGISVVGM-----FYLQNVAQLSPAAAGS-LMLPWSIASFVAIMFTGRYFNRLGPR 268
G I +V G + +++V QLS A GS ++ P +++ + G +R GP
Sbjct: 264 CGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323

Query: 269 PLIIVGCLLQAAGILLLTNVTPATSHRVLMMIFALMGAGGSLCSSTAQSGAFLTIARRDM 328
++ +G + L + + TS + ++I ++G G S + + ++ +++
Sbjct: 324 YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVISTIVSSSLKQQEA 382

Query: 329 PDASALWNLNRQLSFFLGATLLTLLLN 355
+L N LS G ++ LL+
Sbjct: 383 GAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04750TYPE3OMGPROT5760.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 576 bits (1485), Expect = 0.0
Identities = 170/540 (31%), Positives = 271/540 (50%), Gaps = 57/540 (10%)

Query: 4 HILLARVLACAALVLVAPGYSSE----KIPVTGSGFVAKDDSLRTFFDAMALQLKEPVIV 59
H RVL L+L + ++ E IP +VAK +SLR V+V
Sbjct: 6 HSFFKRVLTGTLLLLSSYSWAQELDWLPIPYV---YVAKGESLRDLLTDFGANYDATVVV 62

Query: 60 SKMAARKKITGNFEFHDPNALLEKLSLQLGLIWYFDGQAIYIYDASEMRNAVVSLRNVSL 119
S K++G FE +P L+ ++ L+WY+DG +YI+ SE+ + ++ L+
Sbjct: 63 SD-KINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEA 121

Query: 120 NEFNNFLKRSGLYNKNYPLRGDNRKGTFYVSGPPVYVDMVVNAATMMDKQND--GIELGR 177
E L+RSG++ + R D YVSGPP Y+++V A +++Q + G
Sbjct: 122 AELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGA 181

Query: 178 QKIGVMRLNNTFVGDRTYNLRDQKMVIPGIATAIERLLQGEEQPLGNIVSSEPPAMPAFS 237
I + L DRT + RD ++ PG+AT ++R+L + + P
Sbjct: 182 LAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIP------ 235

Query: 238 ANGEKGKAANYAGGMSLQEALKQNAAAGNIKIVAYPDTNSLLVKGTAEQVHFIEMLVKAL 297
Q A + +A A ++ A P N+++V+ + E++ + L+ AL
Sbjct: 236 -----------------QAATRASAQA---RVEADPSLNAIIVRDSPERMPMYQRLIHAL 275

Query: 298 DVAKRHVELSLWIVDLNKSDLERLGTSWSGSI-----------TIGDKLGVSLNQASIST 346
D +E++L IVD+N L LG W I T GD+ ++ N A S
Sbjct: 276 DKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSL 335

Query: 347 LDG---SRFIAAVNALEEKKQATVVSRPVLLTQENVPAIFDNNRTFYTKLIGERNVALEH 403
+D +A VN LE + A VVSRP LLTQEN A+ D++ T+Y K+ G+ L+
Sbjct: 336 VDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKG 395

Query: 404 VTYGTMIRVLPRFSADG---QIEMSLDIEDGNDKTPQSDTTTSVDALPEVGRTLISTIAR 460
+TYGTM+R+ PR G +I ++L IEDGN Q ++ ++ +P + RT++ T+AR
Sbjct: 396 ITYGTMLRMTPRVLTQGDKSEISLNLHIEDGN----QKPNSSGIEGIPTISRTVVDTVAR 451

Query: 461 VPHGKSLLVGGYTRDANTDTVQSIPFLGKLPLIGSLFRYSSKNKSNVVRVFMIEPKEIVD 520
V HG+SL++GG RD + + +P LG +P IG+LFR S+ VR+F+IEP+ I +
Sbjct: 452 VGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRIIDE 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04755INVEPROTEIN6040.0 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 604 bits (1558), Expect = 0.0
Identities = 371/372 (99%), Positives = 371/372 (99%)

Query: 1 MIPGSTSGISFSRILSRQTSHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60
MIPGSTSGISFSRILSRQ SHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA
Sbjct: 1 MIPGSTSGISFSRILSRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60

Query: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120
ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP
Sbjct: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120

Query: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180
DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS
Sbjct: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180

Query: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240
LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR
Sbjct: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240

Query: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300
LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL
Sbjct: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300

Query: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360
LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE
Sbjct: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360

Query: 361 MAEQRRTIEKLS 372
MAEQRRTIEKLS
Sbjct: 361 MAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04765SSPAKPROTEIN2063e-72 Invasion protein B family signature.
		>SSPAKPROTEIN#Invasion protein B family signature.

Length = 133

Score = 206 bits (525), Expect = 3e-72
Identities = 43/133 (32%), Positives = 76/133 (57%)

Query: 1 MQHLDIAELVRSALEVSGCDPSLIGGIDSHSTIVLDLFALPSICISVKDDDVWIWAQLGA 60
M ++++ +LVR +L GC PS+I +DSHS I + L ++P+I I++ ++ V +WA A
Sbjct: 1 MSNINLVQLVRDSLFTIGCPPSIITDLDSHSAITISLDSMPAINIALVNEQVMLWANFDA 60

Query: 61 DSMVVLQQRAYEILMTIMEGCHFARGGQLLLGEQNGELTLKALVHPDFLSDGEKFSTALN 120
S V LQ AY IL ++ ++ + L + L L+ ++ D++ DG F+ L+
Sbjct: 61 PSDVKLQSSAYNILNLMLMNFSYSINELVELHRSDEYLQLRVVIKDDYVHDGIVFAEILH 120

Query: 121 GFYNYLEVFSRSL 133
FY +E+ + L
Sbjct: 121 EFYQRMEILNGVL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04775SSPAMPROTEIN1672e-56 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 167 bits (423), Expect = 2e-56
Identities = 141/147 (95%), Positives = 143/147 (97%)

Query: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRGLQAEEEAILEQIAGLKLLLDTLRAEN 60
MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDR LQ EEEAI+EQIAGLKLLLDTLRAEN
Sbjct: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRRLQVEEEAIVEQIAGLKLLLDTLRAEN 60

Query: 61 RQLSREEIYTLLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQKKSKYWLRKEGNY 120
RQLSREEIY LLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQ+KSKYWLRKEGNY
Sbjct: 61 RQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKYWLRKEGNY 120

Query: 121 QRWIIRQKRHYIQREIQQEEAESEEII 147
QRWIIRQKR YIQREIQQEEAESEEII
Sbjct: 121 QRWIIRQKRLYIQREIQQEEAESEEII 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04780SSPANPROTEIN5980.0 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 598 bits (1542), Expect = 0.0
Identities = 331/336 (98%), Positives = 332/336 (98%)

Query: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSDHKKDRDYGDAFVMHKETAL 60
MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYS KKDRDYGDAFVMHKETAL
Sbjct: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60

Query: 61 PVLLAAWRHGAPAKSEHHNGNVSGLHHNGKGELRIAEKLLKVTAEKSVGLISAEAKVDKS 120
P+LLAAWRHGAPAKSEHHNGNVSGLHHNGK ELRIAEKLLKVTAEKSVGLISAEAKVDKS
Sbjct: 61 PLLLAAWRHGAPAKSEHHNGNVSGLHHNGKSELRIAEKLLKVTAEKSVGLISAEAKVDKS 120

Query: 121 AALLSPKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180
AALLS KNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR
Sbjct: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180

Query: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240
KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA
Sbjct: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240

Query: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300
AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH
Sbjct: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300

Query: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336
DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA
Sbjct: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04785TYPE3OMOPROT5380.0 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 538 bits (1386), Expect = 0.0
Identities = 302/303 (99%), Positives = 302/303 (99%)

Query: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60
MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL
Sbjct: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60

Query: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120
EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL
Sbjct: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120

Query: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180
HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS
Sbjct: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180

Query: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240
RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR
Sbjct: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240

Query: 241 KNVTLAELETMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300
KNVTLAELE MGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG
Sbjct: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300

Query: 301 NGE 303
NGE
Sbjct: 301 NGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04790TYPE3IMPPROT303e-107 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 303 bits (777), Expect = e-107
Identities = 224/224 (100%), Positives = 224/224 (100%)

Query: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60
MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120
MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180
KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224
LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04795TYPE3IMQPROT894e-27 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 88.7 bits (220), Expect = 4e-27
Identities = 86/86 (100%), Positives = 86/86 (100%)

Query: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60
MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86
FLLSGWYGEVLLSYGRQVIFLALAKG
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04800TYPE3IMRPROT1883e-61 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 188 bits (478), Expect = 3e-61
Identities = 48/237 (20%), Positives = 104/237 (43%), Gaps = 4/237 (1%)

Query: 12 LVASAALGFARVAPIFFFLPFLNSGVLSGAPRNAIIILVALGVWPHALNEAPPFLSVAMI 71
+ RV + P L+ + + + +++ + P P S +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFAL 71

Query: 72 PLVLQEAAVGVMLGCLLSWPFWVMHALGCIIDNQRGATLSSSIDPANGIDTSEMANFLNM 131
L +Q+ +G+ LG + + F + G II Q G + ++ +DPA+ ++ +A ++M
Sbjct: 72 WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDM 131

Query: 132 FAAVVYLQNGGLVTMVDVLNKSYQLCDPMNEC--TPSLPPLLTFINQVAQNALVLASPVV 189
A +++L G + ++ +L ++ E + + L + + N L+LA P++
Sbjct: 132 LALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLI 191

Query: 190 LVLLLSEVFLGLLSRFAPQMNAFAISLTVKSGIAVLIMLLYFS--PVLPDNVLRLSF 244
+LL + LGLL+R APQ++ F I + + + +M +++ F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04805TYPE3IMSPROT340e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 340 bits (875), Expect = e-118
Identities = 120/360 (33%), Positives = 205/360 (56%), Gaps = 19/360 (5%)

Query: 1 MSSNKTEKPTKKRLEDSAKKGQSFKSKDLIIACLTLGGIAYLVSYGSFN-EFMGIIKIII 59
MS KTE+PT K++ D+ KKGQ KSK+++ L + A L+ + E + +I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 ADNFDQSMADYSLAVFGIGLKYLIPFMLLCL---VCSALPAL----LQAGFVLATEALKP 112
+QS +S A+ + L+ F LC +AL A+ +Q GF+++ EA+KP
Sbjct: 61 ---AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKP 117

Query: 113 NLSALNPVEGAKKLFSMRTVKDTVKTLLYLSSFVVAAIICWKKYKVEIFSQLNGNIVGIA 172
++ +NP+EGAK++FS++++ + +K++L + V+ +I+ W K + + L GI
Sbjct: 118 DIKKINPIEGAKRIFSIKSLVEFLKSILKV---VLLSILIWIIIKGNLVTLLQLPTCGIE 174

Query: 173 VIWRELLLALVLTCLACA---LIVLLLDAIAEYFLTMKDMKMDKEEVKREMKEQEGNPEV 229
I L L + C +++ + D EY+ +K++KM K+E+KRE KE EG+PE+
Sbjct: 175 CITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 230 KSKRREVHMEILSEQVKSDIENSRLIVANPTHITIGIYFKPELMPIPMISVYETNQRALA 289
KSKRR+ H EI S ++ +++ S ++VANPTHI IGI +K P+P+++ T+ +
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 290 VRAYAEKVGVPVIVDIKLARSLFKTHRRYDLVSLEEIDEVLRLLVWLE--EVENAGKDVI 347
VR AE+ GVP++ I LAR+L+ + E+I+ +L WLE +E +++
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEML 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04810SYCDCHAPRONE1282e-40 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 128 bits (322), Expect = 2e-40
Identities = 39/160 (24%), Positives = 72/160 (45%), Gaps = 4/160 (2%)

Query: 4 QNNVSEERVAEMIWDAVSEGATLKDVHGIPQDMMDGLYAHAYEFYNQGRLDEAETFFRFL 63
Q + + + G T+ ++ I D ++ LY+ A+ Y G+ ++A F+ L
Sbjct: 3 QETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQAL 62

Query: 64 CIYDFYNPDYTMGLAAVCQLKKQFQKACDLYAVAFTLLKNDYRPVFFTGQCQLLMRKAAK 123
C+ D Y+ + +GL A Q Q+ A Y+ + + R F +C L + A+
Sbjct: 63 CVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAE 122

Query: 124 ARQCF----ELVNERTEDESLRAKALVYLEALKTAETEQH 159
A EL+ ++TE + L + LEA+K + +H
Sbjct: 123 AESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEH 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04815BACINVASINB8410.0 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 841 bits (2173), Expect = 0.0
Identities = 592/593 (99%), Positives = 592/593 (99%)

Query: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60
MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE
Sbjct: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60

Query: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120
SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE
Sbjct: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120

Query: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAAAKKLTQAQNKLQSLDPADPG 180
MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAA KKLTQAQNKLQSLDPADPG
Sbjct: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180

Query: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240
YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN
Sbjct: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240

Query: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300
QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF
Sbjct: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300

Query: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360
QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA
Sbjct: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360

Query: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420
TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV
Sbjct: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420

Query: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480
AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG
Sbjct: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480

Query: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540
NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML
Sbjct: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540

Query: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593
ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA
Sbjct: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04820BACINVASINC5140.0 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 514 bits (1325), Expect = 0.0
Identities = 406/409 (99%), Positives = 408/409 (99%)

Query: 1 MLISNVGVNPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60
MLISNVG+NPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP
Sbjct: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60

Query: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120
GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS
Sbjct: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120

Query: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180
GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ
Sbjct: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180

Query: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240
SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV
Sbjct: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240

Query: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKDSNKQISPEHQAILSKRLESV 300
DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIK+SNKQISPEHQAILSKRLESV
Sbjct: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKNSNKQISPEHQAILSKRLESV 300

Query: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASGQYAATQERSEQQISQVN 360
ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGAS QYAATQERSEQQISQVN
Sbjct: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVN 360

Query: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409
NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA
Sbjct: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04845PF05932432e-08 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 43.3 bits (102), Expect = 2e-08
Identities = 18/128 (14%), Positives = 46/128 (35%), Gaps = 8/128 (6%)

Query: 2 QAHQDIIANIGEKLGL-PLTFDDNNQCLLLLDSDIFTSIEAK--DDIWLLNGMIIPLSPV 58
++ ++ + L + PL FDD+ C +++D+ ++ + LL G++ P
Sbjct: 4 LFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH--- 60

Query: 59 CGDSIWRQIMVINGELAANNEGTLAYIEAAETLLFIHAI-TDLTNTYHIISQLESFVNQQ 117
D + ++ N L E + +I + + + ++ +
Sbjct: 61 -KDIPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWM 119

Query: 118 EALKNILQ 125
+ Q
Sbjct: 120 RGWREASQ 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04850BACYPHPHTASE3001e-98 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 300 bits (770), Expect = 1e-98
Identities = 67/212 (31%), Positives = 101/212 (47%), Gaps = 17/212 (8%)

Query: 340 GKPVALAGSYPKNTPDALEAHMKMLLEKECSCLVVLTSEDQMQAKQ--LPAYFRGSYTFG 397
G +A YP LE+H +ML E L VL S ++ ++ +P YFR S T+G
Sbjct: 252 GNTRTIACQYP--LQSQLESHFRMLAENRTPVLAVLASSSEIANQRFGMPDYFRQSGTYG 309

Query: 398 EVHTNSQKVSSASQGGAI--DQYNMHL-SCGEKQYTIPVLHVKNWPDHQPLPS--TDQLE 452
+ S+ G I D Y + + G+K ++PV+HV NWPD + S T L
Sbjct: 310 SITVESKMTQQVGLGDGIMADMYTLTIREAGQKTISVPVVHVGNWPDQTAVSSEVTKALA 369

Query: 453 YLADRVKNSNQNGAPGRSSS-----DKHLPMIHCLGGVGRTGTMAAALVLKDNPHSNL-- 505
L D+ + +N + SS K P+IHC GVGRT + A+ + D+ +S L
Sbjct: 370 SLVDQTAETKRNMYESKGSSAVGDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSV 429

Query: 506 EQVRADFRDSRNNRMLEDASQF-VQLKAMQAQ 536
E + + R RN M++ Q V +K + Q
Sbjct: 430 EDMVSQMRVQRNGIMVQKDEQLDVLIKLAEGQ 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04865PF07212280.044 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 28.1 bits (62), Expect = 0.044
Identities = 12/39 (30%), Positives = 21/39 (53%)

Query: 234 MSTSTLKRKLAEEGTSFSDIYLSARMNQAAKLLRIGNHN 272
+S +K++ +GT+ IY+++ KLLRI N
Sbjct: 241 LSIDIVKKQKGGKGTAAQGIYINSTSGTTGKLLRIRNLG 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04885FLGMRINGFLIF437e-07 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 42.6 bits (100), Expect = 7e-07
Identities = 33/167 (19%), Positives = 63/167 (37%), Gaps = 10/167 (5%)

Query: 23 LLKGLDQEQANEVIAVLQMHNIEANKIDSGKLGYSITVAEPDFTAAVYWIKTYQLPPRPR 82
L L + ++A L NI + +G +I V + LP
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNI-PYRFANG--SGAIEVPADKVHELRLRLAQQGLPKGGA 109

Query: 83 VEIAQMFPADSLVSSPRAEKARLYSAIEQRLEQSLQTMEGVLSARVHISYDIDA---GEN 139
V + + S +E+ A+E L ++++T+ V SARVH++ + E
Sbjct: 110 VGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQ 168

Query: 140 GRPPKPVHLSALAVYERGSPLAHQISDIKRFLKNSFADVDYDNISVV 186
P V ++ QIS + + ++ A + N+++V
Sbjct: 169 KSPSASVTVTLEPGRALDEG---QISAVVHLVSSAVAGLPPGNVTLV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04905BORPETOXINA310.007 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 30.5 bits (68), Expect = 0.007
Identities = 16/57 (28%), Positives = 30/57 (52%), Gaps = 8/57 (14%)

Query: 201 IISDLTRKWSQAEVAGKLFMSVSSLKRKLAAEEVSFSKIYLDARMNQAIKLLRMGAG 257
++ LT + Q + F+S SS +R ++++YL+ RM +A++ R G G
Sbjct: 66 VLDHLTGRSCQVGSSNSAFVSTSSSRR--------YTEVYLEHRMQEAVEAERAGRG 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04935adhesinb321e-112 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 321 bits (824), Expect = e-112
Identities = 89/309 (28%), Positives = 164/309 (53%), Gaps = 14/309 (4%)

Query: 4 LHRLKTLLIAGIVAILAL-------SPAYAKEKFKVITTFTVIADMAKNVAGDAAEVSSI 56
+ + + L++ + + S K V+ T ++IAD+ KN+AGD + SI
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSI 60

Query: 57 TKPGAEIHEYQPTPGDIKRAQGAQLILANGLNLER----WFARFYQHLSGVPE---VVVS 109
G + HEY+P P D+K+ A LI NG+NLE WF + ++ VS
Sbjct: 61 VPVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVS 120

Query: 110 TGVKPMGITEGPYNGKPNPHAWMSAENALIYVDNIRDALVKYDPDNAQIYKQNAERYKAK 169
GV + + GK +PHAW++ EN +IY NI L + DP N + Y++N + Y K
Sbjct: 121 EGVDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEK 180

Query: 170 IRQMADPLRAELEKIPADQRWLVTSEGAFSYLARDNDMKELYLWPINADQQGTPKQVRKV 229
+ + + + IP +++ +VTSEG F Y ++ ++ Y+W IN +++GTP Q++ +
Sbjct: 181 LSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTL 240

Query: 230 IDTIKKHHIPAIFSESTVSDKPARQVARESGAHYGGVLYVDSLSAADGPVPTYLDLLRVT 289
++ ++K +P++F ES+V D+P + V++++ ++ DS++ +Y +++
Sbjct: 241 VEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYN 300

Query: 290 TETIVNGIN 298
E I G++
Sbjct: 301 LEKIAEGLS 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04945HTHFIS382e-128 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 382 bits (982), Expect = e-128
Identities = 143/373 (38%), Positives = 206/373 (55%), Gaps = 39/373 (10%)

Query: 350 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYNVLKQVEMVAQSDSTVLILG 409
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 410 ETGTGKELIARAIHNLSGRSGRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 469
E+GTGKEL+ARA+H+ R V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 470 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKLIQTDVRLIAATNRDLKKMV 529
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 530 ADREFRNDLYYRLNVFPIQLPPLRERPEDIPLLVKAFTFKIARRMGRNIDSIPAETLRTL 589
FR DLYYRLNV P++LPPLR+R EDIP LV+ F + A + G ++ E L +
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 590 SGMEWPGNVRELENVVERAVLLTRGNVLQLS-LPDITAVTPDTSPVATESAKEG------ 642
WPGNVRELEN+V R L +V+ + + SP+ +A+ G
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 643 ----------------------------EDEYQLIIRVLKETNGVVAGPKGAAQRLGLKR 674
E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIK---AADLLGLNR 463

Query: 675 TTLLSRMKRLGID 687
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04960TYPE4SSCAGA270.011 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.011
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRLGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


12GX95_05255GX95_05440Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_05255219-2.660355L-alanine exporter AlaE
GX95_05260322-3.948121DNA-binding protein
GX95_05265123-1.352388hypothetical protein
GX95_05270318-1.472957hypothetical protein
GX95_05275414-0.218420transcriptional regulator
GX95_052804131.150153proteolipid membrane potential modulator
GX95_052853152.355659peptidoglycan-binding protein LysM
GX95_052902172.890061transcriptional regulator
GX95_052951173.042441GABA permease
GX95_053000214.0777984-aminobutyrate--2-oxoglutarate transaminase
GX95_05305-1213.772956succinate-semialdehyde dehydrogenase I
GX95_05310-2193.216328hydroxyglutarate oxidase
GX95_05315-2162.347309carbon starvation induced protein
GX95_05320-3142.754646tripartite tricarboxylate transporter TctA
GX95_05325-2131.184575hypothetical protein
GX95_05330-3130.269835tricarboxylic transporter
GX95_05335-116-2.859174hypothetical protein
GX95_05340017-3.735250DNA-binding response regulator
GX95_05345123-5.503812histidine kinase
GX95_05350232-8.594447nickel transporter
GX95_05355326-7.003366hypothetical protein
GX95_05360322-5.647164antimicrobial resistance protein Mig-14
GX95_05365217-2.022574VirG localization protein VirK
GX95_053700141.741294effector protein PipB
GX95_053750143.140870hypothetical protein
GX95_053800142.754224outer membrane receptor protein
GX95_05385-1133.047904hydrolase
GX95_05390-1132.393148enterochelin esterase
GX95_05395-1111.217763multidrug ABC transporter ATP-binding protein
GX95_05400-213-0.543299glycosyl transferase
GX95_05405-112-0.358670DNA-invertase
GX95_054101133.295237flagellin
GX95_054151143.600635repressor
GX95_054252143.907871secretion protein HlyD
GX95_054302144.095268ATP-binding protein
GX95_054353154.141330type I secretion protein TolC
GX95_054402153.863269Ig-like domain repeat protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05285INTIMIN270.030 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.030
Identities = 20/69 (28%), Positives = 33/69 (47%), Gaps = 6/69 (8%)

Query: 82 SVDDQVKTTTPAAESQFYTVKSGDTLSAISKQVYGNANLYNKIFEANKPMLKSPE---KI 138
D ++ T FYT+K+G+T++ +SK N + I+ NK + S K
Sbjct: 48 GSDSKLLTHNSYQNRLFYTLKTGETVADLSKSQDINLST---IWSLNKHLYSSESEMMKA 104

Query: 139 YPGQVLRIP 147
PGQ + +P
Sbjct: 105 EPGQQIILP 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05340HTHFIS972e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.8 bits (241), Expect = 2e-25
Identities = 35/122 (28%), Positives = 61/122 (50%), Gaps = 1/122 (0%)

Query: 2 RLLLAEDNRELAHWLEKALVQNGFAVDCVFDGLAADHLLHSEMYALAVLDINMPGMDGLE 61
+L+A+D+ + L +AL + G+ V + + + L V D+ MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VVQRLRKRGQTLPVLLLTARSAVADRVKGLNVGADDYLPKPFELEE-LDARLRALLRRSA 120
++ R++K LPVL+++A++ +K GA DYLPKPF+L E + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GQ 122

Sbjct: 125 RP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05345PF06580320.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.006
Identities = 23/101 (22%), Positives = 39/101 (38%), Gaps = 21/101 (20%)

Query: 370 LLDNALKY----TPEQGIVTARLERDSDAVTLVVEDSGPGIDDEHIHLALQPFHRLDNVG 425
L++N +K+ P+ G + + +D+ VTL VE++G L N
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA--------------LKNTK 308

Query: 426 NVAGAGIGLALVND-IARLHRTHPHFSRSEALGGLYVRIRF 465
G GL V + + L+ T SE G + +
Sbjct: 309 E--STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05395PF05272310.032 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.032
Identities = 44/217 (20%), Positives = 64/217 (29%), Gaps = 49/217 (22%)

Query: 992 PPG----TVVAVVGRSGVGKSTLIKLLAGLYSPGSGQIRVGER-----------LIDAAS 1036
PG V + G G+GKSTLI L GL +G + +
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSE 649

Query: 1037 LSDYRRQTGLVTQDVALFSGDIAENI-RYSRPDSSDTEVEIAARRAGLFETV---QHL-- 1090
++ +RR D + RY V+ R+ ++ T Q+L
Sbjct: 650 MTAFRR------ADAEAVKAFFSSRKDRYRGA--YGRYVQDHPRQVVIWCTTNKRQYLFD 701

Query: 1091 PLGFRT--PVNNGG----TDLSAGQRQLIALARAQLAQ----------AHILLLDEATAR 1134
G R PV G L + QL A A I E R
Sbjct: 702 ITGNRRFWPVLVPGRANLVWLQKFRGQLFAEALHLYLAGERYFPSPEDEEIYFRPEQELR 761

Query: 1135 -IDRSAEERLITSLTGVTHTEKRIALIVAHRLTTARR 1170
++ + RL LT A A + +
Sbjct: 762 LVETGVQGRLWALLTREG---APAAEGAAQKGYSVNT 795


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05410FLAGELLIN2808e-91 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 280 bits (718), Expect = 8e-91
Identities = 267/510 (52%), Positives = 316/510 (61%), Gaps = 13/510 (2%)

Query: 2 AQVINTNSLSLLTQNNLNKSQSALGTAIERLSSGLRINSAKDDAAGQAIANRFTANIKGL 61
AQVINTNSLSLLTQNNLNKSQS+L +AIERLSSGLRINSAKDDAAGQAIANRFT+NIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQASRNANDGISIAQTTEGALNEINNNLQRVRELAVQSANSTNSQSDLDSIQAEITQRLN 121
TQASRNANDGISIAQTTEGALNEINNNLQRVREL+VQ+ N TNS SDL SIQ EI QRL
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVKVLAQDNTLTIQVGANDGETIDIDLKQINSQTLGLDSLNVQKAYDV 181
EIDRVS QTQFNGVKVL+QDN + IQVGANDGETI IDL++I+ ++LGLD NV +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 SATDVISSTYSDGTQALTAPTATDIKAALGNPTVTGDTLTAAVSFKDGKYYATVSGYTDA 241
+ D+ SS + A A + + + V DT V D Y +G
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVP--DKVYVNAANGQLTT 238

Query: 242 GDTAKNGKYEVTVDSATGAVSFGATPTKSTVTGDTAVTKVQVNAPVAADAATKKALQDGG 301
D N ++ + + A + A + G V TK G
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKG-VTFTIDTKTGNDGNG 297

Query: 302 VSSADANAATLVKMSYTDKNGKTIEGGYALKAGDKYYAA------DYDEATGAIKAKTTS 355
S N + G L++ Y + +D+ T AK +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 356 YTAADGTTKTAANQLGGVDG----KTEVVTIDGKTYNASKAAGHDFKAQPELAEAAAKTT 411
A + + + G + + VT+ GKT K A E A AA K+T
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKST 417

Query: 412 ENPLQKIDAALAQVDALRSDLGAVQNRFNSAITNLGNTVNNLSEARSRIEDSDYATEVSN 471
NPL ID+AL++VDA+RS LGA+QNRF+SAITNLGNTV NL+ ARSRIED+DYATEVSN
Sbjct: 418 ANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSN 477

Query: 472 MSRAQILQQAGTSVLAQANQVPQNVLSLLR 501
MS+AQILQQAGTSVLAQANQVPQNVLSLLR
Sbjct: 478 MSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05425RTXTOXIND2433e-78 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 243 bits (621), Expect = 3e-78
Identities = 95/432 (21%), Positives = 176/432 (40%), Gaps = 56/432 (12%)

Query: 18 ERAFSGAGRIVLICSLLFLILGI-WAWFGRLDEVSTGNGKVIPSSREQVLQSLDGGILAQ 76
E S R+V + FL++ + G+++ V+T NGK+ S R + ++ ++ I+ +
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 77 LTVREGDRVQANQIVARLDPTRLASNVGESAAKYRASLASSARLTAEVSDLPL------- 129
+ V+EG+ V+ ++ +L ++ ++ + + R + L
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 130 --AFPAELNGWPDLIAAETRLYKSR-----------RAQLADTEAELRDALASVNK---- 172
P N + + T L K + L AE LA +N+
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 173 ------ELAITQRLEKSGAASHVEVLRLQRQKSDLG---------------------LKI 205
L L A + VL + + + +
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 206 TDLRSQYYVQAREALSKANAEVDMLSAILKGREDSVTRLTVRSPVRGIVKNIQVTTIGGV 265
+ + + + L + + +L+ L E+ +R+PV V+ ++V T GGV
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 266 IPPNGEMMEIVPVDDRLLIETRLSPRDIAFIHPGQRALVKITAYDYAIYGGLDGVVETIS 325
+ +M IVP DD L + + +DI FI+ GQ A++K+ A+ Y YG L G V+ I+
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN 409

Query: 326 PDTIQDKVKPEIFYYRVFIRTHQDYLQNKSGRRFSIVPGMIATVDIKTGEKTIVDYLIKP 385
D I+D+ + V I ++ L + + GM T +IKTG ++++ YL+ P
Sbjct: 410 LDAIEDQRLGL--VFNVIISIEENCLSTG-NKNIPLSSGMAVTAEIKTGMRSVISYLLSP 466

Query: 386 F-NRAKEALRER 396
E+LRER
Sbjct: 467 LEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05435RTXTOXIND355e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 5e-04
Identities = 32/224 (14%), Positives = 63/224 (28%), Gaps = 32/224 (14%)

Query: 209 DVVQTEARIESARSQLAQYQANLDSAKASLMSWLGWNSLNGINNDFPAKLARSCETATPD 268
+ EA +S L Q + + S D P S E
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 269 DRLVPAVLAAW-AQANVARANLDYASAQ---MTPTISLEPSVQHYLNDKYPSHEVLDKTQ 324
L+ + W Q NLD A+ + I+ ++ + L Q
Sbjct: 188 TSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQ 247

Query: 325 YSTWVKVEMPLYQGGGLTARRNAASHAVDAAQSTIQRTRLDVRQKLMEARSQAMSLASAL 384
V + ++ +L +SQ + S
Sbjct: 248 AIAKHAVL-------------------------EQENKYVEAVNELRVYKSQLEQIES-- 280

Query: 385 QILRRQQQLSERTRELYQQQYLDLGSRPLLDVLNAEQEVYQARF 428
+IL +++ T +L++ + LD + ++ E+ +
Sbjct: 281 EILSAKEEYQLVT-QLFKNEILDKLRQTTDNIGLLTLELAKNEE 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05440INTIMIN433e-05 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 43.1 bits (101), Expect = 3e-05
Identities = 66/311 (21%), Positives = 102/311 (32%), Gaps = 30/311 (9%)

Query: 2724 TPAQTNGQPLLAFAQDKAGNTGIAAGFTAPDTRVPEAPIITNVVDDVGIYTGAIANGQ-- 2781
+N + A A D+ GN+ T + V D T A A+G
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 2782 VTNDAQPTLNGTAQAGATVS--IYNNGALLGTTTANASGNWSFTPTGNLTEGSHAFT-AT 2838
+T A NG AQA VS I + A+L +AN +G+ T T L +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVT--LKSDKPGQVVVS 635

Query: 2839 ATNANGTGSVSTAATVIVDTLAPGTPSGTLSADGGSLSGQAEANSTVTVTLAGG------ 2892
A A T +++ A + VD +GQ TV V
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 2893 VTLTTTAG----------SNGAWSLTLPTKQIEGQLINVTATDAAGN-ASGALGITAPVL 2941
VT TTT G +NG +TL + L++ +D A + + + +
Sbjct: 696 VTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLT 755

Query: 2942 PLAARDNITSLDLTSTAVTSTQNYSDYGLLLVGALGNVASVLGN------DTAQVEFIIA 2995
I + T Y L G G N D + + +
Sbjct: 756 IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLK 815

Query: 2996 EGGTGDVTIDA 3006
E GT +++ +
Sbjct: 816 EKGTTTISVIS 826



Score = 39.3 bits (91), Expect = 4e-04
Identities = 79/414 (19%), Positives = 146/414 (35%), Gaps = 39/414 (9%)

Query: 2197 IYNGSALVGTA-QVQANGSWSFT-------PSTSLGAGVWNLTATATDAAGNTSAASEIR 2248
+++ SAL Q+Q +GS S G+ V+ +TA A D GN+S + +
Sbjct: 486 VWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSS-NNVLL 544

Query: 2249 SFTIDTTAPAAPVIDTVYDGTGPITGNLSSGQ--ITDEARPVISGTREAN--TTIRLYDN 2304
+ T+ + V D T T + G IT A +G +AN + +
Sbjct: 545 TITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSG 603

Query: 2305 GTLLAEIPADNSNSWRYTPDASLATGNHVITVIAVDAAGNASPV-SDSVNFVVDTTPPLT 2363
+L+ A+ + S + T +L + V++ A S + +++V FV T +T
Sbjct: 604 TAVLSANSANTNGSGKAT--VTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASIT 661

Query: 2364 PVITSVSDDQAPGLGTIANGQN--TNDPTPTFSGTAEAGATITLYENGTVIGTTTAQ--P 2419
+ A +ANGQ+ T + +T + +T +
Sbjct: 662 EI--KADKTTA-----VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT 714

Query: 2420 DGAWSVSTSKLTSGTHVITAVATDAAGNSSPNSTAFTLTVDTTAPQTPILTSVVDDVAGG 2479
+G V+ + T G +++A +D A + F T+ I+ + V
Sbjct: 715 NGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPT 774

Query: 2480 VTGNLANGQITNDNRPTLNGTAEAGSVISIYDGNTLLGVTSANAGGAWSFTPTTGLNDGT 2539
V + A I+ D ++ G + G + + + N
Sbjct: 775 VWLQYGQVNLKASGGNGKYTWRSANPAIASVDASS--GQVTLKEKGTTTISVISSDN--- 829

Query: 2540 RTLTVTATDPAGNVSPATSGFTIVVD------TLAPTVPLITSIVDDVPNNTGA 2587
+T T T P + P S D +P + +++V GA
Sbjct: 830 QTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883



Score = 36.6 bits (84), Expect = 0.003
Identities = 63/279 (22%), Positives = 90/279 (32%), Gaps = 36/279 (12%)

Query: 1508 TLPVTSALPDGVYTLTAIAADAAGNSSGVSNSFTFTVDTVPLLPPVVN--EILDDVAPVT 1565
LP VY +TA A D GNS SN+ T+ TV VV+ + D A T
Sbjct: 513 ILPAYVQGGSNVYKVTARAYDRNGNS---SNNVLLTI-TVLSNGQVVDQVGVTDFTADKT 568

Query: 1566 GPLTDG--AFTNDRTLTINGSGENGSTVTIYDNGVAIGTALVTDGVWTFN-----TPELS 1618
DG A T T+ NG + V+ + GTA+++ N T L
Sbjct: 569 SAKADGTEAITYTATVKKNGVAQANVPVSF---NIVSGTAVLSANSANTNGSGKATVTLK 625

Query: 1619 EVSHALTFSATDDAGNTTAQTQPITITVDITAPPAPTIQTVDDDGTRVAGRADPYA-TVE 1677
+ A T+A I VD T I+ T VA D TV+
Sbjct: 626 SDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKT--TAVANGQDAITYTVK 683

Query: 1678 IHHADGTLVGSAVANGTGEFVVTLSPAQTDG---------GTLTAIAIDRAGNNGPATNF 1728
+ D + V T ++ S +TD T ++ A + A +
Sbjct: 684 VMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDV 743

Query: 1729 PASDSGLPAVPAITAIEDNVGSVQGNIAAGGATDDTTPT 1767
A + I + G PT
Sbjct: 744 KAPEVEFFTTLTIDDGNIEI--------VGTGVKGKLPT 774


13GX95_05710GX95_05780Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_05710-114-3.272074DNA repair protein RecO
GX95_05715016-4.420199pyridoxine 5'-phosphate synthase
GX95_05720017-4.316101holo-ACP synthase
GX95_05725-117-3.477620ferredoxin
GX95_05730116-2.052682transcriptional regulator
GX95_05735018-0.424297MFS transporter
GX95_057400181.4094172-dehydropantoate 2-reductase
GX95_057450131.760214RpiR family transcriptional regulator
GX95_05750-1121.065912N-acetylmuramic acid 6-phosphate etherase
GX95_05755-2143.159087PTS sugar transporter subunit IIC
GX95_05760-3143.018652acid phosphatase AphA
GX95_05765-2133.307306tRNA adenosine(34) deaminase TadA
GX95_05770-2133.115117lytic transglycosylase F
GX95_05775-2143.611122hypothetical protein
GX95_05780-2133.808012phosphoribosylformylglycinamidine synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05735TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 33/177 (18%), Positives = 69/177 (38%), Gaps = 3/177 (1%)

Query: 213 FWLLFMILALGVFSGMVISSSSAQIGMTQYGLLSGAL-VVSLVSIFNSIGRLFWGGLTDK 271
WL + V + MV++ S I + V + + SIG +G L+D+
Sbjct: 17 IWLCILSF-FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 272 LGGYNTLVIVYLFTCVCMLLLLFFNGNTSVFYFSALGVGFAYAGILVIFPGLTSQNFGMR 331
LG L+ + C ++ + S+ + G A + + ++
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 332 NQGLNYGFMYFGFAVGAVIAPYVTSAIAKYTGSYNTVFILTTVLLLIGVVLTLITKK 388
N+G +G + A+G + P + IA Y ++ + ++ + ++ L + KK
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIH-WSYLLLIPMITIITVPFLMKLLKK 191


14GX95_05895GX95_05955Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_058952162.278453Fe-S cluster assembly scaffold IscU
GX95_05900-1152.982284iron-sulfur cluster assembly protein IscA
GX95_059050143.016231co-chaperone HscB
GX95_05910-1163.669843Fe-S protein assembly chaperone HscA
GX95_05915-1153.951936ferredoxin, 2Fe-2S type, ISC system
GX95_05920-1144.916176Fe-S assembly protein IscX
GX95_05925-1144.384353aminopeptidase PepB
GX95_05930-1134.357158enhanced serine sensitivity protein SseB
GX95_05935-1134.7344543-mercaptopyruvate sulfurtransferase
GX95_05940-1144.749506hypothetical protein
GX95_059450134.500623penicillin-binding protein 1C
GX95_059500152.368046dimethyl sulfoxide reductase subunit A
GX95_059552163.129016dimethylsulfoxide reductase, chain B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05910SHAPEPROTEIN1191e-31 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 119 bits (299), Expect = 1e-31
Identities = 84/368 (22%), Positives = 148/368 (40%), Gaps = 68/368 (18%)

Query: 23 GIDLGTTNSLVATVRSGQAETLPDHEGRHLLPSVVHYQQQGHTVGYAARDNAAQDTANTI 82
IDLGT N+L+ G + +E PSVV A + ++
Sbjct: 14 SIDLGTANTLIYVKGQG----IVLNE-----PSVV------------AIRQDRAGSPKSV 52

Query: 83 SSV----KRMMGRSLADIQARYPHLPYRFKASVNGLPMIDTAAGLLNPVRVSADILKALA 138
++V K+M+GR+ +I A P M D G++ V+ +L+
Sbjct: 53 AAVGHDAKQMLGRTPGNIAAIRP--------------MKD---GVIADFFVTEKMLQHFI 95

Query: 139 ARA-SESLSGELDGVVITVPAYFDDAQRQGTKDAARLAGLHVLRLLNEPTAAAIAYGLDS 197
+ S S V++ VP +R+ +++A+ AG + L+ EP AAAI GL
Sbjct: 96 KQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPV 155

Query: 198 GKEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDDFDHLLADYIREQAG--I 255
+ V D+GGGT +++++ L+ V +GGD FD + +Y+R G I
Sbjct: 156 SEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYGSLI 210

Query: 256 ADRSDNRVQRELLDAAIAAKIALSDADTVRVNVAG---WQG-----EITREQFNDLISAL 307
+ + R++ E+ A + + V G +G + + + +
Sbjct: 211 GEATAERIKHEI-------GSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 308 VKRTLLACRRALKDAGVD-PQDVLE--VVMVGGSTRVPLVRERVGEFFGRTPLTAIDPDK 364
+ + A AL+ + D+ E +V+ GG + + + E G + A DP
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLT 323

Query: 365 VVAIGAAI 372
VA G
Sbjct: 324 CVARGGGK 331


15GX95_06005GX95_06070Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_060050163.310657ribosome biogenesis GTPase Der
GX95_060101153.285383hypothetical protein
GX95_060151153.450353intimin-like inverse autotransporter protein
GX95_060201163.916794hypothetical protein
GX95_060251173.996002hypothetical protein
GX95_060301173.893884hypothetical protein
GX95_060354182.328970AIDA autotransporter
GX95_060403181.348119exodeoxyribonuclease VII large subunit
GX95_06045524-0.045199IMP dehydrogenase
GX95_06050421-3.504386glutamine-hydrolyzing GMP synthase
GX95_06055125-8.134166cytoplasmic protein
GX95_06060118-5.397432hypothetical protein
GX95_06065012-4.517733DUF2633 domain-containing protein
GX95_06070011-3.555210hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06015INTIMIN342e-107 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 342 bits (879), Expect = e-107
Identities = 176/578 (30%), Positives = 269/578 (46%), Gaps = 41/578 (7%)

Query: 53 KGKTFKEQGANYFINSATQGFDNLTPEALES-QARGYIQGQVTSSAQSYLEGMLSPYGKV 111
K ++ NY A L +L A+ G + A S L+ L YG
Sbjct: 154 KSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTA 213

Query: 112 RTSLSIGEGGDLDGSSLDYFIPWYDNQSTLFFSQISAQRKEDRTIGNVGLGVRQNVGNWL 171
+L G + DGSSLD+ +P+YD++ L F Q+ A+ + R N+G G R + +
Sbjct: 214 EVNLQSGN--NFDGSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENM 271

Query: 172 LGGNAFYDYDFTRGHRRLGLGTEAWTDYLKFSGNYYHPLSDWKDSKDFDFYEERPARGWD 231
LG N F D DF+ + RLG+G E W DY K S N Y +S W +S + Y+ERPA G+D
Sbjct: 272 LGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPANGFD 331

Query: 232 IRMESWLPFYPQLGAKLVYEQYYGDEVALFGTDNLQKDPHAVTVGLNYTPVPLVTVGTDY 291
IR +LP YP LGAKL+YEQYYGD VALF +D LQ +P A TVG+NYTP+PLVT+G DY
Sbjct: 332 IRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDY 391

Query: 292 KAGTGDSNDFSVNATVNYQIGTPLAAQLDPENVKIQHSLMGSRTDFVDRNNFIVLEYREK 351
+ GTG+ ND + YQ P + Q++P+ V +L GSR D V RNN I+LEY+++
Sbjct: 392 RHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQ 451

Query: 352 DPLDVTLWLKADATNEHPECVIEDTPEAAVGLEKCKWTVNALINHHYKIISASWQAKNNA 411
D L + + + T E I+ ++ GL++ W +AL + +I + Q+ +
Sbjct: 452 DILSLNIPHDINGT-ERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQD- 509

Query: 412 ARTLVMPVVKADALTEGNNNRWNLVLPAWVNADTEEQRTALNTWKVRMTLEDEKGNKQNS 471
+ +LPA+V + N +KV D GN N+
Sbjct: 510 ---------------------YQAILPAYV-------QGGSNVYKVTARAYDRNGNSSNN 541

Query: 472 GVVEITVQQDRKIELIVDNIADTDRSDHSHEASALADGEDGVVMDLLITDSFGD-ATDRN 530
++ ITV + +VD + TD + + + SA ADG + + + + A
Sbjct: 542 VLLTITVLSN---GQVVDQVGVTDFT--ADKTSAKADGTEAITYTATVKKNGVAQANVPV 596

Query: 531 GNELVDDAMTPVLYDSNDKKVTLAQTPCTTETPCVFIASRDKEAGTVTLSSTLPGTFRWK 590
+V +N A ++ P + S T L++
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNA--NAVIFVD 654

Query: 591 AKADAYGDSNYVDVTFIGDNLSALNAVIYQVKAANPVN 628
+ + T + + A+ + +K PV+
Sbjct: 655 QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVS 692


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06035PRTACTNFAMLY891e-19 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 89.0 bits (220), Expect = 1e-19
Identities = 124/532 (23%), Positives = 196/532 (36%), Gaps = 47/532 (8%)

Query: 1318 VNVASGATFGGSNGTTVNGKVTNEGTLVFGDSEETGAIFTLNGDLINMGTMTSGSSASTP 1377
V +AS A + G+ V+ + T V D+ GA+ L + G++ A
Sbjct: 415 VALASQARWTGAT-RAVDSLSIDNATWVMTDNSNVGAL-----RLASDGSVDFQQPAEAG 468

Query: 1378 GNTLYVDGDYTGNGGSLYLNTVLGDDDSATDKLVITGDASGTTDLYINGIGNGAQTTNGI 1437
+ G+G +N D +DKLV+ DASG L++ G+ + N +
Sbjct: 469 RFKVLTVNTLAGSG-LFRMNVFA--DLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTL 525

Query: 1438 EVVDVGGTSTSDAFVLKN---EVNASLYTYRLYWNESDNDWYLASKAQSDDDDSGSDVTP 1494
+V S + F L N +V+ Y YRL N + + +KA +
Sbjct: 526 LLVQTPLGSAA-TFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQ---- 580

Query: 1495 PDDDSDVTPSDDGDDGGNVTPPDDGGDVAPQYRADIGAYMGNQ--WMARNLQMQTLYDRE 1552
P P + P A + G W A + L R
Sbjct: 581 PGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYA---ESNALSKRL 637

Query: 1553 GSQYRNAD-GSVWARFKAGKAESEAVSGNIDMDSNYSQFQLGGDILAWGNGQQSVTVGVM 1611
G N D G W R A + + + +G D + F+LG D A +G +
Sbjct: 638 GELRLNPDAGGAWGRGFAQRQQLDNRAGR-RFDQKVAGFELGAD-HAVAVAGGRWHLGGL 695

Query: 1612 ASYINADTDSTGNRGADGSQFTSSSNVDGYNLGVYATWFADAQTHSGAYVDSWYQYGFYN 1671
A Y D TG+ G D ++G YAT+ AD SG Y+D+ +
Sbjct: 696 AGYTRGDRGFTGDGGGH---------TDSVHVGGYATYIAD----SGFYLDATLRASRLE 742

Query: 1672 N--SVESGDAGSESYDSTANAV--SLETGYRYDIALSNGNTVSLTPQAQVVWQNYSADSV 1727
N V D + + V SLE G R+ A + L PQA++ +
Sbjct: 743 NDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHA----DGWFLEPQAELAVFRAGGGAY 798

Query: 1728 KDNYGTRIDGQDSDSWTTRLGLRVDGKLYKGSRTVIQPFAEANWLHTSD-DVSVSFDDAT 1786
+ G R+ + S RLGL V ++ +QP+ +A+ L D +V +
Sbjct: 799 RAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIA 858

Query: 1787 VKQDLPANRAELKVGLQADIDKQWSVRAQVAGQTGSNDFGDLNGSLNLRYNW 1838
+ +L RAEL +G+ A + + S+ A G RY+W
Sbjct: 859 HRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


16GX95_06125GX95_06180Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_061252141.515407hypothetical protein
GX95_061301170.431242AI-2E family transporter
GX95_061351150.317120XRE family transcriptional regulator
GX95_06140-1122.052808gluconate:proton symporter
GX95_06145-1131.234934glycerate 2-kinase
GX95_061500110.545630thioredoxin-dependent thiol peroxidase
GX95_06155-1104.034354glycine cleavage system transcriptional
GX95_06160-1124.3245494-hydroxy-tetrahydrodipicolinate synthase
GX95_06165-2104.262803outer membrane protein assembly factor BamC
GX95_06170-1113.823815phosphoribosylaminoimidazolesuccinocarboxamide
GX95_06175-2103.695528hypothetical protein
GX95_06180-2103.135553tRNA cytosine(34) acetyltransferase TmcA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06125SYCDCHAPRONE384e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 37.6 bits (87), Expect = 4e-05
Identities = 31/149 (20%), Positives = 53/149 (35%), Gaps = 28/149 (18%)

Query: 290 NQLTSDLLDQWSKGNVRQQHAAQYGRALQAMEASKYDEARKTLQPLLSAEPNNAWYLDLA 349
N+++SD L+Q Y A ++ KY++A K Q L + ++ +
Sbjct: 29 NEISSDTLEQL------------YSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGL 76

Query: 350 TDIDLGQKRANDAINRLKNARDLRVN-PVLQLNLANAYLQGGQPKAAETILNRYTFSHKD 408
+ + AI+ + + P + A LQ G+ AE+ L
Sbjct: 77 GACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLF-------- 128

Query: 409 DGNGWDLLAQAEAALNNRDQELAARAESY 437
LAQ A +EL+ R S
Sbjct: 129 -------LAQELIADKTEFKELSTRVSSM 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06135HTHFIS320.005 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.005
Identities = 9/70 (12%), Positives = 25/70 (35%), Gaps = 6/70 (8%)

Query: 283 DHALPALLSGLSESWQVQELSRLWLQLAQHDAKGVLQQTLRTWFEHNCDLTQTAKALHIH 342
+ + + ++ L L ++ ++ L + + A L ++
Sbjct: 409 EENMRQYFASFGDALPPSGLYDRVLAEMEYP---LILAALT---ATRGNQIKAADLLGLN 462

Query: 343 VNTLRYRLQR 352
NTLR +++
Sbjct: 463 RNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06180AUTOINDCRSYN310.007 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 31.4 bits (71), Expect = 0.007
Identities = 25/122 (20%), Positives = 50/122 (40%), Gaps = 13/122 (10%)

Query: 459 SRIAVHPARQREGIGQQLIACACMQVAQCDYLSVSFGYT-------PELWRFWQRCGFVL 511
SR V +R ++ +G + + + ++ +Y S GY + +R G+
Sbjct: 100 SRFFVDKSRAKDILGNEYPISSMLFLSMINY-SKDKGYDGIYTIVSHPMLTILKRSGWG- 157

Query: 512 VRMGNHREASSGCYTAMALLPLSDAGQ-RLAQQEHRRLRRDADILTQWNGEAIPLAALRE 570
+R+ + + LP+ D Q LA++ +R ++ L QW + + A
Sbjct: 158 IRVVEQGLSEKEERVYLVFLPVDDENQEALARRINRSGTFMSNELKQW---PLRVPAAIA 214

Query: 571 QA 572
QA
Sbjct: 215 QA 216


17GX95_06245GX95_06335Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_062450173.799995NADP-dependent malic enzyme
GX95_062501214.609307ethanolamine utilization protein EutS
GX95_062551215.136874ethanolamine utilization protein EutP
GX95_062603246.456804ethanolamine utilization protein EutQ
GX95_062653227.043518cobalamin adenosyltransferase
GX95_062704227.838652phosphate acetyltransferase
GX95_062753236.606195ethanolamine utilization protein EutM
GX95_062802226.974168ethanolamine utilization protein EutN
GX95_062851246.379044aldehyde dehydrogenase EutE
GX95_062900246.436443ethanolamine utilization protein EutJ
GX95_062950236.268247alcohol dehydrogenase EutG
GX95_063000235.711625ethanolamine utilization protein EutH
GX95_06305-1205.384881reactivating factor for ethanolamine ammonia
GX95_06310-2183.469498ethanolamine ammonia-lyase
GX95_06315-2143.668758ethanolamine ammonia-lyase
GX95_06320-1131.588970microcompartment protein EutL
GX95_063250130.548812ethanolamine utilization protein EutK
GX95_06330012-0.015415ethanolamine utilization protein EutR
GX95_063352130.112459hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06290SHAPEPROTEIN495e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.4 bits (118), Expect = 5e-09
Identities = 33/116 (28%), Positives = 51/116 (43%), Gaps = 9/116 (7%)

Query: 64 VRDGIVWDFFGAVTLVRRHIDTLEQQLGCRFT-HAATSFPPGTDP---RISINVLESAGL 119
++DG++ DFF +++ I + R + P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 120 EVSHVLDEPTAVA---DLLALDNAG--VVDIGGGTTGIAIVKQGKVTYSADEATGG 170
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


18GX95_06385GX95_06425Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_06385-3153.153128sulfate ABC transporter permease subunit CysW
GX95_06390-2153.516706sulfate ABC transporter ATP-binding protein
GX95_06395-1163.201614cysteine synthase B
GX95_06400-2141.909689hypothetical protein
GX95_064050131.026097GMP synthase
GX95_064101150.808560transcriptional regulator PtsJ
GX95_06415320-1.192959bifunctional pyridoxal
GX95_06420326-1.891834cytoplasmic protein
GX95_06425223-1.932251PTS glucose transporter subunit IIA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06390PF05272349e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 9e-04
Identities = 12/33 (36%), Positives = 16/33 (48%)

Query: 30 MVALLGPSGSGKTTLLRIIAGLEHQSSGHIRFH 62
V L G G GK+TL+ + GL+ S H
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


19GX95_06690GX95_06765Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_06690-1153.444976chorismate synthase
GX95_06695-3133.087289penicillin-insensitive murein endopeptidase
GX95_06700-3120.797212hypothetical protein
GX95_06705-3140.579682elongation factor P hydroxylase
GX95_06710-3150.343922hypothetical protein
GX95_06715-2150.134442bifunctional tRNA
GX95_06720-121-2.610210beta-ketoacyl-[acyl-carrier-protein] synthase I
GX95_06725223-1.317600hypothetical protein
GX95_06730-1181.725719hypothetical protein
GX95_06735-2152.432185hypothetical protein
GX95_06740-2143.219531DNA-binding protein
GX95_06745-2153.378714hypothetical protein
GX95_06750-1143.090030arabinose transporter
GX95_06755-1132.315386flagella biosynthesis regulator
GX95_067600143.0092094-phosphoerythronate dehydrogenase PdxB
GX95_06765-1153.340687aspartate-semialdehyde dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06750TCRTETA447e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.0 bits (104), Expect = 7e-07
Identities = 80/362 (22%), Positives = 135/362 (37%), Gaps = 30/362 (8%)

Query: 14 NFSLFRIAFAVFLTYMTVGLPLPVIPLFVHHELGYSNTMV---GIAVGIQFFATVLTRGY 70
N L I V L + +GL +PV+P + +L +SN + GI + +
Sbjct: 4 NRPLIVILSTVALDAVGIGLIMPVLPGLLR-DLVHSNDVTAHYGILLALYALMQFACAPV 62

Query: 71 AGRLADQYGAKRSALQGMFACGLAGAAWLLAALLPVSAPVKFALLIVGRLILGFGESQLL 130
G L+D++G + + LAGAA + + +AP +L +GR++ G +
Sbjct: 63 LGALSDRFGRRP-----VLLVSLAGAA--VDYAIMATAPF-LWVLYIGRIVAGITGATGA 114

Query: 131 TGTLTWGLGLVGPTRSGKVMSWNGMAIYGALAAGAPLGLL---IHSHFGFAALAGTTMVL 187
G R+ + + + AG LG L H F A A +
Sbjct: 115 VAGAYIADITDGDERA-RHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLN 173

Query: 188 PLLAWAFNGTVRKVPAYTGERPSLWSVVGLIWKPGL-----------GLALQGVGFAVIG 236
L K R +L + W G+ + L G A +
Sbjct: 174 FLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALW 233

Query: 237 TFISLYFVSNGWTMAGFTLTAFGGAFVLMR-MLFGWMPDRFGGVKVAVVSLLVETAGLLL 295
T G +L AFG L + M+ G + R G + ++ ++ + G +L
Sbjct: 234 VIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYIL 293

Query: 296 LWLAPTAWIALVGAALTGAGCSLIFPALGVEVVKRVPAQVRGTALGGYAAFQDISYGVTG 355
L A W+A L +G + PAL + ++V + +G G AA ++ + G
Sbjct: 294 LAFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLT-SIVG 351

Query: 356 PL 357
PL
Sbjct: 352 PL 353



Score = 29.4 bits (66), Expect = 0.027
Identities = 32/137 (23%), Positives = 46/137 (33%), Gaps = 12/137 (8%)

Query: 261 AFVLMRMLF----GWMPDRFGGVKVAVVSLLVETAGLLLLWLAPTAWIALVG---AALTG 313
+ LM+ G + DRFG V +VSL ++ AP W+ +G A +TG
Sbjct: 51 LYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITG 110

Query: 314 AGCSLIFPALGVEVVKRVPAQVRGTALGGYAAFQDISYGVTGPLAGMLATSCGYPSVFLA 373
A G + R G +A V GP+ G L + F A
Sbjct: 111 A----TGAVAGAYIADITDGDERARHFGFMSACFGFGM-VAGPVLGGLMGGFSPHAPFFA 165

Query: 374 GAISAVVGILVTILSFR 390
A + L
Sbjct: 166 AAALNGLNFLTGCFLLP 182


20GX95_06975GX95_07080Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_06975-2263.857958NADH-quinone oxidoreductase subunit A
GX95_06980-1273.787510NADH dehydrogenase
GX95_06985-1283.820454NADH-quinone oxidoreductase subunit C/D
GX95_06990-1293.923485NADH-quinone oxidoreductase subunit E
GX95_06995-1293.868520NADH-quinone oxidoreductase subunit F
GX95_070000283.817783NADH-quinone oxidoreductase subunit G
GX95_070052292.976884NADH-quinone oxidoreductase subunit H
GX95_070103304.011935NADH-quinone oxidoreductase subunit I
GX95_070152273.427086NADH:ubiquinone oxidoreductase subunit J
GX95_070201223.254121NADH-quinone oxidoreductase subunit K
GX95_070251213.156465NADH-quinone oxidoreductase subunit L
GX95_070300162.423546NADH-quinone oxidoreductase subunit M
GX95_07035-1112.507602NADH-quinone oxidoreductase subunit N
GX95_07040-1122.577636chemotaxis protein CheV
GX95_07045-1133.846154ribonuclease Z
GX95_07050-1134.020571GNAT family N-acetyltransferase
GX95_07055-1124.796763protein ElaB
GX95_070600135.507947isochorismate synthase MenF
GX95_070650145.5290742-succinyl-5-enolpyruvyl-6-hydroxy-3-
GX95_07070-1154.8457262-succinyl-6-hydroxy-2,
GX95_070750144.6604011,4-dihydroxy-2-naphthoyl-CoA synthase
GX95_07080-1163.348741o-succinylbenzoate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06980FLGBIOSNFLIP280.019 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 28.3 bits (63), Expect = 0.019
Identities = 18/56 (32%), Positives = 25/56 (44%), Gaps = 3/56 (5%)

Query: 68 MVTSFT---AVHDVARFGAEVLRASPRQADLMVVAGTCFTKMAPVIQRLYDQMLEP 120
M+TSFT V + R A P Q L + F M+PVI ++Y +P
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQP 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07040HTHFIS491e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 1e-08
Identities = 33/148 (22%), Positives = 59/148 (39%), Gaps = 16/148 (10%)

Query: 185 PGAVAIVAEDSKVARAMLEKGLNAMGIPHQMHVTGKDAWERIQQLAQEAEAEGKPISEKI 244
GA +VA+D R +L + L+ G ++ W I +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-------------AGDG 48

Query: 245 ALVLTDLEMPEMDGFTLTRKIKTDERLKKIPVVIHSSLSGSANEDHVRKVKADGYVAK-F 303
LV+TD+ MP+ + F L +IK + +PV++ S+ + + A Y+ K F
Sbjct: 49 DLVVTDVVMPDENAFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106

Query: 304 EINELSSVIQEVMERAAQNVSGPLVSRQ 331
++ EL +I + + S Q
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQ 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07050AUTOINDCRSYN300.002 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 30.2 bits (68), Expect = 0.002
Identities = 11/74 (14%), Positives = 27/74 (36%), Gaps = 12/74 (16%)

Query: 1 MIDWQDLHHSELTVPQLYALLKLRCAVFV--------VEQRCPYLDVDGDDLVGDNRHIL 52
M++ D++H+ L+ + L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWHQDELVAYARIL 66
G + ++ R +
Sbjct: 57 GIKDNTVICSLRFI 70


21GX95_07195GX95_07360Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_07195-114-4.0713552Fe-2S ferredoxin
GX95_07200-113-5.325351ribonucleotide-diphosphate reductase subunit
GX95_07205-114-3.440490ribonucleoside-diphosphate reductase subunit
GX95_07210-210-3.134836bifunctional 3-demethylubiquinol
GX95_07215-19-3.517421hypothetical protein
GX95_07220-19-2.416824MFS transporter
GX95_07225013-1.166778MR-MLE family protein
GX95_072300150.282223DNA gyrase subunit A
GX95_07235-1130.496203two-component system sensor histidine
GX95_07240-2131.064990DNA-binding response regulator
GX95_07245-3111.519849phosphotransferase RcsD
GX95_07250-2152.927327porin OmpC
GX95_07255-1163.745796FAD:protein FMN transferase ApbE
GX95_07260-1153.674419bifunctional transcriptional
GX95_07265-2162.895295alpha-ketoglutarate-dependent dioxygenase AlkB
GX95_07270-1183.153222multidrug ABC transporter permease/ATP-binding
GX95_072750163.239115ecotin
GX95_072800183.514413ferredoxin-type protein NapF
GX95_072850192.971561nitrate reductase
GX95_072900224.762628nitrate reductase catalytic subunit
GX95_072952297.587056ferredoxin-type protein NapG
GX95_073002338.435492quinol dehydrogenase ferredoxin subunit NapH
GX95_0730524011.000852nitrate reductase
GX95_0731034712.230413cytochrome c-type protein NapC
GX95_0731575615.293309heme ABC transporter ATP-binding protein CcmA
GX95_0732065514.052661heme exporter protein CcmB
GX95_0732565413.711095heme ABC transporter permease
GX95_0733044811.975314heme exporter protein CcmD
GX95_073352409.016104cytochrome c biogenesis protein CcmE
GX95_073402357.641835c-type cytochrome biogenesis protein CcmF
GX95_073450181.278945thiol:disulfide interchange protein
GX95_073501160.188090cytochrome c biogenesis protein CcmH
GX95_07355114-3.107162DNA-binding response regulator
GX95_07360115-3.415487hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07215NUCEPIMERASE280.031 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.8 bits (62), Expect = 0.031
Identities = 16/75 (21%), Positives = 29/75 (38%), Gaps = 15/75 (20%)

Query: 133 AERDTQAYLKLDHDFHYVFVKYADNKYISQAHLLISARLLAIRYRLDFTAEYITSSNRGH 192
A+R+ L F VF+ + A+RY L+ Y S+ G
Sbjct: 62 ADREGMTDLFASGHFERVFISPH------RL---------AVRYSLENPHAYADSNLTGF 106

Query: 193 ATILDMLKNNNVEGV 207
IL+ ++N ++ +
Sbjct: 107 LNILEGCRHNKIQHL 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07220TCRTETB310.012 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.6 bits (69), Expect = 0.012
Identities = 36/179 (20%), Positives = 68/179 (37%), Gaps = 12/179 (6%)

Query: 25 ILYFFNYMDRVNIGFAALRMNESLGITPEDFANISSIFFISYLIFQIPSSIGLQKLGARK 84
IL FF+ ++ + + + + P +++ F +++ I +LG ++
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 85 W--ISSIIIGWGAVTGLIFFAKDTQHIL-LARIFLGVFEAGFFPGMVYYLACWFPARERG 141
II +G+V G F +L +AR G A F ++ +A + P RG
Sbjct: 81 LLLFGIIINCFGSVIG--FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 142 KVNSFFMLSIAVASVLAAPMSGWIIEHLNTPDYEGWRWLFAIEGIPTVFLGILTFYLLP 200
K +A+ + + G I ++ W +L I I T+ LL
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI-TIITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07235HTHFIS801e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-17
Identities = 29/104 (27%), Positives = 47/104 (45%)

Query: 827 ILVVDDHPINRRLLADQLGSLGYQCKTANDGVDALNVLSKNAIDIVLSDVNMPNMDGYRL 886
ILV DD R +L L GY + ++ ++ D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVVGVTANALAEEKQRCLESGMDSCLSKPVTLD 930
RI++ LPV+ ++A + E G L KP L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07240HTHFIS488e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.9 bits (114), Expect = 8e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 1 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 60
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 114
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 115 DLPKALAALQKGKKFTPESVSRLLE 139
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07250ECOLIPORIN5340.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 534 bits (1378), Expect = 0.0
Identities = 260/389 (66%), Positives = 297/389 (76%), Gaps = 17/389 (4%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEIYNKDGNKLDLFGKVDGLHYFSDDKGSDGDQTYMRIG 60
MK KVL+L++PALL AGAA+AAEIYNKDGNKLDL+GKVDGLHYFSDD DGDQTYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVNDQLTGYGQWEYQIQGNQTEG-SNDSWTRVAFAGLKFADAGSFDYGRNYGVTY 119
FKGETQ+NDQLTGYGQWEY +Q N TEG +SWTR+AFAGLKF D GSFDYGRNYGV Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTYG-ADNFMQQRGNGYATYRNTDFFGLVDGLDFALQYQGKNGSVS 178
DV WTD+LPEFGGD+Y ADN+M R NG ATYRNTDFFGLVDGL+FALQYQGKN S S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 GENDG--------GRSLLNQNGDGYGGSLTYAIGEGFSVGGAITTSKRTADQNNTADARL 230
++ G + NGDG+G S TY IG GFS G A TTS RT +Q N
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNA--GGT 238

Query: 231 YGNGDRATVYTGGLKYDANNIYLAAQYSQTYNATRFGTSNGNNKSDSYGFANKAQNFEVV 290
GD+A +T GLKYDANNIYLA YS+T N T +G ++ G ANK QNFEV
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGY---DGGVANKTQNFEVT 295

Query: 291 AQYQFDFGLRPSVAYLQSKGKDISNGYGASYGDQDIVKYVDVGATYYFNKNMSTYVDYKI 350
AQYQFDFGLRP+V++L SKGKD++ + D+D+VKY DVGATYYFNKN STYVDYKI
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYN-NVNGDDKDLVKYADVGATYYFNKNFSTYVDYKI 354

Query: 351 NLLDKND-FTRDAGINTDDIVALGLVYQF 378
NLLD +D F +DAGI+TDDIVALG+VYQF
Sbjct: 355 NLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07330PF06580250.031 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 24.8 bits (54), Expect = 0.031
Identities = 11/66 (16%), Positives = 20/66 (30%), Gaps = 3/66 (4%)

Query: 1 MSPAFSSWSDFFAMGGYAFFVWLAVAMTVAPLVLLALHTVLQRRAILRGVAQQRAREARM 60
P + F V + M ++ I + A+EA++
Sbjct: 107 TKPVAFTLPLA---LSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQL 163

Query: 61 RAAQAQ 66
A +AQ
Sbjct: 164 MALKAQ 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07355HTHFIS682e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.9 bits (166), Expect = 2e-15
Identities = 24/114 (21%), Positives = 48/114 (42%), Gaps = 2/114 (1%)

Query: 9 VLIVDDHPLMRRGIRQLLELDPAFYVVAEAGDGASAIDLANRIEPDLILLDLNMKGLSGL 68
+L+ DD +R + Q L A Y V + A+ + DL++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDSASDIYALIDAGADGYLLKDSDPEVLLEAIRK 122
D L +++ +++++ ++ + GA YL K D L+ I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


22GX95_07565GX95_07945Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_07565317-3.850608methyl-galactoside ABC transporter
GX95_07570317-3.847544galactose/methyl galactoside ABC transporter
GX95_07575415-1.784981galactoside ABC transporter permease MglC
GX95_07580417-1.037428dihydropyrimidine dehydrogenase subunit B
GX95_07585419-1.250364dihydropyrimidine dehydrogenase
GX95_07590218-0.169906hypothetical protein
GX95_075952171.975735vancomycin high temperature exclusion protein
GX95_076001162.656277cytidine deaminase
GX95_07605-1142.412179CidB/LrgB family autolysis modulator
GX95_07610-2142.401357hypothetical protein
GX95_07615-2153.625060transcriptional regulator
GX95_07620-1164.0317114-hydroxybenzoate transporter
GX95_076250183.917858gentisate 1,2-dioxygenase
GX95_076301183.6737665-carboxymethyl-2-hydroxymuconate isomerase
GX95_076351184.112003maleylacetoacetate isomerase
GX95_076401193.980302salicylate hydroxylase
GX95_076452173.105068tRNA dihydrouridine(16) synthase DusC
GX95_076501152.304039multidrug transporter permease
GX95_076550151.369909SDR family oxidoreductase
GX95_076600141.406666hypothetical protein
GX95_07665-2131.841643hypothetical protein
GX95_07670-1122.131265D-alanyl-D-alanine endopeptidase
GX95_07675-2132.643015D-lactate dehydrogenase
GX95_07680-2153.132644beta-glucosidase
GX95_07685-2163.848030ABC transporter substrate-binding protein
GX95_07690-1172.945873osmoprotectant uptake system permease
GX95_07695-3161.452974ATP-binding protein
GX95_07700-3170.400145osmoprotectant uptake system permease
GX95_07705-114-2.212454hypothetical protein
GX95_07710-113-3.052161transcriptional regulator
GX95_07715-112-1.903456sensor histidine kinase
GX95_07720013-1.499539two-component system response regulator YehT
GX95_07725-114-1.576730hypothetical protein
GX95_07730-115-2.752269hypothetical protein
GX95_07735-214-3.143238hypothetical protein
GX95_07740-112-3.827466methionine--tRNA ligase
GX95_07745127-7.114023Fe-S-binding ATPase
GX95_07750334-9.053090hypothetical protein
GX95_07755129-6.958525fimbrial chaperone protein
GX95_07760124-4.566083fimbrial assembly protein
GX95_07765122-3.094947fimbrial assembly protein
GX95_077701190.858027fimbrial assembly protein
GX95_077752184.674145heavy metal resistance protein
GX95_077801153.787330hydroxyethylthiazole kinase
GX95_077850152.524959bifunctional hydroxymethylpyrimidine
GX95_07790-1151.862929GntR family transcriptional regulator
GX95_07795-1152.013902sugar kinase
GX95_07800-1160.475835hypothetical protein
GX95_07805-216-2.804661MFS transporter
GX95_07810-126-6.628336fructose-bisphosphate aldolase
GX95_07815133-7.855713lipid kinase YegS
GX95_07820540-10.126170hypothetical protein
GX95_07825125-6.645341hypothetical protein
GX95_07830122-6.685145Tir chaperone family protein
GX95_07835021-6.081551non-LEE encoded effector protein NleB
GX95_07840020-3.200173integrase
GX95_07845-120-5.184984transposase
GX95_07850021-7.254008U32 family peptidase
GX95_07855649-15.238196hypothetical protein
GX95_07860858-19.873377hypothetical protein
GX95_07865959-20.478611hypothetical protein
GX95_078701161-21.347014hypothetical protein
GX95_078751160-21.614203hypothetical protein
GX95_07880547-17.392679hypothetical protein
GX95_07885226-9.558081hypothetical protein
GX95_07890120-6.536827hypothetical protein
GX95_07895118-4.266510hypothetical protein
GX95_07900015-2.222605hypothetical protein
GX95_07905-1140.611874hypothetical protein
GX95_07910-1110.675879cytoplasmic protein
GX95_079150131.837485two-component system response regulator BaeR
GX95_07925-2143.593978two-component system sensor histidine kinase
GX95_07930-1154.085311multidrug transporter subunit MdtD
GX95_07935-1154.408871multidrug transporter subunit MdtC
GX95_07945-1133.289113multidrug transporter subunit MdtB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07570PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 21/74 (28%), Positives = 28/74 (37%), Gaps = 17/74 (22%)

Query: 24 PGVKALDNVNLNVRPHSIHALMGENGAGKSTLLKCLFGIYQKDSGSIIFQGKEVDFHSAK 83
PG K D + L G G GKSTL+ L G+ F D + K
Sbjct: 591 PGCKF-DYSVV---------LEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGK 633

Query: 84 EALENGISMVHQEL 97
++ E +V EL
Sbjct: 634 DSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07620TCRTETB523e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.8 bits (124), Expect = 3e-09
Identities = 64/402 (15%), Positives = 140/402 (34%), Gaps = 19/402 (4%)

Query: 22 RVIICCFLVVMLDGFDTAAIGFIAPDIRTHWQLSASELAPLFGAGLLGLTAGALLCGPLA 81
+++I ++ + + PDI + + + A +L + G + G L+
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 82 DRFGRKRVIELCVALFGALSLLSAFS-PDIETLVLLRFLTGLGLGGAMPNTIT-MTSEYL 139
D+ G KR++ + + S++ L++ RF+ G G A P + + + Y+
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AAAFPALVMVVVARYI 132

Query: 140 PARRRGALVTLMFCGFTLGSAMGGIVSAQLVPLIGWHGILALGGILPLMLFFGLLFALPE 199
P RG L+ +G +G + + I W +L + I + + F L+ L +
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPF-LMKLLKK 191

Query: 200 SPRWQVRRQLPQAV---VARTVSAITGERYHDTH----------FFLHETAAVAKGSIRQ 246
R + + + V + Y + F H
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 247 LFAGRQLVITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQQASWVTAAFQVGGTLGA 306
L +I ++ + F ++ + +M ++ + + G
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG- 310

Query: 307 LLLGVLMDRLNPFRVLAVSYALGAVCIVMIGLSENG-LWLMALAIFGTGIGISGSQVGLN 365
+ G+L+DR P VL + +V + W M + I G+S ++ ++
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370

Query: 366 ALTATLYPTQSRATGVSWSNAIGRCGAIVGSLSGGMMMALNF 407
+ ++ Q G+S N G G ++++
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412



Score = 41.8 bits (98), Expect = 4e-06
Identities = 40/169 (23%), Positives = 73/169 (43%), Gaps = 1/169 (0%)

Query: 251 RQLVITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQQASWVTAAFQVGGTLGALLLG 310
R I + L ++ F S+L +L+ +P + N +WV AF + ++G + G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 311 VLMDRLNPFRVLAVSYALGAVCIVMIGLSENGLWLMALAIFGTGIGISGSQVGLNALTAT 370
L D+L R+L + V+ + + L+ +A F G G + + + A
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 371 LYPTQSRATGVSWSNAIGRCGAIVGSLSGGMMM-ALNFSFDTLFFVIAI 418
P ++R +I G VG GGM+ +++S+ L +I I
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITI 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07655DHBDHDRGNASE1102e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 110 bits (276), Expect = 2e-31
Identities = 66/253 (26%), Positives = 115/253 (45%), Gaps = 12/253 (4%)

Query: 3 KVAIVTASDSGIGKACALLLAQNGFDIGITWHSDERGAQETAKKAAQFGVRVETIHLDLS 62
K+A +T + GIG+A A LA G I ++ E+ + + A+ E D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVR 67

Query: 63 QLPEGAQAIEHLIQRLGRVDVLVNNAGAMTKSAFIDMPFTQWRQIFTVDVDGAFLCAQIA 122
+ + + +G +D+LVN AG + + +W F+V+ G F ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARHMIKQGEGGRIINITSVHEHTPLPQASSYTAAKHALGGLTKSMALELIEHHILVNAVA 182
+++M+ + G I+ + S P ++Y ++K A TK + LEL E++I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NDMDDSDIEPGSEP---SIPIARPGSTHEIASLVAWLCSEGASYT 232
PG+ T M + + I+ E IP+ + +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07660BCTERIALGSPF280.029 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 27.9 bits (62), Expect = 0.029
Identities = 8/39 (20%), Positives = 16/39 (41%), Gaps = 1/39 (2%)

Query: 152 WLHDLDQHLRH-GVWLILAIVLVVGVRWWLKRRGKAEAR 189
L + +R G W++LA++ + R+ K
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQEKRRVS 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07670BLACTAMASEA369e-05 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 36.3 bits (84), Expect = 9e-05
Identities = 33/146 (22%), Positives = 60/146 (41%), Gaps = 6/146 (4%)

Query: 11 LALMLAVPFAPQAVAKTAATTAASQPEIASGSAMI-VDLNTNKVIYSNHPDLVRPIASIT 69
++L+ +P A A + S+ +++ MI +DL + + + + D P+ S
Sbjct: 9 ISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTF 68

Query: 70 KLMTAMVVLDARLPLDEILKVDISQTPEMKGVYSRV---RLNSEISRKNMLLLALMSSEN 126
K++ VL DE L+ I + YS V L ++ + A+ S+N
Sbjct: 69 KVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDN 128

Query: 127 RAAASLAHYY--PGGYNAFIKAMNAK 150
AA L P G AF++ +
Sbjct: 129 SAANLLLATVGGPAGLTAFLRQIGDN 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07715PF065802174e-68 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 217 bits (555), Expect = 4e-68
Identities = 60/216 (27%), Positives = 116/216 (53%), Gaps = 3/216 (1%)

Query: 343 LGEGIAQLLSAQILAGQYERQKALLTQSEIKLLHAQVNPHFLFNALNTIKAVIRHDSEQA 402
L G + + + ++ ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 403 SQLVQYLSTFFRKNLKR-PSEIVTLADEIEHVNAYLQIEKARFQSRLQVQLDVPSTLSRQ 461
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ + + +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 462 KLPAFTLQPIVENAIKHGTSQLLDTGNVAIRARREGQHLMLDIEDNAGLYQPSAG-SSGL 520
++P +Q +VEN IKHG +QL G + ++ ++ + L++E+ L + S+G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 521 GMSLVDKRLREHFGDDYGISVACEPDCFTRITLRLP 556
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07720HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 49/215 (22%), Positives = 87/215 (40%), Gaps = 19/215 (8%)

Query: 2 IKVLIVDDEPLARENLRILLQGQDDIEIVGECANAVEAIGAVHKLRPDVLFLDIQMPRIS 61
+L+ DD+ R L L V +NA + D++ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLEMVGMLDPEHRPYI--VFLTAFD--EYAIKAFEEHAFDYLLKPIEEKRLEKTLHRLRQ 117
+++ + + RP + + ++A + AIKA E+ A+DYL KP + L + R
Sbjct: 62 AFDLLPRIK-KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 118 ERSKQDVSLLPENQQALKFIPCTGHSRIYLLQMDDVAFVSSRMSGVYVT--SSEGKEGFT 175
E ++ L ++Q + + G S +A + + +T S GK
Sbjct: 121 EPKRRPSKLEDDSQDGMPLV---GRSAAMQEIYRVLARLMQTDLTLMITGESGTGK---- 173

Query: 176 ELTLRTLESRTPLLRCHRQFL-VNMAHLQEIRLED 209
EL R L R + F+ +NMA + +E
Sbjct: 174 ELVARALHDYGK--RRNGPFVAINMAAIPRDLIES 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07730PF06291280.010 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.010
Identities = 12/32 (37%), Positives = 19/32 (59%)

Query: 7 MALPLFALSLSVSITGCDQKNDTLQGKQNNMT 38
M LF+ +L++ ITGC Q+ T+ K +T
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07765PF005776790.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 679 bits (1754), Expect = 0.0
Identities = 242/829 (29%), Positives = 390/829 (47%), Gaps = 28/829 (3%)

Query: 13 AALLTSPAWAEEETFDTNFMFG-GLKGEKVSRYQIDSTKPMAGVYEMDVYVNKEWRGTYE 71
A +P + E F+ F+ +SR++ + + G Y +D+Y+N + T +
Sbjct: 35 AFAAQAPLSSAELYFNPRFLADDPQAVADLSRFE-NGQELPPGTYRVDIYLNNGYMATRD 93

Query: 72 VNIQDDPDST----CISPDLIASLGIKFTPQSTTV---ENECIALKTVVHGGSVSYDTAA 124
V C++ +AS+G+ S ++ C+ L +++H + D
Sbjct: 94 VTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQ 153

Query: 125 FNLYLSVPQAYVLEYEAGYASPETWDRGINAFYTSYYASEYYSHYKSGGSEKNTYANFVS 184
L L++PQA++ GY PE WD GINA +Y S + GG+ Y N S
Sbjct: 154 QRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQS 213

Query: 185 GLNLLGWQLHSNANFSKSDN-----SAGKWQSNTLYLERDFPAVLGTMRLGEQYTSGDMF 239
GLN+ W+L N +S + + S KWQ +LERD + + LG+ YT GD+F
Sbjct: 214 GLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIF 273

Query: 240 DTVRFRGVRFWRDMQMLPHSKQNFAPVVRGVAQSNALVTVEQNGFIVYQKEVPPGPFVFE 299
D + FRG + D MLP S++ FAPV+ G+A+ A VT++QNG+ +Y VPPGPF
Sbjct: 274 DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTIN 333

Query: 300 DLQLAGGGADLDVSVKEADGTVSRFIVPYSSVPNMVQPGVAKYDFAAGRSRIEGASQQTD 359
D+ AG DL V++KEADG+ F VPYSSVP + + G +Y AG R A Q+
Sbjct: 334 DIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKP 393

Query: 360 -FLQGTYQYGVNNLLTLYGGTMLASNYRSFTLGTGWNT-LIGAVSVDGTLSHSKQDNGDV 417
F Q T +G+ T+YGGT LA YR+F G G N +GA+SVD T ++S +
Sbjct: 394 RFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQ 453

Query: 418 FDGESYQIAWNKYLPQSATHFSLAAYRYSSRDYRTFNDHVWANNRDNYRRDDDDIYDI-- 475
DG+S + +NK L +S T+ L YRYS+ Y F D ++ D + +
Sbjct: 454 HDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKP 513

Query: 476 --ADYYENDFGRKNTFTLNINQTLPDGWGYFTASALWRDYWGRSGTGKDYQLSYSNTWQR 533
DYY + ++ L + Q L S + YWG S + +Q + ++
Sbjct: 514 KFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFED 572

Query: 534 LSFTLSATQTYDSDNRE-DKRFNIYLSIPL--TWGVKENGGNRDIHLSNSTTFDDQGYEA 590
+++TLS + T ++ + D+ + ++IP R S S + D G
Sbjct: 573 INWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMT 632

Query: 591 NNTSLSGTFGNRDQFNYTTNLS---QQRQEHQTTFGGSVTWNAPLATVGGSYSQSNQYHQ 647
N + GT + +Y+ +T ++ + YS S+ Q
Sbjct: 633 NLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQ 692

Query: 648 VGGNIQGGLVAWADGVHLASRLNDTIAIINAPYLEGAAVQGRPYLRTNAKGYAVFEALTP 707
+ + GG++A A+GV L LNDT+ ++ AP + A V+ + +RT+ +GYAV T
Sbjct: 693 LYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATE 752

Query: 708 YRQNFISLDVSESESDVALLGNRKVTVPYRGAVVVVDFETETSKPFYFLARRADGEPLTF 767
YR+N ++LD + +V L VP RGA+V +F+ + +PL F
Sbjct: 753 YRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLTH-NNKPLPF 811

Query: 768 GYEVEDDEGNNVGLVGQGSRVFIRTEKVPVSVKVATDKQQGLFCKITFD 816
G V + + G+V +V++ + V+V +++ C +
Sbjct: 812 GAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07775TYPE3OMGPROT270.019 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 26.8 bits (59), Expect = 0.019
Identities = 10/43 (23%), Positives = 17/43 (39%), Gaps = 7/43 (16%)

Query: 3 SKLLPCALLLATSFAWAAPA-------TTGIDQYELKSFIADF 38
++L LLL +S++WA L+ + DF
Sbjct: 10 KRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07780ABC2TRNSPORT280.041 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 28.0 bits (62), Expect = 0.041
Identities = 14/54 (25%), Positives = 23/54 (42%)

Query: 41 LLALGASPAMVIDPVEARPFAAIANALLVNVGTLTASRAEAMRAAVESAYDAKT 94
L LGA +++ V + A A +V +TA+ E + AA +T
Sbjct: 46 LFGLGAGLGVMVGRVGGVSYTAFLAAGMVATSAMTAATFETIYAAFGRMEGQRT 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07805TCRTETA379e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.5 bits (87), Expect = 9e-05
Identities = 33/153 (21%), Positives = 53/153 (34%), Gaps = 20/153 (13%)

Query: 253 FSEIFFMLALPFFTKRFGIKKVLLLGLITAAIRYGFFVYGGAETYFTYALLFLGILLHGV 312
+ L + RFG + VLL+ L AA+ Y +L++G ++ G+
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-----FLWVLYIGRIVAGI 108

Query: 313 SYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVMMEKMFAYPQPVN 372
+ V D R G ++ C GFG + G LGG+M P
Sbjct: 109 TGATGAVAGAYIADI-TDGDERARHFGFMS-ACFGFGMVAGPVLGGLMGGFSPHAP---- 162

Query: 373 GLTFNWAGMWTFGAVMIAVIALLFMIFFRESNK 405
+ A + + L ES+K
Sbjct: 163 ---------FFAAAALNGLNFLTGCFLLPESHK 186



Score = 34.8 bits (80), Expect = 6e-04
Identities = 49/285 (17%), Positives = 84/285 (29%), Gaps = 15/285 (5%)

Query: 29 LNKSGFSAGEIGWSYACTAIAAILSPILVGSVTDRFFSAQKVLAVLMFAGAVLMYFAAQQ 88
L S G A A+ ++G+++DRF +L L A A
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP 94

Query: 89 TTFVGFFPLLLAYSLTYMPTIALTNSIA-FANVPDVERDFPRIRVMGTIGWIASGLACGF 147
+V + ++A +T IA + + R F + G +A + G
Sbjct: 95 FLWVLYIGRIVA-GITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL 153

Query: 148 LPQMLGYNDISPTNIPLLITAASSALLGVFAFCLPDTPPKSTGKMDIKVMLGLDALVLLR 207
+ SP + P AA + L + L K + + L A
Sbjct: 154 M------GGFSP-HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWA 206

Query: 208 DKN------FLVFFFCSFLFAMPLAFYYIFANGYLTEVGMKNATGWMTLGQFSEIFFMLA 261
VFF + +P A + IF G + +
Sbjct: 207 RGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMI 266

Query: 262 LPFFTKRFGIKKVLLLGLITAAIRYGFFVYGGAETYFTYALLFLG 306
R G ++ L+LG+I Y + ++ L
Sbjct: 267 TGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07825DHBDHDRGNASE260.046 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 25.8 bits (56), Expect = 0.046
Identities = 12/46 (26%), Positives = 19/46 (41%)

Query: 60 SGAVASVSSGAAYTTALTVLGASFGMGGIGMMGICAGLYLSANGIR 105
SG++ +V S A ++ + M C GL L+ IR
Sbjct: 136 SGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07830PF05932861e-24 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 85.6 bits (212), Expect = 1e-24
Identities = 30/118 (25%), Positives = 46/118 (38%), Gaps = 3/118 (2%)

Query: 6 DRLLRQFSLKLNTDSIVFDENRLCSFIIDNRYRI-LLTSTNSEYIMIYGFCGRPPDNNNL 64
LL FS L +VFD++ C+ IIDN + + L E +++ G P +
Sbjct: 7 KTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLE--PHKDIP 64

Query: 65 AFEFLNANLWFAENNGPHLCYDNNSQSLLLALNFSLNEGSVEKLECEIEVVIRSMENL 122
L L N GP L D S + + SV L+ E+ ++ M
Sbjct: 65 QQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07925HTHFIS757e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 7e-18
Identities = 27/140 (19%), Positives = 65/140 (46%), Gaps = 2/140 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLINHGDKVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + ++ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-RRC 128
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 129 KPQRELQQQDAESPLMIDES 148
+ +L+ + ++ S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07930BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 20/66 (30%), Positives = 26/66 (39%), Gaps = 14/66 (21%)

Query: 187 RGLLAPVKRLVEGTHRLAAGDFTTRVTPTSADEL-----------GKLAQDFNQLASTLE 235
L+A V+ V H LA + P S + L G L N+LA E
Sbjct: 104 SQLMAAVRSKVMEGHSLAD---AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTE 160

Query: 236 KNQQMR 241
+ QQMR
Sbjct: 161 QRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07935TCRTETB1243e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 124 bits (313), Expect = 3e-33
Identities = 95/450 (21%), Positives = 199/450 (44%), Gaps = 25/450 (5%)

Query: 20 FMQSLDTTIVNTALPSMAKSLGESPLHMHMVVVSYVLTVAVMLPASGWLADKIGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 AAIVLFTLGSLFCALSGT-LNQLVLARVLQGVGGAMMVPVGRLTVMKIVPRAQYMAAMTF 138
I++ GS+ + + + L++AR +QG G A + + V + +P+ A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLLGPALGGVLVEYASWHWIFLINIPVGIVGAIATFM-LMPNYTIETRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ + + M L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 PGFLLLAIGMAVLTLALDGSKSMGISPWTLAGLAAGGAAAILLYLLHAKKNSGALFSLRL 257
G +L+++G+ L + L + L+++ H +K + L
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLIVSVL---------SFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTPTFSLGLLGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+L M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQIVNRFGYRRVLVATTLGLALVSLLFMSVALL----GWYYLLPLVLLLQGMVNSARFS 372
+V+R G VL +G+ +S+ F++ + L W+ + +V +L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDTLASSGNSLLSMIMQLSMSIGVTIAGMLL--GMFGQQHIGIDSSATHH 430
++T+ L A +G SLL+ LS G+ I G LL + Q+ + ++ + +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 431 VFMYTWLCMAVIIALPAIIFARVPNDTQQN 460
++ L + II + ++ V +Q++
Sbjct: 428 LYSNLLLLFSGIIVISWLVTLNVYKHSQRD 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07940ACRIFLAVINRP8810.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 881 bits (2277), Expect = 0.0
Identities = 282/1035 (27%), Positives = 502/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILIAAAITLCGILGFRLLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ ++A + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVNEMTSSS-SLGSTRIILEFNFDRDINGAARDVQAAINAAQSLLPGGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSES--WSQGKLYDFASTQLAQTIAQIDGVGDVDVGGSSL 182
+ S + +M+ S++ +Q + D+ ++ + T+++++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDEVREAIDSANVRRPQGAIEDSV------HRWQIQTNDELK 236
A+R+ L+ L ++ +V + N + G + + I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGAAVRLGDVASVTDSVQDVRNAGMTNANPAILLMIRKLPEANIIQ 295
E+ + + N +G+ VRL DVA V ++ N PA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDGIRAKLPELRAMIPAAIDLQIAQDRSPTIRASLQEVEETLAISVALVILVVFLFLRS 355
T I+AKL EL+ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVISMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVVSLTLTPMMCGWMLKSSKPRTQPRKRGVG----RLLVALQQGYGTSLKWVLNHTRLVG 530
++V+L LTP +C +LK K G Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVFLGTVALNIWLYIAIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+++ VA + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVNNVTGFT-GGSRVNSGMMFITLKPRGER---KETAQQIIDRLRVKLAKEPGAR 641
+ +V V GF+ G N+GM F++LKP ER + +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDSLAALREWEPKIRKALSAL-----PQLADVNSD 696
+ + I G ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLIYDRDTMSRLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 SQDISALEKMFVINRDGKAIPLSYFAQWRPANAPLSVNHQGLSAASTIAFNLPTGTSLSQ 816
++K++V + +G+ +P S F + + I GTS
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATEAINRTMTQLGVPPTVRGSFSGTAQVFQQTMNSQLILIVAAIATVYIVLGILYESYVH 876
A + ++L P + ++G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRSGG 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPEQAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
+A A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07945ACRIFLAVINRP8900.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 890 bits (2302), Expect = 0.0
Identities = 292/1036 (28%), Positives = 504/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLLMAAILLAGIIGYRFLPVAALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L +++AG + LPVA P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVVTLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ +TL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPIYSKVNPADPPIMTLAVTSNAMPMTQVE--DMVETRVAQKISQVSGVGLVTLAGG 189
+ I S + +M S+ TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAVAALGLTSETVRTAITGANVNSAKGSLDGP------ERAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSADEYRRLII-AYQNGAPVRLGDVATVEQGAENSWLGAWANQAPAIVMNVQRQPGANI 302
++ +E+ ++ + +G+ VRL DVA VE G EN + A N PA + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVRDTQFELMLAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAVTLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQQSLRKQNRFSRACERMFDRVIASYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S + + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFATLLLSVMLWIVIPKGFFPVQDNGIIQGTLQAPQSSSYASMAQRQRQVAERILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV + L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTTFVGVDGANPTLNSARLQINLKPLDARDDR---VQQVISRLQTAVATIPG 653
+ V+S+ T G + N+ ++LKP + R+ + VI R + + I
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 VALYLQPTQDLTIDTQVSRTQYQFTLQ---ATTLDALSHWVPKL-QNALQSLPQLSEVSS 709
++ P I + T + F L DAL+ +L A Q L V
Sbjct: 660 G--FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDRGLAAWVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTA 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 STPGLAALETIRLTSRDGGTVPLSAIARIEQRFAPLSINHLDQFPVTTFSFNVPEGYSLG 829
++ + + S +G VP SA + + + P G S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAILDTEKTLALPADITTQFQGSTLAFQAALGSTVWLIVAAVVAMYIVLGVLYESFI 889
DA+ + + LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALIIAGSELDIIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + D+ ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIFQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIAMVGGLLVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GI ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


23GX95_08005GX95_08135Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08005216-2.954218colanic acid biosynthesis glycosyltransferase
GX95_08010116-1.751147colanic acid biosynthesis acetyltransferase
GX95_08015016-0.811127colanic acid biosynthesis glycosyltransferase
GX95_08020017-0.907573putative colanic acid polymerase WcaD
GX95_080250202.576657colanic acid biosynthesis glycosyltransferase
GX95_08030-1254.600007colanic acid biosynthesis acetyltransferase
GX95_08035-1325.917798GDP-mannose 4,6-dehydratase
GX95_08040-1285.220565GDP-fucose synthetase
GX95_08045-1253.653988GDP-mannose mannosyl hydrolase
GX95_08050-1212.869178colanic acid biosynthesis glycosyltransferase
GX95_08055-1212.730950mannose-1-phosphate
GX95_080600161.056507phosphomannomutase
GX95_08065-111-1.642421undecaprenyl-phosphate glucose
GX95_08070-111-1.983343colanic acid exporter
GX95_08075018-6.406892colanic acid biosynthesis pyruvyl transferase
GX95_08080127-10.359720colanic acid biosynthesis glycosyltransferase
GX95_08085340-15.249036colanic acid biosynthesis protein WcaM
GX95_08090652-19.271504UDP-N-acetylglucosamine 4-epimerase
GX95_08095958-21.398509GalU regulator GalF
GX95_081001269-24.569125hypothetical protein
GX95_081051367-23.106251hypothetical protein
GX95_081101263-21.181942hypothetical protein
GX95_08115849-16.188056hypothetical protein
GX95_08120640-14.027743hypothetical protein
GX95_08125432-10.883136amylovoran biosynthesis protein AmsE
GX95_08130224-8.253492UDP-glucose 4-epimerase GalE
GX95_08135-114-5.137243ISAs1 family transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08035NUCEPIMERASE1072e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 107 bits (268), Expect = 2e-28
Identities = 81/361 (22%), Positives = 127/361 (35%), Gaps = 58/361 (16%)

Query: 6 LITGVTGQDGSYLAEFLLEKGYEVHGIKRRASSFNTERVDHIYQDPH--------SCNPK 57
L+TG G G ++++ LLE G++V GI + N Y D P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLTDASNLTRILQEVQPDEVYNLGAMSHVAVSFESPEYTADVDAMGTLRLLEAI 117
F H DL D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLEKKTRFYQASTSELYGLVQEIPQKETTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 IYACNGILFNHESPRRGETFVTRKITRAIANIAQGLESCLYLGNMDSLRDWGHAKDYVRM 236
+ A F P K T+A+ G +Y RD+ + D
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLE---GKSIDVY-NYGKMKRDFTYIDD---- 222

Query: 237 QWMMLQQEQPEDFVIATGVQYSVRQFVELAAAQLGIKLRFEGEGINEKGIVVSVTGHDAP 296
IA + +R + A + + V G+ +P
Sbjct: 223 --------------IAEAI---IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSP 265

Query: 297 GVKPGDVIVAV--------DPRY--FRPAEVETLLGDPSKAHEKLGWKPEITLSEMVSEM 346
V+ D I A+ +P +V D +E +G+ PE T+ + V
Sbjct: 266 -VELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 347 V 347
V
Sbjct: 325 V 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08040NUCEPIMERASE887e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 7e-22
Identities = 64/344 (18%), Positives = 128/344 (37%), Gaps = 47/344 (13%)

Query: 5 RIFVAGHRGMVGSAIVRQLAQRG-------------DVEL------VLRTRD----ELDL 41
+ V G G +G + ++L + G DV L +L ++DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDGRAVQAFFAGAGIDQVYLAAAKVGGIVANNTYPADFIYENMMIESNIIHAAHLHNVNK 101
D + FA ++V+++ + + + P + N+ NI+ + +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLARQPMAESELLQGTLEPTNEPYAIAKIAGIKLCESYNRQYGRDYRSV 161
LL+ SS +Y + P + P + YA K A + +Y+ YG +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD---DSVDHPVS-LYAATKKANELMAHTYSHLYGLPATGL 176

Query: 162 MPTNLYGPHDNFHPDNSHVIPALLRRFHEAAQSHAPEVVVWGSGTPMREFLHVDDMAAAS 221
+YGP PD AL + + + +V + G R+F ++DD+A A
Sbjct: 177 RFFTVYGPWGR--PDM-----ALFKFTKAMLEGKSIDV--YNYGKMKRDFTYIDDIAEAI 227

Query: 222 IHVMELA----REVWQENTAPMLSH-----INVGTGVDCTIRELAQTIAKVVGYQGRVVF 272
I + ++ + E P S N+G + + Q + +G + +
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM 287

Query: 273 DAAKPDGTPRKLLDVTRLHQ-LGWYHEISLEAGLAGTYQWFLEN 315
+P D L++ +G+ E +++ G+ W+ +
Sbjct: 288 LPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08090NUCEPIMERASE944e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 94.1 bits (234), Expect = 4e-24
Identities = 73/333 (21%), Positives = 123/333 (36%), Gaps = 60/333 (18%)

Query: 4 NVLLIGASGFVGT----RLLE-----TAVDDFN-----------IKNLDKQQSHFYPEIT 43
L+ GA+GF+G RLLE +D+ N ++ L + F+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 44 RIGDVRDQQILDQTLA--GFDTVVLLAAEH--RDDVSPTSLYYDVNVQGTRNVLAAMEKN 99
D+ D++ + A F+ V + R + Y D N+ G N+L N
Sbjct: 59 --IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 100 GVKNIIFTSSVAVYGLNKKNP-DETHPHD-PFNHYGKSKWQAEEVLREWHA--KAPNERS 155
++++++ SS +VYGLN+K P D P + Y +K E + + P
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLP---- 172

Query: 156 LTIIRPTVIFGERNRGN--VYNLLKQIAGGKFAMV-GPGTNYKSMAYVGNIVEFIKFKLK 212
T +R ++G R + ++ K + GK V G + Y+ +I E I
Sbjct: 173 ATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQD 232

Query: 213 N-----------------VTAGYEVYNYVDKPDLNMNQLVAEVEQSLGKKIPSMHLPYPL 255
A Y VYN + + + + +E +LG + LP
Sbjct: 233 VIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQP 292

Query: 256 GMLGGYCFDI--LSKVTGKKYAVS-SVRVKKFC 285
G + D L +V G + VK F
Sbjct: 293 GDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08130NUCEPIMERASE1832e-57 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 183 bits (465), Expect = 2e-57
Identities = 81/351 (23%), Positives = 153/351 (43%), Gaps = 39/351 (11%)

Query: 1 MAILVTGGAGYIGTHTIISLLDKGYDIVVIDNFSNSSKDAL---TQVEKISAKKINFYHG 57
M LVTG AG+IG H LL+ G+ +V IDN N D ++E ++ F+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNL-NDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DIRDRHVLKDIFSQNSISDVIHFAGLKAVGESVTSPLKYYDNNISGTLCLLNEMLLFNVK 117
D+ DR + D+F+ V AV S+ +P Y D+N++G L +L ++
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 118 SIIFSSSATVYGVPKTLPIKESDPLGQITNPYGRTKIMAENILKDLTKAIPDFRATILRY 177
++++SS++VYG+ + +P D + + Y TK E + + + AT LR+
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPATGLRF 178

Query: 178 FNPVGAHPSGLIGENPKGIPN-NLFPVISQTIAGRNQSVNIFGSDYATPDGTGIRDYIHV 236
F G P G P+ LF + G+ S++++ G RD+ ++
Sbjct: 179 FTVYG----------PWGRPDMALFKFTKAMLEGK--SIDVYN------YGKMKRDFTYI 220

Query: 237 MDLAAGHFSALDKQREGKN---------------FKVYNLGTGKGYSVLQIIKEFENQIN 281
D+A D ++VYN+G ++ I+ E+ +
Sbjct: 221 DDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG 280

Query: 282 KKIKIDLCPRREGDIAQCWSSPDLALEELSWKANLSLEDMIRDTLNWLSKY 332
+ K ++ P + GD+ + + E + + +++D +++ +NW +
Sbjct: 281 IEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08135adhesinmafb290.028 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.3 bits (65), Expect = 0.028
Identities = 11/49 (22%), Positives = 19/49 (38%), Gaps = 6/49 (12%)

Query: 296 ASAMRSHWDVENKLHWRLDVAMNEDDCRIRRGNSAELFSGIRHIAVNIL 344
S D N+ + + ++ R GNS E +G+ A+N
Sbjct: 197 GSNFSDRADEANRKMFEHNAKLD------RWGNSMEFINGVAAGALNPF 239


24GX95_08255GX95_08540Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08255-2183.708164GHMP kinase
GX95_08260-2224.662317acetate kinase
GX95_08265-1266.012623ethanolamine utilization protein EutP
GX95_082700276.628387propanediol utilization protein
GX95_082751276.908885propanediol utilization protein
GX95_082802267.103481NADH dehydrogenase
GX95_082853256.702769propanediol utilization protein
GX95_082903236.167029aldehyde dehydrogenase EutE
GX95_082952246.272262ATP:cob(I)alamin adenosyltransferase
GX95_083003235.296167ethanolamine utilization protein EutN
GX95_083051225.702269microcompartment protein PduM
GX95_083100215.095836propanediol utilization protein
GX95_08315-1244.718093microcompartment protein
GX95_083200294.513238ethanolamine utilization protein EutM
GX95_083251294.942001propanediol dehydratase
GX95_083300294.760601diol dehydratase reactivase subunit alpha
GX95_08335-1232.835979propanediol dehydratase
GX95_08340-2171.043237propanediol dehydratase
GX95_08345-2140.929160propanediol dehydratase
GX95_08350-1131.062673microcompartment protein PduB
GX95_083550170.196493ethanolamine utilization protein EutM
GX95_083600160.590540aquaporin
GX95_083650151.442992hypothetical protein
GX95_083700163.076734cobyrinic acid a,c-diamide synthase
GX95_08375-1153.548152adenosylcobinamide-phosphate synthase
GX95_08380-2162.740933precorrin-8X methylmutase
GX95_08385-2142.941894cobalt-precorrin-5B (C(1))-methyltransferase
GX95_08390-3153.515357precorrin-6y C5,15-methyltransferase
GX95_08395-2143.250015precorrin-6Y C5,15-methyltransferase
GX95_08400-2133.467258precorrin-4 C(11)-methyltransferase
GX95_08405-1152.979070cobalamin biosynthesis protein CbiG
GX95_084100152.761819precorrin-3B C(17)-methyltransferase
GX95_084150172.385409cobalt-precorrin-6A reductase
GX95_084200161.282169sirohydrochlorin cobaltochelatase
GX95_084251181.369836precorrin-2 C(20)-methyltransferase
GX95_084302210.823607cobalamin biosynthesis protein CbiM
GX95_084352211.188723cobalt ABC transporter substrate-binding protein
GX95_084401192.499444cobalt ECF transporter T component CbiQ
GX95_084451182.023032energy-coupling factor ABC transporter
GX95_084501172.139056cobyric acid synthase CobQ
GX95_084551181.568536bifunctional adenosylcobinamide
GX95_084601190.822627adenosylcobinamide-GDP ribazoletransferase
GX95_08465020-0.558600nicotinate-nucleotide--dimethylbenzimidazole
GX95_08470224-3.264442L,D-transpeptidase
GX95_08475023-2.299395*FMN/FAD transporter
GX95_08485020-2.178361*acyl carrier protein
GX95_08495-124-3.289511cytoplasmic protein
GX95_08500-122-2.716679AMP nucleosidase
GX95_08505-121-0.800004**protein MtfA
GX95_08510020-0.865213*hypothetical protein
GX95_08525220-2.323058DNA polymerase V subunit UmuD
GX95_08535223-2.009636DNA polymerase V subunit UmuC
GX95_08540220-1.281402cold-shock protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08260ACETATEKNASE5820.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 582 bits (1502), Expect = 0.0
Identities = 201/395 (50%), Positives = 279/395 (70%), Gaps = 5/395 (1%)

Query: 4 KIMAINAGSSSLKFQLLEMPQGDMLCQGLIERIGMADAQVTIKTHSQKWQETVPVADHRD 63
KI+ IN GSSSLK+QL+E G++L +GL ERIG+ D+ +T + +K + + DH+D
Sbjct: 2 KILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKD 61

Query: 64 AVTLLLEKLLG--YQIINSLRDIDGVGHRVAHGGEFFKDSTLVTDETLAQIERLAELAPL 121
A+ L+L+ L+ Y +I + +ID VGHRV HGGE+F S L+TD+ L I ELAPL
Sbjct: 62 AIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPL 121

Query: 122 HNPVNALGIHVFRQLLPDAPSVAVFDTAFHQTLDEPAYIYPLPWHYYAELGIRRYGFHGT 181
HNP N GI Q++PD P VAVFDTAFHQT+ + AY+YP+P+ YY + IR+YGFHGT
Sbjct: 122 HNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGT 181

Query: 182 SHKYVSGVLAEKLGVPLSALRVICCHLGNGSSICAIKNGRSVNTSMGFTPQSGVMMGTRS 241
SHKYVS AE L P+ +L++I CHLGNGSSI A+KNG+S++TSMGFTP G+ MGTRS
Sbjct: 182 SHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRS 241

Query: 242 GDIDPSILPWIAQRENKTPQQLNQLLNNESGLLGVSGVSSDYRDVEQAA-NTGNRQAKLA 300
G IDPSI+ ++ ++EN + +++ +LN +SG+ G+SG+SSD+RD+E AA G+++A+LA
Sbjct: 242 GSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLA 301

Query: 301 LTLFAERIRATIGSYIMQMGGLDALVFTGGIGENSARARSAVCHNLQFLGLAVDEEKNQR 360
L +FA R++ TIGSY MGG+D +VFT GIGEN R + L+FLG +D+EKN+
Sbjct: 302 LNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKV 361

Query: 361 NA--TFIQTENALVKVAVINTNEELMIAQDVMRVA 393
I T ++ V V V+ TNEE MIA+D ++
Sbjct: 362 RGEEAIISTADSKVNVMVVPTNEEYMIAKDTEKIV 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08285BONTOXILYSIN310.010 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 30.6 bits (69), Expect = 0.010
Identities = 8/39 (20%), Positives = 17/39 (43%)

Query: 190 SDFTDALAEKAAKLVFQYLPTAVEKGDCVATRGKMHNAS 228
SDF+ ++ K LV+ +L + + + G +
Sbjct: 518 SDFSKVVSSKDKSLVYSFLDNLMSYLETIKNDGPIDTDK 556


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08435ANTHRAXTOXNA270.017 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 26.6 bits (58), Expect = 0.017
Identities = 14/32 (43%), Positives = 21/32 (65%), Gaps = 1/32 (3%)

Query: 39 QAIAPQYKPWFQPLYEPASGEIESLLFTLQGS 70
+ IAP+YK +FQ L E + +++ LL T Q S
Sbjct: 739 KQIAPEYKNYFQYLKERITNQVQ-LLLTHQKS 769


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08510PF03627371e-04 PapG
		>PF03627#PapG

Length = 336

Score = 36.9 bits (85), Expect = 1e-04
Identities = 22/93 (23%), Positives = 34/93 (36%), Gaps = 8/93 (8%)

Query: 327 DDHVLDAVLPPDIP-------IPSIAEVQRALYDATKAVSGMPGEEVKQRLRTGTVVTTD 379
DD + LP D+P IP + +QR A +P K R ++
Sbjct: 152 DDIIFKVALPADLPLGDYSVTIPYTSGMQRHFASYLGARFKIPYNVAKTLPRENEMLFLF 211

Query: 380 DRNWELRYSASALRFNLSRAVAIDMESATIAAQ 412
R SA +L ++I+ + AAQ
Sbjct: 212 KNIGGCRPSAQSLEIKHGD-LSINSANNHYAAQ 243


25GX95_08600GX95_08665Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08600016-3.788368mannosyl-3-phosphoglycerate phosphatase
GX95_08605016-4.466251hypothetical protein
GX95_08610116-2.827981hypothetical protein
GX95_08615015-2.507885helix-turn-helix transcriptional regulator
GX95_08620-1130.471899flagellar biosynthetic protein FliR
GX95_08625-2151.269786flagellar export apparatus protein FliQ
GX95_08630-2163.321279flagellar biosynthetic protein FliP
GX95_086350153.262689flagellar biosynthetic protein FliO
GX95_08640-2153.767225flagellar motor switch protein FliN
GX95_08645-1164.437560flagellar motor switch protein FliM
GX95_086501154.748507flagellar basal body-associated protein FliL
GX95_086550124.682470flagellar hook-length control protein FliK
GX95_08660-1133.953351flagellar biosynthesis chaperone FliJ
GX95_08665-2123.503723flagellum-specific ATP synthase FliI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08620TYPE3IMRPROT2135e-71 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 213 bits (543), Expect = 5e-71
Identities = 231/260 (88%), Positives = 246/260 (94%)

Query: 1 MIQVTSEQWLYWLHLYFWPLLRVLALISTAPILSERAIPKRVKLGLGIMITLVIAPSLPA 60
M+QVTSEQWL WL+LYFWPLLRVLALISTAPILSER++PKRVKLGL +MIT IAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDTPLFSIAALWLAMQQILIGIALGFTMQFAFAAVRTAGEFIGLQMGLSFATFVDPGSHL 120
ND P+FS ALWLA+QQILIGIALGFTMQFAFAAVRTAGE IGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLAMLLFLTFNGHLWLISLLVDTFHTLPIGSNPVNSNAFMALARAGGLIF 180
NMPVLARIMDMLA+LLFLTFNGHLWLISLLVDTFHTLPIG P+NSNAF+AL +AG LIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPVITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGIMLMAALMPLIAPFC 240
LNGLMLALP+ITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGI LMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIVSEMPI 260
EHLFSEIFNLLADI+SE+P+
Sbjct: 241 EHLFSEIFNLLADIISELPL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08625TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.5 bits (165), Expect = 1e-18
Identities = 23/78 (29%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALITGLIISILQAATQINEMTLSFIPKIVAVFIAII 63
+ ++ G +A+ + L L+ +VA I GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08630FLGBIOSNFLIP330e-117 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 330 bits (847), Expect = e-117
Identities = 225/245 (91%), Positives = 233/245 (95%)

Query: 1 MRRLLFLSLAGLWLFSPAAAAQLPGLISQPLAGGGQSWSLSVQTLVFITSLTFLPAILLM 60
MRRLL ++ LWL +P A AQLPG+ SQPL GGGQSWSL VQTLVFITSLTF+PAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEQK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE+K
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALDKGAQPLRAFMLRQTREADLALFARLANSGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEAL+KGAQPLR FMLRQTREADL LFARLAN+GPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08640FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 136/137 (99%), Positives = 136/137 (99%)

Query: 1 MSDMNNPSDENTGALDDLWADALNEQKATTNKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60
MSDMNNPSDENTGALDDLWADALNEQKATT KSAADAVFQQLGGGDVSGAMQDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08645FLGMOTORFLIM384e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 384 bits (987), Expect = e-136
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--DTKDEPTPGIASDSDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S D E I+ I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RQFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 HEDQNWRDNLVRQVQHSELELVANFADIPLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ L ++ ++++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTVNGQYALRVEHLI 321
Q G V + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08655FLGHOOKFLIK405e-142 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 405 bits (1041), Expect = e-142
Identities = 193/411 (46%), Positives = 232/411 (56%), Gaps = 40/411 (9%)

Query: 1 MITLPQLITTDTDMTAGLTSGKTTGSAEDFLALLAGALGADGAQGKDARITLADLQAAGG 60
MI L LIT D D T L GK + +A+DFLALL+ AL + K A L
Sbjct: 1 MIRLAPLITADVDTTT-LPGGKASDAAQDFLALLSEALAGETTTDKAAPQLL-------- 51

Query: 61 KLSKGLLTQHGEPGQAVKLADLLAQKAN---ATDETLTDLTQAQHLLSTLTPSLKTSALA 117
++ T GEP + ++D AQ+AN DET + Q + LT + + A
Sbjct: 52 -VATDKPTTKGEPLISDIVSD--AQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAA 108

Query: 118 ALSKTAQHDEKTPALSDEDLASLSALFAMLPGQPVATPVAGETPAENHIALPSLLRGDMP 177
K DEK L+++ ASLSALFAMLPG V D P
Sbjct: 109 VADKNTTKDEKADDLNEDVTASLSALFAMLPGFDNTPKVT-----------------DAP 151

Query: 178 SAPQEETHTLSFSEHEKGKTEASLTRASDDRATGPVLTPLVVAAAATSAKVEVDSPSAPV 237
S F++ T LT A D A G PL A +K EV S +PV
Sbjct: 152 STVLPTEKPTLFTK----LTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPV 207

Query: 238 THGAAMPTLSSATAQPQPLPVASAPVLSAPLGSHEWQQTFSQQVMLFTRQGQQSAQLRLH 297
T AA P ++ QP LP +APVLSAPLGSHEWQQ+ SQ + LFTRQGQQSA+LRLH
Sbjct: 208 T-AAASPLITPHQTQP--LPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLH 264

Query: 298 PEELGQVHISLKLDDNQAQLQMVSPHSHVRAALEAALPMLRTQLAESGIQLGQSSISSES 357
P++LG+V ISLK+DDNQAQ+QMVSPH HVRAALEAALP+LRTQLAESGIQLGQS+IS ES
Sbjct: 265 PQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGES 324

Query: 358 FAGQQQ-SSSQQQSSRAQHTDAFGAEDDIALAAPASLQAAARGNGAVDIFA 407
F+GQQQ +S QQQS R + + EDD L P SLQ GN VDIFA
Sbjct: 325 FSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08660FLGFLIJ2064e-72 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 206 bits (526), Expect = 4e-72
Identities = 130/147 (88%), Positives = 138/147 (93%)

Query: 1 MAQHGALETLKDLAEKEVDDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRSNLNTDMGNG 60
MA+HGAL TLKDLAEKEV+DAARLLGEMRRGCQQAEEQLKMLIDYQNEYR+NLN+DM G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 IASNRWINYQQFIQTLEKAIEQHRLQLTQWTQKVDLALKSWREKKQRLQAWQTLQDRQTA 120
I SNRWINYQQFIQTLEKAI QHR QL QWTQKVD+AL SWREKKQRLQAWQTLQ+RQ+
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRMDQKKMDEFAQRAAMRKPE 147
AALLAENR+DQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


26GX95_09150GX95_09295Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_09150223-2.362272hypothetical protein
GX95_09155223-2.291252hypothetical protein
GX95_09160425-2.682707hypothetical protein
GX95_09165527-3.766950hypothetical protein
GX95_09170428-3.965143integrase
GX95_09175624-3.075502lytic enzyme
GX95_09180425-3.514033hypothetical protein
GX95_09185424-3.067973lytic enzyme
GX95_09190422-1.945651hypothetical protein
GX95_09195424-0.555217DNA breaking-rejoining protein
GX95_09200326-0.267421phage tail protein
GX95_09205333-1.566383hypothetical protein
GX95_09210429-1.777482phage tail protein
GX95_09215337-4.893684PagK
GX95_09220440-8.035825arsenic transporter
GX95_09225747-11.188946disulfide bond formation protein B
GX95_09230543-9.711865hypothetical protein
GX95_09235440-8.612873multidrug DMT transporter permease
GX95_09240237-8.155080pilus assembly protein
GX95_09245236-7.577005hypothetical protein
GX95_09250138-7.436725hypothetical protein
GX95_09260431-7.339316hypothetical protein
GX95_09265529-6.555990hypothetical protein
GX95_09270534-10.000305N-acetyltransferase
GX95_09275435-10.231324hypothetical protein
GX95_09280436-9.843687cytoplasmic protein
GX95_09285233-8.145481type III secretion protein SopE2
GX95_09290032-7.449193hypothetical protein
GX95_09295031-7.835651serine/threonine-protein phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09240PilS_PF08805270.032 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 27.2 bits (60), Expect = 0.032
Identities = 5/34 (14%), Positives = 14/34 (41%), Gaps = 2/34 (5%)

Query: 112 WTLITSI--LIIIAVAVVLAISSMNVAFRSLNIN 143
TL+ + + +I V A ++ ++ +
Sbjct: 28 ATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSS 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09285SACTRNSFRASE280.011 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.011
Identities = 13/60 (21%), Positives = 27/60 (45%), Gaps = 2/60 (3%)

Query: 60 WLCIDYLWVSESARSNGLGSKLMEMAEKEGLRKGCVHGLVDTFSFQ--ALPFYEKQGYIL 117
+ I+ + V++ R G+G+ L+ A + +++T A FY K +I+
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09300SOPEPROTEIN397e-144 Salmonella type III secretion SopE effector protein ...
		>SOPEPROTEIN#Salmonella type III secretion SopE effector protein

signature.
Length = 239

Score = 397 bits (1020), Expect = e-144
Identities = 163/237 (68%), Positives = 192/237 (81%)

Query: 2 TNITLSTQHYRIHRSDVEPVKEKTTEKDIFAKSITAVRNSFISLSTSLSDRFSLHLQTDI 61
T ITLS Q++RI + + +KEK+TEK+ AKSI AV+N FI L + LS+RF H T+
Sbjct: 1 TKITLSPQNFRIQKQETTLLKEKSTEKNSLAKSILAVKNHFIELRSKLSERFISHKNTES 60

Query: 62 PTTHFHRGSASEGRAVLTSKTVKDFMLQKLNSLDIKGNASKDPAYARQTCEAILAAVYSN 121
THFHRGSASEGRAVLT+K VKDFMLQ LN +DI+G+ASKDPAYA QT EAIL+AVYS
Sbjct: 61 SATHFHRGSASEGRAVLTNKVVKDFMLQTLNDIDIRGSASKDPAYASQTREAILSAVYSK 120

Query: 122 NKDQCCKLLISKGNSITPFLKEIGEAAQNAGLPGEMKNGVFTPGGAGANPFVVPLIAAAS 181
NKDQCC LLISKG +I PFL+EIGEAA+NAGLPG KN VFTP GAGANPF+ PLI++A+
Sbjct: 121 NKDQCCNLLISKGINIAPFLQEIGEAAKNAGLPGTTKNDVFTPSGAGANPFITPLISSAN 180

Query: 182 IKYPHMFINHNQQVSFKAHAEKIVMKEVTPLFNKGTMPTPQQFQLTIENIANKYLQN 238
KYP MFIN +QQ SFK +AEKI+M EV PLFN+ MPTPQQFQL +ENIANKY+QN
Sbjct: 181 SKYPRMFINQHQQASFKIYAEKIIMTEVAPLFNECAMPTPQQFQLILENIANKYIQN 237


27GX95_09600GX95_09685Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_096003133.101377glutamate dehydrogenase
GX95_096053153.143207YbgT family membrane protein
GX95_096102163.183671cytochrome d ubiquinol oxidase subunit II
GX95_096151153.156422cytochrome d terminal oxidase subunit 1
GX95_096202173.371832hydrogenase expression/formation protein
GX95_09625215-0.917852hydrogenase-1 operon protein HyaE
GX95_09630318-3.426333hydrogenase expression/formation protein
GX95_09635218-3.622218Ni/Fe-hydrogenase, b-type cytochrome subunit
GX95_09640217-4.072976hydrogenase
GX95_09645123-6.128581hydrogenase
GX95_09650-121-5.175572cytoplasmic protein
GX95_09655-318-1.754792hypothetical protein
GX95_09660-2161.160241redox-regulated ATPase YchF
GX95_09665-2131.687813aminoacyl-tRNA hydrolase
GX95_09670-2142.430891hypothetical protein
GX95_096750152.947049C4-dicarboxylic acid transporter DauA
GX95_096800153.195244ribose-phosphate pyrophosphokinase
GX95_09685-1163.2087704-(cytidine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09675RTXTOXINA310.016 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.1 bits (70), Expect = 0.016
Identities = 23/81 (28%), Positives = 37/81 (45%), Gaps = 16/81 (19%)

Query: 288 LGAIESLLCAV----VL---DGMTGTKHKANSELIGQGLGNM---VAPFF------GGIT 331
L + +L A+ +L D T TK A EL + LGN+ ++ + G++
Sbjct: 242 LDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLS 301

Query: 332 ATAAIARSAANVRAGATSPVS 352
+AA A A+ A SP+S
Sbjct: 302 TSAAAAGLIASAVTLAISPLS 322


28GX95_09930GX95_09985Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_09930-115-3.028505septation protein A
GX95_09935020-3.677608hypothetical protein
GX95_09940024-3.939969hypothetical protein
GX95_09945021-2.341306outer membrane protein OmpW
GX95_09950217-1.169401Mn-containing catalase
GX95_09955216-0.611734hypothetical protein
GX95_099601141.521344hypothetical protein
GX95_09965-1133.171751stress-induced bacterial acidophilic repeat
GX95_09970-1123.262462tryptophan synthase subunit alpha
GX95_09975-2122.920844tryptophan synthase subunit beta
GX95_09980-2102.950307bifunctional indole-3-glycerol phosphate
GX95_09985-2113.041765bifunctional glutamine
29GX95_10100GX95_10195Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_10100-117-3.089717exoribonuclease II
GX95_10105-226-5.563106cytoplasmic protein
GX95_10110-124-6.468217enoyl-[acyl-carrier-protein] reductase
GX95_10115025-6.415433hypothetical protein
GX95_10120020-4.660583Secreted effector kinase SteC
GX95_10125019-3.768996diguanylate phosphodiesterase
GX95_10130-218-2.393953peptide ABC transporter ATP-binding protein
GX95_10135017-1.915117peptide ABC transporter ATP-binding protein
GX95_10140023-4.179378antimicrobial peptide ABC transporter permease
GX95_10145229-4.674828hypothetical protein
GX95_10150645-9.516180hypothetical protein
GX95_10155229-6.782787subtilase cytotoxin subunit B-like protein
GX95_10160-119-2.110851hypothetical protein
GX95_10165-116-1.475342antimicrobial peptide ABC transporter permease
GX95_10170013-0.308357peptide ABC transporter substrate-binding
GX95_10175113-0.364922phage shock protein operon transcriptional
GX95_10180113-0.049322phage shock protein PspA
GX95_101853130.178973phage shock protein B
GX95_10190317-1.193628DNA-binding transcriptional activator PspC
GX95_10195217-0.804772phage shock protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10110DHBDHDRGNASE524e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 52.0 bits (124), Expect = 4e-10
Identities = 52/260 (20%), Positives = 99/260 (38%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSSIVLP 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+ + P
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDASIDAMFAELGNVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + A+ID + A + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRTMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10140HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10190HTHFIS344e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 344 bits (884), Expect = e-118
Identities = 125/345 (36%), Positives = 176/345 (51%), Gaps = 22/345 (6%)

Query: 2 AEFKDNLLGEANRFLEVLEQVSRLAPLDKPVLIIGERGTGKELIANRLHYLSSRWQGPLI 61
++ L+G + E+ ++RL D ++I GE GTGKEL+A LH R GP +
Sbjct: 133 SQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFV 192

Query: 62 SLNCAALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMLVQEKLL 121
++N AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LL
Sbjct: 193 AINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLL 252

Query: 122 RVIEYGELERVGGSQPLQVNVRLVCATNADLPAMVKEGTFRADLLDRLAFDVVQLPPLRE 181
RV++ GE VGG P++ +VR+V ATN DL + +G FR DL RL ++LPPLR+
Sbjct: 253 RVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRD 312

Query: 182 RQSDIMLMAEHFAIQMCRELRLPLFPGFTDRAKETLLHYAWPGNVRELKNVVERSVYRHG 241
R DI + HF Q +E F A E + + WPGNVREL+N+V R +
Sbjct: 313 RAEDIPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYP 370

Query: 242 SSE--------HPLDEIVIDPFQRHPA------------EPPAPALPSATATPDLPLKLR 281
EI P ++ A E S
Sbjct: 371 QDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYD 430

Query: 282 EFQLQQEKALLQRSLQQAKFNQKRAADLLALTYHQFRALLKKHQL 326
+ E L+ +L + NQ +AADLL L + R +++ +
Sbjct: 431 RVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10195RTXTOXIND290.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.019
Identities = 19/104 (18%), Positives = 43/104 (41%), Gaps = 5/104 (4%)

Query: 40 LVEVRSNSARALAEKKQLSRRIEQATAQQTEWQEKAELA-LRKDKDDLARAALIEKQKLT 98
+ + R + +L K+ +++ + + EL + + + L K++
Sbjct: 232 VEKSRLDDFSSLLHKQAIAK-HAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 99 DLIATLEQEVTLVDDTLARMKKEIGELENKLSETRARQQALMLR 142
+ + E+ D L + IG L +L++ RQQA ++R
Sbjct: 291 LVTQLFKNEIL---DKLRQTTDNIGLLTLELAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10200MPTASEINHBTR260.015 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 25.7 bits (56), Expect = 0.015
Identities = 6/43 (13%), Positives = 14/43 (32%)

Query: 30 AGRGELSQSEQQRLLQLTDDAQRMRERIQALEDILDAEHPNWR 72
AG+ + + + A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


30GX95_10260GX95_10380Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_10260-122-3.152696aldo/keto reductase
GX95_10265028-6.510676oxidoreductase
GX95_10270230-6.114164TetR family transcriptional regulator
GX95_10275430-5.967961hypothetical protein
GX95_10285534-7.836560hypothetical protein
GX95_10290432-7.312164hypothetical protein
GX95_10295229-4.907672hypothetical protein
GX95_10300127-4.548831invasin
GX95_10305027-6.786772hypothetical protein
GX95_10310-118-5.168925peroxiredoxin
GX95_10315-116-4.077290hypothetical protein
GX95_10320-114-3.448658hypothetical protein
GX95_10325014-3.561147XRE family transcriptional regulator
GX95_10330-114-3.353950mechanosensitive ion channel protein MscS
GX95_10335014-2.292872hypothetical protein
GX95_10340-114-1.752516universal stress protein UspE
GX95_10345-115-0.909396transcriptional regulator FNR
GX95_10350-214-0.891645methylated-DNA--protein-cysteine
GX95_10355128-6.306460DNA endonuclease
GX95_10360128-6.916469chemoreceptor protein
GX95_10365130-7.296567zinc transporter ZntB
GX95_10370233-8.760399ATP-dependent RNA helicase DbpA
GX95_10375538-10.837192tRNA 2-thiocytidine(32) synthetase TtcA
GX95_10380742-11.252189DUF4765 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10270DHBDHDRGNASE862e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.3 bits (213), Expect = 2e-22
Identities = 68/249 (27%), Positives = 112/249 (44%), Gaps = 24/249 (9%)

Query: 7 KSVLVLGGSRGIGAAIVRRFSADGASVV-FSYAGSR----EAAEKLAAETGSTAIQTDSA 61
K + G ++GIG A+ R ++ GA + Y + ++ K A + A D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARH-AEAFPADVR 67

Query: 62 DRDAIISLV----REYGPLDILVVNAGVALFGDALEQDSDAIDRLFRINIHAPYHASVEA 117
D AI + RE GP+DILV AGV G + + F +N ++AS
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 118 ARNMP--EGGRIIIIGSVNGDRMPVPGMAAYAASKSALQGLARGLARDFGPRGITINVVQ 175
++ M G I+ +GS N +P MAAYA+SK+A + L + I N+V
Sbjct: 128 SKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 176 PGPIDTDI--------NPEDGPMKELMHSF---MAIKRHGRPEEVAGMVAWLAGPEASFV 224
PG +TD+ N + +K + +F + +K+ +P ++A V +L +A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 225 TGAMHTIDG 233
T +DG
Sbjct: 247 TMHNLCVDG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10275HTHTETR453e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.4 bits (107), Expect = 3e-08
Identities = 14/115 (12%), Positives = 40/115 (34%), Gaps = 5/115 (4%)

Query: 6 SRTPGRPRQFDPEQAIETAQHLFHSRGYDAVSVADLTKAFGINPPSFYAAFGSKLGLYTR 65
+T ++ + ++ A LF +G + S+ ++ KA G+ + Y F K L++
Sbjct: 3 RKTKQEAQE-TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 66 VLK----RYRMTDAIPLGALLRHDRPTAKCLIDVLMEAARRYAADPDATGCLVLE 116
+ + + + ++ ++E+ + +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10280adhesinb280.002 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 28.3 bits (63), Expect = 0.002
Identities = 10/48 (20%), Positives = 18/48 (37%), Gaps = 6/48 (12%)

Query: 1 MQKCSLITVLSLSVLMLAGCTTTYTMTTRTGDIIETQGKPEVDTATGM 48
M+KC + +L L+ + LA C++ K V +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQ------KSSTETGSSKLNVVATNSI 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10300INTIMIN2172e-62 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 217 bits (554), Expect = 2e-62
Identities = 117/409 (28%), Positives = 187/409 (45%), Gaps = 21/409 (5%)

Query: 29 SDNEIQSWIAGTASSISPHLQEGTLE-DYAKGKIKALPGQAANHLVNEGMKSAFPAIIFR 87
+D++ ++ A A+S+ LQ +L DYAK + G A+ + ++ A
Sbjct: 158 TDDKALNYAAQQAASLGSQLQSRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTA---- 213

Query: 88 GGVNLEDGAKYRSSEFDMFIPVEETTSSLLFGQLGFRDHDSSSFDGRTYVNVGVGYRQEV 147
VNL+ G + S D +P ++ L FGQ+G R DS R N+G G R +
Sbjct: 214 -EVNLQSGNNFDGSSLDFLLPFYDSEKMLAFGQVGARYIDS-----RFTANLGAGQRFFL 267

Query: 148 NGWLLGVNTFLDADIRYSHLRGGIGGEVYKDSLAFSGNYYFPLTGWKTSAVHELHDERPA 207
+LG N F+D D + R GIGGE ++D S N YF ++GW S + +DERPA
Sbjct: 268 PENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPA 327

Query: 208 YGFDLRTKGTLPDFPWFSGELTYEQYYGDKVDLLGNGTLSRNPRAAGAALVWNPVPLLEV 267
GFD+R G LP +P +L YEQYYGD V L + L NP AA + + P+PL+ +
Sbjct: 328 NGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTM 387

Query: 268 RAGYRDAGNGGSQAEGGLRVNYSFGTPLHEQLDYRNV-GAPSNTTNRRAFVDRNYDIVMA 326
YR + ++ Y F P +Q++ + V + + +R V RN +I++
Sbjct: 388 GIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILE 447

Query: 327 YREQAS-KIRITAMPVSGLSGTLVTLMATVDSRYPIEKVEWSGDAELLAGLQLQGSLGSG 385
Y++Q + I ++G + + V S+Y ++++ W A G Q+Q S
Sbjct: 448 YKKQDILSLNIPH-DINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQS 506

Query: 386 -----LILPQLPLTATDGQEYSLYLTVTDSRGTRVTSERIPVRVTQDET 429
ILP Y + D G + + + V +
Sbjct: 507 AQDYQAILP--AYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQ 553


31GX95_10460GX95_10510Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_10460-311-3.157130ATP-dependent RNA helicase HrpA
GX95_10465128-6.937717hypothetical protein
GX95_10470232-8.642785cytochrome B
GX95_10475234-9.860565SAM-dependent methyltransferase
GX95_10480234-10.064271hypothetical protein
GX95_10485436-11.672047amino acid ABC transporter permease
GX95_10490538-12.665375amino acid ABC transporter ATP-binding protein
GX95_10495537-12.221299amino acid ABC transporter ATP-binding protein
GX95_10500229-9.383930amino acid-binding protein
GX95_10505-123-5.732581cyanate transporter
GX95_10510-121-4.592660pathogenicity island 2 effector protein SseJ
32GX95_10905GX95_11100Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_10905123-3.685752transcriptional regulator
GX95_10910529-5.618761diguanylate cyclase
GX95_10915730-6.948503LysR family transcriptional regulator
GX95_109201038-9.902655hypothetical protein
GX95_109251140-12.509935acyl carrier protein
GX95_10930539-10.679417plasmid stabilization protein
GX95_10935131-6.516392addiction module toxin RelE
GX95_10940131-5.806313S-adenosylmethionine tRNA ribosyltransferase
GX95_10945133-7.001006MarR family transcriptional regulator
GX95_10950131-7.101036hypothetical protein
GX95_10955127-5.664165MFS transporter
GX95_10960126-5.779252PhoPQ-regulated protein
GX95_10965126-7.034920MFS transporter
GX95_10970125-6.570978alcohol dehydrogenase
GX95_10975018-4.008543GntR family transcriptional regulator
GX95_10980016-3.293142hydrolase
GX95_10985015-2.648120uptake hydrogenase small subunit
GX95_10990015-1.892332hydrogenase
GX95_10995-114-0.914239Ni/Fe-hydrogenase, b-type cytochrome subunit
GX95_11000013-1.258427hydrogenase expression/formation protein
GX95_11005-118-0.912164hydrogenase formation protein
GX95_11010-120-0.776096hydrogenase
GX95_11015021-2.602210hydrogenase-1 operon protein HyaF2
GX95_11020122-3.229620ATP/GTP-binding protein
GX95_11025025-4.189787hydrogenase maturation nickel metallochaperone
GX95_11030025-4.014481phosphoporin PhoE
GX95_11035121-3.877819DUF4440 domain-containing protein
GX95_11040018-1.941258hypothetical protein
GX95_110500130.520672DUF4186 domain-containing protein
GX95_110550121.726906glutaminase
GX95_11060-1121.173557succinate-semialdehyde dehydrogenase
GX95_11065-1100.525190LysR family transcriptional regulator
GX95_11075216-1.509127sugar transporter
GX95_11080018-1.889731stress protection protein MarC
GX95_11085019-2.711865transcriptional regulator
GX95_11090119-2.671652MDR efflux pump AcrAB transcriptional activator
GX95_11095120-2.460134multiple antibiotic resistance regulatory
GX95_11100219-2.405296O-acetylserine/cysteine exporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10965TCRTETA1489e-43 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 148 bits (375), Expect = 9e-43
Identities = 95/369 (25%), Positives = 167/369 (45%), Gaps = 6/369 (1%)

Query: 20 RRILPVFLLVGLYAASTAAVMSVLPFYIREMGGSPLII---GIIIATEAFSQFCAAPLIG 76
R ++ + V L A +M VLP +R++ S + GI++A A QF AP++G
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 77 HLSDRVGRKRILIITLTIAAISLLLLANAQCILFILLARTLFGISAGNLSAAVAYIADCT 136
LSDR GR+ +L+++L AA+ ++A A + + + R + GI+ + A AYIAD T
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT 124

Query: 137 HVRNRRQAIGILTGCIGLGGIVGAGVSGWLSRISLSAPIYAAFILVLGSALVAIWGLKDP 196
R + G ++ C G G + G + G + S AP +AA L + L + L +
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 197 STTSRTADKIAAFSARAILKMPVLRVLIIVMLCHFFAYGMYSSQLPVFLSDTFIWNGLPF 256
R + A + A + ++ ++ FF + Q+P L F + +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV-GQVPAALWVIFGEDRFHW 243

Query: 257 GPKALSYLLMADGVINIFVQLFLLGWVSQYFSERKLIILIFALLCTGFLTAGIATTIPVL 316
+ L A G+++ Q + G V+ ER+ ++L TG++ AT +
Sbjct: 244 DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA 303

Query: 317 IFAIVCISIADALAKPTYLAALSVHVSPARQGIVIGTAQALIAIADFISPVLGGFVLGYA 376
+V + + + P A LS V RQG + G+ AL ++ + P+L + +
Sbjct: 304 FPIMVLL-ASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 377 LYGVWIGIA 385
+ W G A
Sbjct: 363 I-TTWNGWA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10975TCRTETA552e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 55.2 bits (133), Expect = 2e-10
Identities = 67/396 (16%), Positives = 138/396 (34%), Gaps = 30/396 (7%)

Query: 10 IVFLLFIVYMLNYMDRSALSITAPLIEKELGFN---AAEMGMIFSAFFIGYALFNFIGGW 66
++ +L V L+ + + P + ++L + A G++ + + + + G
Sbjct: 7 LIVILSTVA-LDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 67 ASDKVGPKTVFLIAALLWSVFCGLTGLVTGLWTMLIVRVLFGMAEGPVSAAGNKIINNWI 126
SD+ G + V L++ +V + LW + I R++ G+ + AG I +
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGA-YIADIT 124

Query: 127 SRKESATAIGIFSAGSPLGGAVSGPIVGLLALSLGWRPAFGIIFLFGLVWVLLWYFIVSD 186
E A G SA G V+GP++G L F + L F++ +
Sbjct: 125 DGDERARHFGFMSA-CFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 187 KPTMSKRLAPEERIDFENHEDVILSDDGRATPSLGYYMKQPMVWATTLAFFSYNYILFFF 246
+R E + + + FF +
Sbjct: 184 SHKGERRPLRREAL-------------NPLASFRWARGMTVVAALMAV-FFIMQLVGQVP 229

Query: 247 LTWFPSYLNHSLHLDIKEISIATVIPWVIGAIGMVLGGVCSDVIYRITGNALLSRRLILG 306
+ + H D I I+ G + + + + + L R L
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLA---AFGILHSLAQAMITGPVAA-----RLGERRALM 281

Query: 307 VCLAGAAVCVAVSGTVSTIGSAITLMSVSLFLLYLTGPIYWAVIQDVVHKDKVGSVGGAM 366
+ + + + A +M V L + P A++ V +++ G + G++
Sbjct: 282 LGMIADGTGYILLAFATRGWMAFPIM-VLLASGGIGMPALQAMLSRQVDEERQGQLQGSL 340

Query: 367 HGLANISGIIGPLVTGFIVQFS-GKYDYAFYLAGAI 401
L +++ I+GPL+ I S ++ ++AGA
Sbjct: 341 AALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAA 376



Score = 32.9 bits (75), Expect = 0.002
Identities = 31/121 (25%), Positives = 49/121 (40%), Gaps = 13/121 (10%)

Query: 299 LSRRLILGVCLAGAAVCVAVSGTVST-----IGSAITLMSVSLFLLYLTGPIYWAVIQDV 353
RR +L V LAGAAV A+ T IG + ++ + TG + A I D+
Sbjct: 70 FGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA------TGAVAGAYIADI 123

Query: 354 VHKDKVGSVGGAMHGLANISGIIGPLVTGFIVQFSGKYDYAFYLAGAIAIVSSLLVFVFV 413
D+ G M + GP++ G + FS F+ A A+ ++ L +
Sbjct: 124 TDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHA--PFFAAAALNGLNFLTGCFLL 181

Query: 414 K 414

Sbjct: 182 P 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11040ECOLIPORIN470e-169 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 470 bits (1211), Expect = e-169
Identities = 235/386 (60%), Positives = 275/386 (71%), Gaps = 20/386 (5%)

Query: 3 KVVVLSAVAAAVMMAGAANAAEIYNKDGNKLDLYGKVDGLHYFSSNHSTDGDQSYIRMGI 62
K VL+ V A++ AGAA+AAEIYNKDGNKLDLYGKVDGLHYFS + S DGDQ+Y+R+G
Sbjct: 2 KRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVGF 61

Query: 63 KGETQITDQLTGFGQWEYQVNANRPEDGDSSGSPQSWTRLGFAGLAFADMGSVDYGRNYG 122
KGETQI DQLTG+GQWEY V AN E SWTRL FAGL F D GS DYGRNYG
Sbjct: 62 KGETQINDQLTGYGQWEYNVQANTTE----GEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 123 VLYDIGSWTDVLPEFGNDSYEASDNFMTGRANGVLTYRNNDFFGLVDGLNIALQYQGKND 182
VLYD+ WTD+LPEFG DSY +DN+MTGRANGV TYRN DFFGLVDGLN ALQYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 183 GLSKEGDPLSNNAR---KSIAYQNGDGFGASATYDLGMGVSLGAAYTSSKRTLDQMTQDK 239
S + + N R I Y NGDGFG S TYD+GMG S GAAYT+S RT +Q+
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 240 YD-NGDRAEAWTGGVKYDANNIYLAANYTRTYDMTYMGDTL----GGFAHKTDNWEMVGQ 294
GD+A+AWT G+KYDANNIYLA Y+ T +MT G T GG A+KT N+E+ Q
Sbjct: 238 TIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 295 YQFDNGLRPSLAFLQSRANDVD----GLGSFDLVKYIDVGSYYYFNKNMSAYVDYKINLL 350
YQFD GLRP+++FL S+ D+ DLVKY DVG+ YYFNKN S YVDYKINLL
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLL 357

Query: 351 KDGNP----SNPNTDNTVALGLVYEF 372
D +P + +TD+ VALG+VY+F
Sbjct: 358 DDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11060BLACTAMASEA310.008 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 30.5 bits (69), Expect = 0.008
Identities = 12/51 (23%), Positives = 21/51 (41%), Gaps = 1/51 (1%)

Query: 22 GQGKVADYIPALASVEGSKLGI-AICTVDGQHYQAGDAHERFSIQSISKVL 71
+ + I S ++G+ + G+ A A ERF + S KV+
Sbjct: 21 ASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVV 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11075TCRTETB575e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 56.8 bits (137), Expect = 5e-11
Identities = 44/192 (22%), Positives = 85/192 (44%), Gaps = 8/192 (4%)

Query: 36 LSDIAESFHMQTAQVGIMLTIYAWVVAVMSLPFMLLTSQMERRKLLICLFVLFIASHVLS 95
L DIA F+ A + T + ++ + + L+ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLAWN-FTVLVISRIGIAFAHAIFWSITASLAIRLAPAGKRAQALSLIATGTALAMVLGL 154
F+ + F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PIGRVVGQYFGWRTTFFAIGMGALITLLCLIKLLPKLPSEHSGSLKSLPLLFRRPALMSL 214
IG ++ Y W + I M +IT+ L+KLL K + LMS+
Sbjct: 157 AIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLLKK------EVRIKGHFDIKGIILMSV 209

Query: 215 YVLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


33GX95_11555GX95_11785Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_11555019-4.779498Bcr/CflA family multidrug efflux transporter
GX95_11560023-6.575269cyclopropane-fatty-acyl-phospholipid synthase
GX95_11565127-7.487547riboflavin synthase subunit alpha
GX95_11570334-8.234330MATE family efflux transporter
GX95_11585545-11.284828**EscU/YscU/HrcU family type III secretion system
GX95_11590442-10.299808EscT/YscT/HrcT family type III secretion system
GX95_11595240-6.818491EscS/YscS/HrcS family type III secretion system
GX95_11600135-6.648906EscR/YscR/HrcR family type III secretion system
GX95_11605235-6.568457type III secretion system protein SsaQ
GX95_11610134-6.358664type III secretion system protein SsaP
GX95_11615034-6.743246type III secretion system protein SsaO
GX95_11620136-6.501945EscN/YscN/HrcN family type III secretion system
GX95_11625237-8.261819EscV/YscV/HrcV family type III secretion system
GX95_11630242-9.252619type III secretion system protein SsaM
GX95_11635241-9.068187SepL/TyeA/HrpJ family type III secretion system
GX95_11640340-10.448129type III secretion system protein SsaK
GX95_11645542-9.079807cytoplasmic protein
GX95_11650643-8.195585EscJ/YscJ/HrcJ family type III secretion inner
GX95_11655843-6.831985EscI/YscI/HrpB family type III secretion system
GX95_11660742-6.662430EscG/YscG/SsaH family type III secretion system
GX95_11665741-5.887321EscF/YscF/HrpA family type III secretion system
GX95_11670440-5.805272pathogenicity island 2 effector protein SseG
GX95_11675337-5.497993pathogenicity island 2 effector protein SseF
GX95_11680333-6.509845CesD/SycD/LcrH family type III secretion system
GX95_11685437-7.116850pathogenicity island 2 effector protein SseE
GX95_11690436-7.266515pathogenicity island 2 effector protein SseD
GX95_11695438-8.469188pathogenicity island 2 effector protein SseC
GX95_11700440-9.794104CesD/SycD/LcrH family type III secretion system
GX95_11705643-10.815871pathogenicity island 2 effector protein SseB
GX95_11710442-10.900656type III secretion system chaperone SseA
GX95_11715441-11.041793EscE/YscE/SsaE family type III secretion system
GX95_11720339-9.878049EscD/YscD/HrpQ family type III secretion system
GX95_11725233-8.223244EscC/YscC/HrcC family type III secretion system
GX95_11730231-7.234123pathogenicity island chaperone protein SpiC
GX95_11735028-5.969249hybrid sensor histidine kinase/response
GX95_117401180.093266DNA-binding response regulator
GX95_117450152.323485helix-turn-helix-type transcriptional regulator
GX95_117500132.554311hypothetical protein
GX95_11755-1123.280031hypothetical protein
GX95_11760-1121.848363DNA-binding response regulator
GX95_117650140.763237sensor histidine kinase
GX95_11770017-0.149365tetrathionate reductase subunit B
GX95_11775017-0.935535tetrathionate reductase subunit C
GX95_11780-114-1.321843tetrathionate reductase subunit A
GX95_11785-217-4.207167hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11555TCRTETB763e-17 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 76.5 bits (188), Expect = 3e-17
Identities = 48/194 (24%), Positives = 84/194 (43%), Gaps = 3/194 (1%)

Query: 8 LVWLAGLSVLGFLATDMYLPAFAAIQADLQTPAAAVSASLSLFLAGFAVAQLLWGPLSDR 67
L+WL LS L + + I D P A+ + + F+ F++ ++G LSD+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 68 YGRKPILLLGLSIFALGSLGMLWVESAAALLTL-RFVQAVGVCAATVIWQALVTDYYPSQ 126
G K +LL G+ I GS+ S +LL + RF+Q G A + +V Y P +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 127 KINRIFATIMPLVGLSPALAPLLGSWILTHFSWQAIFATLFVITLLLMLPALRLKPSVKA 186
+ F I +V + + P +G I + W + L + ++ +P L +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEV 193

Query: 187 RTEGQDKLTFATLL 200
R +G + L+
Sbjct: 194 RIKGHFDIKGIILM 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11585TYPE3IMSPROT385e-136 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 385 bits (991), Expect = e-136
Identities = 125/350 (35%), Positives = 202/350 (57%), Gaps = 4/350 (1%)

Query: 2 SEKTEQPTEKKLRDGRKEGQVVKSIEITSLFQLIALYLYFHFFTEKMILILIASITFTLQ 61
EKTEQPT KK+RD RK+GQV KS E+ S ++AL ++ + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 62 LVNKPFSYALTQLT-HALIESLTSALLFLGAGVIVATVGSVFLQVGVVIASKAIGFKSEH 120
PFS AL+ + + L+E L ++A + S +Q G +I+ +AI +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMA-IASHVVQYGFLISGEAIKPDIKK 121

Query: 121 INPVSNFKQIFSLHSVVELCKSSLKVIMLSLIFAFFFYYYASTFRALPYCGLACGLLVVS 180
INP+ K+IFS+ S+VE KS LKV++LS++ T LP CG+ C ++
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 181 SLIKWLWVGVMAFYIVVGILDYSFQYYKIRKDLKMSKDDVKQEHKDLEGDPQMKTRRREM 240
+++ L V ++V+ I DY+F+YY+ K+LKMSKD++K+E+K++EG P++K++RR+
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 241 QSEIQSGSLAQSVKQSVAVVRNPTHIAVCLGYHPTDMPIPRVLEKGSDAQANYIVNIAER 300
EIQS ++ ++VK+S VV NPTHIA+ + Y + P+P V K +DAQ + IAE
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 301 NCIPVVENVELARSLFFEVERGDKIPETLFEPVAALLRMVMK--IDYAHS 348
+P+++ + LAR+L+++ IP E A +LR + + I+ HS
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHS 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11590TYPE3IMRPROT1643e-52 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 164 bits (418), Expect = 3e-52
Identities = 54/229 (23%), Positives = 100/229 (43%), Gaps = 5/229 (2%)

Query: 8 WLIALAVAFIRPLSLSLLLPLLKSGSLGSALLRNGVLMSLTFPILPIIYQQKIMMHIGKD 67
WL +R L+L P+L S+ ++ G+ M +TF I P + + +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPK-RVKLGLAMMITFAIAPSLPANDVPVF---S 67

Query: 68 YSWLGLVTGEVIIGFLIGFCAAVPFWAVDMAGFLLDTLRGATMGTIFNSTMEAETSLFGL 127
+ L L +++IG +GF F AV AG ++ G + T + +
Sbjct: 68 FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLAR 127

Query: 128 LFSQFLCVIFFISGGMEFILNILYESYQYLPPGRTLLFDRQFLKYIQAEWRTLYQLCISF 187
+ ++F G +++++L +++ LP G L FL +A ++ +
Sbjct: 128 IMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLML 186

Query: 188 SVPAIICMVLADLALGLLNRSAQQLNVFFFSMPLKSILVLLTLLISFPY 236
++P I ++ +LALGLLNR A QL++F PL + + + P
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11595TYPE3IMQPROT729e-21 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 72.5 bits (178), Expect = 9e-21
Identities = 30/85 (35%), Positives = 50/85 (58%)

Query: 4 SELTQFVTQLLWIVLFTSMPVVLVASVVGVIVSLVQALTQIQDQTLQFMIKLLAIAITLM 63
+L + L++VL S +VA+++G++V L Q +TQ+Q+QTL F IKLL + + L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VSYPWLSGILLNYTRQIMLRIGEHG 88
+ W +LL+Y RQ++ G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11600TYPE3IMPPROT2319e-80 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 231 bits (592), Expect = 9e-80
Identities = 79/215 (36%), Positives = 130/215 (60%), Gaps = 8/215 (3%)

Query: 8 LQLIGILFLLSILPLIIVMGTSFLKLAVVFSILRNALGIQQVPPNIALYGLALVLSLFIM 67
+ LI +L ++LP II GT F+K ++VF ++RNALG+QQ+P N+ L G+AL+LS+F+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 68 GPTLLAVKERWHPVQVAGAPFWT-SEWDSKALAPYRQFLQKNSEEKEANYFRNLIKRTWP 126
P + + V + S+ + L YR +L K S+ + +F N +
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 127 ED-------IKRKIKPDSLLILIPAFTVSQLTQAFRIGLLIYLPFLAIDLLISNILLAMG 179
+ K +I+ S+ L+PA+ +S++ AF+IG +YLPF+ +DL++S++LLA+G
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 180 MMMVSPMTISLPFKLLIFLLAGGWDLTLAQLVQSF 214
MMM+SP+TIS P KL++F+ GW L L+ +
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11605FLGMOTORFLIN513e-10 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 51.1 bits (122), Expect = 3e-10
Identities = 21/67 (31%), Positives = 38/67 (56%)

Query: 247 LEQIPQQVLFEIGRASLEIGQLRQLKTGDVLPVGGCFAPEVTIRVNDRIIGQGELIACGN 306
+ IP ++ E+GR + I +L +L G V+ + G + I +N +I QGE++ +
Sbjct: 57 IMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVAD 116

Query: 307 EFMVRIT 313
++ VRIT
Sbjct: 117 KYGVRIT 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11650FLGMRINGFLIF525e-10 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 52.3 bits (125), Expect = 5e-10
Identities = 28/170 (16%), Positives = 67/170 (39%), Gaps = 15/170 (8%)

Query: 23 LYRSLPEDEANQMLALLMQHHIDAEKKQEEDGVTLRVEQSQFINAVELLRLNGYPHRQFT 82
L+ +L + + ++A L Q +I + V + L G P +
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNIPYRFA--NGSGAIEVPADKVHELRLRLAQQGLP-KGGA 109

Query: 83 TADKMFPANQLVVSPQEEQQKINFLK--EQRIEGMLSQMEGVINAKVTIALPTYDEGS-- 138
++ + +S EQ +N+ + E + + + V +A+V +A+P + S
Sbjct: 110 VGFELLDQEKFGISQFSEQ--VNYQRALEGELARTIETLGPVKSARVHLAMP---KPSLF 164

Query: 139 --NASPSSVAVFIKYSPQVNMEAFRVK-IKDLIEMSIPGLQYSKISILMQ 185
S +V + P ++ ++ + L+ ++ GL ++++ Q
Sbjct: 165 VREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQ 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11680SYCDCHAPRONE791e-21 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 79.2 bits (195), Expect = 1e-21
Identities = 26/127 (20%), Positives = 49/127 (38%)

Query: 16 LKQLLSVDPETVYASGYASWQEGDYSRAVIDFSWLVMAQPWSWRAHIALAGTWMMLKEYT 75
L ++ S E +Y+ + +Q G Y A F L + + R + L + +Y
Sbjct: 28 LNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYD 87

Query: 76 TAINFYGHALMLDASHPEPVYQTGVCLKMMGEPGLAREAFQTAIKMSYADASWSEIRQNA 135
AI+ Y + ++D P + CL GE A A ++ + E+
Sbjct: 88 LAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKELSTRV 147

Query: 136 QIMVDTL 142
M++ +
Sbjct: 148 SSMLEAI 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11690PF07132290.011 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 29.3 bits (65), Expect = 0.011
Identities = 21/87 (24%), Positives = 35/87 (40%), Gaps = 2/87 (2%)

Query: 95 GAMLSGVLTIGLGAVGGETGLIAGQAVGHTAGGVMGLGAGVAQRQSDQDKAIADLQQNGA 154
G+M+ G L GLG +G G + G +G GG +G G + + G
Sbjct: 62 GSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLGSAL-GGGLGGALGAGM 120

Query: 155 QSYNKSLTEIMEKATEIMQQIIGVGSS 181
+ N S + ++ ++G G S
Sbjct: 121 NAMNPS-AMMGSLLFSALEDLLGGGMS 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11700SYCDCHAPRONE902e-25 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 89.6 bits (222), Expect = 2e-25
Identities = 40/154 (25%), Positives = 66/154 (42%), Gaps = 7/154 (4%)

Query: 6 TLQQAHDTMRFFRRGGSLRMLL---DDDVTQPLNTLYRYAMQLMDVKEFAGAARLFQLLT 62
T + F + GG++ ML D L LY A ++ A ++FQ L
Sbjct: 8 TQEYQLAMESFLKGGGTIAMLNEISSDT----LEQLYSLAFNQYQSGKYEDAHKVFQALC 63

Query: 63 IYDAWSFDYWFRLGECCQAQKHWGEAIYAYGRAAQIKIDAPQAPWAAAECYLACDNVCYA 122
+ D + ++ LG C QA + AI++Y A + I P+ P+ AAEC L + A
Sbjct: 64 VLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEA 123

Query: 123 IKALKAVVRICGEVSEHQILRLRAEKMLQQLSDR 156
L + + +E + L R ML+ + +
Sbjct: 124 ESGLFLAQELIADKTEFKELSTRVSSMLEAIKLK 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11725TYPE3OMGPROT5810.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 581 bits (1500), Expect = 0.0
Identities = 158/500 (31%), Positives = 261/500 (52%), Gaps = 15/500 (3%)

Query: 11 LLFILNTAKSDELSWKGNDFTLYARQMPLAEVLHLLSENYDTAITISPLITATFSGKIPP 70
LL + + + + EL W + A+ L ++L NYD + +S I SG+
Sbjct: 17 LLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEH 76

Query: 71 GPPVDILNNLAAQYDLLTWFDGSMLYVYPASLLKHQVITFNILSTGRFIHYLRSQNILSS 130
P D L ++A+ Y+L+ ++DG++LY++ S + ++I L+ I
Sbjct: 77 DNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWE- 135

Query: 131 PGCEVKEITGTRAVEVSGVPSCLTRISQLASVLDNALIKR--KDSAVSVSIYTLKYATAM 188
P + R V VSG P L + Q A+ L+ R K A+++ I+ LKYA+A
Sbjct: 136 PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASAS 195

Query: 189 DTQYQYRDQSVVVPGVVSVL-REMSKTSVPASSTNN-----GSPATQALPMFAADPRQNA 242
D YRD V PGV ++L R +S ++ + +N + A ADP NA
Sbjct: 196 DRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNA 255

Query: 243 VIVRDYAANMAGYRKLITELDQRQQMIEISVKIIDVNAGDINQLGIDWGTAVSLGG---- 298
+IVRD M Y++LI LD+ IE+++ I+D+NA + +LG+DW + G
Sbjct: 256 IIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQV 315

Query: 299 --KKIAFNTGLNDGGASGFSTVISDTSNFMVRLNALEKSSQAYVLSQPSVVTLNNIQAVL 356
K + + GA G + R+N LE A V+S+P+++T N QAV+
Sbjct: 316 VIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVI 375

Query: 357 DKNITFYTKLQGEKVAKLESITTGSLLRVTPRLLNDNGTQKIMLNLNIQDGQQSDTQSET 416
D + T+Y K+ G++VA+L+ IT G++LR+TPR+L +I LNL+I+DG Q S
Sbjct: 376 DHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGI 435

Query: 417 DSLPEVQNSEIASQATLLAGQSLLLGGFKQGKQIHSQNKIPLLGDIPVVGHLFRNDTTQV 476
+ +P + + + + A + GQSL++GG + + + +K+PLLGDIP +G LFR +
Sbjct: 436 EGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELT 495

Query: 477 HSVIRLFLIKASVVNNGISH 496
+RLF+I+ +++ GI+H
Sbjct: 496 RRTVRLFIIEPRIIDEGIAH 515


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11735HTHFIS686e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 6e-14
Identities = 31/156 (19%), Positives = 57/156 (36%), Gaps = 13/156 (8%)

Query: 691 ILLVDDADINRDIIGKMLVSLGQHVTIAASSNEALTLSQQQRFDLVLIDIRMPEIDGIEC 750
IL+ DD R ++ + L G V I +++ DLV+ D+ MP+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 751 VQLWHDEPNNLDPDCMFVALSASVATEDIHRCKKNGIHHYITKPVTLATLARYISIAAEY 810
+ PD + +SA + + G + Y+ KP L L
Sbjct: 66 LP----RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG-------- 113

Query: 811 QLLRNIELQEQDPSRCSALLAT-DDMVINSKIFQSL 845
+ R + ++ PS+ +V S Q +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11740HTHFIS666e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 6e-15
Identities = 28/119 (23%), Positives = 50/119 (42%), Gaps = 2/119 (1%)

Query: 1 MKEYKILLVDDHEIIINGIMNALLPWPHFKIVEHVKNGLEVYNACCAYEPDILILDLSLP 60
M IL+ DD I + AL + V N ++ A + D+++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GINGLDIIPQLHQRWPAMNILVYTAYQQEYMTIKTLAAGANGYVLKSSSQQVLLAALQT 119
N D++P++ + P + +LV +A IK GA Y+ K L+ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11760HTHFIS842e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 2e-21
Identities = 31/127 (24%), Positives = 56/127 (44%)

Query: 2 ATIHLLDDDTAVTNACAFLLESLGYDVKCWTQGADFLAQASLYQAGVVLLDMRMPVLDGQ 61
ATI + DDD A+ L GYDV+ + A + +V+ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 GVHDALRQCGSTLAVVFLTGHGDVPMAVEQMKRGAVDFLQKPVSVKPLQAALERALTVSS 121
+ +++ L V+ ++ A++ ++GA D+L KP + L + RAL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 AAVARRE 128
++ E
Sbjct: 124 RRPSKLE 130


34GX95_11870GX95_11900Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_11870121-4.171325hypothetical protein
GX95_11880023-5.394460hypothetical protein
GX95_11885021-4.756427MFS transporter
GX95_11890020-5.355151MFS transporter
GX95_11895-119-4.257471quinate/shikimate dehydrogenase
GX95_11900018-3.4195223-dehydroquinate dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11885TCRTETA300.015 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.015
Identities = 23/103 (22%), Positives = 43/103 (41%), Gaps = 3/103 (2%)

Query: 8 TAVGLYFNYFVHGMGVILMSLNMSSLEQQWHTSAAGVSIVISSLGIGRLSVLLIA---GM 64
+ + + +G+ L+ + L + S + L + L A G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 65 LSDRFGRRPFIILGTACYLIFFIGILYAQTIFVAYACGFLAGM 107
LSDRFGRRP +++ A + + + A ++V Y +AG+
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11890TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 8e-04
Identities = 32/166 (19%), Positives = 71/166 (42%), Gaps = 7/166 (4%)

Query: 23 FLHGMSVITLAQNMTSLAQKFSTDSAGIAYLISGIGLGRLVSILFFGVLSDKFGRRAIIL 82
F ++ + L ++ +A F+ A ++ + L + +G LSD+ G + ++L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 83 LGAVLYML----FFFGIPASPNLMIAFILAVCVGVANSALDTGGYPALMECFPKASGSAV 138
G ++ F G L++A + A AL + G A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY--IPKENRGKAF 141

Query: 139 ILVKAMVSFGQMIYPLIVSALLVNHIWYGYAVVIPGILFVLITLML 184
L+ ++V+ G+ + P I ++ ++I + Y ++IP I + + ++
Sbjct: 142 GLIGSIVAMGEGVGPAI-GGMIAHYIHWSYLLLIPMITIITVPFLM 186


35GX95_11970GX95_12285Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_11970221-0.989735EAL domain-containing protein
GX95_11975219-1.171530endopeptidase
GX95_11980219-0.354285vitamin B12 ABC transporter ATP-binding protein
GX95_11985120-1.824988glutathione peroxidase
GX95_11990123-2.063608vitamin B12 ABC transporter permease BtuC
GX95_11995232-7.352579hypothetical protein
GX95_12000332-7.725554hypothetical protein
GX95_12005232-7.295583integrase
GX95_12010742-11.104630transcriptional regulator
GX95_12015738-11.577889DNA-binding protein
GX95_12020741-11.669707hypothetical protein
GX95_12025640-8.879082hypothetical protein
GX95_12030540-7.538739hypothetical protein
GX95_12035438-5.543373hypothetical protein
GX95_12040432-4.074588hypothetical protein
GX95_12045333-3.344097hypothetical protein
GX95_12050130-2.392137hypothetical protein
GX95_12055227-0.004672hypothetical protein
GX95_120602260.502747hypothetical protein
GX95_12065225-0.695324hypothetical protein
GX95_12070127-0.979135hypothetical protein
GX95_12075126-1.676081hypothetical protein
GX95_12080123-1.530766replication protein
GX95_12085-120-2.280095hypothetical protein
GX95_12090019-1.845742hypothetical protein
GX95_12095019-0.504355toxin YafO
GX95_121002210.769579hypothetical protein
GX95_121052242.033369phage portal protein
GX95_121103262.318648oxidoreductase
GX95_121155312.476609phage capsid protein
GX95_121204343.111144phage major capsid protein, P2 family
GX95_121254324.174392terminase
GX95_121305284.653169capsid assembly protein
GX95_121355263.428310phage tail protein
GX95_121403253.727411phage holin, lambda family
GX95_121454253.790586muraminidase
GX95_121504243.601348lysis protein
GX95_121553243.469606sialate O-acetylesterase
GX95_121601251.910204phage tail protein
GX95_121653252.615431phage virion morphogenesis protein
GX95_121703241.599274baseplate assembly protein
GX95_121753241.125681baseplate assembly protein
GX95_121801202.422190baseplate assembly protein
GX95_121852202.218252phage tail protein I
GX95_121901192.150505hypothetical protein
GX95_121951191.958276phage tail protein
GX95_122001222.578915oxidoreductase
GX95_122051212.616386phage tail tape measure protein
GX95_122102250.791575phage tail protein
GX95_122152230.591039phage tail protein
GX95_122201220.300741phage major tail tube protein
GX95_12225222-0.032691phage tail protein
GX95_12230013-0.399905hypothetical protein
GX95_12235017-0.687314hypothetical protein
GX95_12240020-0.722536hok/gef family protein
GX95_12245022-0.654173hok/gef family protein
GX95_12250121-1.051488integration host factor subunit alpha
GX95_12255019-1.479798phenylalanine--tRNA ligase subunit beta
GX95_12260019-2.535951phenylalanine--tRNA ligase subunit alpha
GX95_12265022-4.38313150S ribosomal protein L20
GX95_12270020-4.75157850S ribosomal protein L35
GX95_12275021-4.521119translation initiation factor IF-3
GX95_12280-119-3.070704threonine--tRNA ligase
GX95_12285022-3.217919endonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11980PF05272300.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.015
Identities = 10/22 (45%), Positives = 13/22 (59%)

Query: 28 ILHLVGPNGAGKSTLLARMAGL 49
+ L G G GKSTL+ + GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12200PF03944290.011 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 28.9 bits (64), Expect = 0.011
Identities = 10/31 (32%), Positives = 17/31 (54%)

Query: 72 TQAYTGRPWPLIDGVGQIYGMYVITGLKTTR 102
TQ++T + WP + + Q+ YV+ G R
Sbjct: 285 TQSFTSQDWPFLYSLFQVNSNYVLNGFSGAR 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12205GPOSANCHOR340.004 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.9 bits (77), Expect = 0.004
Identities = 34/267 (12%), Positives = 69/267 (25%), Gaps = 39/267 (14%)

Query: 5 DIRVSFSAIDKLTRPVETARQSVGSLADSLKKTQADIKSLGTQSRAFSR----LRENFTR 60
+ I L L +L+ + + + L
Sbjct: 170 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKAD 229

Query: 61 TTEKIQKTQRTLNGLRQSQQAGNAMTDKQREHIAQLAAKLDRLNEVRTREKEKLREASRE 120
+ ++ + A A+L L+ T + K++ E
Sbjct: 230 LEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 289

Query: 121 MVRHGITLSGSDRTIQSAIRRTEQYTHTLDAERQMLAR-VTKARAQYDRMQQVAGRLRGG 179
+ + Q L+A RQ L R + +R +++ +L
Sbjct: 290 KAALEAEKADLEHQSQV-----------LNANRQSLRRDLDASREAKKQLEAEHQKL--- 335

Query: 180 GAVALGAATAAGYGAGQFLAPAVSFDREVSRVGALTRLDKSDPQFAALREQAKKLGAETQ 239
+ A SR LD S L + +KL + +
Sbjct: 336 -EEQNKISEA-------------------SRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375

Query: 240 FTSRDAASGQAFLAMAGFTPQAIQAAL 266
+ S + L + + ++ AL
Sbjct: 376 ISEASRQSLRRDLDASREAKKQVEKAL 402


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12240HOKGEFTOXIC371e-07 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 36.7 bits (85), Expect = 1e-07
Identities = 15/47 (31%), Positives = 26/47 (55%), Gaps = 2/47 (4%)

Query: 1 MSQKPLKTA--IICITAVLIIWMLHGSLCELRMRLGGAEFAAFLQCK 45
+ + L I+C+T ++ ++ SLCE+R R G E AAF+ +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYE 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12245HOKGEFTOXIC392e-08 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 38.7 bits (90), Expect = 2e-08
Identities = 14/47 (29%), Positives = 27/47 (57%), Gaps = 2/47 (4%)

Query: 1 MSQKSL--STIAICIAVVLIIWMLRGSLCELHMRLGGAEFAAFLQCK 45
+ + SL + +C+ +++ ++ R SLCE+ R G E AAF+ +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYE 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12250DNABINDINGHU1196e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (299), Expect = 6e-39
Identities = 34/89 (38%), Positives = 55/89 (61%)

Query: 4 TKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQRPGR 63
K ++ + + L+K+D+ V+ F + L GE+V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEDIPITARRVVTFRPGQKLKSRVE 92
NP+TGE+I I A +V F+ G+ LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


36GX95_12350GX95_12435Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_12350-115-4.592660carbohydrate deacetylase
GX95_12355-114-5.3496236-phospho-beta-glucosidase
GX95_12360014-4.171004transcriptional regulator ChbR
GX95_12365116-1.417748PTS N N'-diacetylchitobiose transporter subunit
GX95_12370215-1.481899PTS N,N'-diacetylchitobiose transporter subunit
GX95_123752150.461593PTS sugar transporter subunit IIB
GX95_123801171.579209transcriptional regulator
GX95_123851172.984193NAD(+) synthase
GX95_123900173.280064endonuclease
GX95_123950163.242263ATP-independent periplasmic protein-refolding
GX95_124000163.289592succinylglutamate desuccinylase
GX95_124050143.112196N-succinylarginine dihydrolase
GX95_124100143.156480succinylglutamate-semialdehyde dehydrogenase
GX95_124150132.516493arginine N-succinyltransferase
GX95_124200112.658671aspartate aminotransferase family protein
GX95_124250122.993513exodeoxyribonuclease III
GX95_124300123.544269NTP pyrophosphohydrolase
GX95_124350143.278167hypothetical protein
37GX95_12495GX95_12785Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_12495-211-3.238700aldo/keto reductase
GX95_12500-111-3.587213anaerobic sulfatase maturase
GX95_12505010-3.632672MltA-interacting protein MipA
GX95_12510012-3.249199PrkA family serine protein kinase
GX95_12515-215-2.530711hypothetical protein
GX95_12520-117-0.756193diguanylate cylase
GX95_125250172.323608hypothetical protein
GX95_125300202.552610hypothetical protein
GX95_125351212.955700hypothetical protein
GX95_125402202.049676AraC family transcriptional regulator
GX95_125452220.649904hypothetical protein
GX95_12550127-3.051735MarR family transcriptional regulator
GX95_12555327-3.378532hypothetical protein
GX95_12560226-4.365199hypothetical protein
GX95_12565226-5.677383hypothetical protein
GX95_12570127-5.703659hypothetical protein
GX95_12575026-5.857556hypothetical protein
GX95_12580-125-6.824980DUF1869 domain-containing protein
GX95_12585-229-8.721580hypothetical protein
GX95_12590028-7.249500leucine efflux protein LeuE
GX95_12595028-7.451415chorismate mutase
GX95_12600127-7.036374histidine kinase
GX95_12605023-5.902406transcriptional regulator
GX95_12610-221-4.320561hypothetical protein
GX95_12615018-1.630280aminoglycoside resistance protein
GX95_12620119-1.176399metal-binding protein ZinT
GX95_126301170.006548*four-helix bundle copper-binding protein
GX95_126351150.249817mechanosensitive ion channel protein MscS
GX95_126402170.549407peptide ABC transporter ATP-binding protein
GX95_12645317-0.579170peptide ABC transporter ATP-binding protein
GX95_12650217-1.260773peptide ABC transporter permease
GX95_12655118-2.161670peptide ABC transporter permease
GX95_12660020-3.366322nickel ABC transporter substrate-binding
GX95_12665129-6.301843hypothetical protein
GX95_12670125-5.806926cytochrome b
GX95_12675325-6.020808hypothetical protein
GX95_12680531-9.958516hypothetical protein
GX95_12685733-10.156224heat-shock protein
GX95_12690836-10.945512hypothetical protein
GX95_12695835-10.077274hypothetical protein
GX95_12700736-11.878188hypothetical protein
GX95_12705636-11.399429lysozyme inhibitor
GX95_12715438-10.636057*hypothetical protein
GX95_12720538-10.397502hypothetical protein
GX95_12725339-9.960064hypothetical protein
GX95_12730437-10.458733virulence factor
GX95_12735643-7.985578cold-shock protein
GX95_12740642-8.919769lipoprotein EnvE
GX95_12745741-8.909006DinI family protein
GX95_12750746-10.411691transposase
GX95_12755849-11.292319cytolethal distending toxin subunit CdtB
GX95_12760753-12.293261phage tail protein
GX95_12765749-12.780944hypothetical protein
GX95_12770436-9.494436pertussis toxin-like subunit ArtA
GX95_12775131-8.395759subtilase cytotoxin subunit B
GX95_12780-124-6.876765hypothetical protein
GX95_12785-119-5.556496IS200/IS605 family transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12535PRTACTNFAMLY280.012 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.012
Identities = 17/59 (28%), Positives = 25/59 (42%)

Query: 49 QGLTVGIIILTIGVMAPIASGTLPPSTLIHSFVNWKSLVAIAVGVFVSWLGGRGITLMG 107
Q + L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDG 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12565HTHTETR280.002 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.002
Identities = 8/37 (21%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTIIL 35
+ I+ G I+G++ W+ K ++ I+L
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12670TYPE3IMSPROT270.047 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 27.0 bits (60), Expect = 0.047
Identities = 17/61 (27%), Positives = 32/61 (52%), Gaps = 5/61 (8%)

Query: 77 PHWQHVAAKLMHIALYFTFLALPLLGVAMMASGGKSWSFFGFTVPVFLTPDSTLKSDIKR 136
P Q ++ + ++ L F +L PLL VA + + +GF ++ ++ +K DIK+
Sbjct: 67 PFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFL----ISGEA-IKPDIKK 121

Query: 137 I 137
I
Sbjct: 122 I 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12720ENTEROVIROMP1909e-65 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 190 bits (485), Expect = 9e-65
Identities = 66/187 (35%), Positives = 92/187 (49%), Gaps = 18/187 (9%)

Query: 1 MKNIILSTLVITTGVLVVNVAQADTNAFSVGYAQSKVQDFKN-IRGVNVKYRYE-DDSPV 58
MK I + + + A T+ + GYAQS Q N + G N+KYRYE D+SP+
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPL 60

Query: 59 SFISSLSYLYGDSQASGSVESEGIHYHDKFEVKYGSFMVGPAYRLSDNFSLYALAGVGTV 118
I S +Y AS D + +Y GPAYR++D S+Y + GVG
Sbjct: 61 GVIGSFTYTEKSRTASSG---------DYNKNQYYGITAGPAYRINDWASIYGVVGVGYG 111

Query: 119 KATFKEHSTQDGDSFSNKISSRKTGFAWGAGVQMNPLENIVVDVGYEGSNISSTKINGFN 178
K E+ T D+ GF++GAG+Q NP+EN+ +D YE S I S + +
Sbjct: 112 KFQTTEYPTYKHDT-------SDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWI 164

Query: 179 VGVGYRF 185
GVGYRF
Sbjct: 165 AGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12760cdtoxinb294e-103 Cytolethal distending toxin B signature.
		>cdtoxinb#Cytolethal distending toxin B signature.

Length = 269

Score = 294 bits (755), Expect = e-103
Identities = 125/276 (45%), Positives = 165/276 (59%), Gaps = 16/276 (5%)

Query: 1 MKKPVFFLLTMIICSYISFACANISDYKVMTWNLQGSSASTESKWNVNVRQLLSGTAGVD 60
MKK + L+ + S+ + A +++D++V TWNLQG+SA+TESKWN+NVRQL+SG VD
Sbjct: 1 MKKYIISLI--VFLSFYAQA--DLTDFRVATWNLQGASATTESKWNINVRQLISGENAVD 56

Query: 61 ILMVQEAGAVPTSAVPTGRHIQPFGVGIPIDEYTWNLGTTSRQDIRYIYHSAIDVGARRV 120
IL VQEAG+ P++AV TG I GIP+ E WNL T SR YIY SA+D RV
Sbjct: 57 ILAVQEAGSPPSTAVDTGTLIP--SPGIPVRELIWNLSTNSRPQQVYIYFSAVDALGGRV 114

Query: 121 NLAIVSRQRADNVYVLRPTTVASRPVIGIGLGNDVFLTAHALASGGPDAAAIVRVTINFF 180
NLA+VS +RAD V+VL P RP++GI +GND F TAHA+A DA A+V NFF
Sbjct: 115 NLALVSNRRADEVFVLSPVRQGGRPLLGIRIGNDAFFTAHAIAMRNNDAPALVEEVYNFF 174

Query: 181 RQ---PQMRHLSWFLAGDFNRSPDRLENDLMTEHLERVVAVLAPTEPTQIGGGILDYGVI 237
R P + L+W + GDFNR P LE +L T + R +++P TQ LDY V
Sbjct: 175 RDSRDPVHQALNWMILGDFNREPADLEMNL-TVPVRRASEIISPAAATQTSQRTLDYAVA 233

Query: 238 VDRAPYSQR------VEALRNPQLASDHYPVAFLAR 267
+ + V R Q++SDH+PV R
Sbjct: 234 GNSVAFRPSPLQAGIVYGARRTQISSDHFPVGVSRR 269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12775BORPETOXINA804e-20 Bordetella pertussis toxin A subunit signature.
		>BORPETOXINA#Bordetella pertussis toxin A subunit signature.

Length = 269

Score = 80.2 bits (197), Expect = 4e-20
Identities = 56/170 (32%), Positives = 85/170 (50%), Gaps = 12/170 (7%)

Query: 22 VYRVDSTPPDVIFRDGFSLLGYNRNFQQFISGRSCSGGSSDSRYIATTSSVNQT------ 75
VYR DS PP+ +F++GF+ G N N ++GRSC GSS+S +++T+SS T
Sbjct: 41 VYRYDSRPPEDVFQNGFTAWGNNDNVLDHLTGRSCQVGSSNSAFVSTSSSRRYTEVYLEH 100

Query: 76 ---YAIARAYYSRSTFKGNLYRYQIRADNNFYSLLPS-ITYLETQGGHFN-AYEKTMMRL 130
A+ R T Y Y++RADNNFY S Y++T G + +
Sbjct: 101 RMQEAVEAERAGRGTGHFIGYIYEVRADNNFYGAASSYFEYVDTYGDNAGRILAGALATY 160

Query: 131 QREYVSTLSILPENIQKAVALVYDSATGLVKDGVSTMNSSYLGLSTTSNP 180
Q EY++ I PENI++ + ++ TG N+ Y+ T +NP
Sbjct: 161 QSEYLAHRRIPPENIRRVTRVYHNGITGETTT-TEYSNARYVSQQTRANP 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_12780BORPETOXINB324e-04 Bordetella pertussis toxin B subunit signature.
		>BORPETOXINB#Bordetella pertussis toxin B subunit signature.

Length = 226

Score = 31.9 bits (72), Expect = 4e-04
Identities = 29/101 (28%), Positives = 41/101 (40%), Gaps = 7/101 (6%)

Query: 30 TNAYYSDEVISEIHVGQIDTSPYFCIKTVKANGSGTPVV-ACAVSKQSIWAPSFKELLDQ 88
T+ YYS+ + + T+ C V+ SG PV+ AC + + L
Sbjct: 126 TDHYYSNVTATRLLS---STNSRLCAVFVR---SGQPVIGACTSPYDGKYWSMYSRLRKM 179

Query: 89 ARYFYSTGQSVRIHVQKNIWTYPLFVNAFSANALVGLSSCS 129
Y G SVR+HV K Y F AL G+S C+
Sbjct: 180 LYLIYVAGISVRVHVSKEEQYYDYEDATFETYALTGISICN 220


38GX95_13065GX95_13130Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_13065215-1.302943hypothetical protein
GX95_13070215-1.789068septum formation inhibitor Maf
GX95_130754131.675770hypothetical protein
GX95_130804131.484546hypothetical protein
GX95_130853131.00473123S rRNA pseudouridine(955/2504/2580) synthase
GX95_130902141.972345hypothetical protein
GX95_130951152.312525ribonuclease E
GX95_131001152.492150flagellar hook-filament junction protein FlgL
GX95_13110-2142.294690flagellar hook-associated protein FlgK
GX95_13115-1163.304501flagellar rod assembly protein/muramidase FlgJ
GX95_131201153.168066flagellar biosynthesis protein FlgA
GX95_131252142.660115flagellar basal body L-ring protein
GX95_131302152.567161flagellar basal-body rod protein FlgG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13100IGASERPTASE552e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 54.7 bits (131), Expect = 2e-09
Identities = 49/259 (18%), Positives = 92/259 (35%), Gaps = 26/259 (10%)

Query: 513 PSEEEYAERKRPEQPALATFAMPDVPPAPTPVEPAVSVATAKKDNVAAAQPAQPGLFSRF 572
P E+ + DVP P+ + A+ D PA P S
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSN-----NEEIARVDEAPVPPPA-PATPSET 1036

Query: 573 LNALKQLFSGEETKTVETAAPKAEEKAERQQDRRKPRQNNRRDRNERRDTRDNRAGRDGG 632
S +E+KTVE A E + ++ K ++N + +T+ N + G
Sbjct: 1037 TE-TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKA-----NTQTNEVAQSGS 1090

Query: 633 ESRDDNRRNRRQAQQQNAEAR---DTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
E+++ ++ E + +T + + KV + Q +P++E+S A
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS----QVSPKQEQSETVQPQAEPA 1146

Query: 690 QQEVKALNREELPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
++ +N +E Q + QP ++ N + T S V T ++ V E
Sbjct: 1147 RENDPTVNIKEPQSQTNTTAD---TEQPAKETSS-NVEQPVTESTTVNTGNSVVENPENT 1202

Query: 750 PVENVEQPVPAPRTELAKV 768
+ P +E +
Sbjct: 1203 TPATTQ---PTVNSESSNK 1218



Score = 39.3 bits (91), Expect = 1e-04
Identities = 51/372 (13%), Positives = 88/372 (23%), Gaps = 47/372 (12%)

Query: 630 DGGESRDDNRRNRRQAQQQNAEARDTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
D G + R + N E Q + T + Q S +
Sbjct: 963 DLGAWKYKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDE 1022

Query: 690 QQEVKALNREELPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
ET E Q+ + K Q + N V + V +
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE-AKSNVKANTQ 1081

Query: 750 PVENVEQPVPAPRTELAKVDLPVVADIAPEQDDSVEPRDNTGMPRRSRRSPRHLRVSGQR 809
E + T+ + A + E+ VE +++ P+ +
Sbjct: 1082 TNEVAQSGSETKETQTTETKET--ATVEKEEKAKVETE-------KTQEVPKVTSQVSPK 1132

Query: 810 RRRYRDERYPTQSPMPLTVACASPEMASGKVWIRYPIVRPQETQVVEEQREADLALPQPV 869
+ + + + P V +E Q + AD P
Sbjct: 1133 QEQSETVQPQAEPARE-----------------NDPTVNIKEPQS-QTNTTADTEQPAKE 1174

Query: 870 VAEPQVIAATVALEPQASVQAVENVAVEPQTVAEPQAPEVVEVETTHPEVIAAPVDEQPQ 929
Q V + + PE TT P V + ++
Sbjct: 1175 T-------------SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 930 LIAESDTPEAQEVIA------DAEPVAETADASITVAENVADVVVVEPEEETKAEAAVVE 983
S V D VA S ++D AV +
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQ 1281

Query: 984 HTAEETVIATAQ 995
H ++ + Q
Sbjct: 1282 HISQLEMNNEGQ 1293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13105FLAGELLIN414e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 4e-06
Identities = 30/138 (21%), Positives = 59/138 (42%)

Query: 1 MRISTQMMYEQNMSGITNSQAEWMKLGEQMSTGKRVTNPSDDPIAASQAVVLSQAQAQNS 60
I+T + + + SQ+ E++S+G R+ + DD + A + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYALARTFATQKVSLEESVLSQVTTAIQTAQEKIVYAGNGTLSDDDRASLATDLQGIRDQ 120
Q + E L+++ +Q +E V A NGT SD D S+ ++Q ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 LMNLANSTDGNGRYIFAG 138
+ ++N T NG + +
Sbjct: 122 IDRVSNQTQFNGVKVLSQ 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13110FLGHOOKAP16640.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 664 bits (1714), Expect = 0.0
Identities = 438/553 (79%), Positives = 487/553 (88%), Gaps = 8/553 (1%)

Query: 2 SSLINHAMSGLNAAQAALNTVSNNINNYNVAGYTRQTTILAQANSTLGAGGWIGNGVYVS 61
SSLIN+AMSGLNAAQAALNT SNNI++YNVAGYTRQTTI+AQANSTLGAGGW+GNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRGAQNQSSGLTTRYEQMSKIDNLLADKSSSLSGSLQSFFTSLQTLV 121
GVQREYDAFITNQLR AQ QSSGLT RYEQMSKIDN+L+ +SSL+ +Q FFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKAEGLVNQFKTTDQYLRDQDKQVNIAIGSSVAQINNYAKQIANLND 181
SNAEDPAARQALIGK+EGLVNQFKTTDQYLRDQDKQVNIAIG+SV QINNYAKQIA+LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRMTGVGAGASPNDLLDQRDQLVSELNKIVGVEVSVQDGGTYNLTMANGYTLVQGSTA 241
QISR+TGVGAGASPN+LLDQRDQLVSELN+IVGVEVSVQDGGTYN+TMANGY+LVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPTRTTVAYVDEAAGNIEIPEKLLNTGSLGGLLTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADP+RTTVAYVD AGNIEIPEKLLNTGSLGG+LTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFADAFNAQHTKGYDADGNKGKDFFSIGSPVVYSNSNNADKTVSLTAKVVDSTKVQAT 361
ALAFA+AFN QH G+DA+G+ G+DFF+IG P V N+ N V++ A V D++ V AT
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGD-VAIGATVTDASAVLAT 359

Query: 362 DYKIVFDGTDWQVTRTADNTTFTATKDADGKLEIDGLKVTVGTGAQKNDSFLLKPVSNAI 421
DYKI FD WQVTR A NTTFT T DA+GK+ DGL++T NDSF LKPVS+AI
Sbjct: 360 DYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAI 419

Query: 422 VDMNVKVTNEAEIAMASESKLDPDVDTGDSDNRNGQALLDLQ-NSNVVGGNKTFNDAYAT 480
V+M+V +T+EA+IAMASE D GDSDNRNGQALLDLQ NS VGG K+FNDAYA+
Sbjct: 420 VNMDVLITDEAKIAMASEE------DAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSDVGNKTSTLKTSSTTQANVVKQLYKQQQSVSGVNLDEEYGNLQRYQQYYLANAQVLQ 540
LVSD+GNKT+TLKTSS TQ NVV QL QQQS+SGVNLDEEYGNLQR+QQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 TANALFDALLNIR 553
TANA+FDAL+NIR
Sbjct: 534 TANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13115FLGFLGJ4990.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 499 bits (1285), Expect = 0.0
Identities = 263/316 (83%), Positives = 289/316 (91%), Gaps = 3/316 (0%)

Query: 1 MIGDGKLLASAAWDAQSLNELKAKAGQDPAANIRPVARQVEGMFVQMMLKSMREALPKDG 60
MI D KLLASAAWDAQSLNELKAKAG+DPAANIRPVARQVEGMFVQMMLKSMR+ALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSDQTRLYTSMYDQQIAQQMTAGKGLGLADMMVKQMTGGQTMPADDAPQVPLKFSLET 120
LFSS+ TRLYTSMYDQQIAQQMTAGKGLGLA+MMVKQMT Q +P + P P+KF LET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VNSYQNQALTQLVRKAIPKTPDSSDAPLSGDSKDFLARLSLPARLASEQSGVPHHLILAQ 180
V YQNQAL+QLV+KA+P+ D S L GDSK FLA+LSLPA+LAS+QSGVPHHLILAQ
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDS---LPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQ 177

Query: 181 AALESGWGQRQILRENGEPSYNVFGVKATASWKGPVTEITTTEYENGEAKKVKAKFRVYS 240
AALESGWGQRQI RENGEPSYN+FGVKA+ +WKGPVTEITTTEYENGEAKKVKAKFRVYS
Sbjct: 178 AALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYS 237

Query: 241 SYLEALSDYVALLTRNPRYAAVTTAATAEQGAVALQNAGYATDPNYARKLTSMIQQLKAM 300
SYLEALSDYV LLTRNPRYAAVTTAA+AEQGA ALQ+AGYATDP+YARKLT+MIQQ+K++
Sbjct: 238 SYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSI 297

Query: 301 SEKVSKTYSANLDNLF 316
S+KVSKTYS N+DNLF
Sbjct: 298 SDKVSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13120FLGPRINGFLGI429e-153 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 429 bits (1104), Expect = e-153
Identities = 153/362 (42%), Positives = 215/362 (59%), Gaps = 9/362 (2%)

Query: 5 LAGIVLALVTTLAHAERIRDLTSVQGVRENSLIGYGLVVGLDGTGDQTTQTPFTTQTLNN 64
A L+ A RI+D+ S+Q R+N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 14 SALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRA 73

Query: 65 MLSQLGITVPTGTNMQLKNVAAVMVTASYPPFARQGQTIDVVVSSMGNAKSLRGGTLLMT 124
ML LGIT G + KN+AAVMVTA+ PPFA G +DV VSS+G+A SLRGG L+MT
Sbjct: 74 MLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMT 132

Query: 125 PLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAIIERELPTQFGAGNT 184
L G D Q+YA+AQG ++V G A +++ R+ NGAIIERELP++F
Sbjct: 133 SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVN 192

Query: 185 INLQLNDEDFTMAQQITDAINRAR----GYGSATALDARTVQVRVPSGNSSQVRFLADIQ 240
+ LQL + DF+ A ++ D +N G A D++ + V+ P + R +A+I+
Sbjct: 193 LVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEIE 251

Query: 241 NMEVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQLNVNQPNTPFGGG 300
N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF G
Sbjct: 252 NLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRG 309

Query: 301 QTVVTPQTQIDLRQSGGSLQSVRSSANLNSVVRALNALGATPMDLMSILQSMQSAGCLRA 360
QT V PQT I Q G + ++ +L ++V LN++G +++ILQ ++SAG L+A
Sbjct: 310 QTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQA 368

Query: 361 KL 362
+L
Sbjct: 369 EL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13125FLGLRINGFLGH353e-127 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 353 bits (908), Expect = e-127
Identities = 211/232 (90%), Positives = 223/232 (96%)

Query: 1 MQKYALHAYPVMALMVATLTGCAWIPAKPLVQGATTAQPIPGPVPVANGSIFQSAQPINY 60
MQK A H Y + +L+V +LTGCAWIP+ PLVQGAT+AQP+PGP PVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTSFGFDTVPRYLQGLFGNS 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKT+FGFDTVPRYLQGLFGN+
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADMEASGGNSFNGKGGANASNTFSGTLTVTVDQVLANGNLHVVGEKQIAINQGTEFIRF 180
RAD+EASGGN+FNGKGGANASNTFSGTLTVTVDQVL NGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNSVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSN+VPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13130FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


39GX95_13290GX95_13455Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_13290031-6.232040O-acetyl-ADP-ribose deacetylase
GX95_13295031-8.345927hypothetical protein
GX95_13300-133-8.724444Curli assembly protein CsgC
GX95_13305027-7.646445curlin
GX95_13310026-7.440935curlin subunit CsgB
GX95_13315-122-5.926494transcriptional regulator CsgD
GX95_13320-217-3.626852curli assembly protein CsgE
GX95_13325014-1.865673curli assembly protein CsgF
GX95_13330015-1.439921curli production assembly/transport protein
GX95_13335017-2.173713hypothetical protein
GX95_13340020-3.901326molecular chaperone
GX95_13345025-6.183410phosphatase
GX95_13350133-8.315763bifunctional glyoxylate/hydroxypyruvate
GX95_13355238-9.235984hypothetical protein
GX95_13370233-8.892492*oxidoreductase
GX95_13375333-8.654021MFS transporter
GX95_13380130-7.916008hypothetical protein
GX95_13385127-6.576042N-acetylneuraminic acid mutarotase
GX95_13390016-3.316640N-acetylmannosamine-6-phosphate 2-epimerase
GX95_13395-380.532176acetylneuraminate ABC transporter
GX95_13400-292.325909N-acetylmannosamine kinase
GX95_13405-1113.055025phosphate starvation protein PhoH
GX95_13410-1113.577865phosphate starvation-inducible protein PhoH
GX95_13415-1114.005956sodium/proline symporter
GX95_134201134.441531trifunctional transcriptional regulator/proline
GX95_134250121.973150hypothetical protein
GX95_13430-2132.640397pyrimidine utilization regulatory protein R
GX95_13435-2142.117606stress-induced protein
GX95_13440-1143.081645NAD(P)H:quinone oxidoreductase, type IV
GX95_13445-1163.125282hypothetical protein
GX95_13450-1142.952637bifunctional glucose-1-phosphatase/inositol
GX95_13455-1153.175287protein disulfide oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13375TCRTETA515e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 5e-09
Identities = 53/253 (20%), Positives = 92/253 (36%), Gaps = 24/253 (9%)

Query: 56 AFLATAAFIGRPFGGALFGLLADKFGRKPLMMWSIVAYSVGTGLSGLASGVIMLTLSRFI 115
A A F P GAL +D+FGR+P+++ S+ +V + A + +L + R +
Sbjct: 50 ALYALMQFACAPVLGAL----SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 116 VGMGMAGEYACASTYAVESWPKHLKSKASAFLVSGFGIGNIIAAYFMPSFAEAYGWRAAF 175
G+ A + A + +++ F+ + FG G ++A + + A F
Sbjct: 106 AGITGATGAVAGAYIA-DITDGDERARHFGFMSACFGFG-MVAGPVLGGLMGGFSPHAPF 163

Query: 176 FV-GLLPVLLVIYIRARAPESKEWEE--AKLSGLGKHSQSAWSVFSLSMKGLFNRA---- 228
F L L + PES + E + L + W+ + L
Sbjct: 164 FAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQ 223

Query: 229 ---QFPLTLCVFIVLFSIFGANWPIFGLLPTYLAGEGFDTGVVSNLMTAAAFGTVLGN-- 283
Q P L V+F +W + LA G ++ M LG
Sbjct: 224 LVGQVPAAL---WVIFGEDRFHWDA-TTIGISLAAFGI-LHSLAQAMITGPVAARLGERR 278

Query: 284 -IVWGLCADRIGL 295
++ G+ AD G
Sbjct: 279 ALMLGMIADGTGY 291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13430HTHTETR624e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 4e-14
Identities = 30/158 (18%), Positives = 58/158 (36%), Gaps = 8/158 (5%)

Query: 20 RQLILTAALAVFSQYGIHGARLEQVAERAGVSKTNLLYYYPSKEALYVAVMRQILDVWLA 79
RQ IL AL +FSQ G+ L ++A+ AGV++ + +++ K L+ +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 80 PLKAFRAEF--SPLEAIKEYIRLKLEVSRDYPQASRLF-CMEMLAGAPLLMEELTGDLKA 136
++A+F PL ++E + LE + + L + M + +
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 137 LIDEKSALIAGWVHSG-----KLAPVSPHHLIFMIWAA 169
L E I + A + ++
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGY 170


40GX95_13500GX95_13600Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_135000233.3815604-hydroxyphenylacetate permease
GX95_13505-1223.9612462,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase
GX95_13510-2203.0111532-oxo-hepta-3-ene-1,7-dioic acid hydratase
GX95_13515-3181.7933075-carboxymethyl-2-hydroxymuconate
GX95_13520-3161.9104003,4-dihydroxyphenylacetate 2,3-dioxygenase
GX95_13525-2131.6258665-carboxymethyl-2-hydroxymuconate semialdehyde
GX95_13530-212-0.3064764-hydroxyphenylacetate degradation protein
GX95_13535-115-1.428391homoprotocatechuate degradation operon
GX95_13540017-2.2374364-hydroxyphenylacetate 3-monooxygenase,
GX95_13545023-3.0159744-hydroxyphenylacetate 3-monooxygenase,
GX95_13550025-3.818066hydroxyisourate hydrolase
GX95_13555025-4.582033DNA-binding response regulator
GX95_13560025-5.215267two-component sensor histidine kinase
GX95_13565126-6.990991dipeptidase
GX95_13570430-8.332770hypothetical protein
GX95_13575237-9.784877DUF3950 domain-containing protein
GX95_13580338-10.015951inositol phosphatase
GX95_13585435-9.900271type III secretion chaperone protein SigE
GX95_13590333-8.929634effector protein PipB
GX95_13595329-6.863380*BAX inhibitor protein
GX95_13600121-5.048180sulfurtransferase TusE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13555HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 56/117 (47%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQKTIEWVRQGLTEAGYVVDYACDGRDGLHLALQEHYSLIILDIMLPGLDGWQ 61
IL+ +D+ + Q L+ AGY V + L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRALRTAHQS-PVICLTARDSVEDRVKGLEAGANDYLVKPFSFAELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ +K E GA DYL KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13560PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 18/102 (17%), Positives = 38/102 (37%), Gaps = 15/102 (14%)

Query: 348 ILLQRVLSNLLTNAIRYSDENAVIRIESAYDDNVAEIRVANPGSHPADADKLFRRFWRGD 407
+L+Q ++ N + + I + I ++ D+ + V N GS K
Sbjct: 258 MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE-------- 309

Query: 408 NARHTAGFGLGLSLVNA-IALLHGGSASYRYADEHNIFSVRL 448
G GL V + +L+G A + +++ + +
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13580TYPE3OMBPROT6560.0 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 656 bits (1693), Expect = 0.0
Identities = 187/396 (47%), Positives = 254/396 (64%), Gaps = 5/396 (1%)

Query: 166 LNNQPWQTIKNTLTHNGHHYTNTQLPAAEMKIGAKDIFPSAYQGKGVCSWDTRNIHHANN 225
LNN+ W + ++H+G +Y PA+ MKIG K+IF Y GKG+C TR H N
Sbjct: 146 LNNKNWGPVNKNISHHGKNYGFQLTPASHMKIGNKNIFVKEYNGKGICCASTRESDHIAN 205

Query: 226 LWMSTVSVHEDGKDKTLFCGIRHGVLSPYH-EKDPLLRQVGAENKAKEVLTAALFSKPEL 284
+W+S V V ++GK+ +F GIRHGV+S Y +K+ R V A NKA+E+++AAL+S+PEL
Sbjct: 206 MWLSKV-VDDEGKE--IFSGIRHGVISAYGLKKNSSERAVAARNKAEELVSAALYSRPEL 262

Query: 285 LNRALEGEAVSLKLVSVGLLTASNIFGKEGTMVEDQMRAWQSL-TQPGKMIHLKIRNKDG 343
L++AL G+ V LK+VS LLT +++ G E +M++DQ+ A + L ++ G+ L IRN DG
Sbjct: 263 LSQALSGKTVDLKIVSTSLLTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDG 322

Query: 344 DLQTVKIKPDVAAFNVGVNELALKLGFGLKASDRYNAEALHQLLGNDLRPEARPGGWVGE 403
L+ V + V FN GVNELALK+G G + D+ N E++ LLG++ GGW E
Sbjct: 323 LLKEVSVNLKVVTFNFGVNELALKMGLGWRNVDKLNDESICSLLGDNFLKNGVIGGWAAE 382

Query: 404 WLAQYPDNYEVVNTLARQIKDIWKNNLHHKDGGEPYKLAQRLAMLAHEIDAVPAWNCKSG 463
+ + P V LA QIK+I L D GEPYKL+QR+ +LA+ I AVP WNCKSG
Sbjct: 383 AIEKNPPCKNDVIYLANQIKEIINKKLQKNDNGEPYKLSQRMTLLAYTIGAVPCWNCKSG 442

Query: 464 KDRTGMMDSEIKREHISLHQTHMLSAPGSLPDSGGQKIFQKVLLNSGNLEIQKQNTGGAG 523
KDRTGM D+EIKRE I H+T S S S +++F +L+NSGN+EIQ+ NTG G
Sbjct: 443 KDRTGMQDAEIKREIIRKHETGQFSQLNSKLSSEEKRLFSTILMNSGNMEIQEMNTGVPG 502

Query: 524 NKVMKNLSPEVLNLSYQKRVGDENIWQSVKGISSLI 559
NKVMK L L LSY +R+GD IW VKG SS +
Sbjct: 503 NKVMKKLPLSSLELSYSERIGDSKIWNMVKGYSSFV 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13585PF078241651e-56 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 165 bits (419), Expect = 1e-56
Identities = 33/114 (28%), Positives = 63/114 (55%), Gaps = 1/114 (0%)

Query: 1 MESLLNRLYDALGLDAPE-DEPLLIIDDGIQVYFNESDHTLEMCCPFMPLPDDILTLQHF 59
ME L + + ALG+ + + D+ +++DD + +Y + ++ + CPF LP++I L +
Sbjct: 1 MEDLADVICRALGIPSIDTDDQAIMLDDDVLIYIEKEGDSINLLCPFCALPENINDLIYA 60

Query: 60 LRLNYTSAVTIGADADNTALVALYRLPQTSTEEEALTGFELFISNVKQLKEHYA 113
L LNY+ + + D + +L+A L + E+ E +IS V+ LK+ +A
Sbjct: 61 LSLNYSEKICLATDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRWLKDEFA 114


41GX95_13900GX95_13940Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_13900121-3.836407phosphoserine transaminase
GX95_13905223-4.656416hypothetical protein
GX95_13910223-4.876932ribosomal protein S12 methylthiotransferase
GX95_13915323-6.247698formate transporter FocA
GX95_13920220-5.634612formate C-acetyltransferase
GX95_13925527-7.667932pathogenicity island 1 protein SopD2
GX95_13930516-2.840662pyruvate formate lyase-activating protein
GX95_13935415-1.634229transporter
GX95_13940212-0.578048MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13945TCRTETB347e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 7e-04
Identities = 39/158 (24%), Positives = 62/158 (39%), Gaps = 6/158 (3%)

Query: 8 VMLLLCGLLLLT-LAIAVLNTLVPLWLAQANLPTWQVGMVSSSYFTGNLVGTLFTGYLIK 66
+++ LC L + L VLN +P N P V++++ +GT G L
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 67 RIGFNRSYYLASLIFAAGCVGLGVMVGFWSWMSW-RFIAGIGCAMIWVVVESALMCSGTS 125
++G R +I G V V F+S + RFI G G A +V +
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 126 HNRGRLLAAYMMVYYMGTFLGQLLVSKVSGELLHVLPW 163
NRG+ + MG +G + G + H + W
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPA----IGGMIAHYIHW 168


42GX95_14085GX95_14425Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_14085-2173.401460hydroxylamine reductase
GX95_14090-1193.879843NADH oxidoreductase
GX95_141002164.037660pyruvate oxidase
GX95_141053163.589531low-specificity L-threonine aldolase
GX95_14110318-4.216086NAD(P)-dependent oxidoreductase
GX95_14115331-9.492772hypothetical protein
GX95_14120236-11.114131N-acetylmuramoyl-L-alanine amidase
GX95_14125341-12.513787hypothetical protein
GX95_14130440-10.633728hypothetical protein
GX95_14135441-10.686629MFS transporter
GX95_14140438-7.049005phage tail protein
GX95_14145433-4.833993phage tail protein
GX95_14150536-5.193048phage tail protein I
GX95_14155638-5.869760baseplate J protein
GX95_14160839-7.987945hypothetical protein
GX95_14165737-5.671345hypothetical protein
GX95_14170634-5.163553baseplate assembly protein
GX95_14175635-5.957007late control D family protein
GX95_14180534-6.274346phage tail tape measure protein
GX95_14185435-6.436712hypothetical protein
GX95_14190537-6.595860phage tail protein
GX95_14195541-9.854303phage tail protein
GX95_14200341-9.427708hypothetical protein
GX95_14205337-7.609207hypothetical protein
GX95_14210338-7.454079sugar transporter
GX95_14215332-6.352362hypothetical protein
GX95_14220232-5.065775capsid protein
GX95_14225232-4.461602hypothetical protein
GX95_14230329-3.987397peptidase S14
GX95_14235329-4.023294phage portal protein
GX95_14240428-4.239572hypothetical protein
GX95_14245431-3.450996terminase
GX95_14250327-3.319722hypothetical protein
GX95_14255326-3.009157nuclease
GX95_14260326-1.625823hypothetical protein
GX95_14265424-0.836159hypothetical protein
GX95_14270426-2.060367hypothetical protein
GX95_14275429-6.784593lysozyme
GX95_14280329-6.857611hypothetical protein
GX95_14285327-6.586079restriction endonuclease subunit S
GX95_14295325-6.069417restriction endonuclease subunit M
GX95_14300423-4.977903antitermination protein
GX95_14305524-2.414537DNA primase
GX95_14310626-2.706981helicase
GX95_14315725-3.213515hypothetical protein
GX95_14325739-7.731958hypothetical protein
GX95_14330633-7.484104phage repressor
GX95_14335331-6.347209hypothetical protein
GX95_14340326-5.364092hypothetical protein
GX95_14345427-5.335827hypothetical protein
GX95_14350329-3.792097hypothetical protein
GX95_14355327-3.598093hypothetical protein
GX95_14360328-4.314361hypothetical protein
GX95_14365531-5.166306hypothetical protein
GX95_14370634-5.689968hypothetical protein
GX95_14375328-3.744029hypothetical protein
GX95_14380118-2.803819hypothetical protein
GX95_14385014-2.478368hypothetical protein
GX95_14390011-1.726030excisionase
GX95_14395011-1.555280integrase
GX95_14400-214-0.704539lipoprotein
GX95_14405-313-1.395110arginine ABC transporter ATP-binding protein
GX95_14410-212-2.530711arginine ABC transporter substrate-binding
GX95_14415014-3.881283arginine transporter permease subunit ArtQ
GX95_14420015-3.290535arginine transporter permease subunit ArtM
GX95_14425115-3.515219arginine ABC transporter substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14110NUCEPIMERASE552e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.2 bits (133), Expect = 2e-10
Identities = 30/125 (24%), Positives = 49/125 (39%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVFALSQQGHQVRA---------AARRVERLEKQRLANVSCHKVDL 54
+ LV GA+G+IG H+ L + GHQV + + RLE HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 HWPENLPALLRD--IDTVYYLVH------GMGEGGDFIAHERQAALNVRDALRQTPVKQL 106
E + L + V+ H + + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14115NUCEPIMERASE688e-15 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 67.5 bits (165), Expect = 8e-15
Identities = 70/370 (18%), Positives = 122/370 (32%), Gaps = 71/370 (19%)

Query: 1 MKVLVTGATSGLGRNAVEFLRNKGISVRA---------TGRNEAMGKLLEKMGAEFVHAD 51
MK LVTGA +G + + L G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 104
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 105 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAAGEEVINLLAQANPQT--- 161
+++ ++ SS S+Y + D + +A +K A E L+A
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANE----LMAHTYSHLYGL 171

Query: 162 RFTVLRPQSLFGPHDK--VFIPRLAHMMHHYGSVLLPHGGSALVDMTYYENAIHAMWLAS 219
T LR +++GP + + + + M S+ + + G D TY ++ A+
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 220 QPGCDHLPS--------------GRAYNITNGENRTLRSIVQKLIDELTIDCRIRSVPYP 265
R YNI N L +Q L D L I+ + +P
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 266 MLDMIARSMERFGKKSAKEPPLTHYGVSKLNFDFTLDTTRAQEELGYQPIVTLDEGIERT 325
D+ T DT E +G+ P T+ +G++
Sbjct: 292 PGDV----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNF 324

Query: 326 AAWLRDHGNL 335
W RD +
Sbjct: 325 VNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14140TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/160 (16%), Positives = 62/160 (38%), Gaps = 3/160 (1%)

Query: 31 FAAAAPLLMHDVGMTAVQFGTASSIYYILYGVSKFGSGLLADRINPRVFLAPVLLVIAMI 90
+ P + +D ++ + + + + G L+D++ + L +++
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFG 92

Query: 91 NVGIGMCNNVTILLTLYCFTAIAQGCGFPP-IAKAISQWYSKSERGGWYSLWNTSHNVGG 149
+V + ++ LL + F A FP + ++++ K RG + L + +G
Sbjct: 93 SVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGE 152

Query: 150 ALAPLIASAVITYTGNWRYAFFVPAFITLVQAIISAIFMR 189
+ P I + Y W Y +P IT++ ++
Sbjct: 153 GVGPAIGGMIAHYIH-WSYLLLIP-MITIITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14160RTXTOXINC280.016 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 27.9 bits (62), Expect = 0.016
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 56 AVLDHLAWQWNSDTWRDNWPVSL 78
+L H++W W S NWPVSL
Sbjct: 8 EILGHVSWLWASSPLHRNWPVSL 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14175BCTERIALGSPF300.005 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.005
Identities = 21/82 (25%), Positives = 35/82 (42%), Gaps = 4/82 (4%)

Query: 3 VGMYGSMPFVASSMVVNTFANFKRSSKRRLARHEVIGLKPVLEDIGPDLDEVSFTMRLDT 62
V +G +A F R KRR++ H + P+ IG + T R
Sbjct: 223 VRTFGPWMLLALLAGFMAFRVMLRQEKRRVSFHRRLLHLPL---IGR-IARGLNTARYAR 278

Query: 63 TLGVVPLAALSLLRVMHSAQEV 84
TL ++ +A+ LL+ M + +V
Sbjct: 279 TLSILNASAVPLLQAMRISGDV 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14190TYPE4SSCAGA350.002 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 34.7 bits (79), Expect = 0.002
Identities = 34/123 (27%), Positives = 58/123 (47%), Gaps = 17/123 (13%)

Query: 129 AKGLTARE--KELKLSIDRNREAQSRSVEQVIRYKTALAQARTT--ILDAKRAQDELNRS 184
KGL+ +E K +K + N+E V + + + A+A A+ T + K+AQ +L +S
Sbjct: 562 TKGLSPQEANKLIKDFLSSNKEL----VGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKS 617

Query: 185 LEKRREL------KMEQLGEAKGQLVRSGVQTTAVAAGVFAAANNTANFNRENKMIGLTA 238
L KR L K+E K ++ + Q + +FA N A NR+ + I
Sbjct: 618 LRKREHLEKEVEKKLESKSGNKNKM-EAKAQANSQKDEIFALINKEA--NRDARAIAYAQ 674

Query: 239 DMK 241
++K
Sbjct: 675 NLK 677


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14405PF05272300.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.007
Identities = 16/50 (32%), Positives = 22/50 (44%), Gaps = 1/50 (2%)

Query: 31 LVLLGPSGAGKSSLLRVLNLLEMPRSGTLTIAGNHFDFTKTPSDKAIREL 80
+VL G G GKS+L+ L L+ S T G D + + EL
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDF-FSDTHFDIGTGKDSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14410FLGFLIH310.004 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 30.5 bits (68), Expect = 0.004
Identities = 34/119 (28%), Positives = 50/119 (42%), Gaps = 31/119 (26%)

Query: 81 FDAVMAG--MDITPEREKQVLFTTPYYDNSALFVGQQGKYTSVDQLKGKKVGVQNGTTHQ 138
D+V+A M + E +QV+ TP DNSAL + QL Q
Sbjct: 112 LDSVIASRLMQMALEAARQVIGQTPTVDNSALI-------KQIQQL-----------LQQ 153

Query: 139 KFIMDKHPEITTVPYDSYQNAKLDLQNGRIDAVFGDTAVVTEW-LKANPKLAPVGDKVT 196
+ + P++ P DLQ R+D + G T + W L+ +P L P G KV+
Sbjct: 154 EPLFSGKPQLRVHPD--------DLQ--RVDDMLGATLSLHGWRLRGDPTLHPGGCKVS 202


43GX95_14480GX95_14695Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_144802150.072587ribosomal protein S6 modification protein
GX95_14485015-0.950000nitroreductase A
GX95_14490-117-0.837012hypothetical protein
GX95_14495018-0.636324glutaredoxin
GX95_145001181.275001hypothetical protein
GX95_145051200.526351transporter
GX95_14510324-0.261781late control protein B
GX95_145153230.035205phage tail protein
GX95_14520323-0.030262hypothetical protein
GX95_14525222-0.982939hypothetical protein
GX95_14530328-5.003532hypothetical protein
GX95_14535430-4.652245hypothetical protein
GX95_14540633-6.503532hypothetical protein
GX95_14545547-13.745173regulator
GX95_14550544-12.212078phage repressor protein
GX95_14555341-11.264711hypothetical protein
GX95_14560032-8.839222integrase
GX95_14565-128-8.043912DNA-binding transcriptional regulator
GX95_14570-120-5.682411hypothetical protein
GX95_14575-1110.126397sugar-phosphatase
GX95_145801120.713515multidrug transporter MdfA
GX95_14585-1110.753841undecaprenyl-diphosphate phosphatase
GX95_14590-211-0.052822DNA-binding transcriptional repressor DeoR
GX95_14595-2100.215388serine-type D-Ala-D-Ala carboxypeptidase
GX95_14600-210-0.329436glutathione S-transferase
GX95_14605013-1.629142sugar dehydrogenase
GX95_14610018-4.412834glucose dehydrogenase
GX95_14615429-7.725347hypothetical protein
GX95_14620537-9.402504LysR family transcriptional regulator
GX95_14625638-9.475103electron transfer flavoprotein-ubiquinone
GX95_14630844-10.704437acyl-CoA dehydrogenase
GX95_14635848-11.462259acyl dehydratase
GX95_14640849-11.458772electron transfer flavoprotein subunit alpha
GX95_14645649-11.315040electron transfer flavoprotein subunit beta
GX95_14650233-7.884970CoA ester lyase
GX95_14655-126-6.104678transcriptional regulator
GX95_14660-317-3.534490ribosomal protein S12 methylthiotransferase
GX95_14665-212-0.859775glutathione ABC transporter permease GsiD
GX95_14670-1141.381318glutathione ABC transporter permease GsiC
GX95_14675-1152.279082glutathione ABC transporter substrate-binding
GX95_146800152.953993glutathione ABC transporter ATP-binding protein
GX95_146900133.314381beta-aspartyl-peptidase
GX95_14695-2123.660691molybdopterin molybdotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14505TCRTETA310.011 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.3 bits (71), Expect = 0.011
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSNFSFGIGNAAGLLFAGIML-GFLRANHPTFG-YIPQ--GALNMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGISNGLGAVGGQM--LIAGLVVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14580HTHTETR461e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.2 bits (109), Expect = 1e-08
Identities = 17/80 (21%), Positives = 33/80 (41%)

Query: 7 RRANDPKRREKIIQATLEAVKTYGVHAVTHRKIAAIAQVPLGSMTYYFAGMDALLSEAFT 66
+ + R+ I+ L GV + + +IA A V G++ ++F L SE +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 67 LFTENMSRQYQDFFAQVTDA 86
L N+ ++ A+
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14585TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 33/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVTVVR-ASALM--GALGIGLIIFVDSDWVA-GVSVILWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + +L GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407



Score = 28.7 bits (64), Expect = 0.042
Identities = 32/135 (23%), Positives = 56/135 (41%), Gaps = 11/135 (8%)

Query: 38 IRDILSVSTAEMGAVLFGLSIGSMSGILCS---AWLVKRFGTRKVIRTTMTCAVTGMVIL 94
++D+ +STAE+G+V+ + G+MS I+ LV R G V+ +T +
Sbjct: 283 MKDVHQLSTAEIGSVI--IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTA 340

Query: 95 SVALWCASPLIFALGLAVFGASFGAAEVAINVEGAAVERELNKTVLPMMHGFYSFGTLAG 154
S L S + + + V G + I+ V L + +F +
Sbjct: 341 SFLLETTSWFMTIIIVFVLG-GLSFTKTVIS---TIVSSSLKQQEAGAGMSLLNFTSFLS 396

Query: 155 AGVGMALTA--LSVP 167
G G+A+ LS+P
Sbjct: 397 EGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14595TCRTETB446e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.1 bits (104), Expect = 6e-07
Identities = 66/356 (18%), Positives = 127/356 (35%), Gaps = 51/356 (14%)

Query: 48 QAGLDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLATLLAKNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FT-FLRFLQGISLCFIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWVH 166
RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFILFAALAAIAFFGLQRAMPETATRRGE------------------------- 201
+ W +L + I L + + + +G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSI 223

Query: 202 ------TLSFKALGRDYRLV---------IKNRRFVAGALALGFVSLPLLAWIAQSPIII 246
LSF + R V KN F+ G L G + + +++ P ++
Sbjct: 224 SFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM 283

Query: 247 ISGEQLSSYEYG-LLQVPVFGALIAGNLVLARLTSRRTVRSLIVMGGWPIVAGLIIAAAA 305
QLS+ E G ++ P ++I + L RR ++ +G + + A+
Sbjct: 284 KDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTAS-- 341

Query: 306 TVVSSHAYLWMTAGLSVYAFGIGLANAGLVRLTLFSSDMSKGTVSAAMGMLQMLIF 361
+ +MT + V+ G GL+ V T+ SS + + A M +L F
Sbjct: 342 -FLLETTSWFMTIII-VFVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14610BLACTAMASEA475e-08 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 47.1 bits (112), Expect = 5e-08
Identities = 49/207 (23%), Positives = 78/207 (37%), Gaps = 25/207 (12%)

Query: 1 MTQYASSLRSLAAGSVLLFLFASPVKAEEQTIAPPGVDAR-AWILMDYASGKVLAEGNAD 59
M + SL A ++ L + ASP E+ ++ + R I MD ASG+ L AD
Sbjct: 1 MRYIRLCIISLLA-TLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRAD 59

Query: 60 EKLDPASLTKIMTSYVVGQALKAGKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQV 119
E+ S K++ V + AG +L + + +P V D +
Sbjct: 60 ERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGM 113

Query: 120 SVADLNKGIIIQSGNDACIALADYVAGSQESFIGLMNAYAKRLGLTNTT---FQTVHGLD 176
+V +L I S N A L V G + A+ +++G T ++T
Sbjct: 114 TVGELCAAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEA 168

Query: 177 APGQF---STARDMA------LLGKAL 194
PG +T MA L + L
Sbjct: 169 LPGDARDTTTPASMAATLRKLLTSQRL 195


44GX95_14910GX95_15270Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_14910222-4.161047molybdenum cofactor biosynthesis protein B
GX95_14915224-4.750858cyclic pyranopterin phosphate synthase
GX95_14920224-3.947252hypothetical protein
GX95_14925326-4.673869E3 ubiquitin--protein ligase
GX95_14930124-3.216720hypothetical protein
GX95_14935123-2.813943hypothetical protein
GX95_14940023-1.622807DNA-binding protein
GX95_14945022-1.896191transposase
GX95_14950228-3.840937hypothetical protein
GX95_14955326-1.970874hypothetical protein
GX95_14960426-3.062984hypothetical protein
GX95_14965327-2.374516hypothetical protein
GX95_149701250.864027hypothetical protein
GX95_149750232.225671sulfate transporter
GX95_149801273.434773hypothetical protein
GX95_149851262.698227hypothetical protein
GX95_149900262.515574hypothetical protein
GX95_149952272.677152hypothetical protein
GX95_15000125-0.023728hypothetical protein
GX95_15005128-3.562694hypothetical protein
GX95_15010025-1.701513hypothetical protein
GX95_15015124-2.196788hypothetical protein
GX95_15020121-1.493356hypothetical protein
GX95_15025120-0.102529mor transcription activator family protein
GX95_150303211.336253hypothetical protein
GX95_150355245.265085hypothetical protein
GX95_150405274.856505hypothetical protein
GX95_150456305.495342hypothetical protein
GX95_150505295.561865hypothetical protein
GX95_150555304.017301DNA-binding protein
GX95_150605262.458763hypothetical protein
GX95_150654283.480674hypothetical protein
GX95_150706254.636040hypothetical protein
GX95_150756286.254839hypothetical protein
GX95_150806296.338533hypothetical protein
GX95_150856296.977748hypothetical protein
GX95_150907317.681234hypothetical protein
GX95_150955297.547587phage virion morphogenesis protein
GX95_151004317.587525hypothetical protein
GX95_151054296.538394hypothetical protein
GX95_151104296.944945head protein
GX95_151155316.860037hypothetical protein
GX95_151205358.012001hypothetical protein
GX95_151257358.798223hypothetical protein
GX95_151307368.986140hypothetical protein
GX95_151353296.700076phage tail protein
GX95_151403256.244728phage tail protein
GX95_151454256.306817hypothetical protein
GX95_151504225.159978hypothetical protein
GX95_151554225.239972hypothetical protein
GX95_151604214.807974phage tail protein
GX95_151656204.779910phage baseplate protein
GX95_151706213.448652hypothetical protein
GX95_151757220.788135phage tail protein
GX95_15185324-2.610210phage tail protein
GX95_15190-115-2.277356phage tail protein
GX95_15195-114-1.542924phage tail protein
GX95_15200-212-0.579869phage tail protein
GX95_15205-3111.387882UDP-glucose--(glucosyl)LPS
GX95_15215-2154.214220excinuclease ABC subunit B
GX95_15220-1145.282958dethiobiotin synthase
GX95_15225-1155.567488malonyl-ACP O-methyltransferase BioC
GX95_15230-1165.5507868-amino-7-oxononanoate synthase
GX95_15235-1165.069769biotin synthase
GX95_152400175.971061adenosylmethionine--8-amino-7-oxononanoate
GX95_15245-1175.912831kinase inhibitor
GX95_15250-1165.581240histidine ammonia-lyase
GX95_152550164.787114urocanate hydratase
GX95_152600143.723788histidine utilization repressor
GX95_152650123.484213formimidoylglutamase
GX95_152700143.110269imidazolonepropionase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15270PRTACTNFAMLY310.013 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.8 bits (69), Expect = 0.013
Identities = 17/55 (30%), Positives = 24/55 (43%), Gaps = 5/55 (9%)

Query: 230 VLQTAKALGIPVKGHVEQLSLLGGAQLVSRYQGLSADHIEYLDEAGVAAMRDGGT 284
VL+ +P G +S+LG ++L L HI AGVAAM+
Sbjct: 202 VLRDTNVTAVPASGAPAAVSVLGASELT-----LDGGHITGGRAAGVAAMQGAVV 251


45GX95_15375GX95_15440Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_153752161.666513nicotinamide riboside transporter PnuC
GX95_153802171.936167quinolinate synthase
GX95_154252181.530633*****cell division protein CpoB
GX95_154302201.549486peptidoglycan-associated lipoprotein
GX95_154352171.500086Tol-Pal system beta propeller repeat protein
GX95_154407180.985397cell envelope integrity protein TolA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15425RTXTOXIND290.022 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.022
Identities = 2/39 (5%), Positives = 19/39 (48%)

Query: 56 LTQLQQQLSDNQSDIDSLRGQIQENQYQLNQVMERQKQI 94
+ + + + + +++ + Q+++ + ++ E + +
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLV 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15430OMPADOMAIN1152e-33 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 115 bits (289), Expect = 2e-33
Identities = 36/119 (30%), Positives = 55/119 (46%), Gaps = 4/119 (3%)

Query: 56 EEQARLQMQQLQQNNIVYFDLDKYDIRSDFAAMLDAHANFLRSN--PSYKVTVEGHADER 113
+Q + + V F+ +K ++ + A LD + L + V V G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 114 GTPEYNISLGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYAKNRRAVL 172
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15440IGASERPTASE615e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.8 bits (147), Expect = 5e-12
Identities = 28/199 (14%), Positives = 66/199 (33%), Gaps = 6/199 (3%)

Query: 64 YNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQKQAEEA 123
YN + +++ Q E R+ + A + E
Sbjct: 981 YNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETV 1040

Query: 124 AKLAQQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKAEAEAA 183
A+ ++Q+ + E+ + A + + A +A ++ K + V A+ +E +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEV-----AQSGSETKET 1095

Query: 184 KAAADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAAAAKAAAEAKKKADAAAA 243
+ + + KA E +K E + + K+ + + AE ++ D
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQE-VPKVTSQVSPKQEQSETVQPQAEPARENDPTVN 1154

Query: 244 KAAADAKKKAAAEKAAAAE 262
++ A+ A+
Sbjct: 1155 IKEPQSQTNTTADTEQPAK 1173



Score = 53.5 bits (128), Expect = 8e-10
Identities = 29/184 (15%), Positives = 62/184 (33%), Gaps = 4/184 (2%)

Query: 55 VDPGAVVQQYNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQ 114
VD + N Q D S EE ++ + +E E + ++
Sbjct: 992 VDTTNITTPNNIQADVP-SVPSNNEEIARVDEAPVPPPAPATPSETTETVA-ENSKQESK 1049

Query: 115 EQQKQAEEAAKLAQQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADA 174
+K ++A + Q ++ A+EA + E + + + E + A +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE-TATVEK 1108

Query: 175 KKKAEAEAAKAAADAKKKAEAEAAKAAAEA-KKKAEAEAAKAAAEAKKKADAAAAKAAAE 233
++KA+ E K K ++ + +E + +AE K+ + A
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 234 AKKK 237
+
Sbjct: 1169 EQPA 1172



Score = 53.1 bits (127), Expect = 1e-09
Identities = 24/219 (10%), Positives = 71/219 (32%), Gaps = 21/219 (9%)

Query: 66 RQQDQQASARRAEEERKKLQQQQAEE--LQQKQAAEQER------LKQLEKERLAAQEQQ 117
+ A + E + + +Q A E Q ++ A++ + + E + ++ ++
Sbjct: 1035 ETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 118 KQ-----------AEEAAKLAQQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAE 166
Q EE AK+ ++ Q E + + K+ ++E + A+ ++ +
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQ--EVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 167 AVKAAADAKKKAEAEAAKAAADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAA 226
++ A+ + A + E ++ + E + A +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 227 AAKAAAEAKKKADAAAAKAAADAKKKAAAEKAAAAEGVD 265
+ + + + + ++ + D
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCD 1251



Score = 46.6 bits (110), Expect = 2e-07
Identities = 24/215 (11%), Positives = 60/215 (27%), Gaps = 6/215 (2%)

Query: 59 AVVQQYNRQQDQQASARRAEEERKKLQQQQAEELQQKQAAEQERLKQLEKERLAAQEQQK 118
A Q Q + E K+ + EE + + + + E ++ +Q K
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ-----EVPKVTSQVSPK 1132

Query: 119 QAEEAAKLAQQQQQQAEEAAKAAADAKKKAEAEAAKAAADAKKKAEAEAVKAAADAKKKA 178
Q + Q + + + + + + A + + E +
Sbjct: 1133 QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTG 1192

Query: 179 EAEAAKAAADAKKKAEAEAAKAAAEAKKKAEAEAAKAAAEAKKKADAAAAKAAAEAKKKA 238
+ + ++ K + ++ + A ++ + A
Sbjct: 1193 NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDL 1252

Query: 239 DAAAAKAA-ADAKKKAAAEKAAAAEGVDDLLGDLS 272
+ A +DA+ KA + V + L
Sbjct: 1253 TSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLE 1287


46GX95_15495GX95_15655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_154952221.905684dihydrolipoamide succinyltransferase
GX95_155002201.0876992-oxoglutarate dehydrogenase E1 component
GX95_15505221-0.057424succinate dehydrogenase iron-sulfur subunit
GX95_155101200.917001succinate dehydrogenase flavoprotein subunit
GX95_15515017-0.903749succinate dehydrogenase, hydrophobic membrane
GX95_15520-117-0.699083succinate dehydrogenase cytochrome b556 large
GX95_15525-122-6.299800hypothetical protein
GX95_15530-124-8.204468citrate (Si)-synthase
GX95_15535239-13.155835AbrB family transcriptional regulator
GX95_15540445-15.918725endonuclease VIII
GX95_15545851-18.147043hypothetical protein
GX95_15550753-18.434242glycosyl transferase
GX95_15555852-18.363017glycosyl transferase
GX95_15560652-17.281653sugar ABC transporter ATP-binding protein
GX95_15565547-13.437758ABC transporter permease
GX95_15570446-12.800591glycosyltransferase family 1 protein
GX95_15575342-10.724604glycosyl transferase
GX95_15580337-8.772275UDP-galactopyranose mutase
GX95_15585-126-3.730744transport protein
GX95_15590-1212.267114hypothetical protein
GX95_155950183.183390DNA recombinase
GX95_15600-2173.070513hypothetical protein
GX95_15605-1153.251833lactam utilization protein LamB
GX95_15610-1143.661481hypothetical protein
GX95_15615-2143.076932hypothetical protein
GX95_15620-1162.640590Nif3-like dinuclear metal center protein
GX95_15625-1142.265912MFS transporter
GX95_15630-1153.100544deoxyribodipyrimidine photo-lyase
GX95_156350163.773797hypothetical protein
GX95_15640-1154.420262potassium-transporting ATPase subunit F
GX95_156450164.481646potassium-transporting ATPase subunit KdpA
GX95_156500164.239232potassium-transporting ATPase subunit B
GX95_15655-1153.480674potassium-transporting ATPase subunit C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15505TCRTETOQM310.004 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.0 bits (70), Expect = 0.004
Identities = 12/41 (29%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDNAPRMQDYTLEGEEGRDM-MLLDALIQLKEKDPSLSFRR 53
++N + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15565PF05272300.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.006
Identities = 11/33 (33%), Positives = 16/33 (48%), Gaps = 6/33 (18%)

Query: 50 KIDFTLTEGNRLALIGHNGSGKTTLLRVLAGAY 82
K D+++ L G G GK+TL+ L G
Sbjct: 594 KFDYSVV------LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_15610V8PROTEASE300.008 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 30.0 bits (67), Expect = 0.008
Identities = 14/51 (27%), Positives = 22/51 (43%), Gaps = 4/51 (7%)

Query: 20 LTLVSSANIACGFHAGDAQTMLT---CVREALKNGVAIGAHPSFPDRDNFG 67
+ + IA G G T+LT V + A+ A PS ++DN+
Sbjct: 95 VEAPTGTFIASGVVVGK-DTLLTNKHVVDATHGDPHALKAFPSAINQDNYP 144


47GX95_16100GX95_16210Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_16100-215-4.579249molecular chaperone
GX95_16105-114-3.813160alkyl hydroperoxide reductase subunit F
GX95_16110-117-4.315076peroxiredoxin
GX95_16115017-2.779262thiol:disulfide interchange protein DsbG
GX95_16120016-2.860099LysR family transcriptional regulator
GX95_16125-1101.022217phosphoadenosine phosphosulfate reductase
GX95_16130-1142.564898hypothetical protein
GX95_161350153.365004methionine aminotransferase
GX95_161400163.751149oxidoreductase
GX95_16145-1163.814805hypothetical protein
GX95_16150-2154.156506carbon starvation protein A
GX95_16155-3134.469701thioesterase
GX95_16160-2114.4739592,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
GX95_16165-1125.016800isochorismatase
GX95_161700125.5055882,3-dihydroxybenzoate-AMP ligase
GX95_161751145.513596isochorismate synthase EntC
GX95_161801153.768982Fe2+-enterobactin ABC transporter
GX95_161852164.549306MFS transporter
GX95_161902164.329568iron-enterobactin transporter
GX95_161951164.190609iron-enterobactin transporter permease
GX95_16200-1133.137719iron-enterobactin transporter ATP-binding
GX95_16205-1123.104424LPS O-antigen length regulator
GX95_16210-2113.712648non-ribosomal peptide synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16105STREPTOPAIN310.011 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 31.2 bits (70), Expect = 0.011
Identities = 17/73 (23%), Positives = 33/73 (45%), Gaps = 1/73 (1%)

Query: 2 LDTNMKTQLRAYLEKLTKPVELIATLDDS-AKSAEIKELLAEIAELSDKVTFKEDNTLPV 60
D N K + +++E + ++ LD + A +AEIK+ + + S + + + N +
Sbjct: 109 FDANGKENIASFMESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNL 168

Query: 61 RKPSFLITNPGSQ 73
P PG Q
Sbjct: 169 LTPVIEKVKPGEQ 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16115BCTLIPOCALIN280.019 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.4 bits (63), Expect = 0.019
Identities = 18/98 (18%), Positives = 41/98 (41%), Gaps = 13/98 (13%)

Query: 30 QGITILKSFEAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNALIEK 87
+ + + FE YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 88 EIYAPAGREMWQKMEKASWILDGKKDAPVVLYVFADPF 125
Y+ + W++ E ++ ++G D + + F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16160DHBDHDRGNASE337e-120 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 337 bits (866), Expect = e-120
Identities = 105/257 (40%), Positives = 148/257 (57%), Gaps = 20/257 (7%)

Query: 9 KTVWVTGAGKGIGYATALAFVDAGARVFGFDRE---------------FTQENYPFATEV 53
K ++TGA +GIG A A GA + D E +P
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP----- 63

Query: 54 MDVADAAQVAQVCQRVLQKTPRLDVLVNAAGILRMGATDALSVDDWQQTFAVNVGGAFNL 113
DV D+A + ++ R+ ++ +D+LVN AG+LR G +LS ++W+ TF+VN G FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 114 FSQTMAQFRRQQGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALTVGLELAGCGVRCN 173
++ G+IVTV S+ A PR M+AY +SKAA +GLELA +RCN
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 174 VVSPGSTDTDMQRTLWVSEDAEQQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDLA 233
+VSPGST+TDMQ +LW E+ +Q I+G E FK GIPL K+A+P +IA+ +LFL S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 234 SHITLQDIVVDGGSTLG 250
HIT+ ++ VDGG+TLG
Sbjct: 244 GHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16165ISCHRISMTASE424e-153 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 424 bits (1092), Expect = e-153
Identities = 147/299 (49%), Positives = 192/299 (64%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQSYALPTALDIPTNKVNWAFEPERAALLIHDMQDYFVSFWGRNCPMMDQVIANI 60
MAIP +Q Y +PTA D+P NKV+W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRQYCKEHHIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVEALTPDEADTV 120
L+ C + IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P++ D V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKDTGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRKEHLMALNYVAGRSGRVVMTESLL------PTPVPASKA-----------ALRALIL 223
FS ++H MAL Y AGR VMT+SLL P V + A +R I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDETDEPLD-DENLIDYGLDSVRMMGLAARWRKVHGDIDFVMLAKNPTIDAWWALLS 281
LL ET E + E+L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W LL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16180FERRIBNDNGPP603e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 59.6 bits (144), Expect = 3e-12
Identities = 46/210 (21%), Positives = 80/210 (38%), Gaps = 21/210 (10%)

Query: 105 EPNAETVAAQMPDLILISATGGDSALALYDQLSAIAPTLVINYDDKS-----WQSLLTQL 159
EPN E + P ++ SA G S + L+ IAP N+ D + LT++
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPLAMARKSLTEM 141

Query: 160 GEITGQEKQAAARIAEFEAQLTTVKQRIALPPQPVSALVYTPAAHSANLWTPESAQGKLL 219
++ + A +A++E + ++K R L ++ P S ++L
Sbjct: 142 ADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEIL 201

Query: 220 TQLGFTLATLPRGLQTSKSQGKRHDIIQLGGENLAAGLNGQSLFLFAGDNKDVAALYTNP 279
+ G A + + + + LAA + L ++KD+ AL P
Sbjct: 202 DEYGIPNAW--------QGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATP 253

Query: 280 LLAHLPAVQNKRVYALGTETFRLDYYSATL 309
L +P V+ R + F Y ATL
Sbjct: 254 LWQAMPFVRAGRFQRVPAVWF----YGATL 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16185TCRTETB290.048 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.048
Identities = 69/397 (17%), Positives = 130/397 (32%), Gaps = 66/397 (16%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQVGLSVTLTGGAMFIGLMVGGVLADRYERKKVIL 86
F S+++ +L V++P T IG V G L+D+ K+++L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 87 LARGTCGIGFIGLCVNALLPEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGRENLMQ 146
G + V ++A ++ G F +L ++ + +EN +
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL----VMVVVARYIPKENRGK 139

Query: 147 AGAITMLTVRLGSVISPMLGGILLASGGVAWNYGLAAAGTFITLLPLLTLPRLPVPPQPR 206
A + V +G + P +GG++ + W+Y L IT++ + L +L
Sbjct: 140 AFGLIGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIP--MITIITVPFLMKLLKKEVRI 195

Query: 207 ------------------------------------------------ENPFIAL-LAAF 217
+PF+ L
Sbjct: 196 KGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKN 255

Query: 218 RFLLASPLIGGIALLGGLVTMASAVRVLYPALAMSWQMSAAQIGLLYAAI-PLGAAIGAL 276
+ L GGI T+A V ++ + Q+S A+IG + + I
Sbjct: 256 IPFMIGVLCGGIIF----GTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 277 TSGQLAHSVRPGLIMLVSTVG---SFLAVGLFAIMPVWIAGVICLALFGWLSAISSLLQY 333
G L P ++ + SFL W +I + + G LS +++
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIST 371

Query: 334 TLLQTQTPENMLGRMNGLWTAQNVTGDAIGAALLGGL 370
+ + + M L + + G A++GGL
Sbjct: 372 IVSSSLKQQEAGAGM-SLLNFTSFLSEGTGIAIVGGL 407


48GX95_16300GX95_16340Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_16300025-5.332980AraC family transcriptional regulator
GX95_16305128-7.446503cation transporter
GX95_16315123-6.173891*helix-turn-helix transcriptional regulator
GX95_16320117-1.879049diguanylate cyclase
GX95_16325216-1.607547fimbriae Y protein
GX95_16330217-0.859346DNA-binding response regulator
GX95_163352160.515707adhesin
GX95_163402140.618596fimbrial adhesin FimH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16305ACRIFLAVINRP331e-05 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.9 bits (75), Expect = 1e-05
Identities = 8/24 (33%), Positives = 17/24 (70%)

Query: 8 FAMIGGMITAPLLSLFIIPAAYKL 31
++GGM++A LL++F +P + +
Sbjct: 1004 IGVMGGMVSATLLAIFFVPVFFVV 1027


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16330HTHFIS702e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 2e-16
Identities = 29/122 (23%), Positives = 58/122 (47%), Gaps = 2/122 (1%)

Query: 1 MKPASVIIMDEHPIVRMSIEVLLGKNSNIQVVLKTDDSRTAIEYLRTYPVDLVILDIELP 60
M A++++ D+ +R + L + V T ++ T ++ DLV+ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYD--VRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GTDGFTLLKRIKSIQEHTRILFLSSKSEAFYAGRAIRAGANGFVSKRKDLNDIYNAVKMI 120
+ F LL RIK + +L +S+++ A +A GA ++ K DL ++ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 LS 122
L+
Sbjct: 119 LA 120


49GX95_16720GX95_16825Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_16720424-0.824926DNA-binding protein HU
GX95_16725526-0.813803endopeptidase La
GX95_16730424-0.780424ATP-dependent protease ATP-binding subunit ClpX
GX95_16735321-0.055975ATP-dependent Clp endopeptidase, proteolytic
GX95_16740117-0.494466trigger factor
GX95_16745021-0.018012BolA family transcriptional regulator
GX95_16750-1200.301778hypothetical protein
GX95_167551220.412021muropeptide transporter AmpG
GX95_167601230.600375cytochrome o ubiquinol oxidase subunit II
GX95_16765217-3.450268cytochrome o ubiquinol oxidase subunit I
GX95_16770014-4.974900cytochrome o ubiquinol oxidase subunit III
GX95_16775013-5.225008cytochrome o ubiquinol oxidase subunit IV
GX95_16780015-5.533846protoheme IX farnesyltransferase
GX95_16785-116-5.060214hypothetical protein
GX95_16795-112-1.231401hypothetical protein
GX95_16800-1142.214945hypothetical protein
GX95_168050182.810710YajQ family cyclic di-GMP-binding protein
GX95_168101183.2900762-dehydropantoate 2-reductase
GX95_168150174.351885DJ-1 family protein
GX95_168201184.318567phosphonoacetaldehyde hydrolase
GX95_168252193.8891692-aminoethylphosphonate--pyruvate transaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16725DNABINDINGHU1158e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 115 bits (291), Expect = 8e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIEKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16730GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 34/133 (25%), Positives = 68/133 (51%), Gaps = 15/133 (11%)

Query: 191 ERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDAPD- 249
LE A +E + +L R +++ ++ S+ +Q++A ++L E + +
Sbjct: 291 AALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEA 344

Query: 250 ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDWMVQVPWNARSK 308
++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D + A+ +
Sbjct: 345 SRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE----AKKQ 397

Query: 309 VKKDLRQAQEILD 321
V+K L +A L
Sbjct: 398 VEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16760TCRTETB419e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.0 bits (96), Expect = 9e-06
Identities = 40/190 (21%), Positives = 76/190 (40%), Gaps = 15/190 (7%)

Query: 221 RNNAWLI-LLLIVLYKLGDAFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYG 279
R+N LI L ++ + + + ++++ + VN L +I A+YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 280 GILMQRLSLFRALLIFGILQGASNAGYWLLSITDKNMFSMGAAVFFENLCGGMGTAAFVA 339
L +L + R LL I+ + ++ + FS+ + G G AAF A
Sbjct: 71 K-LSDQLGIKRLLLFGIIINCFGS----VIGFVGHSFFSL---LIMARFIQGAGAAAFPA 122

Query: 340 LLM----TLCNKSFSATQFALLSALSAVGRVYVGPVAGWFVEAH-GWPTFYLFSVVAAVP 394
L+M K F L+ ++ A+G VGP G + + W L ++ +
Sbjct: 123 LVMVVVARYIPKENRGKAFGLIGSIVAMG-EGVGPAIGGMIAHYIHWSYLLLIPMITIIT 181

Query: 395 GLLLLLVCRQ 404
L+ + ++
Sbjct: 182 VPFLMKLLKK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16800TCRTETA854e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 85.3 bits (211), Expect = 4e-20
Identities = 84/383 (21%), Positives = 149/383 (38%), Gaps = 26/383 (6%)

Query: 16 GLGTVFSLRMLGMFMVLPVLTTY--GMALQGASEALIGIAIGIYGLAQAIFQIPFGLLSD 73
L TV L +G+ +++PVL + A GI + +Y L Q G LSD
Sbjct: 10 ILSTVA-LDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 74 RIGRKPLIVGGLAVFVAGSVIAALSHSIWGIILGRALQG-SGAIAAAVMALLSDLTREQN 132
R GR+P+++ LA I A + +W + +GR + G +GA A A ++D+T
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDE 128

Query: 133 RTKAMAFIGVSFGITFAIAMVLGPIVTHSLGLNALFWMIAALATLGILLTIWVVPNSTNH 192
R + F+ FG VLG ++ +A F+ AAL L L +++P S
Sbjct: 129 RARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNFLTGCFLLPESHKG 187

Query: 193 VLNRESGMVKGSFSKVLAEPRLLKLNFGIMCLHILLMSTFVA-LPGQLADAGFPAAEHWK 251
+ + G+ + L+ F+ L GQ+ A + +
Sbjct: 188 ERRPLRREALNPLAS-------FRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR 240

Query: 252 VYLATMVIAFA--------AVVPFIIYAEVKRRMKQVFLFCVGLII-VAEIVLWGAGQHF 302
+ I + ++ +I V R+ + +G+I +L
Sbjct: 241 FHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRG 300

Query: 303 WELVIGVQLFFLAFNLMEAL---LPSLISKESPAGYKGTAMGVYSTSQFLGVALGGSLGG 359
W + L M AL L + +E +G+ + S + +G L ++
Sbjct: 301 WMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360

Query: 360 WIDGTFDGQTVFLAGAVLAMVWL 382
T++G ++AGA L ++ L
Sbjct: 361 ASITTWNG-WAWIAGAALYLLCL 382


50GX95_17175GX95_17590Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_17175012-3.128628cytochrome ubiquinol oxidase subunit I
GX95_17180013-4.045805hypothetical protein
GX95_17185013-3.716321type III restriction endonuclease subunit R
GX95_171900100.381825restriction endonuclease
GX95_171950144.215263MFS transporter
GX95_172000134.425169hypothetical protein
GX95_17205-1134.949121heavy metal transport/detoxification protein
GX95_17210-1124.611877Cu(I)-responsive transcriptional regulator
GX95_17215-1134.212851copper-translocating P-type ATPase
GX95_17220-1161.819208efflux transporter periplasmic adaptor subunit
GX95_17225-116-0.195716multidrug efflux RND transporter permease
GX95_17230125-5.122247multidrug transporter
GX95_17235236-12.306505hypothetical protein
GX95_17240234-10.083513hypothetical protein
GX95_17245234-10.229386transcriptional regulator
GX95_17250335-10.870281hypothetical protein
GX95_17255334-10.170281hypothetical protein
GX95_17260332-9.382673response regulator
GX95_17265322-5.153502cyclic diguanylate phosphodiesterase
GX95_17270321-4.522857hypothetical protein
GX95_17275321-3.954793hypothetical protein
GX95_17280319-2.833018fimbrial protein
GX95_17285318-2.475824fimbrial chaperone protein StbB
GX95_172903140.051849fimbrial assembly protein
GX95_17295317-0.731894fimbrial protein
GX95_17300318-1.231172fimbrial assembly protein
GX95_17305418-0.949454hypothetical protein
GX95_17310418-1.122274phage tail protein
GX95_17315518-1.496812sialate O-acetylesterase
GX95_17320423-4.500111phage tail protein
GX95_17325524-4.009163phage tail protein
GX95_17330625-3.656666phage tail protein
GX95_17335725-4.109163hypothetical protein
GX95_17340625-2.853915hypothetical protein
GX95_17345423-1.870691hypothetical protein
GX95_17350321-1.609551hypothetical protein
GX95_17355320-1.641158hypothetical protein
GX95_17360222-1.784493hypothetical protein
GX95_17365221-1.493162hypothetical protein
GX95_17370321-1.329280transglycosylase
GX95_17375421-1.497765hypothetical protein
GX95_17380424-2.165529hypothetical protein
GX95_17385525-1.130864hypothetical protein
GX95_173907280.686509hypothetical protein
GX95_173956291.470140hypothetical protein
GX95_174005291.738678hypothetical protein
GX95_174055271.140699hypothetical protein
GX95_174104272.101790DUF2184 domain-containing protein
GX95_174153260.980009hypothetical protein
GX95_17420424-0.237564NUDIX hydrolase
GX95_17425324-0.738446hypothetical protein
GX95_17430323-1.267989hypothetical protein
GX95_17435324-1.510716terminase
GX95_17440428-3.969549terminase small subunit
GX95_17445426-3.444801hypothetical protein
GX95_17450327-2.340281hypothetical protein
GX95_17455123-2.258084lysozyme
GX95_17460024-2.447101hypothetical protein
GX95_17465-125-1.368571hypothetical protein
GX95_17470-222-1.362564antiterminator
GX95_17475025-2.539822hypothetical protein
GX95_17480125-2.659412hypothetical protein
GX95_17485326-3.403670hypothetical protein
GX95_17490326-3.776855hypothetical protein
GX95_17495428-4.310209DNA damage-inducible protein I
GX95_17500428-2.757671N-acetylglucosamine-6-phosphate deacetylase
GX95_17505125-1.848117hypothetical protein
GX95_17510023-0.889844hypothetical protein
GX95_17515125-1.480970hypothetical protein
GX95_17520228-0.894585hypothetical protein
GX95_17525330-1.322732hypothetical protein
GX95_17530431-2.350632phage replication protein
GX95_17535635-4.123091repressor
GX95_175401043-7.988461Rha family transcriptional regulator
GX95_17545841-7.747464transcriptional regulator
GX95_17550741-7.927852hypothetical protein
GX95_17555840-7.766772hypothetical protein
GX95_17560739-8.355701hypothetical protein
GX95_17565835-6.048309hypothetical protein
GX95_17570434-3.080913hypothetical protein
GX95_17575533-3.879247hypothetical protein
GX95_17580331-1.787242host-nuclease inhibitor protein Gam
GX95_17585232-1.755694hypothetical protein
GX95_17590229-1.089800YqaJ-like viral recombinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17195TCRTETA567e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.4 bits (136), Expect = 7e-11
Identities = 56/306 (18%), Positives = 108/306 (35%), Gaps = 17/306 (5%)

Query: 19 FTSWMLDAFDFFILVFVLSDLAEWFHAS---VSDVSIAIMLTLAVRPIGALLFGRMAEKY 75
++ LDA +++ VL L S + I + L ++ A + G +++++
Sbjct: 11 LSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF 70

Query: 76 GRRPILMLNILFFTVFELLSAWSPTFMAFLIFRVMYGVAMGGIWGVASSLAMETIPDRSR 135
GRRP+L++++ V + A +P I R++ G+ G VA + + R
Sbjct: 71 GRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDER 129

Query: 136 ----GLMSGIFQAGYPCGYLFASVIFGLFYSMVGWRGMFLIGA---LPVVLLPYIWFKVP 188
G MS F G + A + G F A L
Sbjct: 130 ARHFGFMSACFGFG-----MVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 189 ESPVWLAARARKENTALLPVLRKQWKLCLYLVLVMAFFNFFSHGTQDLYPTFLKMQHGFD 248
R N + + L+ V L+ F + + +D
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 249 PHLISI-IAIFYNIAAMLGGIFYGTLSERIGRKKAIMIAAFLALPVLPLWAFSSGSFTIG 307
I I +A F + ++ + G ++ R+G ++A+M+ L AF++ +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 308 LGAFLM 313
L+
Sbjct: 305 PIMVLL 310



Score = 33.6 bits (77), Expect = 0.001
Identities = 37/186 (19%), Positives = 77/186 (41%), Gaps = 10/186 (5%)

Query: 3 TPLNWTTTQRHVAFASFTSWMLDAF-DFFILVFVLSDLAEWFHASVSDVSIAIMLTLAVR 61
W VA +++ ++V+ + FH + + I++ +
Sbjct: 201 ASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG-EDRFHWDATTIGISLAAFGILH 259

Query: 62 PIG-ALLFGRMAEKYGRRPILMLNILF-FTVFELLSAWSPTFMAFLIFRVMYGVAMG--G 117
+ A++ G +A + G R LML ++ T + LL+ + +MAF I ++ +G
Sbjct: 260 SLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPA 319

Query: 118 IWGVASSLAMETIPDRSRGLMSGIFQAGYPCGYLFASVIFGLFYSMVGWRG-MFLIGA-L 175
+ + S E + +G ++ + G L + I+ S+ W G ++ GA L
Sbjct: 320 LQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA--ASITTWNGWAWIAGAAL 377

Query: 176 PVVLLP 181
++ LP
Sbjct: 378 YLLCLP 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17220RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 4e-08
Identities = 18/112 (16%), Positives = 37/112 (33%), Gaps = 7/112 (6%)

Query: 74 ELRSRVGGTLDAVSVPEGRLVSRGQLLFQIDPRPFEVALDTAVAQLRQAEVLARQAQADF 133
E++ + + V EG V +G +L ++ E + L QA + + Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 134 DRIQR-------LVASGAVSRKNADDVTATRNARQAQMQSAKAAVAAARLEL 178
I+ L + ++V + + Q + + L L
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNL 209



Score = 34.4 bits (79), Expect = 7e-04
Identities = 20/106 (18%), Positives = 37/106 (34%), Gaps = 13/106 (12%)

Query: 112 LDTAVAQLRQAEVLARQAQADFDRIQRLVASGAVSRKNADDVTATRNARQAQMQSAKAAV 171
L +QL Q E A+ ++ + +L + ++ + +
Sbjct: 268 LRVYKSQLEQIESEILSAKEEYQLVTQLFKN---------EILDKLRQTTDNIGLLTLEL 318

Query: 172 AAARLELSWTRITAPIAGRVDRILVTRGNLVSGGVAGNATLLTTIV 217
A + I AP++ +V ++ V GGV A L IV
Sbjct: 319 AKNEERQQASVIRAPVSVKVQQLKVHT----EGGVVTTAETLMVIV 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17225ACRIFLAVINRP10460.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1046 bits (2706), Expect = 0.0
Identities = 435/1040 (41%), Positives = 660/1040 (63%), Gaps = 19/1040 (1%)

Query: 6 FFIARPIFAIVLSLLMLLAGAIAFLKLPLSEYPAVTPPTVQVSASYPGANPQVIADTVAA 65
FFI RPIFA VL++++++AGA+A L+LP+++YP + PP V VSA+YPGA+ Q + DTV
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLEQVINGVDGMLYMNTQMAIDGRMVISIAFEQGTDPDMAQIQVQNRVSRALPRLPEEVQ 125
+EQ +NG+D ++YM++ G + I++ F+ GTDPD+AQ+QVQN++ A P LP+EVQ
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 126 RIGVVTEKTSPDMLMVVHLVSPQKRYDSLYLSNFAIRQVRDELARLPGVGDVLVWGAGEY 185
+ G+ EK+S LMV VS +S++ V+D L+RL GVGDV ++G +Y
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG-AQY 182

Query: 186 AMRVWLDPAKIANRGLTASDIVTALREQNVQVAAGSVGQQPEASA-AFQMTVNTLGRLTS 244
AMR+WLD + LT D++ L+ QN Q+AAG +G P ++ R +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 245 EEQFGEIVVKIGADGEVTRLRDVARVTLGADAYTLRSLLNGEAAPALQIIQSPGANAIDV 304
E+FG++ +++ +DG V RL+DVARV LG + Y + + +NG+ A L I + GANA+D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 305 SNAIRGKMDELQQNFPQDIEYRIAYDPTVFVRASLQSVAITLLEALVLVVLVVVLFLQTW 364
+ AI+ K+ ELQ FPQ ++ YD T FV+ S+ V TL EA++LV LV+ LFLQ
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 365 RASIIPLVAVPVSLVGTFALMHLFGFSLNTLSLFGLVLSIGIVVDDAIVVVENVERHISQ 424
RA++IP +AVPV L+GTFA++ FG+S+NTL++FG+VL+IG++VDDAIVVVENVER + +
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 425 GKSPG-EAAKKAMDEVTGPILSITSVLTAVFIPSAFLAGLQGEFYRQFALTIAISTILSA 483
K P EA +K+M ++ G ++ I VL+AVFIP AF G G YRQF++TI + LS
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 484 INSLTLSPALAAILLRPHHDTAKADWLTRLMGTVTGGFFHRFNRFFDSASNRYVSAVRRA 543
+ +L L+PAL A LL+P ++ GGFF FN FD + N Y ++V +
Sbjct: 483 LVALILTPALCATLLKP---------VSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 544 VRGSVIVMVLYAGFVGLTWLGFHQVPNGFVPAQDKYYLVGIAQLPSGASLDRTEAVVKQM 603
+ + +++YA V + F ++P+ F+P +D+ + + QLP+GA+ +RT+ V+ Q+
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 604 SAIALA--EPGVESVVVFPGLSVNGPVNVPNSALMFAMLKPFDEREDPSLSANAIAGKLM 661
+ L + VESV G S +G N+ + F LKP++ER SA A+ +
Sbjct: 594 TDYYLKNEKANVESVFTVNGFSFSG--QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 662 HKFSHIPDGFIGIFPPPPVPGLGATGGFKLQIEDRAELGFEAMTKVQSEIMSKAMQTP-E 720
+ I DGF+ F P + LG GF ++ D+A LG +A+T+ +++++ A Q P
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 721 LANMLASFQTNAPQLQVDIDRVKAKSMGVSLTDIFETLQINLGSLYVNDFNRFGRTWRVM 780
L ++ + + Q ++++D+ KA+++GVSL+DI +T+ LG YVNDF GR ++
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 781 AQADAPFRMQQEDIGLLKVRNAKGEMIPLSAFVTIMRQSGPDRIIHYNGFPSVDISGGPA 840
QADA FRM ED+ L VR+A GEM+P SAF T G R+ YNG PS++I G A
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 841 PGFSSGQATDAIEKIVRETLPEGMVFEWTDLVYQEKQAGNSALAIFALAVLLAFLILAAQ 900
PG SSG A +E + + LP G+ ++WT + YQE+ +GN A A+ A++ ++ FL LAA
Sbjct: 832 PGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 901 YNSWSLPFAVLLIAPMSLLSAIVGVWVSGGDNNIFTQIGFVVLVGLAAKNAILIVEFAR- 959
Y SWS+P +V+L+ P+ ++ ++ + N+++ +G + +GL+AKNAILIVEFA+
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 960 AKEHDGADPLTAVLEASRLRLRPILMTSFAFIAGVVPLVLATGAGAEMRHAMGIAVFAGM 1019
E +G + A L A R+RLRPILMTS AFI GV+PL ++ GAG+ ++A+GI V GM
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 1020 LGVTLFGLLLTPVFYVVVRR 1039
+ TL + PVF+VV+RR
Sbjct: 1011 VSATLLAIFFVPVFFVVIRR 1030



Score = 89.5 bits (222), Expect = 4e-20
Identities = 68/427 (15%), Positives = 143/427 (33%), Gaps = 36/427 (8%)

Query: 643 FDEREDPSLSANAIAGKLMHKFSHIPDGFIGIFPPPPVPGLGATGGFKLQIEDRAELGFE 702
F DP ++ + KL +P + ++ + + ++
Sbjct: 94 FQSGTDPDIAQVQVQNKLQLATPLLPQEV----QQQGISVEKSSSSYLMVAGFVSDNP-- 147

Query: 703 AMTKVQSEIMSKAMQTPELANM--LASFQTNAPQLQVDI--DRVKAKSMGVSLTDIFETL 758
T+ + L+ + + Q Q + I D ++ D+ L
Sbjct: 148 GTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDVINQL 207

Query: 759 QIN--------LGSLYVNDFNRFGRTWRVMAQADAPFRMQQEDIGLLKVR-NAKGEMIPL 809
++ LG + + + P E+ G + +R N+ G ++ L
Sbjct: 208 KVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP-----EEFGKVTLRVNSDGSVVRL 262

Query: 810 SAFVTIMRQSGPDRII-HYNGFPSVDISGGPAPGFSSGQATDAIEKIV---RETLPEGM- 864
+ +I NG P+ + A G ++ AI+ + + P+GM
Sbjct: 263 KDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMK 322

Query: 865 ---VFEWTDLVYQEKQAGNSALAIFALAVLLAFLILAAQYNSWSLPFAVLLIAPMSLLSA 921
++ T V + + + + A++L FL++ + + P+ LL
Sbjct: 323 VLYPYDTTPFV---QLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGT 379

Query: 922 IVGVWVSGGDNNIFTQIGFVVLVGLAAKNAILIVE-FARAKEHDGADPLTAVLEASRLRL 980
+ G N T G V+ +GL +AI++VE R D P A ++
Sbjct: 380 FAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQ 439

Query: 981 RPILMTSFAFIAGVVPLVLATGAGAEMRHAMGIAVFAGMLGVTLFGLLLTPVFYVVVRRM 1040
++ + A +P+ G+ + I + + M L L+LTP + +
Sbjct: 440 GALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKP 499

Query: 1041 ALKRENR 1047
+
Sbjct: 500 VSAEHHE 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17230cdtoxinb290.042 Cytolethal distending toxin B signature.
		>cdtoxinb#Cytolethal distending toxin B signature.

Length = 269

Score = 28.8 bits (64), Expect = 0.042
Identities = 21/70 (30%), Positives = 27/70 (38%), Gaps = 4/70 (5%)

Query: 75 AIALRNNRDLRKAGLNVEAARALYRIQRAEMLPTLGIATAMDAGRTPADLSVMDEPEINR 134
AIA+RNN A VE +R R + L D R PADL + + R
Sbjct: 155 AIAMRNN----DAPALVEEVYNFFRDSRDPVHQALNWMILGDFNREPADLEMNLTVPVRR 210

Query: 135 RYEMAGATTA 144
E+ A
Sbjct: 211 ASEIISPAAA 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17250ENTEROVIROMP1347e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 7e-43
Identities = 59/183 (32%), Positives = 88/183 (48%), Gaps = 21/183 (11%)

Query: 1 MKRRSSFLVFLGLLLASPLALANDQHTVSFGYAQTHLSSLKNSDSKDLRGFNFKYRYEFN 60
MK+ + +L + TV+ GYAQ+ N + GFN KYRYE +
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMN----KMGGFNLKYRYEED 56

Query: 61 ET-WGMLGSFTATRNEMENYTWKEGKLHKNGSDSVDYGSLMFGPTYRFNDYVSLYGNAGI 119
+ G++GSFT T + K Y + GP YR ND+ S+YG G+
Sbjct: 57 NSPLGVIGSFTYTEKSRTASSGDYNK--------NQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 ATMKF--------NKHSKEDSFAYGAGVIFNPVKSISIDASWEASRFFAVDTNTFGVSVG 171
KF + + F+YGAG+ FNP++++++D S+E SR +VD T+ VG
Sbjct: 109 GYGKFQTTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVG 168

Query: 172 YRF 174
YRF
Sbjct: 169 YRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17290PF005777590.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 759 bits (1962), Expect = 0.0
Identities = 262/880 (29%), Positives = 417/880 (47%), Gaps = 63/880 (7%)

Query: 4 TINLNRKS-LALLIAIVCSGSAQG----EEYYFDPALLQGATYGQ-NIARFNE-QQTPSG 56
I +R + + + + C+ +AQ E YF+P L +++RF Q+ P G
Sbjct: 17 HIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPG 76

Query: 57 DYLADVYVNGTLVTASTNIRFDAVKEGQQTEPCLPLSVMKAAQIKSLPATDAA----TEC 112
Y D+Y+N + A+ ++ F+ Q PCL + + + + + + C
Sbjct: 77 TYRVDIYLNNGYM-ATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 113 RPLREWVPHAGWQFDSATLRLLLTIPMTELTHKPRGYISPSEWDSGALALFLRHNTNWTH 172
PL + A Q D RL LTIP ++++ RGYI P WD G A L +N +
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS 195

Query: 173 TENTDSHYRYQYLWSGLNMGVNLGLWQVRHQSNLRYANSNQS-GSAWRYNSVRTWVQRPV 231
+N Y + L G+N+G W++R + Y +S+ S GS ++ + TW++R +
Sbjct: 196 VQN-RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDI 254

Query: 232 ASINSILSLGDSYTDSSLFGSLSFNGAKLVTDERMRPQGKRGYAPEVRGVAASSAHVVVK 291
+ S L+LGD YT +F ++F GA+L +D+ M P +RG+AP + G+A +A V +K
Sbjct: 255 IPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIK 314

Query: 292 QLGKVIYETNVPPGPFYIDDLYNTRYQGDLEVEVIEASGKTSRFTVPYSSVPDSVRPGNW 351
Q G IY + VPPGPF I+D+Y GDL+V + EA G T FTVPYSSVP R G+
Sbjct: 315 QNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHT 374

Query: 352 HYSLAFGRVRQYY--DIENRFFEGTFQHGVNNTITLNLGSRIAQRYQAWLAGGVWATGM- 408
YS+ G R + RFF+ T HG+ T+ G+++A RY+A+ G G
Sbjct: 375 RYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGAL 434

Query: 409 GAFGLNATWSNARAEHNARQQGWRAELSYSKTFT-TGTNLVLAAYRYSTNGFRDLQDVLG 467
GA ++ T +N+ +++ G Y+K+ +GTN+ L YRYST+G+ + D
Sbjct: 435 GALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTY 494

Query: 468 VRREAKTGI-------------DYYSDTLHQRNRLSATVSQPLGRLGTLNLSASTADYYN 514
R DYY+ ++R +L TV+Q LGR TL LS S Y+
Sbjct: 495 SRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWG 554

Query: 515 NQSRITQLQMGYSNQWRNISYGVNIARQRTTWDYDRFYHGVNEPLDVSSRQKYTETTMSF 574
+ Q Q G + + +I++ ++ + + W + ++
Sbjct: 555 TSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKG------------------RDQMLAL 596

Query: 575 NVSIPLDWGENRTSVA------MNYNQSSQSRSSTISMTGSSG---ENSDLSWSVYGGYE 625
NV+IP S + +Y+ S ++ G G E+++LS+SV GY
Sbjct: 597 NVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYA 656

Query: 626 RYRNSNSDSSAPTTFGGNLQQNTRFGALRANYDQGDNYRQEGLGASGTLVLHSGGLTAGP 685
+ NS S+ L +G Y D+ +Q G SG ++ H+ G+T G
Sbjct: 657 GGGDGNSGSTG----YATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQ 712

Query: 686 YTSDTFALIHADGAQGAIVQNGQGAVVDRFGYAILPSLSPYRVNNVTLDTRKMRSDAELT 745
+DT L+ A GA+ A V+N G D GYA+LP + YR N V LDT + + +L
Sbjct: 713 PLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLD 772

Query: 746 GGSQQIVPYAGAIARVNFATISGKAVLISVKMPDGGIPPMGADVFNGEGTNIGMVGQSGQ 805
+VP GAI R F G +L+++ + P GA V + + G+V +GQ
Sbjct: 773 NAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQ 831

Query: 806 IYARIAHPSGSLLVRWGTGANQRCRVAYQLDLHTKEPFLY 845
+Y +G + V+WG N C YQL +++ L
Sbjct: 832 VYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLT 871


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17420IGASERPTASE464e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.8 bits (108), Expect = 4e-07
Identities = 29/137 (21%), Positives = 55/137 (40%), Gaps = 14/137 (10%)

Query: 291 DSIPNEAEKMDEEKIVALINKAIDARMAKAD---EETDAKAKADAEEEARKEKADAEEKE 347
P A + + VA +K + K + ET A+ + A+E KA+ + E
Sbjct: 1025 VPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNE 1084

Query: 348 AEEAKAKA-DAEEKAAKEKADAEAKEKADAE-----EAERMAKEKADSQLRQEIAEL--- 398
++ ++ + + KE A E +EKA E E ++ + + Q + E +
Sbjct: 1085 VAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144

Query: 399 --RSRIPTELSDEERNE 413
R PT E +++
Sbjct: 1145 PARENDPTVNIKEPQSQ 1161



Score = 45.1 bits (106), Expect = 6e-07
Identities = 38/200 (19%), Positives = 69/200 (34%), Gaps = 12/200 (6%)

Query: 315 ARMAKADEETDAKAKADAEEEARKEKADAE-EKEAEEAKAKADAEEKAAKEKADAEAKEK 373
A+ +ET + ++EKA E EK E K + K + + E
Sbjct: 1086 AQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEP 1145

Query: 374 ADAEEAERMAKEKADSQLR-----QEIAELRSRIPTELSDEERNEIADAQVKADSVFSCF 428
A + KE Q E S + +++ ++ V+ +
Sbjct: 1146 ARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 429 GKRAPVPLSGEKPLAYRRRLMIQLQEHSPDFKTV---DLSTIADSALLSVAEKTIYADAQ 485
+ V R R ++ H+ + T D ST+A L S + +DA+
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDAR 1265

Query: 486 KSA---SLSVGPGMLREIKR 502
A +L+VG + + I +
Sbjct: 1266 AKAQFVALNVGKAVSQHISQ 1285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17450PYOCINKILLER320.001 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.1 bits (72), Expect = 0.001
Identities = 31/130 (23%), Positives = 47/130 (36%), Gaps = 10/130 (7%)

Query: 26 YSRGYQRADTSWKLQWAQRDLTDATITLQREVTERAKEQRRQHAADEERKRADEELAKIQ 85
Y R R + + T+A +LQ + AA + A A+ Q
Sbjct: 173 YMRFLDREMEGLTAAYNVKLFTEAISSLQIRMN-------TLTAAKASIEAAAANKAREQ 225

Query: 86 ADADAAERARGGLQQQLAAVQ-RQLAGSETGRLSALAAASQ--AKAETGILLAQLLGEAD 142
A A+A +A +QQ A A G + A AA A+ LAQ + +A
Sbjct: 226 AAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAI 285

Query: 143 DLAGKFAKEA 152
+ G+ A
Sbjct: 286 AVLGRVLASA 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17525PYOCINKILLER260.042 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.3 bits (57), Expect = 0.042
Identities = 22/89 (24%), Positives = 35/89 (39%), Gaps = 3/89 (3%)

Query: 3 IDKQALREEFRYMQVHYSDPADRA--RQVIYIAAEALLDENLQLQREKDATEAVALALRD 60
I +QA+RE Y DR + + LQ + A ++
Sbjct: 157 IGEQAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAAKASIEA 216

Query: 61 DM-RQAREQLAAAERRIAELEAKLETADK 88
+AREQ AA +R AE +A+ + A +
Sbjct: 217 AAANKAREQAAAEAKRKAEEQARQQAAIR 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17550HTHFIS280.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.9 bits (62), Expect = 0.003
Identities = 11/43 (25%), Positives = 18/43 (41%), Gaps = 3/43 (6%)

Query: 1 MNALEKAIQVAGNSSKLAEKLGVSSMTISHWKKRYGGVVPKGR 43
+ AL GN K A+ LG++ T+ + G V +
Sbjct: 442 LAALTAT---RGNQIKAADLLGLNRNTLRKKIRELGVSVYRSS 481


51GX95_17685GX95_17980Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_17685-215-3.234076amidohydrolase
GX95_17690-116-4.551201hypothetical protein
GX95_17695435-9.547686adhesin
GX95_177001031-6.493010TioA protein
GX95_177051031-6.281758transcriptional regulator
GX95_177101032-5.664814fimbrial protein TcfD
GX95_177151129-3.965143fimbrial assembly protein
GX95_177201229-2.507072fimbrial protein
GX95_177251227-1.546071fimbrial protein
GX95_17735837-8.301426transposase
GX95_17740936-8.308787transposase
GX95_17745939-7.322250cytoplasmic protein
GX95_17750738-6.174003transcriptional regulator
GX95_177601030-0.969014hypothetical protein
GX95_177651032-1.398894PerC family transcriptional regulator
GX95_177701032-1.717593carbohydrate transporter
GX95_17775932-1.579659pilin structural protein SafD
GX95_17780931-2.010543pilin outer membrane usher protein SafC
GX95_17785635-7.789595pili assembly chaperone protein SafB
GX95_17790635-8.163375pilus assembly protein
GX95_177953374.312440transposase
GX95_178003364.191189transposase
GX95_178054323.723565hypothetical protein
GX95_178104313.380494hypothetical protein
GX95_178154323.643345hypothetical protein
GX95_178204312.622389type IV secretion protein Rhs
GX95_178254221.689832hypothetical protein
GX95_178304221.664628type VI secretion protein Vgr
GX95_178353221.651124DUF2778 domain-containing protein
GX95_178403211.418335hypothetical protein
GX95_178453200.233062hypothetical protein
GX95_178503222.225012sugar-binding protein
GX95_178552223.435859hypothetical protein
GX95_178602202.627671type VI secretion protein Vgr
GX95_178651221.161690hypothetical protein
GX95_178701212.670511hypothetical protein
GX95_178751244.547795cytoplasmic protein
GX95_178800223.426704hypothetical protein
GX95_17885020-1.251965hypothetical protein
GX95_17890020-0.867116hypothetical protein
GX95_17895119-0.133499type VI secretion system-associated protein
GX95_17900219-2.394970type VI secretion system-associated lipoprotein
GX95_17905325-6.754509Hcp1 family type VI secretion system effector
GX95_17910217-4.382230hypothetical protein
GX95_17915016-3.573692hypothetical protein
GX95_17920016-2.767155Hcp1 family type VI secretion system effector
GX95_17925016-1.526680hypothetical protein
GX95_17930-1131.942916EvpB family type VI secretion protein
GX95_17935-1142.473148type VI secretion system-associated protein
GX95_17940-1162.155151hypothetical protein
GX95_179450162.925547hypothetical protein
GX95_179500173.600511ClpV1 family T6SS ATPase
GX95_179551174.344689hypothetical protein
GX95_179601194.032096cytoplasmic protein
GX95_179652185.169934impE family protein
GX95_179701185.300808cytoplasmic protein
GX95_179750174.405156type VI secretion protein
GX95_179800153.693724sciB domain protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17705ENTEROVIROMP335e-04 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 33.0 bits (75), Expect = 5e-04
Identities = 14/62 (22%), Positives = 26/62 (41%), Gaps = 7/62 (11%)

Query: 146 VGLAHVKLSNNTIPVGFGINETLSASKNNFAWGAGIGAKYAVTDNIMIDASYKYINAGKV 205
VG+ + K P S F++GAG+ ++ +N+ +D SY+ V
Sbjct: 106 VGVGYGKFQTTEYPTYKH-----DTSDYGFSYGAGL--QFNPMENVALDFSYEQSRIRSV 158

Query: 206 SI 207
+
Sbjct: 159 DV 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17725PF00577725e-15 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 72.2 bits (177), Expect = 5e-15
Identities = 99/652 (15%), Positives = 196/652 (30%), Gaps = 80/652 (12%)

Query: 78 PAAERQKALAALSRPLLRNSNLVCGVSEAK-------DSSECGYVATDKEDVAVIFDENN 130
Q + L+R L + G++ A C + + D D
Sbjct: 98 TGDSEQGIVPCLTRAQLASM----GLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQ 153

Query: 131 GQLSLFLNRDWLPDEERRDKRWLTPT--PEGVSAF-----IHRQTLYLSDDLHSRNMTLN 183
+L+L + + ++ R + ++ P G++A ++ +S LN
Sbjct: 154 QRLNLTIPQAFMS---NRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLN 210

Query: 184 GSGALGLGDGRYLGGNWAAIWNQSEHYNNSQTWFDNLFVRQDLGNQYYLQAGRMDQRNLS 243
L +G R L N +N S+ + S+ + + +L+ R S
Sbjct: 211 LQSGLNIGAWR-LRDNTTWSYNSSDSSSGSKNKWQH--------INTWLE--RDIIPLRS 259

Query: 244 SATGGDFGFSLLPLS--RFDGLRTGTTQAYVNHEVDQNATPVMVQVTRNARIDIYRGSEL 301
T GD F G + + + A + A++ I +
Sbjct: 260 RLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYD 319

Query: 302 LGSQFLTPGMHTLDTHSLPPGSYPLALRIYEDGILRRTESQPFSKGGNSF-SAQTQWFIQ 360
+ + + PG T++ S L + I E + + P+S T++ I
Sbjct: 320 IYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSIT 379

Query: 361 GGQEDTGDKASHYDGETVMAAGFRTGLRKNISLTEGISLAHE----AWYSETRLNSQHAV 416
G+ +G+ + + + GL ++ G LA + + + A+
Sbjct: 380 AGEYRSGN--AQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGAL 437

Query: 417 -LDGTLDLSAGILHGTDSTSGNTEQVTYNDGFSASLWRNHTESDACSGRHPQSVHASMTC 475
+D T + L G + + YN + S T R+ S + +
Sbjct: 438 SVDMTQ--ANSTLPDDSQHDGQSVRFLYNKSLNES----GTNIQLVGYRYSTSGYFNFAD 491

Query: 476 QTSMNASLSVPVGNWYALLGYSTSRTEGRPVYRGYDDNSDKENVF----------WRQAY 525
T N Y + Y+ +K Y
Sbjct: 492 TTYSRM-------NGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLY 544

Query: 526 IPASHRE-------SAQASATYNLNMAGMNINTHGGVWRTRNDGVNDDGLFMSVSVSYAS 578
+ SH+ Q A N +N + + D L ++V++ ++
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSH 604

Query: 579 Q-PPTMTGSNGYTSAGTDIHSSRNQKTQTSWNVNHVRSWQQDLYRELSVGFSGYNDDSWS 637
+ SA + N + V +L + G++G D +
Sbjct: 605 WLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSG 664

Query: 638 GSLGGRMS--GRMGELSATISNSHQRNAGSASSLTAGYSSSLALSRNGLFWG 687
+ ++ G G + S+S L G S + NG+ G
Sbjct: 665 STGYATLNYRGGYGNANIGYSHS-----DDIKQLYYGVSGGVLAHANGVTLG 711


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17775PF05775942e-27 Enterobacteria AfaD invasin protein
		>PF05775#Enterobacteria AfaD invasin protein

Length = 142

Score = 93.8 bits (233), Expect = 2e-27
Identities = 38/132 (28%), Positives = 67/132 (50%), Gaps = 2/132 (1%)

Query: 14 SVTLLVAASSLMPIANAAEKLQTTLRVGTYFRAGHVPDGMVLAQGRVTYHGSHSGFRVWS 73
S++L + LM + + ++ TL Y + DG+ LA GR+ +HSGFRVW
Sbjct: 4 SISLTLCGILLMLMGSFSQAADITLMNHKYM-GNLLHDGVKLATGRIICQDTHSGFRVWI 62

Query: 74 DEQKAGNTPTVLLLSGQQDPRHHIQVRLEGEGWQPDTVSGRGAILRTAADNAS-FSVVVD 132
+ ++ G ++ + P+H++++R+ G GW G + T ++AS F + VD
Sbjct: 63 NARQEGGGAGKYIVQSTEGPQHNLRIRISGNGWSSFVEKGIQGVFNTIKEDASIFYIEVD 122

Query: 133 GNQEVPADTWTL 144
GNQ+V +
Sbjct: 123 GNQQVQPGKYLF 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17780PF005778230.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 823 bits (2127), Expect = 0.0
Identities = 309/872 (35%), Positives = 452/872 (51%), Gaps = 52/872 (5%)

Query: 4 KQPALLLFIAGVVHCANA-------HAYTFDASML-GDAAKGVDMSLFNQG-VQQPGTYR 54
K F+ V CA A F+ L D D+S F G PGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 55 VDVMVNGKRVDTRDVVFKLEKDGQGTPFLAPCLTVSQLSRYGVKTEDYPQLWKAAKTPDE 114
VD+ +N + TRDV F QG + PCLT +QL+ G+ T + A D
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQG---IVPCLTRAQLASMGLNTASVSGMNLLAD--DA 134

Query: 115 CADL-SAIPQAKAVLDTNNQQLQLSIPQVALRTKFKGIAPEDLWDDGIPAFLMNYSARTT 173
C L S I A A LD Q+L L+IPQ + + +G P +LWD GI A L+NY+
Sbjct: 135 CVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGN 194

Query: 174 QTDYKMDMERRDNSSWVQLQPGINIGAWRVRNATSWQR-----SGQQSGKWQAAYTYAER 228
++ + +++ LQ G+NIGAWR+R+ T+W S KWQ T+ ER
Sbjct: 195 SVQNRIG--GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLER 252

Query: 229 GLYSLKSRLTLGQKTSQGEIFDSVPFTGVMLASDDNMVPYSERQFAPVVRGIARTQARVE 288
+ L+SRLTLG +QG+IFD + F G LASDDNM+P S+R FAPV+ GIAR A+V
Sbjct: 253 DIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVT 312

Query: 289 VKQNGYTIYNTTVAPGPFALRDLSVTDSSGDLHVTVWEADGSTQMFVVPYQTPAIALHQG 348
+KQNGY IYN+TV PGPF + D+ +SGDL VT+ EADGSTQ+F VPY + + +G
Sbjct: 313 IKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREG 372

Query: 349 YLKYSLLAGRYRSSDSATDKAQIAQATLMYGLPWNLTAYGGIQSATHYQAALLGLGGSLG 408
+ +YS+ AG YRS ++ +K + Q+TL++GLP T YGG Q A Y+A G+G ++G
Sbjct: 373 HTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMG 432

Query: 409 RWGSLSVDGSDTHSQRQGEAVQQGASWRLRYSNQLTATGTNLFLTRWQYASQGYNTLSDV 468
G+LSVD + +S ++ G S R Y+ L +GTN+ L ++Y++ GY +D
Sbjct: 433 ALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADT 492

Query: 469 LDSYRHNGNRL-------------WSWRENLQPSSRTTLMLSQSWGRHLGNLSLIGSRTD 515
S + N + + L ++Q GR L L GS
Sbjct: 493 TYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQT 551

Query: 516 WRNRPGHDDSYGLSWGTSIGVGLLSLNWNQNRTLWRNGEHRKENITSLWFSMPLSRWTGN 575
+ D+ + T+ +L+++ + W+ G ++ + +L ++P S W +
Sbjct: 552 YWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKG---RDQMLALNVNIPFSHWLRS 608

Query: 576 -------NVSASWQMTSPSHGGQTQQVGVNGEAFSQ-QLDWEVRQSYRADAPPGGGNNSA 627
+ SAS+ M+ +G T GV G L + V+ Y G+
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGY 668

Query: 628 LHLAWNGDYGLLGGDYSYSRAMRQMGVNIAGGIVIHHHGVTLGQPLQGSVALVEAPGASG 687
L + G YG YS+S ++Q+ ++GG++ H +GVTLGQPL +V LV+APGA
Sbjct: 669 ATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKD 728

Query: 688 VPVGGWPGVKTDFRGDTTVGNLSVYQENTVSLDPSRLPDDAEVTQTDVRVVPTEGAVVEA 747
V GV+TD+RG + + Y+EN V+LD + L D+ ++ VVPT GA+V A
Sbjct: 729 AKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRA 788

Query: 748 KFHTRIGARALMTLKREDGSAIPFGAQVTVNGQDGSAALVDTDSQVYLTGLADKGELTVK 807
+F R+G + LMTL + +PFGA VT + S+ +V + QVYL+G+ G++ VK
Sbjct: 789 EFKARVGIKLLMTLTH-NNKPLPFGAMVT-SESSQSSGIVADNGQVYLSGMPLAGKVQVK 846

Query: 808 WGA---QQCRVNYRLPAHKGIAGLYQMSGLCR 836
WG C NY+LP L Q+S CR
Sbjct: 847 WGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17895OMPADOMAIN702e-15 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 70.3 bits (172), Expect = 2e-15
Identities = 36/128 (28%), Positives = 58/128 (45%), Gaps = 16/128 (12%)

Query: 317 QHSRVVFRGDAMFVPGQKTVSDAIRPVINKAAREIARVG---GAVTVTGHTDSQPIHSAE 373
Q + D +F + T+ + +++ +++ + G+V V G+TD I S
Sbjct: 211 QTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDR--IGSDA 268

Query: 374 FPSNLVLSEKRAAEVAALLTSGGVPAGRVHIVGKGDTVPVADN---------GSKAGRAK 424
+ N LSE+RA V L S G+PA ++ G G++ PV N A
Sbjct: 269 Y--NQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAP 326

Query: 425 NRRVEILV 432
+RRVEI V
Sbjct: 327 DRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17955HTHFIS330.005 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.005
Identities = 38/185 (20%), Positives = 58/185 (31%), Gaps = 31/185 (16%)

Query: 529 ALREWQGDAPVV----FPEVSAAVVA--AIVADWTGIP--AGRMVKDEASQVLELPARLA 580
+++ + D PV+ A+ A D+ P ++ + E R +
Sbjct: 68 RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPS 127

Query: 581 QRVTGQDGALAQIGE--RIQTAR---AGLGDPRKPVGVFMLAGPSGVGKTETALALAEAI 635
+ + +G +Q A L + M+ G SG GK A AL +
Sbjct: 128 KLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL---MITGESGTGKELVARALHDYG 184

Query: 636 YGGEQNLVTINMSEFQEAHTVSTLKGAPPGYVGYGEGGVLTEAVRRHPWSV-------VL 688
V INM+ S L G E G T A R +
Sbjct: 185 KRRNGPFVAINMAAIPRDLIESELFGH--------EKGAFTGAQTRSTGRFEQAEGGTLF 236

Query: 689 LDEIE 693
LDEI
Sbjct: 237 LDEIG 241


52GX95_18290GX95_18345Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_182900153.3693415'-methylthioadenosine/S-adenosylhomocysteine
GX95_182950154.713644vitamin B12 ABC transporter substrate-binding
GX95_183001175.147370hypothetical protein
GX95_183051185.429415iron-sulfur cluster insertion protein ErpA
GX95_18310-1123.780861chloride channel protein
GX95_18315-2143.702705glutamate-1-semialdehyde-2,1-aminomutase
GX95_18320-2153.951984Fe3+-hydroxamate ABC transporter permease FhuB
GX95_18325-2133.411723iron-hydroxamate transporter substrate-binding
GX95_18330-1132.699087iron-hydroxamate transporter ATP-binding
GX95_18335-1132.300327penicillin-binding protein 1B
GX95_18340-2123.870440ATP-dependent helicase HrpB
GX95_183450103.8442082'-5' RNA ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18295FERRIBNDNGPP464e-08 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 46.5 bits (110), Expect = 4e-08
Identities = 32/174 (18%), Positives = 62/174 (35%), Gaps = 9/174 (5%)

Query: 23 APRVITLSPANTELAFAAGITPVGVSSYSDY------PPEAQKIEQVSTWQGMNLERIVA 76
R++ L EL A GI P GV+ +Y PP + V NLE +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTE 94

Query: 77 LKPDLVVAWRG-GNAERQVNQLTSLGIKVMWVDAVSIEQIADTLRQLAAWSPQPEKAQQA 135
+KP +V G G + + ++ + +L ++A A+
Sbjct: 95 MKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETH 154

Query: 136 AQTLLNEYAALNAEYAGKAKKRVFLQFGMNP--LFTSGKGSIQHQVLTTCGGEN 187
+ ++ + + + + L ++P + G S+ ++L G N
Sbjct: 155 LAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIPN 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18325FERRIBNDNGPP4980.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 498 bits (1283), Expect = 0.0
Identities = 245/296 (82%), Positives = 266/296 (89%)

Query: 1 MRDLYPLTRRRLLTAMALSPLLWQMNTAQAAAIDPRRIVALEWLPVELLLALGITPYGVA 60
M L ++RRRLLTAMALSPLLWQMNTA AAAIDP RIVALEWLPVELLLALGI PYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DVPNYKLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEKLARIAPGR 120
D NY+LWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPE LARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFDFSDGKKPLAVARRSLVELAQTLNLEATAEKHLAQYDRFIASQKPHFIRRGGRPLLMT 180
GF+FSDGK+PLA+AR+SL E+A LNL++ AE HLAQY+ FI S KP F++RG RPLL+T
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVLGPNCLFQEVLDEYGIVNAWQGETNFWGSTAVSIDRLAMYKEADVICFDH 240
TLIDPRHMLV GPN LFQE+LDEYGI NAWQGETNFWGSTAVSIDRLA YK+ DV+CFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 GNSTDMNALMATPLWQAMPFVRAGRFHRVPAVWFYGATLSTMHFVRILNNVLGGKA 296
NS DM+ALMATPLWQAMPFVRAGRF RVPAVWFYGATLS MHFVR+L+N +GGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


53GX95_18505GX95_18550Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_18505-2163.3731614-hydroxythreonine-4-phosphate dehydrogenase
GX95_18510-2133.678178hypothetical protein
GX95_18515-2143.9090242-keto-3-deoxygluconate permease
GX95_18520-2153.945731UPF0231 family protein
GX95_18525-1223.268377bifunctional aconitate hydratase
GX95_185301283.289297hypothetical protein
GX95_185352312.200100hypothetical protein
GX95_185404371.955057hypothetical protein
GX95_185453371.855135dihydrolipoyl dehydrogenase
GX95_185502331.121994pyruvate dehydrogenase complex
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18535IGASERPTASE310.012 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/141 (14%), Positives = 50/141 (35%), Gaps = 1/141 (0%)

Query: 343 VPYPNNTVAQRFHPTNVSGGLSATQQAPVSRDSQRQAAMAQFQQRSHTSPANLSGETSRD 402
+ PNN A + + ++ +APV + + + + S +
Sbjct: 997 ITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPS-ETTETVAENSKQESKTVEKNE 1055

Query: 403 RQRKAASQQLNQIAQRNNYRGYDGTQNSSRREAAQQTLNKSTTQQHRSELKAKAQQHPVS 462
+ + Q ++A+ TQ + ++ +T TT+ + K ++ V
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 463 QQQRDTARQRIESSTPQQRQA 483
++ + +P+Q Q+
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQS 1136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18555RTXTOXIND364e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.3 bits (84), Expect = 4e-04
Identities = 43/282 (15%), Positives = 85/282 (30%), Gaps = 38/282 (13%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGVVKEIKVSVGDKTETGALIMIFDSADGAADAAP 85
+ V +T G S E+ + +VKEI V G+ G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AKA--------EEKKEAAPAAAPAAAAAKDVHVPDIGSDEVEVTEVMVKVG------DTV 131
++ + + + + + + V EV+ T
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTW 198

Query: 132 EAEQSLITVEGDKASMEVPAPFAGTVKEIKVNTGDKVSTGSLIMVFEVAGAAPAKAEAAP 191
+ ++ + DK E A + ++ +K + A K
Sbjct: 199 QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK-QAIA-KHAVLE 256

Query: 192 AAAAPAAATGVKDVNVPDIGGDEVEVTEVMVK-----------VGDKVA--------AEQ 232
A V + E E+ + + DK+
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTL 316

Query: 233 SLITVEGDKASMEVPAPFAGTVKEIKIST-GDKVKTGSLIMV 273
L E + + + AP + V+++K+ T G V T +MV
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 29.8 bits (67), Expect = 0.039
Identities = 16/85 (18%), Positives = 32/85 (37%), Gaps = 4/85 (4%)

Query: 226 DKVAAEQSLITVEGDKASMEVPAPFAGTVKEIKISTGDKVKTGSLIMVFEVEGAAPAAAP 285
+ VA +T G S E+ VKEI + G+ V+ G ++ ++ A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVL--LKLTALGAEADT 136

Query: 286 AKQEAAAPAPAAKAEKPAAPAAKAE 310
K +++ + + + E
Sbjct: 137 LKTQSSLLQARLEQTRYQILSRSIE 161


54GX95_18765GX95_18795Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_187651183.9366313-isopropylmalate dehydratase large subunit
GX95_187702173.5710373-isopropylmalate dehydratase small subunit
GX95_187751173.993756inhibitor of glucose transporter
GX95_187801163.601395transcriptional regulator SgrR
GX95_187851164.656778thiamine ABC transporter substrate binding
GX95_187900174.667928thiamine/thiamine pyrophosphate ABC transporter
GX95_18795-2173.787252thiamine ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18790PF06580310.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.6 bits (69), Expect = 0.014
Identities = 16/79 (20%), Positives = 27/79 (34%), Gaps = 3/79 (3%)

Query: 4 RRQPLIPGWLIPGLCAAALMITVALAAFLALWLNAPSGAWSTIWRDSYLWHVVRFSFWQA 63
R GWL + L + A +W A + W + +++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLL---AFINTKPVAFTLPL 116

Query: 64 FLSAVLSVVPAVFLARALY 82
LS + +VV F+ LY
Sbjct: 117 ALSIIFNVVVVTFMWSLLY 135


55GX95_18975GX95_19040Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_189750233.342592carnitinyl-CoA dehydratase
GX95_18980-1203.134635phenylacetic acid degradation protein PaaY
GX95_189850203.126570CaiF/GrlA family transcriptional regulator
GX95_189900203.313804carbamoyl phosphate synthase large subunit
GX95_189950131.988117carbamoyl phosphate synthase small subunit
GX95_190000122.0965084-hydroxy-tetrahydrodipicolinate reductase
GX95_190050111.623399triphosphoribosyl-dephospho-CoA synthase CitG
GX95_190100120.311308holo-ACP synthase CitX
GX95_19015-1130.207727citrate lyase subunit alpha
GX95_19020-1183.300686citrate (pro-3S)-lyase subunit beta
GX95_190251245.133700citrate lyase acyl carrier protein
GX95_190300212.817787[citrate (pro-3S)-lyase] ligase
GX95_19035-1232.472684citrate:sodium symporter
GX95_190401254.079611oxaloacetate decarboxylase subunit gamma
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_18990HTHFIS330.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.008
Identities = 20/110 (18%), Positives = 38/110 (34%), Gaps = 18/110 (16%)

Query: 34 CKALREEGYRVILVNS-----------NPATIMTDPEMADATYIEPIHWEVVRKIIEKER 82
+AL GY V + ++ + ++TD M D + + I+K R
Sbjct: 20 NQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD------LLPRIKKAR 73

Query: 83 PDAVLPTMGGQTALNCALELERQGVLEEFGVTM-IGATADAIDKAEDRRR 131
PD + M Q A++ +G + + I +A +
Sbjct: 74 PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19030LPSBIOSNTHSS381e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 1e-05
Identities = 21/102 (20%), Positives = 43/102 (42%), Gaps = 4/102 (3%)

Query: 154 NPFTLGHRYLVEQAAAACDWLHLFVVKEDAS--FFSYTDRWALIEQGIAGIDNVTLHSGS 211
+P T GH ++E+ D +++ V++ FS +R I + IA + N + S
Sbjct: 10 DPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPNAQVDSFE 69

Query: 212 AYMISRATFPGYFLKEKGV--VDDCHCQIDLQLFREHLAPAL 251
++ A +G+ + D ++ + + LA L
Sbjct: 70 GLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDL 111


56GX95_19130GX95_19175Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19130110-4.173972arylsulfatase
GX95_19135211-4.943771cytoplasmic protein
GX95_19140212-6.012311anaerobic sulfatase maturase
GX95_19145216-7.670377arylsulfatase
GX95_19150223-10.055289hypothetical protein
GX95_19155126-10.304702multifunctional 2',3'-cyclic-nucleotide
GX95_19160130-9.573222arylsulfatase
GX95_19165132-8.603612transcriptional regulator
GX95_19170030-6.119156LysR family transcriptional regulator
GX95_19175028-3.289087transcriptional regulator
57GX95_19230GX95_19265Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19230-113-3.419249chitinase
GX95_19235-118-5.056993chitinase
GX95_19240016-4.560964hypothetical protein
GX95_19245116-3.671593hypothetical protein
GX95_19250217-3.277965hypothetical protein
GX95_19255118-2.315450hypothetical protein
GX95_19260224-0.003954molecular chaperone DnaJ
GX95_19265228-0.037429molecular chaperone DnaK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19265SHAPEPROTEIN1413e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (357), Expect = 3e-39
Identities = 84/388 (21%), Positives = 152/388 (39%), Gaps = 86/388 (22%)

Query: 5 IGIDLGTTNSCVAIMDGTQARVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + Q VL PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKG--QGIVLNE-------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPYKIIGADNGDAWLDVKGQKMAPPQISAE 118
P N + AI+ +K +A ++ +
Sbjct: 64 GRTPGN-IAAIR---------------------------------PMKDGVIADFFVTEK 89

Query: 119 VLKK-MKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAAL 177
+L+ +K+ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 90 MLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAI 149

Query: 178 AYGL--DKEVGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDTR 235
GL + G+ V D+GGGT ++++I ++ V + +GG+ FD
Sbjct: 150 GAGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEA 197

Query: 236 LINYLVDEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADAT 291
+INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 198 IINYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 292 GPKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIND--VILVGGQTRMPM 348
P+ + + LE+L E + + + VAL+ SDI++ ++L GG +
Sbjct: 245 VPRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRN 302

Query: 349 VQKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + + E G +P VA G
Sbjct: 303 LDRLLMEETGIPVVVAEDPLTCVARGGG 330


58GX95_19320GX95_19450Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19320120-4.379302thr operon leader peptide
GX95_19325120-4.418168tRNA/rRNA methyltransferase
GX95_19330120-5.703735hypothetical protein
GX95_19335117-5.754263two-component system response regulator ArcA
GX95_19340221-6.135920hypothetical protein
GX95_19345321-5.400228hypothetical protein
GX95_19350218-3.851765fimbrial protein SthA
GX95_19355014-2.272656fimbrial assembly protein
GX95_19360-112-1.201208fimbrial assembly protein
GX95_19365-3140.965011fimbrial protein SthD
GX95_19370-3121.311884fimbrial assembly protein
GX95_19375-2142.403033cell envelope integrity protein CreD
GX95_19380-1163.208100two-component system sensor histidine kinase
GX95_193850152.479624two-component system response regulator CreB
GX95_19390-1152.854445hypothetical protein
GX95_19395-1173.601525MDR efflux pump AcrAB transcriptional activator
GX95_19400-2172.829999phosphoglycerate mutase
GX95_19405-1183.196132inosine/xanthosine triphosphatase
GX95_19410-1183.245570Trp operon repressor
GX95_19415-1183.359600murein transglycosylase
GX95_19420-1173.078446energy-dependent translational throttle protein
GX95_19425-3121.329233trifunctional nicotinamide-nucleotide
GX95_19430-2131.423099DNA repair protein RadA
GX95_19435-116-1.206462phosphoserine phosphatase SerB
GX95_19440120-2.915795hypothetical protein
GX95_19445222-4.241026lipoate--protein ligase
GX95_19450223-4.777496hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19335HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19360PF005777700.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 770 bits (1989), Expect = 0.0
Identities = 335/877 (38%), Positives = 491/877 (55%), Gaps = 70/877 (7%)

Query: 14 LSFLFICCS----IKPALAHDHFNPLSLENDEPGVENVDLSVFEKGGQAE-GTYNVDIYI 68
LF+ C+ + A +FNP L +D DLS FE G + GTY VDIY+
Sbjct: 27 FVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 69 NNASVETKNIAFKNKKSADNKLSLQPCLSVAQLKQWGVKTENFPELQN-DPNGCTDL-SL 126
NN + T+++ F + D++ + PCL+ AQL G+ T + + + C L S+
Sbjct: 85 NNGYMATRDVTFN---TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSM 141

Query: 127 LAGAVAKFNVIGNRLDLAIPQIALIADPREFVPTSEWDEGINAFLLNYSFTGSQDHDIDE 186
+ A A+ +V RL+L IPQ + R ++P WD GINA LLNY+F+G+ +
Sbjct: 142 IHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN-RI 200

Query: 187 NRTENSEYANLRPGINIGAWRFRNYSTW-----NHDSDGQNSWDSAYTYVSRDIEFLKGQ 241
+ Y NL+ G+NIGAWR R+ +TW + S +N W T++ RDI L+ +
Sbjct: 201 GGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSR 260

Query: 242 LIAGENNTPADVFDSISFKGVQISSDDDMLPDSMKGFAPVIRGVAKSSAQVTVEQNGYTI 301
L G+ T D+FD I+F+G Q++SDD+MLPDS +GFAPVI G+A+ +AQVT++QNGY I
Sbjct: 261 LTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDI 320

Query: 302 YKTNVPAGPFAINDLYPTGGSGDLYVTIKESDGSEQHFIVPYASVPVLQREGHLKYDLTV 361
Y + VP GPF IND+Y G SGDL VTIKE+DGS Q F VPY+SVP+LQREGH +Y +T
Sbjct: 321 YNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITA 380

Query: 362 GRTRSSDSHSAQQNFAELTALYGLAGGITAYGGIESTLSNDIYHAALIGTGLNLGDLGAL 421
G RS ++ + F + T L+GL G T YGG + D Y A G G N+G LGAL
Sbjct: 381 GEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLA---DRYRAFNFGIGKNMGALGAL 437

Query: 422 SLDVTNSWSKIKAGDVVSDTLTGQSWRIRYSKDIQSTGTNFTVAGYRYSTKDYYALEDVL 481
S+D+T + S + GQS R Y+K + +GTN + GYRYST Y+ D
Sbjct: 438 SVDMTQANSTLPDD----SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTT 493

Query: 482 DTYSD--------------------NSHYDHVRNRTDLSLSQDII-YGSISLTLYNEDYW 520
+ + + + R + L+++Q + ++ L+ ++ YW
Sbjct: 494 YSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYW 553

Query: 521 N-DTHTTSLGIGYNNTWHNVSYGINYSYTLNADNSQDEDDDTEDSNDQQISINISIPLDA 579
G N + ++++ ++YS T NA DQ +++N++IP
Sbjct: 554 GTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKG---------RDQMLALNVNIPFSH 604

Query: 580 FMPS--------TYATYNMNSAKDGDTTHTVGLNGTALAQKNLSWSVQEGYSS---QEKA 628
++ S A+Y+M+ +G T+ G+ GT L NLS+SVQ GY+
Sbjct: 605 WLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSG 664

Query: 629 TSGNVSATYNGTYADINGGYSYDNHMRRLNYGVQGGVLLHRNGLTLSQPMDDTIILVKAP 688
++G + Y G Y + N GYS+ + +++L YGV GGVL H NG+TL QP++DT++LVKAP
Sbjct: 665 STGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAP 724

Query: 689 GAAGVPVNNETGVDTDFRGYAVVPYASPYHRNEVSLDTTGIRKNIELIDTSKTLVPTRGA 748
GA V N+TGV TD+RGYAV+PYA+ Y N V+LDT + N++L + +VPTRGA
Sbjct: 725 GAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGA 784

Query: 749 VVRAEYKTNIGYKALMALTRINNLPVPFGATVSSLTKPDNHSSFVGDAGQAWLTGLEKQG 808
+VRAE+K +G K LM LT NN P+PFGA V+S + S V D GQ +L+G+ G
Sbjct: 785 IVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTSES--SQSSGIVADNGQVYLSGMPLAG 841

Query: 809 RLLVKWGPTAADQCQVSYRIPSSPSASGVEILHEQCQ 845
++ VKWG C +Y++P + L +C+
Sbjct: 842 KVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19385HTHFIS936e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 6e-24
Identities = 32/139 (23%), Positives = 58/139 (41%)

Query: 1 MQQPQVWLVEDEQGIADTLIYTLQLEGFTVELFARGLPALEKARQQRPDAVILDVGLPDI 60
M + + +D+ I L L G+ V + + D V+ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLERHPALPILFLTARSDEVDRLLGLEIGADDYVAKPFSPREVSARVRTLLR 120
+ F+L ++ + P LP+L ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFAAPSPVVRTGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19425LPSBIOSNTHSS347e-04 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 33.6 bits (77), Expect = 7e-04
Identities = 15/73 (20%), Positives = 35/73 (47%), Gaps = 10/73 (13%)

Query: 71 GKFYPLHTGHIYLIQRACSQVDELHIIMGYDDTRDRGLFEDSAMSQQPTVSDRLRWLLQT 130
G F P+ GH+ +I+R C D++++ + + + +F +V +RL + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVL-RNPNKQPMF---------SVQERLEQIAKA 56

Query: 131 FKYQKNIRIHAFN 143
+ N ++ +F
Sbjct: 57 IAHLPNAQVDSFE 69


59GX95_19625GX95_19690Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19625119-3.389432glucosamine--fructose-6-phosphate
GX95_19630-117-1.612528glucosamine--fructose-6-phosphate
GX95_19635-316-0.958480PTS mannose transporter subunit IID
GX95_19640-4150.818460PTS sugar transporter
GX95_19645-3171.016356PTS mannose/fructose/sorbose transporter subunit
GX95_19650-2201.625756PTS sugar transporter
GX95_19655-1202.348443Fis family transcriptional regulator
GX95_196600232.995846methyl-accepting chemotaxis protein II
GX95_196651253.137243carbon starvation protein A
GX95_19670016-3.050125hypothetical protein
GX95_19675117-2.866775GTPase
GX95_19680323-4.806084AraC family transcriptional regulator
GX95_19685017-3.572157threonine transporter RhtB
GX95_19690-118-3.962612hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19650SECFTRNLCASE260.048 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 26.3 bits (58), Expect = 0.048
Identities = 18/57 (31%), Positives = 27/57 (47%), Gaps = 1/57 (1%)

Query: 61 IIALTDIFAGSVNNEFVRYLS-RENFHLLAGINLPLVIDLFMSENDGNTTHTIMTAL 116
+ AL I S+N+ V + REN + L V++L ++E T T MT L
Sbjct: 209 VAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTGMTTL 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19655HTHFIS1714e-48 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 171 bits (436), Expect = 4e-48
Identities = 91/381 (23%), Positives = 157/381 (41%), Gaps = 52/381 (13%)

Query: 106 LIGHNESLKRPIEQLKTALFYPDGGLPLLMTGESGTGKSYLAQLMHEYAIAQALLSP--D 163
L+G + +++ L + L L++TGESGTGK +A+ +H+ +
Sbjct: 139 LVGRSAAMQEIYRVLARLM---QTDLTLMITGESGTGKELVARALHD-------YGKRRN 188

Query: 164 APFISFNCAQYASNPELLAANLFGYVKGAFTGAQSDRPGAFEAADGGMLFLDEVHRLSAE 223
PF++ N A A +L+ + LFG+ KGAFTGAQ+ G FE A+GG LFLDE+ + +
Sbjct: 189 GPFVAINMA--AIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMD 246

Query: 224 GQEKLFTWLDRGEIYRVGDTAQGHPVSVRLVFATT----EEIHS-TFLTTFLRRI-PIQV 277
Q +L L +GE VG VR+V AT + I+ F R+ + +
Sbjct: 247 AQTRLLRVLQQGEYTTVGGR-TPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPL 305

Query: 278 NLPDLQHRSRQEKEALILLFFWTEAKKLS-ATLILKPRLLQILNQYVYRGNVGELKNVVK 336
LP L R R E ++ F +A+K L+++ + + GNV EL+N+V+
Sbjct: 306 RLPPL--RDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVR 363

Query: 337 YAVATAWAKKPGQETVTVSLHDLPDAMLSALPSLNESLADDTPVSISPDTNLTWLLRARD 396
A ++ + + + S +P A S+S + +R
Sbjct: 364 RLTALY-------PQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYF 416

Query: 397 EMQG-MIHDTQCHVLALYELVRSGKEGWETVQKRMGDEIEALFDRLIFTGDDNVHSQRLL 455
G + + + L E+ E + L T + + + LL
Sbjct: 417 ASFGDALPPSGLYDRVLAEM-----------------EYPLILAALTATRGNQIKAADLL 459

Query: 456 LITSQVREEFYRLEKRFNMQL 476
+ R + + + +
Sbjct: 460 GLN---RNTLRKKIRELGVSV 477


60GX95_19770GX95_19880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19770223-5.057730tryptophan--tRNA ligase
GX95_19775124-4.615599GntR family transcriptional regulator
GX95_19780228-6.551821dienelactone hydrolase
GX95_19785033-8.076087hypothetical protein
GX95_19790026-5.558872hypothetical protein
GX95_19795027-7.064207hypothetical protein
GX95_19800-122-4.333603hypothetical protein
GX95_19805021-3.849342hypothetical protein
GX95_19810119-3.160509SAM-dependent methyltransferase
GX95_19815313-3.218788hypothetical protein
GX95_19820314-5.544729molybdopterin-guanine dinucleotide biosynthesis
GX95_19825312-4.724724hypothetical protein
GX95_19830313-4.861502cytoplasmic protein
GX95_19835213-4.555522DNA repair protein
GX95_19840213-4.286709type II restriction endonuclease
GX95_19845216-6.054514hypothetical protein
GX95_19850217-5.394197hypothetical protein
GX95_19855220-5.872679TIGR02687 family protein
GX95_19860219-5.776883TIGR02688 family protein
GX95_19865325-6.833598hypothetical protein
GX95_19870432-8.122946restriction endonuclease
GX95_19875428-5.822506DNA helicase
GX95_19880325-4.670045hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19850HTHFIS300.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.017
Identities = 8/21 (38%), Positives = 14/21 (66%)

Query: 30 ESNVLITGPNGCGKSTLLRAI 50
+ ++ITG +G GK + RA+
Sbjct: 160 DLTLMITGESGTGKELVARAL 180


61GX95_20375GX95_20435Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_203752140.238135hypothetical protein
GX95_203802160.085734hypothetical protein
GX95_203854191.02382923S rRNA
GX95_203904211.065409ribonuclease R
GX95_203954210.634999transcriptional repressor NsrR
GX95_204004240.788135adenylosuccinate synthase
GX95_204051190.898047DUF2065 domain-containing protein
GX95_20410-1172.137002protease modulator HflC
GX95_20415-3132.812354HflK protein
GX95_20420-3162.479073GTPase HflX
GX95_20425-1153.624725RNA chaperone Hfq
GX95_20430-1133.615148tRNA (adenosine(37)-N6)-dimethylallyltransferase
GX95_20435-1123.101377DNA mismatch repair protein MutL
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20375cloacin310.004 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.8 bits (69), Expect = 0.004
Identities = 37/147 (25%), Positives = 58/147 (39%), Gaps = 20/147 (13%)

Query: 54 ARVKLSHDKLNDLRERKASLETRALAAMSKNVDAAL--LNEVAEEIARLENAILAEEQVL 111
+VK D+ N R ++ T + A +N + A LN+ E++AR + QV
Sbjct: 294 DQVKQRQDEEN--RRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVY 351

Query: 112 TNLEASRDAVEKAVTATGQRIAQFEQQLEVVKATEAMQRAQQAVTTSTVGAASNVSTAAE 171
+ ++ DA K + I QF + A + M G A
Sbjct: 352 NSRKSELDAANKTLADAIAEIKQFNRF-----AHDPM-----------AGGHRMWQMAGL 395

Query: 172 SLKRLQTRQAERQARLDAAAQLEKVAD 198
+R QT +QA DAAA+ + AD
Sbjct: 396 KAQRAQTDVNNKQAAFDAAAKEKSDAD 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20390RTXTOXIND320.013 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.013
Identities = 12/55 (21%), Positives = 26/55 (47%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPEDVMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ +D+ +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20410PYOCINKILLER290.030 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.030
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 225 NRMRAEREAVARRHRSQGQEEAEKLRAAADYEVTK---TLAEAERQGRIMRGEGDAEAAK 281
N+ R + A A+R + + +RAA Y + +A A +G I +G A A+
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQ 279

Query: 282 LFADA 286
+DA
Sbjct: 280 AISDA 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20420SECA330.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/144 (18%), Positives = 55/144 (38%), Gaps = 6/144 (4%)

Query: 282 HVVDAADVRVQENIEAVNTVLEEIDAHEIPTLMVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P + ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQSGVGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIEY 424
+R I R +++P EY
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20435ALARACEMASE300.028 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.1 bits (68), Expect = 0.028
Identities = 26/161 (16%), Positives = 57/161 (35%), Gaps = 18/161 (11%)

Query: 31 VENSLDAGATRVDIDIER---GGAKLIR-IRDNGCGIKKEELALALARHATSKIASLDDL 86
++ SLD A + ++ I R A++ ++ N G E + A+ + +L++
Sbjct: 5 IQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 87 EAIISLGFRGEAL----------ASISSVSRLTLTSRTAEQAEAWQAYAEGRDMDVTVK- 135
+ G++G L I RLT + Q +A Q +D+ +K
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 136 -PAAHPVGTTLEVLDLFYNTPARRKFMRTEK--TEFNHIDE 173
+ +G + + + + + F +
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEH 165


62GX95_20605GX95_20725Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_206052130.489870C4-dicarboxylate transporter
GX95_206102180.436458divalent-cation tolerance protein CutA
GX95_206153201.172750protein-disulfide reductase DsbD
GX95_206207281.201414transcriptional regulator
GX95_206307321.990200*phosphatase PAP2 family protein
GX95_2063510386.409828hypothetical protein
GX95_206409367.379451faeA-like family protein
GX95_206459315.204226hypothetical protein
GX95_206508306.029225fimbrial protein
GX95_206557315.198766hypothetical protein
GX95_206609357.080441fimbrial protein
GX95_206658346.652332fimbrial protein
GX95_206708345.682501hypothetical protein
GX95_206758416.409498fimbrial protein
GX95_206808375.058968Clp protease ClpE
GX95_206857345.177024fimbrial assembly protein
GX95_20690322-0.002736fimbrial protein
GX95_20695223-3.624353transcriptional regulator
GX95_20700229-8.094072hypothetical protein
GX95_20705230-8.997363transposase
GX95_20710336-11.316059GNAT family N-acetyltransferase
GX95_20715537-12.484312CopG family transcriptional regulator
GX95_20720226-9.314250hypothetical protein
GX95_20725124-7.619023AraC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20620HTHTETR479e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.5 bits (110), Expect = 9e-09
Identities = 28/189 (14%), Positives = 54/189 (28%), Gaps = 15/189 (7%)

Query: 3 REDILGEALKLLETQGIADTTLEMVAERVNRPLDTLQRFWPDKEAILYDALRYLSQQVDI 62
R+ IL AL+L QG++ T+L +A+ + + DK + + +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 63 WRRQLLLDETLSAEQKLLARYSA-LSECVSNNRYPGCLFIAACTFYPDPTHPIHQLANQQ 121
+ L L V+ R + F+ + Q
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRL---LMEIIFHKCEFVGEMAVVQQA 129

Query: 122 KRTAHDFTHELLTTL--------EID---DPAMVARQMELVLEGCLSRMLVNRSQADVDT 170
+R +++ + + A M + G + L D+
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKK 189

Query: 171 AQRLAEDIL 179
R IL
Sbjct: 190 EARDYVAIL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20685PF00577393e-126 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 393 bits (1012), Expect = e-126
Identities = 200/864 (23%), Positives = 330/864 (38%), Gaps = 88/864 (10%)

Query: 20 LALAIAVALGSGTASAGEKLDMSFIQGGAGINPEVWAALNGDY-APGRYLVDLSLNGKDI 78
L +A A A + +SA + F+ ++ NG PG Y VD+ LN +
Sbjct: 30 LFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 79 GKRILDVTPQDSEA---LCLSEAWLAKAGIYVSAEYFRKGYDAMRQCWVLAKA-PAVKVD 134
R + DSE CL+ A LA G+ ++ A C L
Sbjct: 90 ATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTAS-VSGMNLLADDACVPLTSMIHDATAQ 148

Query: 135 FDVATQSLSLAIPQKGLVKMPENV----EWDYGTEAFRMNYNANANSGRYN---TSAFGS 187
DV Q L+L IPQ + WD G A +NYN + NS + S +
Sbjct: 149 LDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAY 208

Query: 188 ADLKA--NIGRWVVSSSATASTGDGGSNEATINMFT-----ATRAIRSLSADLAVGKTST 240
+L++ NIG W + + T S S+ + N + R I L + L +G T
Sbjct: 209 LNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYT 268

Query: 241 GDSLLGSTGTYGVSLSRNNSMKPGSL-GYTPVFSGIANGPSRVTLTQNGRMLYSGMMPSG 299
+ G L+ +++M P S G+ PV GIA G ++VT+ QNG +Y+ +P G
Sbjct: 269 QGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPG 328

Query: 300 PFSVTDVPLYS-SGDVTMTVTGDDGREQTQVFPLSVMAGQLSPGQHEFSVAAGIPDDDSD 358
PF++ D+ SGD+ +T+ DG Q P S + G +S+ AG +
Sbjct: 329 PFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNA 388

Query: 359 LEGG--VFAASYGYGL-DGLTLRTGGVFNQDWQGASAGAVLGLGYLGALSADGAYATAK- 414
+ F ++ +GL G T+ G ++ + G +G LGALS D A +
Sbjct: 389 QQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTL 448

Query: 415 YRDGSRSGNKVKLAWSKQLALTNTGLR-VSWSHQSEEYEDMSSF---------------- 457
D G V+ ++K L + T ++ V + + + Y + +
Sbjct: 449 PDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGV 508

Query: 458 --NPAETWLQQNQGRRTRDEWNAGISQPVGGLFSLSASGWQRSYYPASTTGSYRYADDNG 515
+ N R + ++Q +G +L SG ++Y+ +
Sbjct: 509 IQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYW-----------GTSN 557

Query: 516 KETGITGTLSTQIKSVSLNLGWSGSRNMNGENTW-SASASVSVPF--------TLFERKY 566
+ L+T + ++ L +S ++N + + +V++PF R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 567 SSSTTVSTGKDGGTGVSTGISGAL--NDRFSYGLGGGRDSDGGISS----YLNAAYSGDR 620
S+S ++S +G G+ G L ++ SY + G G +S Y Y G
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 621 ANLNGTVNHSSGSGTSGSVSASGSVLAVPAARDIMFSRTTGDTVAVVSVKDTPGVKVMSG 680
N N +HS SG VLA + + DTV +V KV +
Sbjct: 678 GNANIGYSHSDDI-KQLYYGVSGGVLAHANG--VTLGQPLNDTVVLVKAPGAKDAKVENQ 734

Query: 681 NETS-DSDGNLVVP-LNSYDWNTVTIDAGTLPLDTELSTTSRKVVPTDSAVVWMPFDALK 738
D G V+P Y N V +D TL + +L VVPT A+V F A
Sbjct: 735 TGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARV 794

Query: 739 VHRYLLQVRMPDGAFVPGGTWARDSNNTPLGFVANNGVLMINAVDRPGDITL-------G 791
+ L+ + + +P G ++ G VA+NG + ++ + G + +
Sbjct: 795 GIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENA 853

Query: 792 QCRI----PAAKLQETAKLQEITC 811
C P Q+ C
Sbjct: 854 HCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20695FIMREGULATRY1032e-32 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 103 bits (257), Expect = 2e-32
Identities = 39/80 (48%), Positives = 58/80 (72%)

Query: 14 VRAGELEPGAVSEEQFWLLVDISPIHSEKIILALKDYFVSGYSRKVVCERHGMSGGYLST 73
+R L PG++SE F+LL+ IS IHS+++ILA+KDY V G+SRK VCE++ M+ GY ST
Sbjct: 18 IRESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHSRKEVCEKYQMNNGYFST 77

Query: 74 SVNRLNFISRNVHKLAGYYS 93
++ RL ++ +LA YY+
Sbjct: 78 TLGRLIRLNALAARLAPYYT 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20710SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 6/65 (9%)

Query: 89 LAVDKSLHGQGVGRALVRDAGLRVIQVAETIGIRGMLVHALSDE--AREFYQRVGFVPSP 146
+AV K +GVG AL+ A I+ A+ G+++ A FY + F+
Sbjct: 95 IAVAKDYRKKGVGTALLHKA----IEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150

Query: 147 MDPMM 151
+D M+
Sbjct: 151 VDTML 155


63GX95_20935GX95_21010Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_20935-3153.025929cation/acetate symporter ActP
GX95_20940-3142.837206murein hydrolase effector protein LrgB
GX95_20945-3132.465863murein hydrolase regulator LrgA
GX95_20950-2142.308183LysR family transcriptional regulator
GX95_20955-2140.100232Na+/H+ antiporter
GX95_20960-115-0.881733permease
GX95_20965026-8.440535glutathione S-transferase
GX95_20970323-5.922883redox-sensitive transcriptional activator SoxR
GX95_20975325-6.816187AraC family transcriptional regulator
GX95_20980326-7.439216hypothetical protein
GX95_20985327-8.087072hypothetical protein
GX95_20990328-8.480573antibiotic ABC transporter ATP-binding protein
GX95_20995326-7.254980Ig-like domain repeat protein
GX95_21000541-14.843984cation transporter
GX95_21005-118-6.430175ABC transporter
GX95_21010-215-3.883583hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20995GPOSANCHOR502e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.1 bits (119), Expect = 2e-07
Identities = 49/190 (25%), Positives = 76/190 (40%), Gaps = 30/190 (15%)

Query: 96 DSAQVEKKGNGKRRNKKEEEELKKQLDEAENAKK--EADKAK-EEAEKAKEAAEKALNEA 152
A+ + + + L++ LD + AKK EA+ K EE K EA+ ++L
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 153 FEVQNSSK-QIEEMLQNFLADNVAKDNLAQQSDASQQNTQA---KATQASKQNDAEKVLP 208
+ +K Q+E Q N + S+AS+Q+ + + +A KQ +
Sbjct: 353 LDASREAKKQLEAEHQKLEEQN-------KISEASRQSLRRDLDASREAKKQVEKALEEA 405

Query: 209 QPI-------NKNTSTGK--SNSSKNEEN-KLDAESVKEPLKVTLALAAES----NSGSK 254
NK K + K E KL+AE+ + LK LA AE +G
Sbjct: 406 NSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEA--KALKEKLAKQAEELAKLRAGKA 463

Query: 255 DDSITNFTKP 264
DS T KP
Sbjct: 464 SDSQTPDAKP 473



Score = 48.1 bits (114), Expect = 9e-07
Identities = 35/136 (25%), Positives = 63/136 (46%), Gaps = 4/136 (2%)

Query: 98 AQVEKKGNGKRRNKKEEEELKKQLDEAENAKKEADKAKEEAEKAKEAAEKALNEAFEVQN 157
A+ +K + ++ + L++ LD + AKK+ +KA EEA A EK E E +
Sbjct: 365 AEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKK 424

Query: 158 SSKQIEEMLQNFL-ADNVA-KDNLAQQSD--ASQQNTQAKATQASKQNDAEKVLPQPINK 213
+++ + LQ L A+ A K+ LA+Q++ A + +A +Q K +P
Sbjct: 425 LTEKEKAELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQA 484

Query: 214 NTSTGKSNSSKNEENK 229
+ K N +K +
Sbjct: 485 PQAGTKPNQNKAPMKE 500



Score = 43.9 bits (103), Expect = 2e-05
Identities = 17/115 (14%), Positives = 41/115 (35%), Gaps = 19/115 (16%)

Query: 101 EKKGNGKRRNKKEEEELKKQLDEAENAKKEAD-------KAKEEAEKAKEAAEKALNEAF 153
++ ++ + + + + E + + + A ++L
Sbjct: 260 ARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ-SQVLNANRQSLRRDL 318

Query: 154 EVQNSSK-QIEEMLQNFLADNVAKDNLAQQSDASQQNTQAK---ATQASKQNDAE 204
+ +K Q+E Q N + S+AS+Q+ + + +A KQ +AE
Sbjct: 319 DASREAKKQLEAEHQKLEEQN-------KISEASRQSLRRDLDASREAKKQLEAE 366


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21000RTXTOXIND2668e-87 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 266 bits (682), Expect = 8e-87
Identities = 87/425 (20%), Positives = 175/425 (41%), Gaps = 25/425 (5%)

Query: 9 LMMIIISLTILIIILTYFIEINSVVHGQGVITTKDNAQLISLSKGGTIQDIYVAEGDTVK 68
+ I+ ++ IL+ ++ V G +T ++ I + +++I V EG++V+
Sbjct: 60 VAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVR 119

Query: 69 KGELLAKVVNLDLQKEYQRYRTQKGYLDKDVNEI-------SFILDKENESGLITLDGTR 121
KG++L K+ L E +TQ L + + S L+K E L +
Sbjct: 120 KGDVLLKLT--ALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQ 177

Query: 122 SLSNKEVKANIELVHSQIRA-------KELKKTSLDSEISGLQEKLSSKEKELALLAEEI 174
++S +EV L+ Q KEL +E + +++ E + +
Sbjct: 178 NVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRL 237

Query: 175 NILSPLVKKGISPYTNFLNKKQAYIKVKSEINDIESSITLKKDDIELVVNDIEALNNELR 234
+ S L+ K L ++ Y++ +E+ +S + + +I + + + +
Sbjct: 238 DDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFK 297

Query: 235 LSLSKIISKNLQELEVVNSTLKVIEKQINEEDIYSPVDGVIYKINKSATTHGGVIQAADL 294
+ + + + ++ L E++ I +PV + ++ T GGV+ A+
Sbjct: 298 NEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQL--KVHTEGGVVTTAET 355

Query: 295 LFEIKPKVRTMLADVKILPKYRDQIYVDEAVKLDVQSIIQPKIKSYNATIDNISPDSYEE 354
L I P+ T+ + K I V + + V++ + + NI+ D+ E+
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIED 415

Query: 355 NTGGTIQRYYKVIIAFDVNE----DDLRWLKPGMTVDASVITGKHSIMEYLLSPLMKGVD 410
G + VII+ + N + L GM V A + TG S++ YLLSPL + V
Sbjct: 416 QRLGL---VFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVT 472

Query: 411 KAFSE 415
++ E
Sbjct: 473 ESLRE 477


64GX95_21845GX95_21930Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_218452161.935647DNA-binding response regulator
GX95_218500161.911141two-component system sensor histidine kinase
GX95_21855-1161.434828hypothetical protein
GX95_21860-217-0.000902MOSC domain-containing protein
GX95_21865-116-0.800987superoxide dismutase
GX95_21870-114-3.005725hypothetical protein
GX95_21875013-4.202640TRAP transporter small permease protein
GX95_21880112-4.301758hypothetical protein
GX95_21885013-4.333803IS200/IS605 family transposase
GX95_21890-114-4.358924hypothetical protein
GX95_21895-114-2.286129hypothetical protein
GX95_21900-2120.431760rhamnose/proton symporter RhaT
GX95_21905-1141.168370AraC family transcriptional regulator
GX95_21915-1212.825598transcriptional activator RhaS
GX95_21920-2223.487381rhamnulokinase
GX95_21925-2224.046379L-rhamnose isomerase
GX95_21930-1183.588788rhamnulose-1-phosphate aldolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21850HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 2e-24
Identities = 36/128 (28%), Positives = 64/128 (50%), Gaps = 2/128 (1%)

Query: 3 KILLVDDDRELTSLLKELLEMEGFNVLVAHDGEQALELL-DDSIDLLLLDVMMPKKNGID 61
IL+ DDD + ++L + L G++V + + + DL++ DV+MP +N D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 TLKALRQTH-QTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRRSHW 120
L +++ PV++++A+ + + + E GA DYLPKPF+ EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 SEQQQSSD 128
+ D
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21855PF06580290.037 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.037
Identities = 19/108 (17%), Positives = 37/108 (34%), Gaps = 28/108 (25%)

Query: 354 LENIVRNALRY------SHTKIEVGFSVDKDGITITVDDDGPGVSPEDREQIFRPFYRTD 407
++ +V N +++ KI + + D +T+ V++ G +E
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE---------- 309

Query: 408 EARDRESGGTGLGLAIVESAMQQHRGWVKAD---DSPLGGLRLTLWLP 452
TG GL V +Q G +A G + + +P
Sbjct: 310 --------STGTGLQNVRERLQMLYG-TEAQIKLSEKQGKVNAMVLIP 348


65GX95_22345GX95_22440Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_22345-112-3.599025hypothetical protein
GX95_22350-214-0.251104chloramphenical resistance permease RarD
GX95_22355-213-0.182312hypothetical protein
GX95_22360-1130.390869hypothetical protein
GX95_223650141.782038magnesium and cobalt transport protein CorA
GX95_223700163.139738DNA helicase II
GX95_22375-1153.523642flavin mononucleotide phosphatase
GX95_22380-2111.072120tyrosine recombinase XerC
GX95_22385-2130.684092DUF484 family protein
GX95_22390-1150.116092diaminopimelate epimerase
GX95_22395018-1.519480hypothetical protein
GX95_22400-121-4.591681hypothetical protein
GX95_22405-121-4.997914hypothetical protein
GX95_22410-112-3.035384iron donor protein CyaY
GX95_22420-113-0.935951hypothetical protein
GX95_224250110.202538hypothetical protein
GX95_224300122.296910adenylate cyclase
GX95_22435-1122.465136hydroxymethylbilane synthase
GX95_22440-1133.056631uroporphyrinogen-III synthase
66GX95_22760GX95_22800Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_22760217-0.113458FMN-binding protein MioC
GX95_22765220-0.009016tRNA uridine-5-carboxymethylaminomethyl(34)
GX95_22770420-1.38896916S rRNA (guanine(527)-N(7))-methyltransferase
GX95_227755320.060223ATP F0F1 synthase subunit I
GX95_227805310.124938F0F1 ATP synthase subunit A
GX95_227855380.912290ATP F0F1 synthase subunit C
GX95_227905361.013489F0F1 ATP synthase subunit B
GX95_227953310.700851F0F1 ATP synthase subunit delta
GX95_228002290.642809F0F1 ATP synthase subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22790PYOCINKILLER270.043 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.7 bits (58), Expect = 0.043
Identities = 15/42 (35%), Positives = 21/42 (50%)

Query: 70 AEAQVIIEQANKRRAQILDEAKTEAEQERTKIVAQAQAEIEA 111
A+A + ANK R Q EAK +AE++ + A A A
Sbjct: 210 AKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYA 251


67GX95_00065GX95_00095N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_00065-3121.220253MFS transporter
GX95_00070-2130.868725TMAO reductase system sensor histidine
GX95_00075-111-0.300994TMAO reductase system periplasmic protein TorT
GX95_00080-1121.188742two-component system response regulator TorR
GX95_00085-1132.493808trimethylamine N-oxide reductase cytochrome
GX95_000900174.110488trimethylamine N-oxide reductase I catalytic
GX95_000953235.617480molecular chaperone TorD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00065TCRTETA478e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.1 bits (112), Expect = 8e-08
Identities = 65/384 (16%), Positives = 118/384 (30%), Gaps = 36/384 (9%)

Query: 66 AEMGYVFSAFAWLYTLCQIPGGWFLDRIGSRLTYFIAIFGWSVATLLQGFATGLLSLIGL 125
A G + + +A + C G DR G R +++ G +V + A L L
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 126 RAITGIFEAPAFPANNRMVTSWFPEHERASAVGFYTSGQFVGLAFLTPLLIWIQEMLSWH 185
R + GI A + ERA GF ++ G+ P+L + S H
Sbjct: 103 RIVAGITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMV-AGPVLGGLMGGFSPH 160

Query: 186 WVFIVTGGIGIIWSLVWFKVYQPPRLTKSLSQAELEYIRDGGGLVDGDAPAKKEARQPLT 245
F + + L + + P ++EA PL
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKGE-------------------RRPLRREALNPLA 201

Query: 246 KADWKLVFHRKLVGVYLGQFAVNSTLWFFLTWFPNYLTQEKGITALKAGFMTTV-PFLAA 304
W + + F + + + A G L +
Sbjct: 202 SFRWARGM-TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHS 260

Query: 305 FFGVLLSGWLADKLVKKGFSLGVARKTPIICGLLISTC--IMGANYTNDPLWIMALMAIA 362
+++G +A +L + ++ G++ I+ A T + ++ +A
Sbjct: 261 LAQAMITGPVAARL---------GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA 311

Query: 363 FFGNGFASITWSLISSLAPMRLIGLTGGMFNFIGGLGGISVPLVIGYL-AQSYGFAPALV 421
G G ++ +++S G G + L I PL+ + A S
Sbjct: 312 SGGIGMPALQ-AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWA 370

Query: 422 YISVVALLGALSYILLVGDVKRVG 445
+I+ AL L G G
Sbjct: 371 WIAGAALYLLCLPALRRGLWSGAG 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00070HTHFIS556e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 6e-10
Identities = 28/162 (17%), Positives = 64/162 (39%), Gaps = 5/162 (3%)

Query: 681 RLLLIEDNMLTQRITAEMLTGKGVKVSVAESANDALRCLAEGESFDVALVDFDLPDYDGL 740
+L+ +D+ + + + L+ G V + +A R +A G+ D+ + D +PD +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDENAF 63

Query: 741 TLAQQLMSLYPAMKRIGFSAH-VIDDNLRQRTAGLFCGIIQKPVPREELYRMIAHYLQGK 799
L ++ P + + SA ++ G + + KP EL +I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAY-DYLPKPFDLTELIGIIGRALAEP 122

Query: 800 SHNARAMLNEHQLAGDMASVGP--EKLRQWVALFKDSALPLV 839
+ ++ Q + +++ + +A + L L+
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00080HTHFIS772e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.8 bits (189), Expect = 2e-18
Identities = 28/115 (24%), Positives = 56/115 (48%), Gaps = 1/115 (0%)

Query: 4 HIVIVEDEPVTQARLQAYFEQEGYRVSVTDSGAGLRDIMEHEHVSLILLDINLPDENGLM 63
I++ +D+ + L + GY V +T + A L + L++ D+ +PDEN
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LTRALRER-STVGIILVTGRCDQIDRIVGLEMGADDYVTKPLELRELVVRVKNLL 117
L +++ + +++++ + + I E GA DY+ KP +L EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00095PF06872290.021 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 28.5 bits (63), Expect = 0.021
Identities = 14/54 (25%), Positives = 27/54 (50%)

Query: 111 LLLEAGMEVNDDFKEPTDHLAIYLELLSHLHFSLGESFQQRRMNKLRQKTLSSL 164
L+L+A +++N D+K+P + + +LL L L + + Q L+ L
Sbjct: 29 LVLDATIKINSDYKKPWNEMTCAEKLLKILTLGLWNPKYSQDERQQFQGLLTVL 82


68GX95_00255GX95_00290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_002550214.364673DNA-binding response regulator
GX95_002601203.401919two-component system sensor histidine kinase
GX95_00265-1151.886709regulatory protein UhpC
GX95_00270-1130.283826hexose phosphate transporter
GX95_00275013-0.359570hypothetical protein
GX95_00280-1130.430092transcriptional regulator
GX95_00285-2151.082435addiction module toxin RelE
GX95_00290-2161.376861MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00255HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.0 bits (148), Expect = 2e-13
Identities = 23/116 (19%), Positives = 45/116 (38%), Gaps = 5/116 (4%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHT 114
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00260PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 30/142 (21%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLAQAIRSLLREMELESRGIVSHLDWRIDETALSESQRVTLFRVCQEGLNN 424
LR ++LA + + ++L S L + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----NASAVTLQGWQQDERLMLVIEDDGSGLPPGSQQ-QGFGLTGMRERVSALG 478
+KH + L+G + + + L +E+ GS +++ G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLTISCTHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00265TCRTETB418e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.0 bits (96), Expect = 8e-06
Identities = 72/408 (17%), Positives = 138/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILITIWLGYALFY--FTRKSFNAAAPEILASGILTRSDIGLLATLFYITYGVSKFVSG 86
RH I IWL F+ N + P+I + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIVNILFGFSTSLWAFALLWALNAFFQGFGS---PVCARLL 143
+SD+ + + G+I +++ S F L + F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPLVMAAVALHYGWRVGMMVAGLLAIGVGMVLC 202
A Y + RG + L + +G + P + +A + W +++ + I V
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITV----- 182

Query: 203 WRLRDRPQAIGLPPVGDWRHDALEVAQQQEGAGLSRKEILAKYVLLNPYIWLLSLCYVLV 262
P + L ++ G L I+ + Y + VL
Sbjct: 183 ------PFLMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVSMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNG----------------NRGPMNLIFAAGILLSVGSL---WLMPFASYVMQ 347
GS +F G RGP+ ++ LSV L +L+ S+ M
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMT 352

Query: 348 AACFFTTGFFVFGPQMLIGMAAAECSHKEAAGAATGFVGLFAYLGASL 395
F G F + +I + ++ AGA + ++L
Sbjct: 353 IIIVFVLGGLSF-TKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00270TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 27/168 (16%), Positives = 64/168 (38%), Gaps = 16/168 (9%)

Query: 49 FNIAQNDMISTYGLSMTELGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--- 89

Query: 109 MLGFSASMGAGSTSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
F + +G S F ++ + F Q G + + + ++ P+ RG G
Sbjct: 90 --CFGSVIGFVGHSFFSLLIM---ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRF 212
+G + A+Y+ + + + P +I +I ++
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP-MITIITVPFLMKL 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_00290TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 1e-04
Identities = 40/208 (19%), Positives = 77/208 (37%), Gaps = 13/208 (6%)

Query: 33 ITVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFSSLFITQIIQATDR--RY 86
+ ++ + + L+ P + +DL S V + A A+ + +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 87 IVILFAVLLTA-SCLMVSFANSFTLLLLGRACLGLALGGFWAMSASLTMRLVPARTVPKA 145
V+L ++ A +++ A +L +GR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 146 LSVIFGAVSIALVIAAPLGSFLGGIIGWRNVFNAAAVMGVLCVIWVVKSLP-SLPGEPSH 204
+ +V LG +GG F AAA + L + LP S GE
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 205 QKQ---NMFSLLQRPGVMAGMIAIFMSF 229
++ N + + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


69GX95_01210GX95_01300N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_01210-2143.476156anion permease
GX95_01215-2164.206172hypothetical protein
GX95_01220-2143.411591hypothetical protein
GX95_01225-2143.156038ABC transporter ATP-binding protein
GX95_012300141.994542hypothetical protein
GX95_012351141.461620nickel responsive regulator
GX95_012402151.381718ACP synthase
GX95_012450131.229420permease
GX95_012501133.810260MFS transporter
GX95_012551143.506990hypothetical protein
GX95_012602143.660050hypothetical protein
GX95_012652153.974958sulfurtransferase TusA
GX95_012702154.092128methyl-accepting chemotaxis protein II
GX95_012752133.957689zinc/cadmium/mercury/lead-transporting ATPase
GX95_012802151.919399hypothetical protein
GX95_012853162.229576hypothetical protein
GX95_012902162.116085hypothetical protein
GX95_012951161.66841616S rRNA (guanine(966)-N(2))-methyltransferase
GX95_013001181.467048signal recognition particle-docking protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01210TYPE3IMSPROT300.022 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.1 bits (68), Expect = 0.022
Identities = 23/194 (11%), Positives = 57/194 (29%), Gaps = 40/194 (20%)

Query: 12 TGLLLLLALAFVLFYEAINGFHDTANAVATVIY------TRAMRSQLAVVMAAVFNFFGV 65
L++AL+ +L + F + + ++A+ + V+ F
Sbjct: 30 VSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFP 89

Query: 66 LLGGLSVAYAIVHML-------------------PTDLLLNMGSAHGLAMVFSMLLAAII 106
LL ++ H++ P + + S L +L ++
Sbjct: 90 LLTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVL 149

Query: 107 WNLGTWYFGLPASSSHTLIGAIIGIGLTNAMMTGTSVVDALNIPKVINIFGSLIISPIVG 166
++ W + ++ + T + T ++ + L++ VG
Sbjct: 150 LSILIWIIIKG------NLVTLLQLP-TCGIECITPLLGQI--------LRQLMVICTVG 194

Query: 167 LVFAGGLIFLLRRY 180
V + Y
Sbjct: 195 FVVISIADYAFEYY 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01220RTXTOXIND784e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 4e-18
Identities = 70/413 (16%), Positives = 136/413 (32%), Gaps = 82/413 (19%)

Query: 3 KMKRHLVWWGAGILVAVAAIAWWMLRPAGIPEGFAASNGRI--EATEVDIATKIAGRIDT 60
+ LV + + V A +L E A +NG++ +I +
Sbjct: 54 SRRPRLVAY-FIMGFLVIAFILSVLGQV---EIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 61 ILVSEGQFVRQGEVLAKMDTRV----------------LQEQRLEAI------------- 91
I+V EG+ VR+G+VL K+ L++ R + +
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 92 -----------------------AQIKEAESAVAAARALLEQRQSEMRAAQSVVKQREAE 128
Q ++ L+++++E + + + E
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 129 LDSVSKRHVRSRSLSQRGAVSVQQLDDDRAAAESARAALETAKAQVSAAKAAIEAARTSI 188
R SL + A++ + + A L K+Q+ ++ I +A+
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 189 IQ-------------AQTRVEAAQATERRIVADID--DSELKAPRDGRV-QYRVAEPGEV 232
QT T + S ++AP +V Q +V G V
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 233 LSAGGRVLNMVDLSDVY-MTFFLPTEQAGLLKIGGEARLVLDAAPDLRIPATISFVASVA 291
++ ++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVK 406

Query: 292 QFTPKTVETHDERLKLMFRVKARIPPELLRQHLEYV--KTGLPGMAWVRLDER 342
+E D+RL L+F V I L + + +G+ A ++ R
Sbjct: 407 NINLDAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01230ABC2TRNSPORT482e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 48.0 bits (114), Expect = 2e-08
Identities = 43/171 (25%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPVTPFEIMMAKV-WSMGLVVLVVSGLSLMLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G + +V LG + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG---IGVVAAALGY-TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLMILVLLPLQMLSGGSTPRESMPQAVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGLSIVWPQFLTLLAIGGVFFL-IALLRFR 367
+P +H + L + I+ + + + I FFL ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01240ENTSNTHTASED336e-04 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 32.7 bits (74), Expect = 6e-04
Identities = 25/93 (26%), Positives = 44/93 (47%), Gaps = 6/93 (6%)

Query: 30 RRASWLAGRVLLSRALSPL---PEMVYGEQGKPAFSAGTPLWFNLSHSGDTIALLLSDEG 86
R+A LAGR+ AL + G++ +P + G L+ ++SH T ++S +
Sbjct: 46 RKAEHLAGRIAAVHALREVGVRTVPGMGDKRQPLWPDG--LFGSISHCATTALAVISRQ- 102

Query: 87 EVGCDIEVIRPRDNWRSLANAVFSLGEHAEMEA 119
+G DIE I + LA ++ E ++A
Sbjct: 103 RIGIDIEKIMSQHTATELAPSIIDSDERQILQA 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01250TCRTETA492e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 48.7 bits (116), Expect = 2e-08
Identities = 75/403 (18%), Positives = 138/403 (34%), Gaps = 42/403 (10%)

Query: 13 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHD--AMGFSAFWAGLIISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADVLGPKKIVVFGLCGCFLSGLGYLLADIASAWPLISLLLLGLGRVILGI-GQS 129
P G +D G + +++ L G + + Y + A L +L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPF-----LWVLYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVLCYAWGGLQGLALTVMGV 189
A G+ + + R + M G LG L G
Sbjct: 113 GAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPVLGGLM----GGFSPHAPFFAA 166

Query: 190 ALLAILLAL----------PRPSVKANKGKPLPFRAVLGRVWLYGMALALA-----SAGF 234
A L L L + P + + +A +A
Sbjct: 167 AALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVG 226

Query: 235 GVIATFITLFYDAK-GWDGAAFALTLFSVAFVGT---RLLFPNGINRLGGLNVAMICFGV 290
V A +F + + WD ++L + + + ++ RLG M+
Sbjct: 227 QVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286

Query: 291 EIIGLLLVGTAAMPWMAKIGVLLTGMGFSLVFPALGVVAVKAVPPQNQGAALATYTVFMD 350
+ G +L+ A WMA ++L + PAL + + V + QG +
Sbjct: 287 DGTGYILLAFATRGWMAFPIMVLLA-SGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS 345

Query: 351 MSLGVTGPLAGLVMTWAGVPV----IYLAAAGLVAMALLLTWR 389
++ + GPL + A + ++A A L + L R
Sbjct: 346 LT-SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01260PF04183280.038 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.038
Identities = 17/91 (18%), Positives = 28/91 (30%), Gaps = 14/91 (15%)

Query: 121 LGQILDVHVFNRLRQNRRWWLAPTASTLFGNISDTLAFFFIAFWRSPDAFMAEHWMEIAL 180
LG I + L+ + +TL + + AE W+
Sbjct: 347 LGVIWRENPCRWLKPDES---PVLMATLMECDENNQPL--AGAYIDRSGLDAETWLT--- 398

Query: 181 VDYCFKVLISIIFFLPMYGVLL-----NMLL 206
V++ + L YGV L N+ L
Sbjct: 399 -QLFRVVVVPLYHLLCRYGVALIAHGQNITL 428


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01265PF012061012e-32 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 101 bits (254), Expect = 2e-32
Identities = 28/72 (38%), Positives = 42/72 (58%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQTGETLLIIADDPATTRDIPGFCTFMEHDLLAQET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F H+LL Q+
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 EGLPYRYLLRKA 80
E Y + L++A
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01275ACRIFLAVINRP300.040 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.040
Identities = 17/78 (21%), Positives = 34/78 (43%), Gaps = 3/78 (3%)

Query: 336 AEERRAPIERFIDRFSRIYTPVIMVIALLVTLIPPLMFDGGWQEWIYKGLTLLLIGCPCA 395
E++ P E S+I ++ + +L + P+ F GG IY+ ++ ++ A
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS---A 477

Query: 396 LVISTPAAITSGLAAAAR 413
+ +S A+ A A
Sbjct: 478 MALSVLVALILTPALCAT 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01285SHIGARICIN270.026 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 26.7 bits (59), Expect = 0.026
Identities = 6/29 (20%), Positives = 16/29 (55%)

Query: 7 FFIIIIALIVVAASFRFVQQRREKAANEA 35
+++I AA ++F++Q+ K ++
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQIGKRVDKT 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01300IGASERPTASE310.012 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.012
Identities = 15/114 (13%), Positives = 34/114 (29%), Gaps = 2/114 (1%)

Query: 17 DKEQKQEQTEEQQIVEEQRPVEPPVETAADIDAQTPAHSKAETEAFAEEVVDVTEKVQES 76
+++ K E + Q++ + V P E + + Q + + +E T ++
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 77 EKP-QPVEPEPATAVETAAPQIAVEREELPLPEEVKDEAISPEEWQAEAETVEV 129
E+P + V + PE P + +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSESSNKPKN 1221



Score = 30.8 bits (69), Expect = 0.017
Identities = 22/128 (17%), Positives = 41/128 (32%), Gaps = 12/128 (9%)

Query: 17 DKEQKQEQTEEQQIVEEQRPVEPPVETAADIDA--QTPAHSKAETEAFAEEVVD------ 68
+ EQ E VE PV + ++ + + T A + V+
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNK 1218

Query: 69 VTEKVQESEKPQPVEPEPATAVETAAPQIAVEREELPLPEEVKDEAISPEEWQAEAETVE 128
+ + S + P EPAT + + L + +S +A+ +
Sbjct: 1219 PKNRHRRSVRSVPHNVEPAT-TSSNDRSTVALCD---LTSTNTNAVLSDARAKAQFVALN 1274

Query: 129 VIAAVEEE 136
V AV +
Sbjct: 1275 VGKAVSQH 1282


70GX95_01375GX95_01405N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_01375-3142.038135sn-glycerol-3-phosphate ABC transporter
GX95_01380-2152.667735glycerol-3-phosphate transporter permease
GX95_01385-1121.823928glycerol-3-phosphate transporter
GX95_01390-1101.463415sn-glycerol-3-phosphate ABC transporter
GX95_01395112-0.803576glycerophosphodiester phosphodiesterase
GX95_01400115-3.752881hypothetical protein
GX95_01405115-3.652020gamma-glutamyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01375MALTOSEBP431e-06 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 43.2 bits (101), Expect = 1e-06
Identities = 46/176 (26%), Positives = 73/176 (41%), Gaps = 17/176 (9%)

Query: 133 SGHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQELADYTAKLRAAGMKCGYASGW 192
+G L++ P L YNKD L P PPKTW+E+ +L+A G +
Sbjct: 126 NGKLIAYPIAVEALSLIYNKD------LLP-NPPKTWEEIPALDKELKAKGKSALMFNLQ 178

Query: 193 QGWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIALLEEMNKKGDFSYVG 250
+ + +A G F +N +D D ++ K + L++ + D Y
Sbjct: 179 EPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY-- 236

Query: 251 RKDESTEKFYNGDCAMTTASSGSLANIRQYAKFNYGVGMMPYDADIKGAPQNAIIG 306
+ F G+ AMT + +NI +K NYGV ++P KG P +G
Sbjct: 237 --SIAEAAFNKGETAMTINGPWAWSNIDT-SKVNYGVTVLP---TFKGQPSKPFVG 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01390PF05272290.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.041
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTSGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01395PF04619280.032 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 27.6 bits (61), Expect = 0.032
Identities = 11/60 (18%), Positives = 21/60 (35%), Gaps = 4/60 (6%)

Query: 29 VGARYGHTMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGGW 84
+G ++ D + G+ FL+ D+N ++ W + D G W
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01405NAFLGMOTY320.004 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 32.4 bits (73), Expect = 0.004
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 275 RTPISGDYRGYQVFSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 331
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 332 YAYADRSEYLGDPDFVKVPWQA 353
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


71GX95_01775GX95_01835N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_017751191.913519phosphoribulokinase
GX95_017802192.121940hypothetical protein
GX95_017850152.142677hydrolase
GX95_017900141.790434monooxygenase
GX95_017950142.360104LysR family transcriptional regulator
GX95_018000132.221241ABC transporter ATP-binding protein
GX95_01805-2130.632929glutathione-regulated potassium-efflux system
GX95_01810-113-0.614136glutathione-regulated potassium-efflux system
GX95_01815020-1.269688hypothetical protein
GX95_01820119-0.956051peptidylprolyl isomerase
GX95_01825315-1.677852lysis protein
GX95_01830217-1.530448FKBP-type peptidyl-prolyl cis-trans isomerase
GX95_01835122-1.898332hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01775PF07299361e-04 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 36.0 bits (83), Expect = 1e-04
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFSLLEHTFIEYGQTGKGQSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01780FLGFLIH250.024 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 25.1 bits (54), Expect = 0.024
Identities = 17/46 (36%), Positives = 23/46 (50%), Gaps = 3/46 (6%)

Query: 3 IPWQGLAPDTLDNLIESFV---LREGTDYGEHERSLEQKVADVKRQ 45
+PW+ PD L FV E T E E SLEQ++A ++ Q
Sbjct: 5 LPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01800PYOCINKILLER310.021 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.5 bits (68), Expect = 0.021
Identities = 21/85 (24%), Positives = 33/85 (38%), Gaps = 7/85 (8%)

Query: 522 VQKQENQADDAPKENNANSAQSRKDQKRREAELRTLT---QPLRKEITRLEKEMEKLNAQ 578
+ E + A +E N N ++ RE E T + + I+ L+ M L A
Sbjct: 151 TRTAEEIGEQAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAA 210

Query: 579 LA----QAEEKLGDSSLYDPSRKAE 599
A A K + + + RKAE
Sbjct: 211 KASIEAAAANKAREQAAAEAKRKAE 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01805ISCHRISMTASE280.025 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 27.7 bits (61), Expect = 0.025
Identities = 35/138 (25%), Positives = 52/138 (37%), Gaps = 22/138 (15%)

Query: 11 YAHPESQDSVANRVLLKPAIQHNNVTVHDLYARYPDFFID--TPYEQ-----ALLREHDV 63
Y P + D N+V P + +HD+ + D F +P + L+ V
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 64 IVFQH--PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLVGKYWRSVITTGEPESA---- 117
Q P+ + P DR L F GPG N G Y +IT PE
Sbjct: 69 ---QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLNS--GPYEEKIITELAPEDDDLVL 121

Query: 118 --YRYDALNRYPMSDVLR 133
+RY A R + +++R
Sbjct: 122 TKWRYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01830INFPOTNTIATR1282e-38 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 128 bits (323), Expect = 2e-38
Identities = 80/226 (35%), Positives = 121/226 (53%), Gaps = 9/226 (3%)

Query: 28 AAKPAATADSKAAFKNDDQKAAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQTFEARVKSAAQAKMEKDAADNEAKGKTFRDAFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG F A + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLLYKVEKEGTGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL YK+ GTG P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPALAYGKTGVPG-IPANSTLVFDVELLDIKPA 251
+ + G ++ +P LAYG V G I N TL+F + L+ +K A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01835ACRIFLAVINRP290.022 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.022
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


72GX95_01865GX95_01885N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_01865445-2.136950translation elongation factor G
GX95_01870340-2.353126translation elongation factor Tu
GX95_01875333-3.118115bacterioferritin-associated ferredoxin
GX95_01880536-2.734755bacterioferritin
GX95_01885649-0.847491prepilin peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01865TCRTETOQM6160.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 616 bits (1591), Expect = 0.0
Identities = 178/698 (25%), Positives = 305/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVGQIKTRLGANPVPLQLAIGAEEGFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMQDLANEWHQNLIESAAEASEELMEKYLGGEELTEEEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KQALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+Q R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKTARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPENPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P+ ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYREAIRAKVTDIEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKKA---EYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKSGPLAGYPVVDLGVRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D + +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDDAPNNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01870TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKIIELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01880HELNAPAPROT371e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 36.8 bits (85), Expect = 1e-05
Identities = 18/103 (17%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLRSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+++ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDWLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_01885PREPILNPTASE1412e-44 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 141 bits (358), Expect = 2e-44
Identities = 59/143 (41%), Positives = 86/143 (60%), Gaps = 2/143 (1%)

Query: 4 ALPFLIFYASFSLLLGIYDARTGLLPDRFTCPLLWGGLLYHQICLPERLPDALWGAIAGY 63
L L+ + L D LLPD+ T PLLWGGLL++ + L DA+ GA+AGY
Sbjct: 134 TLAALLLT-WVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGY 192

Query: 64 GGFALIYWGYRLRYQKEGLGYGDVKYLAALGAWHCWETLPLLVFLAAMLACGGFGVALLV 123
+YW ++L KEG+GYGD K LAALGAW W+ LP+++ L++++ G+ L++
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV-GAFMGIGLIL 251

Query: 124 RGKSALINPLPFGPWLAVAGFIT 146
P+PFGP+LA+AG+I
Sbjct: 252 LRNHHQSKPIPFGPYLAIAGWIA 274


73GX95_02140GX95_02175N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_02140-116-1.667403**hypothetical protein
GX95_02145-215-2.271544multidrug efflux RND transporter permease
GX95_02150-216-2.372458efflux transporter periplasmic adaptor subunit
GX95_02155-115-2.846617acrEF/envCD operon transcriptional regulator
GX95_02160-114-1.121686histidine kinase
GX95_02165-212-1.237366DUF2556 domain-containing protein
GX95_02170-212-0.764775adenine-specific DNA-methyltransferase
GX95_02175-2131.033799Fis family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02140adhesinb290.001 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.001
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTTLLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02145ACRIFLAVINRP13860.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1386 bits (3589), Expect = 0.0
Identities = 914/1032 (88%), Positives = 972/1032 (94%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAIMQLPVAQYPTIAPPAVSISATYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAI+QLPVAQYPTIAPPAVS+SA YPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSFLMVAGFVSDNPNTTQDDISDYVASNIKDSISRLNGVGDVQLFGA 180
EVQQQGISVEKSSSS+LMVAGFVSDNP TTQDDISDYVASN+KD++SRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDANLLNKYQLTPVDVINQLTVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDA+LLNKY+LTPVDVINQL VQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KDPQEFGKVTLRVNTDGSVVHLKDVARIELGGENYNVVARINGKPASGLGIKLATGANAL 300
K+P+EFGKVTLRVN+DGSVV LKDVAR+ELGGENYNV+ARINGKPA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTATAIKAKLAELQPFFPQGMKVVYPYDTTPFVKISIHEVVKTLFEAIILVFLVMYLFLQ 360
DTA AIKAKLAELQPFFPQGMKV+YPYDTTPFV++SIHEVVKTLFEAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NIRATLIPTIAVPVVLLGTFAVLAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N+RATLIPTIAVPVVLLGTFA+LAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDNLSPREATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
MED L P+EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPVSAEHHEKKSGFFGWFNTRFDHSVNHYTNSVSGIVRNTGRY 540
SVLVALILTPALCATLLKPVSAEHHE K GFFGWFNT FDHSVNHYTNSV I+ +TGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LIIYLLIVVGMAVLFLRLPTSFLPEEDQGVFLTMIQLPSGATQERTQKVLDQVTHYYLNN 600
L+IY LIV GM VLFLRLP+SFLPEEDQGVFLTMIQLP+GATQERTQKVLDQVT YYL N
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQGQNSGMAFVSLKPWEERNGEENSVEAVIARATRAFSQIRDG 660
EKANVESVFTVNGFSFSGQ QN+GMAFVSLKPWEERNG+ENS EAVI RA +IRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 LVFPFNMPAIVELGTATGFDFELIDQGGLGHDALTKARNQLLGMVAKHPDLLVRVRPNGL 720
V PFNMPAIVELGTATGFDFELIDQ GLGHDALT+ARNQLLGM A+HP LV VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTPQFKLDVDQEKAQALGVSLSDINETISAALGGYYVNDFIDRGRVKKVYVQADAQLRM 780
EDT QFKL+VDQEKAQALGVSLSDIN+TIS ALGG YVNDFIDRGRVKK+YVQADA+ RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPGDINNLYVRSANGEMVPFSTFSSARWIYGSPRLERYNGMPSMELLGEAAPGRSTGEAM 840
LP D++ LYVRSANGEMVPFS F+++ W+YGSPRLERYNG+PSME+ GEAAPG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 SLMENLASQLPNGIGYDWTGMSYQERLSGNQAPALYAISLIVVFLCLAALYESWSIPFSV 900
+LMENLAS+LP GIGYDWTGMSYQERLSGNQAPAL AIS +VVFLCLAALYESWSIP SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGVVGALLAASLRGLNNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMEKEGRGLI 960
MLVVPLG+VG LLAA+L NDVYF VGLLTTIGLSAKNAILIVEFAKDLMEKEG+G++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLEASRMRLRPILMTSLAFILGVMPLVISRGAGSGAQNAVGTGVMGGMLTATLLAIFF 1020
EATL A RMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGM++ATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVVKRRF 1032
VPVFFVV++R F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02150RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.5 bits (100), Expect = 2e-06
Identities = 24/137 (17%), Positives = 48/137 (35%), Gaps = 15/137 (10%)

Query: 98 ATYQADYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
+ K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 VAAKAAVESARINLAYTKVTSPISGRIGKSNV-TEGALVTNGQSTELATVQQLDPIYVDV 215
+ + + +P+S ++ + V TEG +VT + T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVE 230
+ D F+ + Q+
Sbjct: 371 LVQNKDIGFINVGQNAI 387



Score = 37.1 bits (86), Expect = 1e-04
Identities = 22/127 (17%), Positives = 41/127 (32%), Gaps = 13/127 (10%)

Query: 46 TAPLAVTTELPGR-TSAFRIAEVRPQVSGIVLKRNFTEGSDVEAGQSLYQIDPATYQADY 104
+ + G+ T + R E++P + IV + EG V G L ++ +AD
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEAD- 135

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVVAAKAAVE 164
K++++ A L RY L E ++ +
Sbjct: 136 ------TLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02155HTHTETR1282e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 128 bits (324), Expect = 2e-39
Identities = 82/216 (37%), Positives = 130/216 (60%), Gaps = 3/216 (1%)

Query: 1 MAKKTKADALKTRQHLIETAIAQFALRGVANTTLNDIADAADVTRGAIYWHFENKTQLFN 60
MA+KTK +A +TRQH+++ A+ F+ +GV++T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EVW-LQQPPLRELIQDRLTGCWNDNPLQDLREKFIAALQYIAAVPRQQALMQILYHKCEF 119
E+W L + + EL + +PL LRE I L+ R++ LM+I++HKCEF
Sbjct: 61 EIWELSESNIGELELEYQAKF-PGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 HNGM-ISEQAIREKMGFHHQSMLEVLQRCMDKKLISGSLDLDVILIILHGSFSGIVKNWL 178
M + +QA R + + + L+ C++ K++ L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNPTSYDLYKQAPALVDNVLKMLSPDGSVRQLMPNE 214
P S+DL K+A V +L+M ++R NE
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02175DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


74GX95_02755GX95_02790N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_027551133.045283hypothetical protein
GX95_027601131.979973osmotically-inducible protein OsmY
GX95_027651131.856062phosphoheptose isomerase
GX95_02770-1132.141098YraN family protein
GX95_02775-2141.790641penicillin-binding protein activator
GX95_02780-3150.38759416S rRNA
GX95_02785-212-0.252333transcriptional regulator
GX95_02790-2120.233056galactitol-1-phosphate 5-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02755NUCEPIMERASE290.009 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.009
Identities = 14/55 (25%), Positives = 22/55 (40%), Gaps = 16/55 (29%)

Query: 4 VLITGATGLVGGHLLRMLINTPQVSAIAAPTRRPLTDIVGV--YNP-HDPQLTDA 55
L+TGA G +G H+ + L+ +VG+ N +D L A
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGH-------------QVVGIDNLNDYYDVSLKQA 44


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02765RTXTOXINA280.028 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.028
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02775IGASERPTASE340.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.003
Identities = 24/120 (20%), Positives = 38/120 (31%), Gaps = 8/120 (6%)

Query: 289 QAVEMQPAAAPDAPVEPGVEETQPQMTNGVASPSQASVSDLTDDAPSQSATPVSAPQTPP 348
Q+ +QP A P +P V +PQ + A + S PV+ T
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTN----TTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 349 ATASAPADPSAELKIYDTSSQPLD-QVLAQVQQDGASIVVGPLLKNNVEALMKSNTPLNV 407
S +P ++QP + ++ V + N A SN V
Sbjct: 1191 TGNSVVENPENTT---PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_02790DHBDHDRGNASE383e-05 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 38.1 bits (88), Expect = 3e-05
Identities = 26/92 (28%), Positives = 39/92 (42%), Gaps = 2/92 (2%)

Query: 156 AQGCEGKNVIIVGAGT-IGLLALQCARELGARSVTAIDINPQKLELAKALGATHTCNSRE 214
A+G EGK I GA IG + GA + A+D NP+KLE + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MTADDIQTALSDIQFDQLVLETAGTPQTVSLA 246
AD +A D ++ E V++A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


75GX95_04750GX95_04850N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_04750-224-6.272744EscC/YscC/HrcC family type III secretion system
GX95_04755-124-5.840653SepL/TyeA/HrpJ family type III secretion system
GX95_04760-223-4.577749EscV/YscV/HrcV family type III secretion system
GX95_04765-224-4.200277type III secretion system chaperone SpaK
GX95_04770-224-3.911257EscN/YscN/HrcN family type III secretion system
GX95_04775-126-5.344344type III secretion system protein SpaM
GX95_04780-126-5.983360antigen presentation protein SpaN
GX95_04785-127-6.811652type III secretion system protein SpaO
GX95_04790122-6.007002EscR/YscR/HrcR family type III secretion system
GX95_04795122-5.206718EscS/YscS/HrcS family type III secretion system
GX95_04800120-5.317244EscT/YscT/HrcT family type III secretion system
GX95_04805123-5.571171EscU/YscU/HrcU family type III secretion system
GX95_04810025-5.235257CesD/SycD/LcrH family type III secretion system
GX95_04815025-5.122098pathogenicity island 1 effector protein SipB
GX95_04820028-7.144465pathogenicity island 1 effector protein SipC
GX95_04825231-8.199321cell invasion protein SipD
GX95_04830232-9.033294pathogenicity island 1 effector protein SipA
GX95_04835232-10.936672acyl carrier protein
GX95_04840232-10.993555hypothetical protein
GX95_04845231-9.597675chaperone protein SicP
GX95_04850231-9.227773pathogenicity island 1 effector protein StpP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04750TYPE3OMGPROT5760.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 576 bits (1485), Expect = 0.0
Identities = 170/540 (31%), Positives = 271/540 (50%), Gaps = 57/540 (10%)

Query: 4 HILLARVLACAALVLVAPGYSSE----KIPVTGSGFVAKDDSLRTFFDAMALQLKEPVIV 59
H RVL L+L + ++ E IP +VAK +SLR V+V
Sbjct: 6 HSFFKRVLTGTLLLLSSYSWAQELDWLPIPYV---YVAKGESLRDLLTDFGANYDATVVV 62

Query: 60 SKMAARKKITGNFEFHDPNALLEKLSLQLGLIWYFDGQAIYIYDASEMRNAVVSLRNVSL 119
S K++G FE +P L+ ++ L+WY+DG +YI+ SE+ + ++ L+
Sbjct: 63 SD-KINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEA 121

Query: 120 NEFNNFLKRSGLYNKNYPLRGDNRKGTFYVSGPPVYVDMVVNAATMMDKQND--GIELGR 177
E L+RSG++ + R D YVSGPP Y+++V A +++Q + G
Sbjct: 122 AELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGA 181

Query: 178 QKIGVMRLNNTFVGDRTYNLRDQKMVIPGIATAIERLLQGEEQPLGNIVSSEPPAMPAFS 237
I + L DRT + RD ++ PG+AT ++R+L + + P
Sbjct: 182 LAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIP------ 235

Query: 238 ANGEKGKAANYAGGMSLQEALKQNAAAGNIKIVAYPDTNSLLVKGTAEQVHFIEMLVKAL 297
Q A + +A A ++ A P N+++V+ + E++ + L+ AL
Sbjct: 236 -----------------QAATRASAQA---RVEADPSLNAIIVRDSPERMPMYQRLIHAL 275

Query: 298 DVAKRHVELSLWIVDLNKSDLERLGTSWSGSI-----------TIGDKLGVSLNQASIST 346
D +E++L IVD+N L LG W I T GD+ ++ N A S
Sbjct: 276 DKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSL 335

Query: 347 LDG---SRFIAAVNALEEKKQATVVSRPVLLTQENVPAIFDNNRTFYTKLIGERNVALEH 403
+D +A VN LE + A VVSRP LLTQEN A+ D++ T+Y K+ G+ L+
Sbjct: 336 VDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKG 395

Query: 404 VTYGTMIRVLPRFSADG---QIEMSLDIEDGNDKTPQSDTTTSVDALPEVGRTLISTIAR 460
+TYGTM+R+ PR G +I ++L IEDGN Q ++ ++ +P + RT++ T+AR
Sbjct: 396 ITYGTMLRMTPRVLTQGDKSEISLNLHIEDGN----QKPNSSGIEGIPTISRTVVDTVAR 451

Query: 461 VPHGKSLLVGGYTRDANTDTVQSIPFLGKLPLIGSLFRYSSKNKSNVVRVFMIEPKEIVD 520
V HG+SL++GG RD + + +P LG +P IG+LFR S+ VR+F+IEP+ I +
Sbjct: 452 VGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRIIDE 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04755INVEPROTEIN6040.0 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 604 bits (1558), Expect = 0.0
Identities = 371/372 (99%), Positives = 371/372 (99%)

Query: 1 MIPGSTSGISFSRILSRQTSHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60
MIPGSTSGISFSRILSRQ SHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA
Sbjct: 1 MIPGSTSGISFSRILSRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSA 60

Query: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120
ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP
Sbjct: 61 ALAQFRNRRDYEKKSSNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFP 120

Query: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180
DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS
Sbjct: 121 DPSDLVLVLRELLRRKDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLS 180

Query: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240
LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR
Sbjct: 181 LKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSR 240

Query: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300
LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL
Sbjct: 241 LEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSL 300

Query: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360
LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE
Sbjct: 301 LADIIGLNALLLSHKEHASFLQIFYQVCKAIPSSLFYEEYWQEELLMALRSMTDIAYKHE 360

Query: 361 MAEQRRTIEKLS 372
MAEQRRTIEKLS
Sbjct: 361 MAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04765SSPAKPROTEIN2063e-72 Invasion protein B family signature.
		>SSPAKPROTEIN#Invasion protein B family signature.

Length = 133

Score = 206 bits (525), Expect = 3e-72
Identities = 43/133 (32%), Positives = 76/133 (57%)

Query: 1 MQHLDIAELVRSALEVSGCDPSLIGGIDSHSTIVLDLFALPSICISVKDDDVWIWAQLGA 60
M ++++ +LVR +L GC PS+I +DSHS I + L ++P+I I++ ++ V +WA A
Sbjct: 1 MSNINLVQLVRDSLFTIGCPPSIITDLDSHSAITISLDSMPAINIALVNEQVMLWANFDA 60

Query: 61 DSMVVLQQRAYEILMTIMEGCHFARGGQLLLGEQNGELTLKALVHPDFLSDGEKFSTALN 120
S V LQ AY IL ++ ++ + L + L L+ ++ D++ DG F+ L+
Sbjct: 61 PSDVKLQSSAYNILNLMLMNFSYSINELVELHRSDEYLQLRVVIKDDYVHDGIVFAEILH 120

Query: 121 GFYNYLEVFSRSL 133
FY +E+ + L
Sbjct: 121 EFYQRMEILNGVL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04775SSPAMPROTEIN1672e-56 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 167 bits (423), Expect = 2e-56
Identities = 141/147 (95%), Positives = 143/147 (97%)

Query: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRGLQAEEEAILEQIAGLKLLLDTLRAEN 60
MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDR LQ EEEAI+EQIAGLKLLLDTLRAEN
Sbjct: 1 MHSLTRIKVLQRRCTVFHSQCESILLRYQDEDRRLQVEEEAIVEQIAGLKLLLDTLRAEN 60

Query: 61 RQLSREEIYTLLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQKKSKYWLRKEGNY 120
RQLSREEIY LLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQ+KSKYWLRKEGNY
Sbjct: 61 RQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKREEFQEKSKYWLRKEGNY 120

Query: 121 QRWIIRQKRHYIQREIQQEEAESEEII 147
QRWIIRQKR YIQREIQQEEAESEEII
Sbjct: 121 QRWIIRQKRLYIQREIQQEEAESEEII 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04780SSPANPROTEIN5980.0 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 598 bits (1542), Expect = 0.0
Identities = 331/336 (98%), Positives = 332/336 (98%)

Query: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSDHKKDRDYGDAFVMHKETAL 60
MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYS KKDRDYGDAFVMHKETAL
Sbjct: 1 MGDVSAVSSSGNILLPQQDEVGGLSEALKKAVEKHKTEYSGDKKDRDYGDAFVMHKETAL 60

Query: 61 PVLLAAWRHGAPAKSEHHNGNVSGLHHNGKGELRIAEKLLKVTAEKSVGLISAEAKVDKS 120
P+LLAAWRHGAPAKSEHHNGNVSGLHHNGK ELRIAEKLLKVTAEKSVGLISAEAKVDKS
Sbjct: 61 PLLLAAWRHGAPAKSEHHNGNVSGLHHNGKSELRIAEKLLKVTAEKSVGLISAEAKVDKS 120

Query: 121 AALLSPKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180
AALLS KNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR
Sbjct: 121 AALLSSKNRPLESVSGKKLSADLKAVESVSEVTDNATGISDDNIKALPGDNKAIAGEGVR 180

Query: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240
KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA
Sbjct: 181 KEGAPLARDVAPARMAAANTGKPEDKDHKKVKDVSQLPLQPTTIADLSQLTGGDEKMPLA 240

Query: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300
AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH
Sbjct: 241 AQSKPMMTIFPTADGVKGEDSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLH 300

Query: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336
DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA
Sbjct: 301 DQWQNGNPQRWHLTRDDQQNPQQQQHRQQSGEEDDA 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04785TYPE3OMOPROT5380.0 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 538 bits (1386), Expect = 0.0
Identities = 302/303 (99%), Positives = 302/303 (99%)

Query: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60
MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL
Sbjct: 1 MSLRVRQIDRREWLLAQTATECQRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWL 60

Query: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120
EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL
Sbjct: 61 EHVSPALAGAAVSAGAEHLVVPWLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLL 120

Query: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180
HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS
Sbjct: 121 HIMSDRGGLWFEHLPELPAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS 180

Query: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240
RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR
Sbjct: 181 RAEVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYR 240

Query: 241 KNVTLAELETMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300
KNVTLAELE MGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG
Sbjct: 241 KNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300

Query: 301 NGE 303
NGE
Sbjct: 301 NGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04790TYPE3IMPPROT303e-107 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 303 bits (777), Expect = e-107
Identities = 224/224 (100%), Positives = 224/224 (100%)

Query: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60
MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120
MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180
KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224
LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIAT 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04795TYPE3IMQPROT894e-27 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 88.7 bits (220), Expect = 4e-27
Identities = 86/86 (100%), Positives = 86/86 (100%)

Query: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60
MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86
FLLSGWYGEVLLSYGRQVIFLALAKG
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04800TYPE3IMRPROT1883e-61 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 188 bits (478), Expect = 3e-61
Identities = 48/237 (20%), Positives = 104/237 (43%), Gaps = 4/237 (1%)

Query: 12 LVASAALGFARVAPIFFFLPFLNSGVLSGAPRNAIIILVALGVWPHALNEAPPFLSVAMI 71
+ RV + P L+ + + + +++ + P P S +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFAL 71

Query: 72 PLVLQEAAVGVMLGCLLSWPFWVMHALGCIIDNQRGATLSSSIDPANGIDTSEMANFLNM 131
L +Q+ +G+ LG + + F + G II Q G + ++ +DPA+ ++ +A ++M
Sbjct: 72 WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDM 131

Query: 132 FAAVVYLQNGGLVTMVDVLNKSYQLCDPMNEC--TPSLPPLLTFINQVAQNALVLASPVV 189
A +++L G + ++ +L ++ E + + L + + N L+LA P++
Sbjct: 132 LALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLI 191

Query: 190 LVLLLSEVFLGLLSRFAPQMNAFAISLTVKSGIAVLIMLLYFS--PVLPDNVLRLSF 244
+LL + LGLL+R APQ++ F I + + + +M +++ F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04805TYPE3IMSPROT340e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 340 bits (875), Expect = e-118
Identities = 120/360 (33%), Positives = 205/360 (56%), Gaps = 19/360 (5%)

Query: 1 MSSNKTEKPTKKRLEDSAKKGQSFKSKDLIIACLTLGGIAYLVSYGSFN-EFMGIIKIII 59
MS KTE+PT K++ D+ KKGQ KSK+++ L + A L+ + E + +I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 ADNFDQSMADYSLAVFGIGLKYLIPFMLLCL---VCSALPAL----LQAGFVLATEALKP 112
+QS +S A+ + L+ F LC +AL A+ +Q GF+++ EA+KP
Sbjct: 61 ---AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKP 117

Query: 113 NLSALNPVEGAKKLFSMRTVKDTVKTLLYLSSFVVAAIICWKKYKVEIFSQLNGNIVGIA 172
++ +NP+EGAK++FS++++ + +K++L + V+ +I+ W K + + L GI
Sbjct: 118 DIKKINPIEGAKRIFSIKSLVEFLKSILKV---VLLSILIWIIIKGNLVTLLQLPTCGIE 174

Query: 173 VIWRELLLALVLTCLACA---LIVLLLDAIAEYFLTMKDMKMDKEEVKREMKEQEGNPEV 229
I L L + C +++ + D EY+ +K++KM K+E+KRE KE EG+PE+
Sbjct: 175 CITPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 230 KSKRREVHMEILSEQVKSDIENSRLIVANPTHITIGIYFKPELMPIPMISVYETNQRALA 289
KSKRR+ H EI S ++ +++ S ++VANPTHI IGI +K P+P+++ T+ +
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 290 VRAYAEKVGVPVIVDIKLARSLFKTHRRYDLVSLEEIDEVLRLLVWLE--EVENAGKDVI 347
VR AE+ GVP++ I LAR+L+ + E+I+ +L WLE +E +++
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEML 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04810SYCDCHAPRONE1282e-40 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 128 bits (322), Expect = 2e-40
Identities = 39/160 (24%), Positives = 72/160 (45%), Gaps = 4/160 (2%)

Query: 4 QNNVSEERVAEMIWDAVSEGATLKDVHGIPQDMMDGLYAHAYEFYNQGRLDEAETFFRFL 63
Q + + + G T+ ++ I D ++ LY+ A+ Y G+ ++A F+ L
Sbjct: 3 QETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQAL 62

Query: 64 CIYDFYNPDYTMGLAAVCQLKKQFQKACDLYAVAFTLLKNDYRPVFFTGQCQLLMRKAAK 123
C+ D Y+ + +GL A Q Q+ A Y+ + + R F +C L + A+
Sbjct: 63 CVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAE 122

Query: 124 ARQCF----ELVNERTEDESLRAKALVYLEALKTAETEQH 159
A EL+ ++TE + L + LEA+K + +H
Sbjct: 123 AESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEH 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04815BACINVASINB8410.0 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 841 bits (2173), Expect = 0.0
Identities = 592/593 (99%), Positives = 592/593 (99%)

Query: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60
MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE
Sbjct: 1 MVNDASSISRSGYTQNPRLAEAAFEGVRKNTDFLKAADKAFKDVVATKAGDLKAGTKSGE 60

Query: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120
SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE
Sbjct: 61 SAINTVGLKPPTDAAREKLSSEGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIESQKE 120

Query: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAAAKKLTQAQNKLQSLDPADPG 180
MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAA KKLTQAQNKLQSLDPADPG
Sbjct: 121 MGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPADPG 180

Query: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240
YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN
Sbjct: 181 YAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQGTANAASQN 240

Query: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300
QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF
Sbjct: 241 QVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEMEKKSAEF 300

Query: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360
QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA
Sbjct: 301 QEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAA 360

Query: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420
TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV
Sbjct: 361 TGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMV 420

Query: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480
AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG
Sbjct: 421 AVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480

Query: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540
NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML
Sbjct: 481 NVGSKMGLQTNALSKELVGNTLNKVALGMEVTNTAAQSAGGVAEGVFIKNASEALADFML 540

Query: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593
ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA
Sbjct: 541 ARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNADASRFILRQSRA 593


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04820BACINVASINC5140.0 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 514 bits (1325), Expect = 0.0
Identities = 406/409 (99%), Positives = 408/409 (99%)

Query: 1 MLISNVGVNPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60
MLISNVG+NPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP
Sbjct: 1 MLISNVGINPAAYLNNHSVENSSQTASQSVSAKDILNSIGISSSKVSDLGLSPTLSAPAP 60

Query: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120
GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS
Sbjct: 61 GVLTQTPGTITSFLKASIQNTDMNQDLNALANNVTTKANEVVQTQLREQQAEVGKFFDIS 120

Query: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180
GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ
Sbjct: 121 GMSSSAVALLAAANTLMLTLNQADSKLSGKLSLVSFDAAKTTASSMMREGMNALSGSISQ 180

Query: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240
SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV
Sbjct: 181 SALQLGITGVGAKLEYKGLQNERGALKHNAAKIDKLTTESHSIKNVLNGQNSVKLGAEGV 240

Query: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKDSNKQISPEHQAILSKRLESV 300
DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIK+SNKQISPEHQAILSKRLESV
Sbjct: 241 DSLKSLNMKKTGTDATKNLNDATLKSNAGTSATESLGIKNSNKQISPEHQAILSKRLESV 300

Query: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASGQYAATQERSEQQISQVN 360
ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGAS QYAATQERSEQQISQVN
Sbjct: 301 ESDIRLEQNTMDMTRIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVN 360

Query: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409
NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA
Sbjct: 361 NRVASTASDEARESSRKSTSLIQEMLKTMESINQSKASALAAIAGNIRA 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04845PF05932432e-08 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 43.3 bits (102), Expect = 2e-08
Identities = 18/128 (14%), Positives = 46/128 (35%), Gaps = 8/128 (6%)

Query: 2 QAHQDIIANIGEKLGL-PLTFDDNNQCLLLLDSDIFTSIEAK--DDIWLLNGMIIPLSPV 58
++ ++ + L + PL FDD+ C +++D+ ++ + LL G++ P
Sbjct: 4 LFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH--- 60

Query: 59 CGDSIWRQIMVINGELAANNEGTLAYIEAAETLLFIHAI-TDLTNTYHIISQLESFVNQQ 117
D + ++ N L E + +I + + + ++ +
Sbjct: 61 -KDIPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWM 119

Query: 118 EALKNILQ 125
+ Q
Sbjct: 120 RGWREASQ 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_04850BACYPHPHTASE3001e-98 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 300 bits (770), Expect = 1e-98
Identities = 67/212 (31%), Positives = 101/212 (47%), Gaps = 17/212 (8%)

Query: 340 GKPVALAGSYPKNTPDALEAHMKMLLEKECSCLVVLTSEDQMQAKQ--LPAYFRGSYTFG 397
G +A YP LE+H +ML E L VL S ++ ++ +P YFR S T+G
Sbjct: 252 GNTRTIACQYP--LQSQLESHFRMLAENRTPVLAVLASSSEIANQRFGMPDYFRQSGTYG 309

Query: 398 EVHTNSQKVSSASQGGAI--DQYNMHL-SCGEKQYTIPVLHVKNWPDHQPLPS--TDQLE 452
+ S+ G I D Y + + G+K ++PV+HV NWPD + S T L
Sbjct: 310 SITVESKMTQQVGLGDGIMADMYTLTIREAGQKTISVPVVHVGNWPDQTAVSSEVTKALA 369

Query: 453 YLADRVKNSNQNGAPGRSSS-----DKHLPMIHCLGGVGRTGTMAAALVLKDNPHSNL-- 505
L D+ + +N + SS K P+IHC GVGRT + A+ + D+ +S L
Sbjct: 370 SLVDQTAETKRNMYESKGSSAVGDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSV 429

Query: 506 EQVRADFRDSRNNRMLEDASQF-VQLKAMQAQ 536
E + + R RN M++ Q V +K + Q
Sbjct: 430 EDMVSQMRVQRNGIMVQKDEQLDVLIKLAEGQ 461


76GX95_05165GX95_05195N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_05165-2120.493202S-ribosylhomocysteine lyase
GX95_05170-2110.528753porin
GX95_05175-2142.839683MFS transporter
GX95_05180-2111.358994multidrug export protein EmrA
GX95_05185-3110.577697transcriptional repressor MprA
GX95_05190-2131.016491MFS transporter
GX95_05195-1130.308784glycine betaine ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05165LUXSPROTEIN287e-103 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 287 bits (736), Expect = e-103
Identities = 130/170 (76%), Positives = 145/170 (85%)

Query: 2 PLLDSFAVDHTRMQAPAVRVAKTMNTPHGDAITVFDLRFCIPNKEVMPEKGIHTLEHLFA 61
PLLDSF VDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ EKGIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRDHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMADVLKVQDQNQIP 121
GFMR+HLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAM DVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLSEAQDIARHILERDVRVNSNKELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V VN N ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05175TCRTETB1297e-35 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 129 bits (326), Expect = 7e-35
Identities = 94/405 (23%), Positives = 164/405 (40%), Gaps = 23/405 (5%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRF 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ +
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFMWSTVAFAAASWACGVS-SSLNMLIFFRVVQGVVAGPLIPLSQSLLLNNYPPAK 135
G +L ++ + S V S ++LI R +QG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGIAVVLMTLHTLRGRETH 195
R A L V + GP +GG I+ HW ++ I + I I V + L+
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM-ITIITVPFLMKLLKKEVRI 195

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVIAISFLIVWELTD 255
D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 196 KG--HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DHPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAWTFEPGMDFGASAWPQFIQGF- 373
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 374 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05180RTXTOXIND742e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 74.1 bits (182), Expect = 2e-16
Identities = 61/418 (14%), Positives = 125/418 (29%), Gaps = 97/418 (23%)

Query: 19 KRKTALLLLTLLFVIIAVAYGIYWFLVLRHIEETDDA----YVAGNQVQIMAQVSGSVTK 74
+ L F++ + VL +E A +G +I + V +
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 75 VWADNTDFVKEGDVLVTLDQT--------------------------------------- 95
+ + V++GDVL+ L
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 96 -------------DAKQAFERAKTALASSVRQTHQLMINSKQ-------LQANIDVQKTA 135
+ + K ++ Q +Q +N + + A I+ +
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 136 LAQAQSDLNRRVPLGNANLIGREELQHARDAVASAQAQLDVAIQQYNSNQAMILNSNLED 195
+S L+ L + I + + + A +L V Q ++ IL++ E
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 196 QPAVQQAATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQ 236
Q Q E+ + + + I +P++ V + V G
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 237 ISPTTPLMAVVPATD-LWVDANFKETQLANMRIGQPVTVITDIYGDDVKY---TGKVVGL 292
++ LM +VP D L V A + + + +GQ + + + +Y GKV +
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI 408

Query: 293 DMGTGSAFSLLPAQNATGNWIKVVQRLPVRVELDARQLEQHPLRIGLSTLVTVDTANR 350
+ ++ G V+ + + PL G++ + T R
Sbjct: 409 -----NLDAIE--DQRLGLVFNVIISIEENCLSTGNK--NIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05190TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.4 bits (110), Expect = 1e-07
Identities = 31/165 (18%), Positives = 66/165 (40%), Gaps = 2/165 (1%)

Query: 34 LDTIAHHFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFE-RRTLIVSMTLLAAGGMLI 92
L IA+ F+ +S ++ TA L ++ G L D +R L+ + + G ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 93 TASSQSLSMMILGTALTGLFSVVAQILVPLA-ATLATPATRGKVVGTIMSGLLLGILLAR 151
S++I+ + G + LV + A RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASALMALMAVALWRGLPKLKSDTHLNY 196
+ G++A+ W + + + + + +++ H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05195PF06057290.014 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.4 bits (66), Expect = 0.014
Identities = 8/55 (14%), Positives = 17/55 (30%)

Query: 277 FAIMKLPLADINAQNAMMHAGKSSEADVQGHVDGWINAHQQQFDGWVKEALAAQK 331
F + ++P + S +D + HV + + Q + Q
Sbjct: 133 FVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVTSDNQSARYLTLPEVNKQT 187


77GX95_05395GX95_05465N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_05395-1111.217763multidrug ABC transporter ATP-binding protein
GX95_05400-213-0.543299glycosyl transferase
GX95_05405-112-0.358670DNA-invertase
GX95_054101133.295237flagellin
GX95_054151143.600635repressor
GX95_054252143.907871secretion protein HlyD
GX95_054302144.095268ATP-binding protein
GX95_054353154.141330type I secretion protein TolC
GX95_054402153.863269Ig-like domain repeat protein
GX95_05445112-0.897293SsrA-binding protein
GX95_05450112-0.441519ubiquinone-binding protein
GX95_05455112-1.157836RnfH family protein
GX95_05460011-1.037866outer membrane protein assembly factor BamE
GX95_05465011-1.074305DNA repair protein RecN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05395PF05272310.032 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.032
Identities = 44/217 (20%), Positives = 64/217 (29%), Gaps = 49/217 (22%)

Query: 992 PPG----TVVAVVGRSGVGKSTLIKLLAGLYSPGSGQIRVGER-----------LIDAAS 1036
PG V + G G+GKSTLI L GL +G + +
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSE 649

Query: 1037 LSDYRRQTGLVTQDVALFSGDIAENI-RYSRPDSSDTEVEIAARRAGLFETV---QHL-- 1090
++ +RR D + RY V+ R+ ++ T Q+L
Sbjct: 650 MTAFRR------ADAEAVKAFFSSRKDRYRGA--YGRYVQDHPRQVVIWCTTNKRQYLFD 701

Query: 1091 PLGFRT--PVNNGG----TDLSAGQRQLIALARAQLAQ----------AHILLLDEATAR 1134
G R PV G L + QL A A I E R
Sbjct: 702 ITGNRRFWPVLVPGRANLVWLQKFRGQLFAEALHLYLAGERYFPSPEDEEIYFRPEQELR 761

Query: 1135 -IDRSAEERLITSLTGVTHTEKRIALIVAHRLTTARR 1170
++ + RL LT A A + +
Sbjct: 762 LVETGVQGRLWALLTREG---APAAEGAAQKGYSVNT 795


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05410FLAGELLIN2808e-91 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 280 bits (718), Expect = 8e-91
Identities = 267/510 (52%), Positives = 316/510 (61%), Gaps = 13/510 (2%)

Query: 2 AQVINTNSLSLLTQNNLNKSQSALGTAIERLSSGLRINSAKDDAAGQAIANRFTANIKGL 61
AQVINTNSLSLLTQNNLNKSQS+L +AIERLSSGLRINSAKDDAAGQAIANRFT+NIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQASRNANDGISIAQTTEGALNEINNNLQRVRELAVQSANSTNSQSDLDSIQAEITQRLN 121
TQASRNANDGISIAQTTEGALNEINNNLQRVREL+VQ+ N TNS SDL SIQ EI QRL
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVKVLAQDNTLTIQVGANDGETIDIDLKQINSQTLGLDSLNVQKAYDV 181
EIDRVS QTQFNGVKVL+QDN + IQVGANDGETI IDL++I+ ++LGLD NV +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 SATDVISSTYSDGTQALTAPTATDIKAALGNPTVTGDTLTAAVSFKDGKYYATVSGYTDA 241
+ D+ SS + A A + + + V DT V D Y +G
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVP--DKVYVNAANGQLTT 238

Query: 242 GDTAKNGKYEVTVDSATGAVSFGATPTKSTVTGDTAVTKVQVNAPVAADAATKKALQDGG 301
D N ++ + + A + A + G V TK G
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKG-VTFTIDTKTGNDGNG 297

Query: 302 VSSADANAATLVKMSYTDKNGKTIEGGYALKAGDKYYAA------DYDEATGAIKAKTTS 355
S N + G L++ Y + +D+ T AK +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 356 YTAADGTTKTAANQLGGVDG----KTEVVTIDGKTYNASKAAGHDFKAQPELAEAAAKTT 411
A + + + G + + VT+ GKT K A E A AA K+T
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKST 417

Query: 412 ENPLQKIDAALAQVDALRSDLGAVQNRFNSAITNLGNTVNNLSEARSRIEDSDYATEVSN 471
NPL ID+AL++VDA+RS LGA+QNRF+SAITNLGNTV NL+ ARSRIED+DYATEVSN
Sbjct: 418 ANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSN 477

Query: 472 MSRAQILQQAGTSVLAQANQVPQNVLSLLR 501
MS+AQILQQAGTSVLAQANQVPQNVLSLLR
Sbjct: 478 MSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05425RTXTOXIND2433e-78 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 243 bits (621), Expect = 3e-78
Identities = 95/432 (21%), Positives = 176/432 (40%), Gaps = 56/432 (12%)

Query: 18 ERAFSGAGRIVLICSLLFLILGI-WAWFGRLDEVSTGNGKVIPSSREQVLQSLDGGILAQ 76
E S R+V + FL++ + G+++ V+T NGK+ S R + ++ ++ I+ +
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 77 LTVREGDRVQANQIVARLDPTRLASNVGESAAKYRASLASSARLTAEVSDLPL------- 129
+ V+EG+ V+ ++ +L ++ ++ + + R + L
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 130 --AFPAELNGWPDLIAAETRLYKSR-----------RAQLADTEAELRDALASVNK---- 172
P N + + T L K + L AE LA +N+
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 173 ------ELAITQRLEKSGAASHVEVLRLQRQKSDLG---------------------LKI 205
L L A + VL + + + +
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEY 289

Query: 206 TDLRSQYYVQAREALSKANAEVDMLSAILKGREDSVTRLTVRSPVRGIVKNIQVTTIGGV 265
+ + + + L + + +L+ L E+ +R+PV V+ ++V T GGV
Sbjct: 290 QLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGV 349

Query: 266 IPPNGEMMEIVPVDDRLLIETRLSPRDIAFIHPGQRALVKITAYDYAIYGGLDGVVETIS 325
+ +M IVP DD L + + +DI FI+ GQ A++K+ A+ Y YG L G V+ I+
Sbjct: 350 VTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN 409

Query: 326 PDTIQDKVKPEIFYYRVFIRTHQDYLQNKSGRRFSIVPGMIATVDIKTGEKTIVDYLIKP 385
D I+D+ + V I ++ L + + GM T +IKTG ++++ YL+ P
Sbjct: 410 LDAIEDQRLGL--VFNVIISIEENCLSTG-NKNIPLSSGMAVTAEIKTGMRSVISYLLSP 466

Query: 386 F-NRAKEALRER 396
E+LRER
Sbjct: 467 LEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05435RTXTOXIND355e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 5e-04
Identities = 32/224 (14%), Positives = 63/224 (28%), Gaps = 32/224 (14%)

Query: 209 DVVQTEARIESARSQLAQYQANLDSAKASLMSWLGWNSLNGINNDFPAKLARSCETATPD 268
+ EA +S L Q + + S D P S E
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 269 DRLVPAVLAAW-AQANVARANLDYASAQ---MTPTISLEPSVQHYLNDKYPSHEVLDKTQ 324
L+ + W Q NLD A+ + I+ ++ + L Q
Sbjct: 188 TSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQ 247

Query: 325 YSTWVKVEMPLYQGGGLTARRNAASHAVDAAQSTIQRTRLDVRQKLMEARSQAMSLASAL 384
V + ++ +L +SQ + S
Sbjct: 248 AIAKHAVL-------------------------EQENKYVEAVNELRVYKSQLEQIES-- 280

Query: 385 QILRRQQQLSERTRELYQQQYLDLGSRPLLDVLNAEQEVYQARF 428
+IL +++ T +L++ + LD + ++ E+ +
Sbjct: 281 EILSAKEEYQLVT-QLFKNEILDKLRQTTDNIGLLTLELAKNEE 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05440INTIMIN433e-05 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 43.1 bits (101), Expect = 3e-05
Identities = 66/311 (21%), Positives = 102/311 (32%), Gaps = 30/311 (9%)

Query: 2724 TPAQTNGQPLLAFAQDKAGNTGIAAGFTAPDTRVPEAPIITNVVDDVGIYTGAIANGQ-- 2781
+N + A A D+ GN+ T + V D T A A+G
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 2782 VTNDAQPTLNGTAQAGATVS--IYNNGALLGTTTANASGNWSFTPTGNLTEGSHAFT-AT 2838
+T A NG AQA VS I + A+L +AN +G+ T T L +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVT--LKSDKPGQVVVS 635

Query: 2839 ATNANGTGSVSTAATVIVDTLAPGTPSGTLSADGGSLSGQAEANSTVTVTLAGG------ 2892
A A T +++ A + VD +GQ TV V
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 2893 VTLTTTAG----------SNGAWSLTLPTKQIEGQLINVTATDAAGN-ASGALGITAPVL 2941
VT TTT G +NG +TL + L++ +D A + + + +
Sbjct: 696 VTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLT 755

Query: 2942 PLAARDNITSLDLTSTAVTSTQNYSDYGLLLVGALGNVASVLGN------DTAQVEFIIA 2995
I + T Y L G G N D + + +
Sbjct: 756 IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLK 815

Query: 2996 EGGTGDVTIDA 3006
E GT +++ +
Sbjct: 816 EKGTTTISVIS 826



Score = 39.3 bits (91), Expect = 4e-04
Identities = 79/414 (19%), Positives = 146/414 (35%), Gaps = 39/414 (9%)

Query: 2197 IYNGSALVGTA-QVQANGSWSFT-------PSTSLGAGVWNLTATATDAAGNTSAASEIR 2248
+++ SAL Q+Q +GS S G+ V+ +TA A D GN+S + +
Sbjct: 486 VWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSS-NNVLL 544

Query: 2249 SFTIDTTAPAAPVIDTVYDGTGPITGNLSSGQ--ITDEARPVISGTREAN--TTIRLYDN 2304
+ T+ + V D T T + G IT A +G +AN + +
Sbjct: 545 TITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSG 603

Query: 2305 GTLLAEIPADNSNSWRYTPDASLATGNHVITVIAVDAAGNASPV-SDSVNFVVDTTPPLT 2363
+L+ A+ + S + T +L + V++ A S + +++V FV T +T
Sbjct: 604 TAVLSANSANTNGSGKAT--VTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASIT 661

Query: 2364 PVITSVSDDQAPGLGTIANGQN--TNDPTPTFSGTAEAGATITLYENGTVIGTTTAQ--P 2419
+ A +ANGQ+ T + +T + +T +
Sbjct: 662 EI--KADKTTA-----VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT 714

Query: 2420 DGAWSVSTSKLTSGTHVITAVATDAAGNSSPNSTAFTLTVDTTAPQTPILTSVVDDVAGG 2479
+G V+ + T G +++A +D A + F T+ I+ + V
Sbjct: 715 NGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPT 774

Query: 2480 VTGNLANGQITNDNRPTLNGTAEAGSVISIYDGNTLLGVTSANAGGAWSFTPTTGLNDGT 2539
V + A I+ D ++ G + G + + + N
Sbjct: 775 VWLQYGQVNLKASGGNGKYTWRSANPAIASVDASS--GQVTLKEKGTTTISVISSDN--- 829

Query: 2540 RTLTVTATDPAGNVSPATSGFTIVVD------TLAPTVPLITSIVDDVPNNTGA 2587
+T T T P + P S D +P + +++V GA
Sbjct: 830 QTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883



Score = 36.6 bits (84), Expect = 0.003
Identities = 63/279 (22%), Positives = 90/279 (32%), Gaps = 36/279 (12%)

Query: 1508 TLPVTSALPDGVYTLTAIAADAAGNSSGVSNSFTFTVDTVPLLPPVVN--EILDDVAPVT 1565
LP VY +TA A D GNS SN+ T+ TV VV+ + D A T
Sbjct: 513 ILPAYVQGGSNVYKVTARAYDRNGNS---SNNVLLTI-TVLSNGQVVDQVGVTDFTADKT 568

Query: 1566 GPLTDG--AFTNDRTLTINGSGENGSTVTIYDNGVAIGTALVTDGVWTFN-----TPELS 1618
DG A T T+ NG + V+ + GTA+++ N T L
Sbjct: 569 SAKADGTEAITYTATVKKNGVAQANVPVSF---NIVSGTAVLSANSANTNGSGKATVTLK 625

Query: 1619 EVSHALTFSATDDAGNTTAQTQPITITVDITAPPAPTIQTVDDDGTRVAGRADPYA-TVE 1677
+ A T+A I VD T I+ T VA D TV+
Sbjct: 626 SDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKT--TAVANGQDAITYTVK 683

Query: 1678 IHHADGTLVGSAVANGTGEFVVTLSPAQTDG---------GTLTAIAIDRAGNNGPATNF 1728
+ D + V T ++ S +TD T ++ A + A +
Sbjct: 684 VMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDV 743

Query: 1729 PASDSGLPAVPAITAIEDNVGSVQGNIAAGGATDDTTPT 1767
A + I + G PT
Sbjct: 744 KAPEVEFFTTLTIDDGNIEI--------VGTGVKGKLPT 774


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05450FLGMOTORFLIM280.026 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 27.6 bits (61), Expect = 0.026
Identities = 16/78 (20%), Positives = 30/78 (38%), Gaps = 8/78 (10%)

Query: 49 GSRVLESSPAQMTAAVDVSKAGISKTFTTRNQLTRNQSILMHLVDGPFKKLIGGWK---- 104
G+ VLE P+ + +D G + + LT I +++G +++ +
Sbjct: 113 GNAVLEVDPSITFSIIDRLFGGTGQAAKVQRDLT---DIENSVMEGVIVRILANVRESWT 169

Query: 105 -FTPLSPEACRIEFQLDF 121
L P +IE F
Sbjct: 170 QVIDLRPRLGQIETNPQF 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_05465RTXTOXIND310.009 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.009
Identities = 31/198 (15%), Positives = 64/198 (32%), Gaps = 36/198 (18%)

Query: 177 QQQSQERAARAELLQYQLKELNDFNPQAGEFEQIDEEYKRLANSGQLLTTSQNALALLAD 236
+ QS AR E +YQ+ + + E + DE Y + + ++L + +
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFS- 196

Query: 237 GEDVNLQSQLYSAKQLVSELVGMDSKLSGILDMLEEATIQLTEASDELRHYCERLDLDPN 296
Q+Q Y Q L ++ +L A I E +
Sbjct: 197 ----TWQNQKY---QKELNLDKKRAERLTVL-----ARINRYENLSRV------------ 232

Query: 297 RLFELEQRIAKQISLARKHHVSPEALPQLYQSLLEEQQQLDDQADSLETLTLAVNKHHQQ 356
+ R+ SL K ++ ++LE++ + + + L + + +
Sbjct: 233 ----EKSRLDDFSSLLHKQAIA-------KHAVLEQENKYVEAVNELRVYKSQLEQIESE 281

Query: 357 ALETAQALHQQRQFYAQE 374
L + Q + E
Sbjct: 282 ILSAKEEYQLVTQLFKNE 299


78GX95_06615GX95_06650N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_06615-116-1.975226MFS transporter
GX95_06620-215-1.362347ABC transporter substrate-binding protein
GX95_06625-213-1.461865sensor histidine kinase
GX95_06630-112-2.264987transcriptional regulator
GX95_06635013-0.391707outer membrane protease
GX95_066450100.799246*hypothetical protein
GX95_06650-191.515017phospholipid-binding lipoprotein MlaA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06615TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 0.001
Identities = 72/429 (16%), Positives = 140/429 (32%), Gaps = 45/429 (10%)

Query: 28 IQALLSVFLGYLAYYIVRNNFTLSTPYLKEQLDLSATQI---GLLSSCMLIAYGISKGVM 84
I L +V L + ++ P L L S G+L + + V+
Sbjct: 8 IVILSTVALDAVGIGLI----MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 85 SSLADKASPKVFMACGLVLCAIVNVGLGFSSAFWIFAALVVFNGLFQGMGVGPSFITIAN 144
+L+D+ + + L A+ + + W+ + G+ G IA+
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIAD 122

Query: 145 WFPRRERGRVGAFWNISHNVGGGIVA-PIVGAAFAILGSEHWQSASYIVPACVAIIFALI 203
ER R F +S G G+VA P++G A + A + + L
Sbjct: 123 ITDGDERAR--HFGFMSACFGFGMVAGPVLGGLMGGFSPH----APFFAAAALNGLNFLT 176

Query: 204 VLVLGKGSPREEGLPSLEQMMPEEKVVLKTKNMAKAPENMSAWQIFCTYVLRNKNAWYIS 263
L +PE + + P A ++ +
Sbjct: 177 GCFL----------------LPE------SHKGERRPLRREALNPLASFRWARGMTVVAA 214

Query: 264 LVDVFVYMVRFGMISWLPIYLLTVKHFSKEQMSVAFLFFEWA---AIPSTLLAGWLSDKL 320
L+ VF M G + + F + ++ + ++ ++ G ++ +L
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARL 274

Query: 321 FKGRRMPLAMICMALIFVCLIGYWKSESLLMVTIFAAIVGCLIYVPQFLASVQTMEIVPS 380
+ R + L MI ++ L + + + A G + Q + S Q E
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQG 334

Query: 381 FAVGSAVGLRGFMSYIFGASLGTSLFGVMVDKLGWYGGFYLLMGGIVCCILFCYLSHRGA 440
GS L ++ I G L T+++ + + G+ + G + + L RG
Sbjct: 335 QLQGSLAALTS-LTSIVGPLLFTAIYAA---SITTWNGWAWIAGAALYLLCLPAL-RRGL 389

Query: 441 LELERQRQN 449
QR +
Sbjct: 390 WSGAGQRAD 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06620FLGMOTORFLIM290.050 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 28.7 bits (64), Expect = 0.050
Identities = 5/35 (14%), Positives = 15/35 (42%), Gaps = 4/35 (11%)

Query: 330 QQLVQRMFDTAISFRLAQLKDAWRALHSAETRLKR 364
+++ + LA ++++W + RL +
Sbjct: 150 NSVMEGVIVRI----LANVRESWTQVIDLRPRLGQ 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06630HTHFIS2484e-80 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 248 bits (636), Expect = 4e-80
Identities = 120/474 (25%), Positives = 192/474 (40%), Gaps = 73/474 (15%)

Query: 7 SILLIDDDVDVLDAYTQMLEQAGYRVRGFTHPFEAKEWVKADWEGIVLSDVCMPGCSGID 66
+IL+ DDD + Q L +AGY VR ++ W+ A +V++DV MP + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 67 LMTLFHQDDDQLPILLITGHGDVPMAVDAVKKGAWDFLQKPVDPGKLLILIEDALRQRRS 126
L+ + LP+L+++ A+ A +KGA+D+L KP D +L+ +I AL + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 VIARRQYCQQTLQVELIGRSEWMNQFRQRLQQLAETDIAVWFYGEHGTGRMTGARYLHQL 186
++ + Q L+GRS M + + L +L +TD+ + GE GTG+ AR LH
Sbjct: 125 RPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 187 GRNAKGPFVRYELT--PENAGQLETF-----------------IDQAQGGTLVLSHPEYL 227
G+ GPFV + P + + E F +QA+GGTL L +
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 228 TREQQHHLAR-LQSLEHRP----------FRLVGVGSASLVEQAAANQIAAELYYCFAMT 276
+ Q L R LQ E+ R+V + L + +LYY +
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 277 QIACQSLSQRPDDIEPLFRHYLRKACLRLNHPVPEIAGELLKGIMRRAWPSNVRELANAA 336
+ L R +DI L RH++++A + V E L+ + WP NVREL N
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 337 ELFAV-----------------------------------GVLPLAETVNPQLL------ 355
+ E Q
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 356 LQEPTPLDRRVEEYERQIITEALNIHQGRINEVAEYLQIPRKKLYLRMKKYGLS 409
L DR + E E +I AL +G + A+ L + R L ++++ G+S
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06635OMPTIN475e-173 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 475 bits (1224), Expect = e-173
Identities = 148/320 (46%), Positives = 213/320 (66%), Gaps = 11/320 (3%)

Query: 1 MKKHAIAVMMIAVFSESVYAESTLFIPDVSPESVTTSLSVGVLNGKSRELVYD-TDTGRK 59
M+ + +++ + S +A + +P+++ +S+G L+GK++E VY + GRK
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTET--LSFTPDNINADISLGTLSGKTKERVYLAEEGGRK 58

Query: 60 LSQLDWKIKNVATLQGDLSWEPYSFMTLDARGWTSLASGSGHMVDHDWMSSEQPG-WTDR 118
+SQLDWK N A ++G ++W+ +++ A GWT+L S G+MVD DWM S PG WTD
Sbjct: 59 VSQLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDE 118

Query: 119 SIHPDTSVNYANEYDLNVKGWLLQGDNYKAGVTAGYQETRFSWTARGGSYIYDNGR---- 174
S HPDT +NYANE+DLN+KGWLL NY+ G+ AGYQE+R+S+TARGGSYIY +
Sbjct: 119 SRHPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRD 178

Query: 175 YIGNFPHGVRGIGYSQRFEMPYIGLAGDYRINDFECNVLFKYSDWVNAHDNDEHY--MRK 232
IG+FP+G R IGY QRF+MPYIGL G YR DFE FKYS WV + DNDEHY ++
Sbjct: 179 DIGSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKR 238

Query: 233 LTFREKTENSRYYGASIDAGYYITSNAKIFAEFAYSKYEEGKGGTQIIDKTSGDTAYFGG 292
+T+R K ++ YY +++AGYY+T NAK++ E A+++ KG T + D + +T+ +
Sbjct: 239 ITYRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNN-NTSDYSK 297

Query: 293 DAAGIANNNYTVTAGLQYRF 312
+ AGI N N+ TAGL+Y F
Sbjct: 298 NGAGIENYNFITTAGLKYTF 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06645PF06580290.030 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.030
Identities = 22/113 (19%), Positives = 46/113 (40%), Gaps = 12/113 (10%)

Query: 199 WIIATMVWMFPAAGGAKIVVIILMTWLIALGDTTHIVVGSVEILYLV-FNGTLPWSDFLW 257
I W+ G I+ ++ +I + V + I L+ F T P + F
Sbjct: 61 SFIKRQGWLKLNMG-QIILRVLPACVVIGM----VWFVANTSIWRLLAFINTKPVA-FTL 114

Query: 258 PFALPTLAGNICGGTFIFALMSHAQIRNDMSNKRKEEARLRGERLERERKKAE 310
P AL ++ N+ TF+++L+ K ++A + ++ ++A+
Sbjct: 115 PLAL-SIIFNVVVVTFMWSLLYFGWHFF----KNYKQAEIDQWKMASMAQEAQ 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06650VACJLIPOPROT398e-144 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 398 bits (1025), Expect = e-144
Identities = 238/251 (94%), Positives = 248/251 (98%)

Query: 1 MKLRLSALALGTTLLVGCASSGTEQQGRSDPFEGFNRTMYNFNFNVLDPYVVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGT+QQGRSDP EGFNRTMYNFNFNVLDPY+VRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAIMVNYFLQGDPYQGMVHFTRFFLNTLLGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPA+MVNYFLQGDPYQGMVHFTRFFLNT+LGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRVEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLREDGGDMADTLYPVLSWLTWPM 180
ANPKLQR EPHRFGSTLGHYGVGYGPYVQLPFYGSFTLR+DGGDMAD LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SIGKWTIEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGKLKPQENPNAQA 240
S+GKWT+EGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGG+LKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDELKEIDSE 251
IQD+LK+IDSE
Sbjct: 241 IQDDLKDIDSE 251


79GX95_06770GX95_06810N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_06770-1152.586490tRNA pseudouridine(38-40) synthase TruA
GX95_067750152.049936hypothetical protein
GX95_06780-1101.847727acetyl-CoA carboxylase subunit beta
GX95_06785-190.505179bifunctional tetrahydrofolate
GX95_06790010-1.063717cell division protein DedD
GX95_06795-19-1.822614colicin V production protein
GX95_06800010-1.499832amidophosphoribosyltransferase
GX95_06805114-1.652512sigma-54-dependent Fis family transcriptional
GX95_06810013-1.927305diaminopimelate decarboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06770FbpA_PF05833290.029 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.7 bits (64), Expect = 0.029
Identities = 20/63 (31%), Positives = 31/63 (49%), Gaps = 6/63 (9%)

Query: 204 VRNIVGS-LLEVGAHNQPESWIAELLAAKDRTLAAATAKAEGLYLVAVDYPDRFDLPKPP 262
+NI GS ++ + PES + E AA LAA +K++ V VDY + ++ KP
Sbjct: 496 TKNIPGSHVIVKNIMDIPESTLLE--AAN---LAAYYSKSQNSSNVPVDYTEVKNVKKPN 550

Query: 263 MGP 265

Sbjct: 551 GAK 553


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06790PF05616300.008 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.7 bits (66), Expect = 0.008
Identities = 22/89 (24%), Positives = 34/89 (38%), Gaps = 19/89 (21%)

Query: 53 PDMMPAATQALPTQPPEGAAEEVRAGDAAAPSLDPSRMASNNVELDPIPAETPKPKPQPK 112
PD+ P + +A QP P + P+ +NN P P E P +P P+
Sbjct: 313 PDLTPGSAEAPNAQP--------------LPEVSPAENPANN----PAPNENPGTRPNPE 354

Query: 113 PQQPVVVASTPTPAPKPAAD-DKPAPTGK 140
P + + P +P D PA +
Sbjct: 355 PDPDLNPDANPDTDGQPGTRPDSPAVPDR 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06800ANTHRAXTOXNA340.002 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 33.6 bits (76), Expect = 0.002
Identities = 13/37 (35%), Positives = 24/37 (64%), Gaps = 2/37 (5%)

Query: 469 KDVDQQYLDFLDSLRND-DAKAVLFQNEM-ENLEMHN 503
K +D ++L+ + SL +D D+ +LF + E LE++N
Sbjct: 186 KSLDPEFLNLIKSLSDDSDSSDLLFSQKFKEKLELNN 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06805HTHFIS348e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 348 bits (894), Expect = e-118
Identities = 121/371 (32%), Positives = 185/371 (49%), Gaps = 24/371 (6%)

Query: 122 NMSGVRRLQEQVVELNQLLYADHHE---KHHAIITENPEMLSNIAKAKRLAASNIPVTIV 178
+++ + + + + + + + ++ + M RL +++ + I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 179 GETGTGKELFSRLIHQCSKRANKPFIALNCGALPPTLIESTLFGTVRGAYTGAENS-QGY 237
GE+GTGKEL +R +H KR N PF+A+N A+P LIES LFG +GA+TGA+ G
Sbjct: 167 GESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGR 226

Query: 238 LELANGGTLFLDELNAMPIEMQSKLLRFLQDKTFWRLGGQQQLHSDVRIVAAMNEAPVKL 297
E A GGTLFLDE+ MP++ Q++LLR LQ + +GG+ + SDVRIVAA N+ +
Sbjct: 227 FEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQS 286

Query: 298 IQQERLRADLFYRLSVGMLTLPPLRARPEDIPLLANYFIDKYRNDVPQDIHGLSETARAD 357
I Q R DL+YRL+V L LPPLR R EDIP L +F+ + + D+ + A
Sbjct: 287 INQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALEL 345

Query: 358 LLNHAWPGNVRMLENAIVRSMIMQEKDGLLKHIIF-------------------EQDELN 398
+ H WPGNVR LEN + R + +D + + II ++
Sbjct: 346 MKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSIS 405

Query: 399 LGVPETAPENPLPSSPDPQYEGSLEVRVANYERHLIETALDTHQGNIAAAARSLNVSRTT 458
V E + G + +A E LI AL +GN AA L ++R T
Sbjct: 406 QAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNT 465

Query: 459 LQYKVQKYAIR 469
L+ K+++ +
Sbjct: 466 LRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_06810ALARACEMASE320.006 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 31.7 bits (72), Expect = 0.006
Identities = 23/133 (17%), Positives = 47/133 (35%), Gaps = 20/133 (15%)

Query: 87 VLKAIRDAGICAEANSQYEVRKCLEIGFRGDQIVFNGVVKKPADLEYAIANDLYLINVDS 146
+ AI A N + E E G++G ++ G DLE + L
Sbjct: 46 IWSAIGATDGFALLNLE-EAITLRERGWKGPILMLEGFFH-AQDLEIYDQHRLTT----C 99

Query: 147 LYELEHIDAIS-RKLKKVANVCVRVEPNVPSATHAELVTAFHAKSGLDLEQAEETCRRIL 205
++ + A+ +LK ++ ++V + + + G ++ +++
Sbjct: 100 VHSNWQLKALQNARLKAPLDIYLKVN------------SGMN-RLGFQPDRVLTVWQQLR 146

Query: 206 AMPYVHLRGLHMH 218
AM V L H
Sbjct: 147 AMANVGEMTLMSH 159


80GX95_07215GX95_07250N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_07215-19-3.517421hypothetical protein
GX95_07220-19-2.416824MFS transporter
GX95_07225013-1.166778MR-MLE family protein
GX95_072300150.282223DNA gyrase subunit A
GX95_07235-1130.496203two-component system sensor histidine
GX95_07240-2131.064990DNA-binding response regulator
GX95_07245-3111.519849phosphotransferase RcsD
GX95_07250-2152.927327porin OmpC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07215NUCEPIMERASE280.031 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.8 bits (62), Expect = 0.031
Identities = 16/75 (21%), Positives = 29/75 (38%), Gaps = 15/75 (20%)

Query: 133 AERDTQAYLKLDHDFHYVFVKYADNKYISQAHLLISARLLAIRYRLDFTAEYITSSNRGH 192
A+R+ L F VF+ + A+RY L+ Y S+ G
Sbjct: 62 ADREGMTDLFASGHFERVFISPH------RL---------AVRYSLENPHAYADSNLTGF 106

Query: 193 ATILDMLKNNNVEGV 207
IL+ ++N ++ +
Sbjct: 107 LNILEGCRHNKIQHL 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07220TCRTETB310.012 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.6 bits (69), Expect = 0.012
Identities = 36/179 (20%), Positives = 68/179 (37%), Gaps = 12/179 (6%)

Query: 25 ILYFFNYMDRVNIGFAALRMNESLGITPEDFANISSIFFISYLIFQIPSSIGLQKLGARK 84
IL FF+ ++ + + + + P +++ F +++ I +LG ++
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 85 W--ISSIIIGWGAVTGLIFFAKDTQHIL-LARIFLGVFEAGFFPGMVYYLACWFPARERG 141
II +G+V G F +L +AR G A F ++ +A + P RG
Sbjct: 81 LLLFGIIINCFGSVIG--FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 142 KVNSFFMLSIAVASVLAAPMSGWIIEHLNTPDYEGWRWLFAIEGIPTVFLGILTFYLLP 200
K +A+ + + G I ++ W +L I I T+ LL
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI-TIITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07235HTHFIS801e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-17
Identities = 29/104 (27%), Positives = 47/104 (45%)

Query: 827 ILVVDDHPINRRLLADQLGSLGYQCKTANDGVDALNVLSKNAIDIVLSDVNMPNMDGYRL 886
ILV DD R +L L GY + ++ ++ D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVVGVTANALAEEKQRCLESGMDSCLSKPVTLD 930
RI++ LPV+ ++A + E G L KP L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07240HTHFIS488e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.9 bits (114), Expect = 8e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 1 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 60
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 114
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 115 DLPKALAALQKGKKFTPESVSRLLE 139
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07250ECOLIPORIN5340.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 534 bits (1378), Expect = 0.0
Identities = 260/389 (66%), Positives = 297/389 (76%), Gaps = 17/389 (4%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEIYNKDGNKLDLFGKVDGLHYFSDDKGSDGDQTYMRIG 60
MK KVL+L++PALL AGAA+AAEIYNKDGNKLDL+GKVDGLHYFSDD DGDQTYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVNDQLTGYGQWEYQIQGNQTEG-SNDSWTRVAFAGLKFADAGSFDYGRNYGVTY 119
FKGETQ+NDQLTGYGQWEY +Q N TEG +SWTR+AFAGLKF D GSFDYGRNYGV Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTYG-ADNFMQQRGNGYATYRNTDFFGLVDGLDFALQYQGKNGSVS 178
DV WTD+LPEFGGD+Y ADN+M R NG ATYRNTDFFGLVDGL+FALQYQGKN S S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 GENDG--------GRSLLNQNGDGYGGSLTYAIGEGFSVGGAITTSKRTADQNNTADARL 230
++ G + NGDG+G S TY IG GFS G A TTS RT +Q N
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNA--GGT 238

Query: 231 YGNGDRATVYTGGLKYDANNIYLAAQYSQTYNATRFGTSNGNNKSDSYGFANKAQNFEVV 290
GD+A +T GLKYDANNIYLA YS+T N T +G ++ G ANK QNFEV
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGY---DGGVANKTQNFEVT 295

Query: 291 AQYQFDFGLRPSVAYLQSKGKDISNGYGASYGDQDIVKYVDVGATYYFNKNMSTYVDYKI 350
AQYQFDFGLRP+V++L SKGKD++ + D+D+VKY DVGATYYFNKN STYVDYKI
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYN-NVNGDDKDLVKYADVGATYYFNKNFSTYVDYKI 354

Query: 351 NLLDKND-FTRDAGINTDDIVALGLVYQF 378
NLLD +D F +DAGI+TDDIVALG+VYQF
Sbjct: 355 NLLDDDDPFYKDAGISTDDIVALGMVYQF 383


81GX95_07915GX95_07955N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_079150131.837485two-component system response regulator BaeR
GX95_07925-2143.593978two-component system sensor histidine kinase
GX95_07930-1154.085311multidrug transporter subunit MdtD
GX95_07935-1154.408871multidrug transporter subunit MdtC
GX95_07945-1133.289113multidrug transporter subunit MdtB
GX95_07950-1122.819385multidrug transporter subunit MdtA
GX95_07955-1132.450281molecular chaperone
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07925HTHFIS757e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 7e-18
Identities = 27/140 (19%), Positives = 65/140 (46%), Gaps = 2/140 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLINHGDKVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + ++ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-RRC 128
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 129 KPQRELQQQDAESPLMIDES 148
+ +L+ + ++ S
Sbjct: 124 RRPSKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07930BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 20/66 (30%), Positives = 26/66 (39%), Gaps = 14/66 (21%)

Query: 187 RGLLAPVKRLVEGTHRLAAGDFTTRVTPTSADEL-----------GKLAQDFNQLASTLE 235
L+A V+ V H LA + P S + L G L N+LA E
Sbjct: 104 SQLMAAVRSKVMEGHSLAD---AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTE 160

Query: 236 KNQQMR 241
+ QQMR
Sbjct: 161 QRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07935TCRTETB1243e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 124 bits (313), Expect = 3e-33
Identities = 95/450 (21%), Positives = 199/450 (44%), Gaps = 25/450 (5%)

Query: 20 FMQSLDTTIVNTALPSMAKSLGESPLHMHMVVVSYVLTVAVMLPASGWLADKIGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 AAIVLFTLGSLFCALSGT-LNQLVLARVLQGVGGAMMVPVGRLTVMKIVPRAQYMAAMTF 138
I++ GS+ + + + L++AR +QG G A + + V + +P+ A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLLGPALGGVLVEYASWHWIFLINIPVGIVGAIATFM-LMPNYTIETRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ + + M L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 PGFLLLAIGMAVLTLALDGSKSMGISPWTLAGLAAGGAAAILLYLLHAKKNSGALFSLRL 257
G +L+++G+ L + L + L+++ H +K + L
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLIVSVL---------SFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTPTFSLGLLGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+L M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQIVNRFGYRRVLVATTLGLALVSLLFMSVALL----GWYYLLPLVLLLQGMVNSARFS 372
+V+R G VL +G+ +S+ F++ + L W+ + +V +L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDTLASSGNSLLSMIMQLSMSIGVTIAGMLL--GMFGQQHIGIDSSATHH 430
++T+ L A +G SLL+ LS G+ I G LL + Q+ + ++ + +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 431 VFMYTWLCMAVIIALPAIIFARVPNDTQQN 460
++ L + II + ++ V +Q++
Sbjct: 428 LYSNLLLLFSGIIVISWLVTLNVYKHSQRD 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07940ACRIFLAVINRP8810.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 881 bits (2277), Expect = 0.0
Identities = 282/1035 (27%), Positives = 502/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILIAAAITLCGILGFRLLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ ++A + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVNEMTSSS-SLGSTRIILEFNFDRDINGAARDVQAAINAAQSLLPGGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSES--WSQGKLYDFASTQLAQTIAQIDGVGDVDVGGSSL 182
+ S + +M+ S++ +Q + D+ ++ + T+++++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDEVREAIDSANVRRPQGAIEDSV------HRWQIQTNDELK 236
A+R+ L+ L ++ +V + N + G + + I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGAAVRLGDVASVTDSVQDVRNAGMTNANPAILLMIRKLPEANIIQ 295
E+ + + N +G+ VRL DVA V ++ N PA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDGIRAKLPELRAMIPAAIDLQIAQDRSPTIRASLQEVEETLAISVALVILVVFLFLRS 355
T I+AKL EL+ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVISMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVVSLTLTPMMCGWMLKSSKPRTQPRKRGVG----RLLVALQQGYGTSLKWVLNHTRLVG 530
++V+L LTP +C +LK K G Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVFLGTVALNIWLYIAIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
+++ VA + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVNNVTGFT-GGSRVNSGMMFITLKPRGER---KETAQQIIDRLRVKLAKEPGAR 641
+ +V V GF+ G N+GM F++LKP ER + +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQANASYQYTLLSDSLAALREWEPKIRKALSAL-----PQLADVNSD 696
+ + I G ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLIYDRDTMSRLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 SQDISALEKMFVINRDGKAIPLSYFAQWRPANAPLSVNHQGLSAASTIAFNLPTGTSLSQ 816
++K++V + +G+ +P S F + + I GTS
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATEAINRTMTQLGVPPTVRGSFSGTAQVFQQTMNSQLILIVAAIATVYIVLGILYESYVH 876
A + ++L P + ++G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRSGG 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPEQAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
+A A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07945ACRIFLAVINRP8900.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 890 bits (2302), Expect = 0.0
Identities = 292/1036 (28%), Positives = 504/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLLMAAILLAGIIGYRFLPVAALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L +++AG + LPVA P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVVTLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ +TL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPIYSKVNPADPPIMTLAVTSNAMPMTQVE--DMVETRVAQKISQVSGVGLVTLAGG 189
+ I S + +M S+ TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAVAALGLTSETVRTAITGANVNSAKGSLDGP------ERAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSADEYRRLII-AYQNGAPVRLGDVATVEQGAENSWLGAWANQAPAIVMNVQRQPGANI 302
++ +E+ ++ + +G+ VRL DVA VE G EN + A N PA + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVRDTQFELMLAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAVTLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQQSLRKQNRFSRACERMFDRVIASYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S + + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFATLLLSVMLWIVIPKGFFPVQDNGIIQGTLQAPQSSSYASMAQRQRQVAERILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV + L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTTFVGVDGANPTLNSARLQINLKPLDARDDR---VQQVISRLQTAVATIPG 653
+ V+S+ T G + N+ ++LKP + R+ + VI R + + I
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 VALYLQPTQDLTIDTQVSRTQYQFTLQ---ATTLDALSHWVPKL-QNALQSLPQLSEVSS 709
++ P I + T + F L DAL+ +L A Q L V
Sbjct: 660 G--FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDRGLAAWVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTA 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 STPGLAALETIRLTSRDGGTVPLSAIARIEQRFAPLSINHLDQFPVTTFSFNVPEGYSLG 829
++ + + S +G VP SA + + + P G S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAILDTEKTLALPADITTQFQGSTLAFQAALGSTVWLIVAAVVAMYIVLGVLYESFI 889
DA+ + + LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALIIAGSELDIIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + D+ ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIFQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIAMVGGLLVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GI ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07950RTXTOXIND416e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 6e-06
Identities = 36/172 (20%), Positives = 71/172 (41%), Gaps = 10/172 (5%)

Query: 123 KVALAQAQGQLAKDNATLANARRDLARYQQ---LAKTNLVSRQELDAQQAL--VNETQGT 177
K A+ + + + + L + L + + AK +L + L + +T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 178 IKADEANVASAQLQLDWSRITAPVSGRV-GLKQVDVGNQISSSDTAGIVVITQTHPIDLI 236
I +A + + S I APVS +V LK G +++++T +V++ + +++
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVT 369

Query: 237 FTLPESDIATVVQAQKAGKALVVEAWDRTNSHKL-SEGVLLSLDNQIDPTTG 287
+ DI + Q A + VEA+ T L + ++LD D G
Sbjct: 370 ALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 40.6 bits (95), Expect = 8e-06
Identities = 20/122 (16%), Positives = 46/122 (37%), Gaps = 13/122 (10%)

Query: 79 GTVTAA-NTVTVRSRVDGQLIALHFQEGQQVNAGDLLAQIDPSQFKVALAQAQGQLAKDN 137
G +T + + ++ + + + +EG+ V GD+L ++ + + Q
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQ------- 140

Query: 138 ATLANARRDLARYQQLAKTNLVSRQELDAQQALVNETQGTIKADEANVASAQLQLDWSRI 197
++L AR + RYQ L+++ EL+ L + + L +
Sbjct: 141 SSLLQARLEQTRYQILSRS-----IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 198 TA 199
+
Sbjct: 196 ST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_07955SHAPEPROTEIN515e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.5 bits (121), Expect = 5e-09
Identities = 32/129 (24%), Positives = 56/129 (43%), Gaps = 20/129 (15%)

Query: 132 AMMVHIRHTAHSQ-LPEAITQAVIGRPINFQGLGGDDANRQAQGILERAAKRAGFQEVVF 190
M+ H HS + ++ P+ R+A + +A+ AG +EV
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGA-----TQVERRA---IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLREEKRVLVVDIGGGTTDCSMLLMGPQWRQRADRENSLLGHSGCRV 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S R+
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 35.1 bits (81), Expect = 5e-04
Identities = 33/137 (24%), Positives = 55/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RCAEESKIALSG--QADITARLPFISDDLA------VAISQQGLEAALDQ 380
R +Y + AE K + D + +LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARILEQVQLALDSAQEKPDV--------IYLTGGSARSPLIKKALSEQLPGIPVAGGD 432
PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIPV +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


82GX95_08130GX95_08155N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08130224-8.253492UDP-glucose 4-epimerase GalE
GX95_08135-114-5.137243ISAs1 family transposase
GX95_08140-313-1.997705phosphogluconate dehydrogenase
GX95_08145-313-1.396606UDP-glucose 6-dehydrogenase
GX95_08150-2140.583732LPS O-antigen chain length determinant protein
GX95_08155-2161.676955bifunctional phosphoribosyl-AMP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08130NUCEPIMERASE1832e-57 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 183 bits (465), Expect = 2e-57
Identities = 81/351 (23%), Positives = 153/351 (43%), Gaps = 39/351 (11%)

Query: 1 MAILVTGGAGYIGTHTIISLLDKGYDIVVIDNFSNSSKDAL---TQVEKISAKKINFYHG 57
M LVTG AG+IG H LL+ G+ +V IDN N D ++E ++ F+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNL-NDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DIRDRHVLKDIFSQNSISDVIHFAGLKAVGESVTSPLKYYDNNISGTLCLLNEMLLFNVK 117
D+ DR + D+F+ V AV S+ +P Y D+N++G L +L ++
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 118 SIIFSSSATVYGVPKTLPIKESDPLGQITNPYGRTKIMAENILKDLTKAIPDFRATILRY 177
++++SS++VYG+ + +P D + + Y TK E + + + AT LR+
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPATGLRF 178

Query: 178 FNPVGAHPSGLIGENPKGIPN-NLFPVISQTIAGRNQSVNIFGSDYATPDGTGIRDYIHV 236
F G P G P+ LF + G+ S++++ G RD+ ++
Sbjct: 179 FTVYG----------PWGRPDMALFKFTKAMLEGK--SIDVYN------YGKMKRDFTYI 220

Query: 237 MDLAAGHFSALDKQREGKN---------------FKVYNLGTGKGYSVLQIIKEFENQIN 281
D+A D ++VYN+G ++ I+ E+ +
Sbjct: 221 DDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG 280

Query: 282 KKIKIDLCPRREGDIAQCWSSPDLALEELSWKANLSLEDMIRDTLNWLSKY 332
+ K ++ P + GD+ + + E + + +++D +++ +NW +
Sbjct: 281 IEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08135adhesinmafb290.028 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.3 bits (65), Expect = 0.028
Identities = 11/49 (22%), Positives = 19/49 (38%), Gaps = 6/49 (12%)

Query: 296 ASAMRSHWDVENKLHWRLDVAMNEDDCRIRRGNSAELFSGIRHIAVNIL 344
S D N+ + + ++ R GNS E +G+ A+N
Sbjct: 197 GSNFSDRADEANRKMFEHNAKLD------RWGNSMEFINGVAAGALNPF 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08150IGASERPTASE320.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.005
Identities = 17/84 (20%), Positives = 36/84 (42%), Gaps = 1/84 (1%)

Query: 138 STTAEGAQRRLAEYIQQVDEEVAKELEVDLKDNITLQTKTLQESLETQEVVAQEQKDLRI 197
+T R +A+ + + + EV + T +T+T E+ ET V +E+ +
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT-TETKETATVEKEEKAKVET 1116

Query: 198 KQIEEALRYADEAKITQPQIQQTQ 221
++ +E + + Q Q + Q
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQ 1140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08155PF07201280.024 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 27.9 bits (62), Expect = 0.024
Identities = 19/102 (18%), Positives = 35/102 (34%), Gaps = 13/102 (12%)

Query: 105 FGDASHQWLFLYQLEQLLAERKTADPASSYTAKLYASGTKRIAQKVGEE---GVETALAA 161
+ S Q+ L L L R P ++ + L +A++ GE G A
Sbjct: 129 SEEPSEQFKMLCGLRDALKGR----PELAHLSHLVEQALVSMAEEQGETIVLGARITPEA 184

Query: 162 TVNDRFELTNEAS--DLMYHLLVLLQDQDLNLTAVIDNLRKR 201
+ + D ++ Q + A+ +L+KR
Sbjct: 185 YRESQSGVNPLQPLRDTYRDAVMGYQ----GIYAIWSDLQKR 222


83GX95_08620GX95_08700N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08620-1130.471899flagellar biosynthetic protein FliR
GX95_08625-2151.269786flagellar export apparatus protein FliQ
GX95_08630-2163.321279flagellar biosynthetic protein FliP
GX95_086350153.262689flagellar biosynthetic protein FliO
GX95_08640-2153.767225flagellar motor switch protein FliN
GX95_08645-1164.437560flagellar motor switch protein FliM
GX95_086501154.748507flagellar basal body-associated protein FliL
GX95_086550124.682470flagellar hook-length control protein FliK
GX95_08660-1133.953351flagellar biosynthesis chaperone FliJ
GX95_08665-2123.503723flagellum-specific ATP synthase FliI
GX95_08670-1142.192564flagellar assembly protein FliH
GX95_08675-2161.884626flagellar motor switch protein FliG
GX95_08680-2132.089847flagellar M-ring protein FliF
GX95_08685-114-0.110069flagellar hook-basal body complex protein FliE
GX95_08690-215-0.719743hypothetical protein
GX95_08695-314-0.171535SirA-like protein
GX95_08700-311-1.108384hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08620TYPE3IMRPROT2135e-71 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 213 bits (543), Expect = 5e-71
Identities = 231/260 (88%), Positives = 246/260 (94%)

Query: 1 MIQVTSEQWLYWLHLYFWPLLRVLALISTAPILSERAIPKRVKLGLGIMITLVIAPSLPA 60
M+QVTSEQWL WL+LYFWPLLRVLALISTAPILSER++PKRVKLGL +MIT IAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDTPLFSIAALWLAMQQILIGIALGFTMQFAFAAVRTAGEFIGLQMGLSFATFVDPGSHL 120
ND P+FS ALWLA+QQILIGIALGFTMQFAFAAVRTAGE IGLQMGLSFATFVDP SHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLAMLLFLTFNGHLWLISLLVDTFHTLPIGSNPVNSNAFMALARAGGLIF 180
NMPVLARIMDMLA+LLFLTFNGHLWLISLLVDTFHTLPIG P+NSNAF+AL +AG LIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPVITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGIMLMAALMPLIAPFC 240
LNGLMLALP+ITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGI LMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIVSEMPI 260
EHLFSEIFNLLADI+SE+P+
Sbjct: 241 EHLFSEIFNLLADIISELPL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08625TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.5 bits (165), Expect = 1e-18
Identities = 23/78 (29%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALITGLIISILQAATQINEMTLSFIPKIVAVFIAII 63
+ ++ G +A+ + L L+ +VA I GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08630FLGBIOSNFLIP330e-117 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 330 bits (847), Expect = e-117
Identities = 225/245 (91%), Positives = 233/245 (95%)

Query: 1 MRRLLFLSLAGLWLFSPAAAAQLPGLISQPLAGGGQSWSLSVQTLVFITSLTFLPAILLM 60
MRRLL ++ LWL +P A AQLPG+ SQPL GGGQSWSL VQTLVFITSLTF+PAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEQK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE+K
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALDKGAQPLRAFMLRQTREADLALFARLANSGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEAL+KGAQPLR FMLRQTREADL LFARLAN+GPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08640FLGMOTORFLIN2092e-73 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 209 bits (534), Expect = 2e-73
Identities = 136/137 (99%), Positives = 136/137 (99%)

Query: 1 MSDMNNPSDENTGALDDLWADALNEQKATTNKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60
MSDMNNPSDENTGALDDLWADALNEQKATT KSAADAVFQQLGGGDVSGAMQDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08645FLGMOTORFLIM384e-136 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 384 bits (987), Expect = e-136
Identities = 86/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--DTKDEPTPGIASDSDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S D E I+ I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RQFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 HEDQNWRDNLVRQVQHSELELVANFADIPLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ L ++ ++++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTVNGQYALRVEHLI 321
Q G V + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08655FLGHOOKFLIK405e-142 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 405 bits (1041), Expect = e-142
Identities = 193/411 (46%), Positives = 232/411 (56%), Gaps = 40/411 (9%)

Query: 1 MITLPQLITTDTDMTAGLTSGKTTGSAEDFLALLAGALGADGAQGKDARITLADLQAAGG 60
MI L LIT D D T L GK + +A+DFLALL+ AL + K A L
Sbjct: 1 MIRLAPLITADVDTTT-LPGGKASDAAQDFLALLSEALAGETTTDKAAPQLL-------- 51

Query: 61 KLSKGLLTQHGEPGQAVKLADLLAQKAN---ATDETLTDLTQAQHLLSTLTPSLKTSALA 117
++ T GEP + ++D AQ+AN DET + Q + LT + + A
Sbjct: 52 -VATDKPTTKGEPLISDIVSD--AQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAA 108

Query: 118 ALSKTAQHDEKTPALSDEDLASLSALFAMLPGQPVATPVAGETPAENHIALPSLLRGDMP 177
K DEK L+++ ASLSALFAMLPG V D P
Sbjct: 109 VADKNTTKDEKADDLNEDVTASLSALFAMLPGFDNTPKVT-----------------DAP 151

Query: 178 SAPQEETHTLSFSEHEKGKTEASLTRASDDRATGPVLTPLVVAAAATSAKVEVDSPSAPV 237
S F++ T LT A D A G PL A +K EV S +PV
Sbjct: 152 STVLPTEKPTLFTK----LTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPV 207

Query: 238 THGAAMPTLSSATAQPQPLPVASAPVLSAPLGSHEWQQTFSQQVMLFTRQGQQSAQLRLH 297
T AA P ++ QP LP +APVLSAPLGSHEWQQ+ SQ + LFTRQGQQSA+LRLH
Sbjct: 208 T-AAASPLITPHQTQP--LPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLH 264

Query: 298 PEELGQVHISLKLDDNQAQLQMVSPHSHVRAALEAALPMLRTQLAESGIQLGQSSISSES 357
P++LG+V ISLK+DDNQAQ+QMVSPH HVRAALEAALP+LRTQLAESGIQLGQS+IS ES
Sbjct: 265 PQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGES 324

Query: 358 FAGQQQ-SSSQQQSSRAQHTDAFGAEDDIALAAPASLQAAARGNGAVDIFA 407
F+GQQQ +S QQQS R + + EDD L P SLQ GN VDIFA
Sbjct: 325 FSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08660FLGFLIJ2064e-72 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 206 bits (526), Expect = 4e-72
Identities = 130/147 (88%), Positives = 138/147 (93%)

Query: 1 MAQHGALETLKDLAEKEVDDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRSNLNTDMGNG 60
MA+HGAL TLKDLAEKEV+DAARLLGEMRRGCQQAEEQLKMLIDYQNEYR+NLN+DM G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 IASNRWINYQQFIQTLEKAIEQHRLQLTQWTQKVDLALKSWREKKQRLQAWQTLQDRQTA 120
I SNRWINYQQFIQTLEKAI QHR QL QWTQKVD+AL SWREKKQRLQAWQTLQ+RQ+
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRMDQKKMDEFAQRAAMRKPE 147
AALLAENR+DQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08670FLGFLIH367e-133 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 367 bits (944), Expect = e-133
Identities = 192/235 (81%), Positives = 209/235 (88%), Gaps = 7/235 (2%)

Query: 1 MSNELPWQVWTPDDLAPPPETFVPVEADNVTLTDDTPEPELTAEQQLEQELAQLKIQAHE 60
MS+ LPW+ WTPDDLAPP FVP+ T+ ++ AE LEQ+LAQL++QAHE
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEE-------AEPSLEQQLAQLQMQAHE 53

Query: 61 QGYNAGLAEGRQKGHAQGYQEGLAQGLEQGQAQAQTQQAPIHARMQQLVSEFQNTLDALD 120
QGY AG+AEGRQ+GH QGYQEGLAQGLEQG A+A++QQAPIHARMQQLVSEFQ TLDALD
Sbjct: 54 QGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALD 113

Query: 121 SVIASRLMQMALEAARQVIGQTPAVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 180
SVIASRLMQMALEAARQVIGQTP VDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV
Sbjct: 114 SVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRV 173

Query: 181 EEMLGATLSLHGWRLRGDPTLHHGGCKVSADEGDLDASVATRWQELCRLAAPGVL 235
++MLGATLSLHGWRLRGDPTLH GGCKVSADEGDLDASVATRWQELCRLAAPGV+
Sbjct: 174 DDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08675FLGMOTORFLIG339e-118 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 339 bits (870), Expect = e-118
Identities = 114/329 (34%), Positives = 196/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLSGTDKSVILLMTIGEDRAAEVFKHLSTREVQALSTAMANVRQISNKQLTDVLSEFE 60
+S L+G K+ ILL++IG + +++VFK+LS E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANEYLRSVLVKALGEERASSLLEDILETRDTTSGIETLNFMEPQSAAD 120
+ + +Y R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRSQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEPPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08680FLGMRINGFLIF7860.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 786 bits (2031), Expect = 0.0
Identities = 558/559 (99%), Positives = 559/559 (100%)

Query: 2 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 61
SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ
Sbjct: 1 SATASTATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQ 60

Query: 62 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 121
DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF
Sbjct: 61 DGGAIVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKF 120

Query: 122 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 181
GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE
Sbjct: 121 GISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLE 180

Query: 182 PGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 241
PGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV
Sbjct: 181 PGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDV 240

Query: 242 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 301
ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS
Sbjct: 241 ESRIQRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNIS 300

Query: 302 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRNTQRN 361
EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPR+TQRN
Sbjct: 301 EQVGAGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRN 360

Query: 362 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 421
ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG
Sbjct: 361 ETSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMG 420

Query: 422 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 481
FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR
Sbjct: 421 FSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVR 480

Query: 482 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 541
PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD
Sbjct: 481 PQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSD 540

Query: 542 NDPRVVALVIRQWMSNDHE 560
NDPRVVALVIRQWMSNDHE
Sbjct: 541 NDPRVVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08685FLGHOOKFLIE1141e-36 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 114 bits (286), Expect = 1e-36
Identities = 90/103 (87%), Positives = 96/103 (93%)

Query: 2 AAIQGIEGVISQLQATAMAARGQDTHSQSTVSFAGQLHAALDRISDRQTAARVQAEKFTL 61
+AIQGIEGVISQLQATAM+AR Q++ Q T+SFAGQLHAALDRISD QTAAR QAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGIALNDVMADMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPG+ALNDVM DMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08695PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08700RTXTOXIND290.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.025
Identities = 10/53 (18%), Positives = 17/53 (32%), Gaps = 2/53 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFIGMIGWALLT 214
R L R + + + A L + P R R M + ++L
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLG 78


84GX95_08900GX95_08945N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_08900-1100.942995flagellar motor stator protein MotA
GX95_08905-1101.583735flagellar motor protein MotB
GX95_08910-1111.410917chemotaxis protein CheA
GX95_08915-2121.623422chemotaxis protein CheW
GX95_08920-1101.945783methyl-accepting chemotaxis protein II
GX95_08925-2122.031463chemotaxis protein-glutamate
GX95_08930-1142.690832chemotaxis response regulator protein-glutamate
GX95_08935-2131.993084two-component system response regulator
GX95_08940-2121.407898protein phosphatase CheZ
GX95_08945-2110.760812flagellar biosynthesis protein FlhB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08900PF05844320.002 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 31.9 bits (72), Expect = 0.002
Identities = 12/28 (42%), Positives = 21/28 (75%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQQGMFSLERDIEN 103
++LL +L+R+ K+R+ G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08905OMPADOMAIN421e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 42.2 bits (99), Expect = 1e-06
Identities = 25/118 (21%), Positives = 46/118 (38%), Gaps = 11/118 (9%)

Query: 162 FKTGSAEVEPYMRDILRAIAPVL---NGIPNRISLAGHTDDFPYANGEKGYSNWELSADR 218
F A ++P + L + L + + + G+TD G Y N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI----GSDAY-NQGLSERR 277

Query: 219 ANASRRELVAGGLDNGKVLRVVGMAATMRLSDRGPDDAINRR--ISLLVLNKQAEQAI 274
A + L++ G+ K+ GM + ++ D+ R I L +++ E +
Sbjct: 278 AQSVVDYLISKGIPADKI-SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08910PF06580427e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.8 bits (98), Expect = 7e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 376 ELDKSLIERIIDPLT--HLVRNSLDHGIEMPEKRLEAGKNVVGNLILSAEHQGGNICIEV 433
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 434 TDDGAGLNRERILAKAMSQGMAVNENMTDDEVGMLIFAPGFSTAEQVTDVSGRGVGMDVV 493
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 494 KRNIQEMGG---HVEIQSKQGSGTTIRILLP 521
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08930HTHFIS641e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.5 bits (157), Expect = 1e-13
Identities = 30/141 (21%), Positives = 61/141 (43%), Gaps = 6/141 (4%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAARARIAAHKP 140
+AE R ++ +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDG 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08935HTHFIS897e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 7e-24
Identities = 29/105 (27%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGFGFIISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG +++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADSAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_08945TYPE3IMSPROT420e-149 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 420 bits (1082), Expect = e-149
Identities = 102/351 (29%), Positives = 179/351 (50%), Gaps = 14/351 (3%)

Query: 7 DDKTEAPTPHRLEKAREEGQIPRSRELTSLLILLVGVCIIWFGGESLARQLAGMLSAGLH 66
+KTE PTP ++ AR++GQ+ +S+E+ S +++ ++ + + ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLML--IP 60

Query: 67 FDHRMVNDPNLILGQIILLIKAAMMALLPLIAGVVLVALISPVMLGGLIFSGKSLQPKFS 126
+ + + + ++ PL+ L+A+ S V+ G + SG++++P
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 127 KLNPLPGIKRMFSAQTGAELLKAVLKSTLVGCVTGFYLWHNWPQMMRLMAESPIVAMGNA 186
K+NP+ G KR+FS ++ E LK++LK L+ + + N +++L P +
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQL----PTCGIECI 176

Query: 187 LDLVGLCALLVVLGVIPMVGF------DVFFQIFSHLKKLRMSRQDIRDEFKESEGDPHV 240
L+G +L L VI VGF D F+ + ++K+L+MS+ +I+ E+KE EG P +
Sbjct: 177 TPLLG--QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEI 234

Query: 241 KGKIRQMQRAAAQRRMMEDVPKADVIVTNPTHYSVALQYDENKMSAPKVVAKGAGLIALR 300
K K RQ + R M E+V ++ V+V NPTH ++ + Y + P V K
Sbjct: 235 KSKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQT 294

Query: 301 IREIGAEHRVPTLEAPPLARALYRHAEIGQQIPGQLYAAVAEVLAWVWQLK 351
+R+I E VP L+ PLARALY A + IP + A AEVL W+ +
Sbjct: 295 VRKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


85GX95_09740GX95_09760N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_09740-2111.125681invasin
GX95_09745-2161.519035two-component system response regulator NarL
GX95_09750-1171.657515two-component system sensor histidine kinase
GX95_09755-1211.944456hypothetical protein
GX95_09760-2201.891033nitrate/nitrite transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09740INTIMIN2481e-75 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 248 bits (635), Expect = 1e-75
Identities = 126/444 (28%), Positives = 216/444 (48%), Gaps = 24/444 (5%)

Query: 22 SFSLSLLLLTASGTICAQAQDPFDQNRL----PDLGMMPESHEGEKHFAEMAKAFGEASM 77
F S L L S + A N+L PD+ + + ++A A + +
Sbjct: 118 PFEYSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQL 177

Query: 78 KNNDLDTGEQARQFAFGQVRDVVSEQVNQQLESWLSAWGSASVDINVDNEGHFNGSRGSW 137
++ L+ G+ A+ A G + Q + QL++WL +G+A V++ N F+GS +
Sbjct: 178 QSRSLN-GDYAKDTALG----IAGNQASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDF 230

Query: 138 FIPLQDKQRYLTWSQLGLTQQTDGLVSNIGVGQRWAQDGWLLGYNTFYDNLLDENLQRAG 197
+P D ++ L + Q+G +N+G GQR+ +LGYN F D + R G
Sbjct: 231 LLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLG 290

Query: 198 FGAEAWGEYLRLSANYYQPFADWQT--HTATLEQRMARGYDINAQMRLPFYQHINTSVSL 255
G E W +Y + S N Y + W + ++R A G+DI LP Y + +
Sbjct: 291 IGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMY 350

Query: 256 EQYFGDSVDLFDSGTGYHNPVALKLGLNYTPVPLLTMTAQHKQGESGVSQNNLGLTLNYR 315
EQY+GD+V LF+S NP A +G+NYTP+PL+TM ++ G + + Y+
Sbjct: 351 EQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQ 410

Query: 316 FGVPLKKQLAASEVAQSQSLRGSRYDTPQRNSLPTMEYRQRKTLTVFLATPPWDLTPGET 375
F P +Q+ V + ++L GSRYD QRN+ +EY+++ L++ + + T T
Sbjct: 411 FDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERST 469

Query: 376 VALKLQVRSVHGIRHLSWQGDTQALSLTAG----TDTRSTEGWTIIMPAWDHREGAANRW 431
++L V+S +G+ + W D AL G + ++S + + I+PA+ +G +N +
Sbjct: 470 QKIQLIVKSKYGLDRIVW--DDSALRSQGGQIQHSGSQSAQDYQAILPAY--VQGGSNVY 525

Query: 432 RLSVVVEDEKGQRVSSNEITLALT 455
+++ D G SSN + L +T
Sbjct: 526 KVTARAYDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09745HTHFIS725e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.2 bits (177), Expect = 5e-17
Identities = 33/117 (28%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLVSMAPDISVVGEASNGEQGIDLAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A V SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKALSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALQQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09750PF06580514e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 51.4 bits (123), Expect = 4e-09
Identities = 30/123 (24%), Positives = 54/123 (43%), Gaps = 17/123 (13%)

Query: 473 SARFGFTVKLDYQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SHADDVVVTV 523
S +F ++ + Q+ P + VP L+Q E N +KH +++
Sbjct: 233 SIQFEDRLQFENQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKG 285

Query: 524 TQCGKQVKLKVQDNGCGVPENAERSNHYGMIIMRDRAQSLRG-DCQVRRRETGGTEVTVT 582
T+ V L+V++ G +N + S G+ +R+R Q L G + Q++ E G +
Sbjct: 286 TKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 583 FIP 585
IP
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_09760TCRTETB300.026 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.026
Identities = 18/58 (31%), Positives = 28/58 (48%), Gaps = 1/58 (1%)

Query: 128 TPFSTFIIISLLCGFAGANF-ASSMANISFFFPKQKQGGALGLNGGLGNMGVSVMQLV 184
+ FS I+ + G A F A M ++ + PK+ +G A GL G + MG V +
Sbjct: 101 SFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAI 158


86GX95_10265GX95_10300N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_10265028-6.510676oxidoreductase
GX95_10270230-6.114164TetR family transcriptional regulator
GX95_10275430-5.967961hypothetical protein
GX95_10285534-7.836560hypothetical protein
GX95_10290432-7.312164hypothetical protein
GX95_10295229-4.907672hypothetical protein
GX95_10300127-4.548831invasin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10270DHBDHDRGNASE862e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.3 bits (213), Expect = 2e-22
Identities = 68/249 (27%), Positives = 112/249 (44%), Gaps = 24/249 (9%)

Query: 7 KSVLVLGGSRGIGAAIVRRFSADGASVV-FSYAGSR----EAAEKLAAETGSTAIQTDSA 61
K + G ++GIG A+ R ++ GA + Y + ++ K A + A D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARH-AEAFPADVR 67

Query: 62 DRDAIISLV----REYGPLDILVVNAGVALFGDALEQDSDAIDRLFRINIHAPYHASVEA 117
D AI + RE GP+DILV AGV G + + F +N ++AS
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 118 ARNMP--EGGRIIIIGSVNGDRMPVPGMAAYAASKSALQGLARGLARDFGPRGITINVVQ 175
++ M G I+ +GS N +P MAAYA+SK+A + L + I N+V
Sbjct: 128 SKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 176 PGPIDTDI--------NPEDGPMKELMHSF---MAIKRHGRPEEVAGMVAWLAGPEASFV 224
PG +TD+ N + +K + +F + +K+ +P ++A V +L +A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 225 TGAMHTIDG 233
T +DG
Sbjct: 247 TMHNLCVDG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10275HTHTETR453e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.4 bits (107), Expect = 3e-08
Identities = 14/115 (12%), Positives = 40/115 (34%), Gaps = 5/115 (4%)

Query: 6 SRTPGRPRQFDPEQAIETAQHLFHSRGYDAVSVADLTKAFGINPPSFYAAFGSKLGLYTR 65
+T ++ + ++ A LF +G + S+ ++ KA G+ + Y F K L++
Sbjct: 3 RKTKQEAQE-TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 66 VLK----RYRMTDAIPLGALLRHDRPTAKCLIDVLMEAARRYAADPDATGCLVLE 116
+ + + + ++ ++E+ + +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10280adhesinb280.002 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 28.3 bits (63), Expect = 0.002
Identities = 10/48 (20%), Positives = 18/48 (37%), Gaps = 6/48 (12%)

Query: 1 MQKCSLITVLSLSVLMLAGCTTTYTMTTRTGDIIETQGKPEVDTATGM 48
M+KC + +L L+ + LA C++ K V +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQ------KSSTETGSSKLNVVATNSI 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_10300INTIMIN2172e-62 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 217 bits (554), Expect = 2e-62
Identities = 117/409 (28%), Positives = 187/409 (45%), Gaps = 21/409 (5%)

Query: 29 SDNEIQSWIAGTASSISPHLQEGTLE-DYAKGKIKALPGQAANHLVNEGMKSAFPAIIFR 87
+D++ ++ A A+S+ LQ +L DYAK + G A+ + ++ A
Sbjct: 158 TDDKALNYAAQQAASLGSQLQSRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTA---- 213

Query: 88 GGVNLEDGAKYRSSEFDMFIPVEETTSSLLFGQLGFRDHDSSSFDGRTYVNVGVGYRQEV 147
VNL+ G + S D +P ++ L FGQ+G R DS R N+G G R +
Sbjct: 214 -EVNLQSGNNFDGSSLDFLLPFYDSEKMLAFGQVGARYIDS-----RFTANLGAGQRFFL 267

Query: 148 NGWLLGVNTFLDADIRYSHLRGGIGGEVYKDSLAFSGNYYFPLTGWKTSAVHELHDERPA 207
+LG N F+D D + R GIGGE ++D S N YF ++GW S + +DERPA
Sbjct: 268 PENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPA 327

Query: 208 YGFDLRTKGTLPDFPWFSGELTYEQYYGDKVDLLGNGTLSRNPRAAGAALVWNPVPLLEV 267
GFD+R G LP +P +L YEQYYGD V L + L NP AA + + P+PL+ +
Sbjct: 328 NGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTM 387

Query: 268 RAGYRDAGNGGSQAEGGLRVNYSFGTPLHEQLDYRNV-GAPSNTTNRRAFVDRNYDIVMA 326
YR + ++ Y F P +Q++ + V + + +R V RN +I++
Sbjct: 388 GIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILE 447

Query: 327 YREQAS-KIRITAMPVSGLSGTLVTLMATVDSRYPIEKVEWSGDAELLAGLQLQGSLGSG 385
Y++Q + I ++G + + V S+Y ++++ W A G Q+Q S
Sbjct: 448 YKKQDILSLNIPH-DINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQS 506

Query: 386 -----LILPQLPLTATDGQEYSLYLTVTDSRGTRVTSERIPVRVTQDET 429
ILP Y + D G + + + V +
Sbjct: 507 AQDYQAILP--AYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQ 553


87GX95_11555GX95_11605N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_11555019-4.779498Bcr/CflA family multidrug efflux transporter
GX95_11560023-6.575269cyclopropane-fatty-acyl-phospholipid synthase
GX95_11565127-7.487547riboflavin synthase subunit alpha
GX95_11570334-8.234330MATE family efflux transporter
GX95_11585545-11.284828**EscU/YscU/HrcU family type III secretion system
GX95_11590442-10.299808EscT/YscT/HrcT family type III secretion system
GX95_11595240-6.818491EscS/YscS/HrcS family type III secretion system
GX95_11600135-6.648906EscR/YscR/HrcR family type III secretion system
GX95_11605235-6.568457type III secretion system protein SsaQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11555TCRTETB763e-17 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 76.5 bits (188), Expect = 3e-17
Identities = 48/194 (24%), Positives = 84/194 (43%), Gaps = 3/194 (1%)

Query: 8 LVWLAGLSVLGFLATDMYLPAFAAIQADLQTPAAAVSASLSLFLAGFAVAQLLWGPLSDR 67
L+WL LS L + + I D P A+ + + F+ F++ ++G LSD+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 68 YGRKPILLLGLSIFALGSLGMLWVESAAALLTL-RFVQAVGVCAATVIWQALVTDYYPSQ 126
G K +LL G+ I GS+ S +LL + RF+Q G A + +V Y P +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 127 KINRIFATIMPLVGLSPALAPLLGSWILTHFSWQAIFATLFVITLLLMLPALRLKPSVKA 186
+ F I +V + + P +G I + W + L + ++ +P L +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEV 193

Query: 187 RTEGQDKLTFATLL 200
R +G + L+
Sbjct: 194 RIKGHFDIKGIILM 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11585TYPE3IMSPROT385e-136 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 385 bits (991), Expect = e-136
Identities = 125/350 (35%), Positives = 202/350 (57%), Gaps = 4/350 (1%)

Query: 2 SEKTEQPTEKKLRDGRKEGQVVKSIEITSLFQLIALYLYFHFFTEKMILILIASITFTLQ 61
EKTEQPT KK+RD RK+GQV KS E+ S ++AL ++ + +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 62 LVNKPFSYALTQLT-HALIESLTSALLFLGAGVIVATVGSVFLQVGVVIASKAIGFKSEH 120
PFS AL+ + + L+E L ++A + S +Q G +I+ +AI +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMA-IASHVVQYGFLISGEAIKPDIKK 121

Query: 121 INPVSNFKQIFSLHSVVELCKSSLKVIMLSLIFAFFFYYYASTFRALPYCGLACGLLVVS 180
INP+ K+IFS+ S+VE KS LKV++LS++ T LP CG+ C ++
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 181 SLIKWLWVGVMAFYIVVGILDYSFQYYKIRKDLKMSKDDVKQEHKDLEGDPQMKTRRREM 240
+++ L V ++V+ I DY+F+YY+ K+LKMSKD++K+E+K++EG P++K++RR+
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 241 QSEIQSGSLAQSVKQSVAVVRNPTHIAVCLGYHPTDMPIPRVLEKGSDAQANYIVNIAER 300
EIQS ++ ++VK+S VV NPTHIA+ + Y + P+P V K +DAQ + IAE
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 301 NCIPVVENVELARSLFFEVERGDKIPETLFEPVAALLRMVMK--IDYAHS 348
+P+++ + LAR+L+++ IP E A +LR + + I+ HS
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHS 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11590TYPE3IMRPROT1643e-52 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 164 bits (418), Expect = 3e-52
Identities = 54/229 (23%), Positives = 100/229 (43%), Gaps = 5/229 (2%)

Query: 8 WLIALAVAFIRPLSLSLLLPLLKSGSLGSALLRNGVLMSLTFPILPIIYQQKIMMHIGKD 67
WL +R L+L P+L S+ ++ G+ M +TF I P + + +
Sbjct: 12 WLNLYFWPLLRVLALISTAPILSERSVPK-RVKLGLAMMITFAIAPSLPANDVPVF---S 67

Query: 68 YSWLGLVTGEVIIGFLIGFCAAVPFWAVDMAGFLLDTLRGATMGTIFNSTMEAETSLFGL 127
+ L L +++IG +GF F AV AG ++ G + T + +
Sbjct: 68 FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLAR 127

Query: 128 LFSQFLCVIFFISGGMEFILNILYESYQYLPPGRTLLFDRQFLKYIQAEWRTLYQLCISF 187
+ ++F G +++++L +++ LP G L FL +A ++ +
Sbjct: 128 IMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLML 186

Query: 188 SVPAIICMVLADLALGLLNRSAQQLNVFFFSMPLKSILVLLTLLISFPY 236
++P I ++ +LALGLLNR A QL++F PL + + + P
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11595TYPE3IMQPROT729e-21 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 72.5 bits (178), Expect = 9e-21
Identities = 30/85 (35%), Positives = 50/85 (58%)

Query: 4 SELTQFVTQLLWIVLFTSMPVVLVASVVGVIVSLVQALTQIQDQTLQFMIKLLAIAITLM 63
+L + L++VL S +VA+++G++V L Q +TQ+Q+QTL F IKLL + + L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VSYPWLSGILLNYTRQIMLRIGEHG 88
+ W +LL+Y RQ++ G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11600TYPE3IMPPROT2319e-80 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 231 bits (592), Expect = 9e-80
Identities = 79/215 (36%), Positives = 130/215 (60%), Gaps = 8/215 (3%)

Query: 8 LQLIGILFLLSILPLIIVMGTSFLKLAVVFSILRNALGIQQVPPNIALYGLALVLSLFIM 67
+ LI +L ++LP II GT F+K ++VF ++RNALG+QQ+P N+ L G+AL+LS+F+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 68 GPTLLAVKERWHPVQVAGAPFWT-SEWDSKALAPYRQFLQKNSEEKEANYFRNLIKRTWP 126
P + + V + S+ + L YR +L K S+ + +F N +
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 127 ED-------IKRKIKPDSLLILIPAFTVSQLTQAFRIGLLIYLPFLAIDLLISNILLAMG 179
+ K +I+ S+ L+PA+ +S++ AF+IG +YLPF+ +DL++S++LLA+G
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 180 MMMVSPMTISLPFKLLIFLLAGGWDLTLAQLVQSF 214
MMM+SP+TIS P KL++F+ GW L L+ +
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11605FLGMOTORFLIN513e-10 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 51.1 bits (122), Expect = 3e-10
Identities = 21/67 (31%), Positives = 38/67 (56%)

Query: 247 LEQIPQQVLFEIGRASLEIGQLRQLKTGDVLPVGGCFAPEVTIRVNDRIIGQGELIACGN 306
+ IP ++ E+GR + I +L +L G V+ + G + I +N +I QGE++ +
Sbjct: 57 IMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVAD 116

Query: 307 EFMVRIT 313
++ VRIT
Sbjct: 117 KYGVRIT 123


88GX95_11725GX95_11760N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_11725233-8.223244EscC/YscC/HrcC family type III secretion system
GX95_11730231-7.234123pathogenicity island chaperone protein SpiC
GX95_11735028-5.969249hybrid sensor histidine kinase/response
GX95_117401180.093266DNA-binding response regulator
GX95_117450152.323485helix-turn-helix-type transcriptional regulator
GX95_117500132.554311hypothetical protein
GX95_11755-1123.280031hypothetical protein
GX95_11760-1121.848363DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11725TYPE3OMGPROT5810.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 581 bits (1500), Expect = 0.0
Identities = 158/500 (31%), Positives = 261/500 (52%), Gaps = 15/500 (3%)

Query: 11 LLFILNTAKSDELSWKGNDFTLYARQMPLAEVLHLLSENYDTAITISPLITATFSGKIPP 70
LL + + + + EL W + A+ L ++L NYD + +S I SG+
Sbjct: 17 LLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEH 76

Query: 71 GPPVDILNNLAAQYDLLTWFDGSMLYVYPASLLKHQVITFNILSTGRFIHYLRSQNILSS 130
P D L ++A+ Y+L+ ++DG++LY++ S + ++I L+ I
Sbjct: 77 DNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWE- 135

Query: 131 PGCEVKEITGTRAVEVSGVPSCLTRISQLASVLDNALIKR--KDSAVSVSIYTLKYATAM 188
P + R V VSG P L + Q A+ L+ R K A+++ I+ LKYA+A
Sbjct: 136 PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASAS 195

Query: 189 DTQYQYRDQSVVVPGVVSVL-REMSKTSVPASSTNN-----GSPATQALPMFAADPRQNA 242
D YRD V PGV ++L R +S ++ + +N + A ADP NA
Sbjct: 196 DRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNA 255

Query: 243 VIVRDYAANMAGYRKLITELDQRQQMIEISVKIIDVNAGDINQLGIDWGTAVSLGG---- 298
+IVRD M Y++LI LD+ IE+++ I+D+NA + +LG+DW + G
Sbjct: 256 IIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQV 315

Query: 299 --KKIAFNTGLNDGGASGFSTVISDTSNFMVRLNALEKSSQAYVLSQPSVVTLNNIQAVL 356
K + + GA G + R+N LE A V+S+P+++T N QAV+
Sbjct: 316 VIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVI 375

Query: 357 DKNITFYTKLQGEKVAKLESITTGSLLRVTPRLLNDNGTQKIMLNLNIQDGQQSDTQSET 416
D + T+Y K+ G++VA+L+ IT G++LR+TPR+L +I LNL+I+DG Q S
Sbjct: 376 DHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGI 435

Query: 417 DSLPEVQNSEIASQATLLAGQSLLLGGFKQGKQIHSQNKIPLLGDIPVVGHLFRNDTTQV 476
+ +P + + + + A + GQSL++GG + + + +K+PLLGDIP +G LFR +
Sbjct: 436 EGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELT 495

Query: 477 HSVIRLFLIKASVVNNGISH 496
+RLF+I+ +++ GI+H
Sbjct: 496 RRTVRLFIIEPRIIDEGIAH 515


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11735HTHFIS686e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 6e-14
Identities = 31/156 (19%), Positives = 57/156 (36%), Gaps = 13/156 (8%)

Query: 691 ILLVDDADINRDIIGKMLVSLGQHVTIAASSNEALTLSQQQRFDLVLIDIRMPEIDGIEC 750
IL+ DD R ++ + L G V I +++ DLV+ D+ MP+ + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 751 VQLWHDEPNNLDPDCMFVALSASVATEDIHRCKKNGIHHYITKPVTLATLARYISIAAEY 810
+ PD + +SA + + G + Y+ KP L L
Sbjct: 66 LP----RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG-------- 113

Query: 811 QLLRNIELQEQDPSRCSALLAT-DDMVINSKIFQSL 845
+ R + ++ PS+ +V S Q +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11740HTHFIS666e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 6e-15
Identities = 28/119 (23%), Positives = 50/119 (42%), Gaps = 2/119 (1%)

Query: 1 MKEYKILLVDDHEIIINGIMNALLPWPHFKIVEHVKNGLEVYNACCAYEPDILILDLSLP 60
M IL+ DD I + AL + V N ++ A + D+++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GINGLDIIPQLHQRWPAMNILVYTAYQQEYMTIKTLAAGANGYVLKSSSQQVLLAALQT 119
N D++P++ + P + +LV +A IK GA Y+ K L+ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_11760HTHFIS842e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 2e-21
Identities = 31/127 (24%), Positives = 56/127 (44%)

Query: 2 ATIHLLDDDTAVTNACAFLLESLGYDVKCWTQGADFLAQASLYQAGVVLLDMRMPVLDGQ 61
ATI + DDD A+ L GYDV+ + A + +V+ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 GVHDALRQCGSTLAVVFLTGHGDVPMAVEQMKRGAVDFLQKPVSVKPLQAALERALTVSS 121
+ +++ L V+ ++ A++ ++GA D+L KP + L + RAL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 AAVARRE 128
++ E
Sbjct: 124 RRPSKLE 130


89GX95_13095GX95_13145N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_130951152.312525ribonuclease E
GX95_131001152.492150flagellar hook-filament junction protein FlgL
GX95_13110-2142.294690flagellar hook-associated protein FlgK
GX95_13115-1163.304501flagellar rod assembly protein/muramidase FlgJ
GX95_131201153.168066flagellar biosynthesis protein FlgA
GX95_131252142.660115flagellar basal body L-ring protein
GX95_131302152.567161flagellar basal-body rod protein FlgG
GX95_131351122.851885flagellar biosynthesis protein FlgF
GX95_131401131.435680flagellar hook protein FlgE
GX95_131450171.245347flagellar basal body rod modification protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13100IGASERPTASE552e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 54.7 bits (131), Expect = 2e-09
Identities = 49/259 (18%), Positives = 92/259 (35%), Gaps = 26/259 (10%)

Query: 513 PSEEEYAERKRPEQPALATFAMPDVPPAPTPVEPAVSVATAKKDNVAAAQPAQPGLFSRF 572
P E+ + DVP P+ + A+ D PA P S
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSN-----NEEIARVDEAPVPPPA-PATPSET 1036

Query: 573 LNALKQLFSGEETKTVETAAPKAEEKAERQQDRRKPRQNNRRDRNERRDTRDNRAGRDGG 632
S +E+KTVE A E + ++ K ++N + +T+ N + G
Sbjct: 1037 TE-TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKA-----NTQTNEVAQSGS 1090

Query: 633 ESRDDNRRNRRQAQQQNAEAR---DTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
E+++ ++ E + +T + + KV + Q +P++E+S A
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS----QVSPKQEQSETVQPQAEPA 1146

Query: 690 QQEVKALNREELPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
++ +N +E Q + QP ++ N + T S V T ++ V E
Sbjct: 1147 RENDPTVNIKEPQSQTNTTAD---TEQPAKETSS-NVEQPVTESTTVNTGNSVVENPENT 1202

Query: 750 PVENVEQPVPAPRTELAKV 768
+ P +E +
Sbjct: 1203 TPATTQ---PTVNSESSNK 1218



Score = 39.3 bits (91), Expect = 1e-04
Identities = 51/372 (13%), Positives = 88/372 (23%), Gaps = 47/372 (12%)

Query: 630 DGGESRDDNRRNRRQAQQQNAEARDTRQQETAEKVKTGDEQQQTPRRERSRRRNDDKRQA 689
D G + R + N E Q + T + Q S +
Sbjct: 963 DLGAWKYKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDE 1022

Query: 690 QQEVKALNREELPVQETEQEERVQQVQPRRKQRQLNQKVRFTNSAVVETVDTPVVVDEPR 749
ET E Q+ + K Q + N V + V +
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE-AKSNVKANTQ 1081

Query: 750 PVENVEQPVPAPRTELAKVDLPVVADIAPEQDDSVEPRDNTGMPRRSRRSPRHLRVSGQR 809
E + T+ + A + E+ VE +++ P+ +
Sbjct: 1082 TNEVAQSGSETKETQTTETKET--ATVEKEEKAKVETE-------KTQEVPKVTSQVSPK 1132

Query: 810 RRRYRDERYPTQSPMPLTVACASPEMASGKVWIRYPIVRPQETQVVEEQREADLALPQPV 869
+ + + + P V +E Q + AD P
Sbjct: 1133 QEQSETVQPQAEPARE-----------------NDPTVNIKEPQS-QTNTTADTEQPAKE 1174

Query: 870 VAEPQVIAATVALEPQASVQAVENVAVEPQTVAEPQAPEVVEVETTHPEVIAAPVDEQPQ 929
Q V + + PE TT P V + ++
Sbjct: 1175 T-------------SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 930 LIAESDTPEAQEVIA------DAEPVAETADASITVAENVADVVVVEPEEETKAEAAVVE 983
S V D VA S ++D AV +
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQ 1281

Query: 984 HTAEETVIATAQ 995
H ++ + Q
Sbjct: 1282 HISQLEMNNEGQ 1293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13105FLAGELLIN414e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 4e-06
Identities = 30/138 (21%), Positives = 59/138 (42%)

Query: 1 MRISTQMMYEQNMSGITNSQAEWMKLGEQMSTGKRVTNPSDDPIAASQAVVLSQAQAQNS 60
I+T + + + SQ+ E++S+G R+ + DD + A + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYALARTFATQKVSLEESVLSQVTTAIQTAQEKIVYAGNGTLSDDDRASLATDLQGIRDQ 120
Q + E L+++ +Q +E V A NGT SD D S+ ++Q ++
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 LMNLANSTDGNGRYIFAG 138
+ ++N T NG + +
Sbjct: 122 IDRVSNQTQFNGVKVLSQ 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13110FLGHOOKAP16640.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 664 bits (1714), Expect = 0.0
Identities = 438/553 (79%), Positives = 487/553 (88%), Gaps = 8/553 (1%)

Query: 2 SSLINHAMSGLNAAQAALNTVSNNINNYNVAGYTRQTTILAQANSTLGAGGWIGNGVYVS 61
SSLIN+AMSGLNAAQAALNT SNNI++YNVAGYTRQTTI+AQANSTLGAGGW+GNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRGAQNQSSGLTTRYEQMSKIDNLLADKSSSLSGSLQSFFTSLQTLV 121
GVQREYDAFITNQLR AQ QSSGLT RYEQMSKIDN+L+ +SSL+ +Q FFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKAEGLVNQFKTTDQYLRDQDKQVNIAIGSSVAQINNYAKQIANLND 181
SNAEDPAARQALIGK+EGLVNQFKTTDQYLRDQDKQVNIAIG+SV QINNYAKQIA+LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRMTGVGAGASPNDLLDQRDQLVSELNKIVGVEVSVQDGGTYNLTMANGYTLVQGSTA 241
QISR+TGVGAGASPN+LLDQRDQLVSELN+IVGVEVSVQDGGTYN+TMANGY+LVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPTRTTVAYVDEAAGNIEIPEKLLNTGSLGGLLTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADP+RTTVAYVD AGNIEIPEKLLNTGSLGG+LTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFADAFNAQHTKGYDADGNKGKDFFSIGSPVVYSNSNNADKTVSLTAKVVDSTKVQAT 361
ALAFA+AFN QH G+DA+G+ G+DFF+IG P V N+ N V++ A V D++ V AT
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGD-VAIGATVTDASAVLAT 359

Query: 362 DYKIVFDGTDWQVTRTADNTTFTATKDADGKLEIDGLKVTVGTGAQKNDSFLLKPVSNAI 421
DYKI FD WQVTR A NTTFT T DA+GK+ DGL++T NDSF LKPVS+AI
Sbjct: 360 DYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAI 419

Query: 422 VDMNVKVTNEAEIAMASESKLDPDVDTGDSDNRNGQALLDLQ-NSNVVGGNKTFNDAYAT 480
V+M+V +T+EA+IAMASE D GDSDNRNGQALLDLQ NS VGG K+FNDAYA+
Sbjct: 420 VNMDVLITDEAKIAMASEE------DAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSDVGNKTSTLKTSSTTQANVVKQLYKQQQSVSGVNLDEEYGNLQRYQQYYLANAQVLQ 540
LVSD+GNKT+TLKTSS TQ NVV QL QQQS+SGVNLDEEYGNLQR+QQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 TANALFDALLNIR 553
TANA+FDAL+NIR
Sbjct: 534 TANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13115FLGFLGJ4990.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 499 bits (1285), Expect = 0.0
Identities = 263/316 (83%), Positives = 289/316 (91%), Gaps = 3/316 (0%)

Query: 1 MIGDGKLLASAAWDAQSLNELKAKAGQDPAANIRPVARQVEGMFVQMMLKSMREALPKDG 60
MI D KLLASAAWDAQSLNELKAKAG+DPAANIRPVARQVEGMFVQMMLKSMR+ALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSDQTRLYTSMYDQQIAQQMTAGKGLGLADMMVKQMTGGQTMPADDAPQVPLKFSLET 120
LFSS+ TRLYTSMYDQQIAQQMTAGKGLGLA+MMVKQMT Q +P + P P+KF LET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VNSYQNQALTQLVRKAIPKTPDSSDAPLSGDSKDFLARLSLPARLASEQSGVPHHLILAQ 180
V YQNQAL+QLV+KA+P+ D S L GDSK FLA+LSLPA+LAS+QSGVPHHLILAQ
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDS---LPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQ 177

Query: 181 AALESGWGQRQILRENGEPSYNVFGVKATASWKGPVTEITTTEYENGEAKKVKAKFRVYS 240
AALESGWGQRQI RENGEPSYN+FGVKA+ +WKGPVTEITTTEYENGEAKKVKAKFRVYS
Sbjct: 178 AALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYS 237

Query: 241 SYLEALSDYVALLTRNPRYAAVTTAATAEQGAVALQNAGYATDPNYARKLTSMIQQLKAM 300
SYLEALSDYV LLTRNPRYAAVTTAA+AEQGA ALQ+AGYATDP+YARKLT+MIQQ+K++
Sbjct: 238 SYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSI 297

Query: 301 SEKVSKTYSANLDNLF 316
S+KVSKTYS N+DNLF
Sbjct: 298 SDKVSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13120FLGPRINGFLGI429e-153 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 429 bits (1104), Expect = e-153
Identities = 153/362 (42%), Positives = 215/362 (59%), Gaps = 9/362 (2%)

Query: 5 LAGIVLALVTTLAHAERIRDLTSVQGVRENSLIGYGLVVGLDGTGDQTTQTPFTTQTLNN 64
A L+ A RI+D+ S+Q R+N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 14 SALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRA 73

Query: 65 MLSQLGITVPTGTNMQLKNVAAVMVTASYPPFARQGQTIDVVVSSMGNAKSLRGGTLLMT 124
ML LGIT G + KN+AAVMVTA+ PPFA G +DV VSS+G+A SLRGG L+MT
Sbjct: 74 MLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMT 132

Query: 125 PLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAIIERELPTQFGAGNT 184
L G D Q+YA+AQG ++V G A +++ R+ NGAIIERELP++F
Sbjct: 133 SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVN 192

Query: 185 INLQLNDEDFTMAQQITDAINRAR----GYGSATALDARTVQVRVPSGNSSQVRFLADIQ 240
+ LQL + DF+ A ++ D +N G A D++ + V+ P + R +A+I+
Sbjct: 193 LVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEIE 251

Query: 241 NMEVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQLNVNQPNTPFGGG 300
N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF G
Sbjct: 252 NLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRG 309

Query: 301 QTVVTPQTQIDLRQSGGSLQSVRSSANLNSVVRALNALGATPMDLMSILQSMQSAGCLRA 360
QT V PQT I Q G + ++ +L ++V LN++G +++ILQ ++SAG L+A
Sbjct: 310 QTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQA 368

Query: 361 KL 362
+L
Sbjct: 369 EL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13125FLGLRINGFLGH353e-127 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 353 bits (908), Expect = e-127
Identities = 211/232 (90%), Positives = 223/232 (96%)

Query: 1 MQKYALHAYPVMALMVATLTGCAWIPAKPLVQGATTAQPIPGPVPVANGSIFQSAQPINY 60
MQK A H Y + +L+V +LTGCAWIP+ PLVQGAT+AQP+PGP PVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTSFGFDTVPRYLQGLFGNS 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKT+FGFDTVPRYLQGLFGN+
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADMEASGGNSFNGKGGANASNTFSGTLTVTVDQVLANGNLHVVGEKQIAINQGTEFIRF 180
RAD+EASGGN+FNGKGGANASNTFSGTLTVTVDQVL NGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNSVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSN+VPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13130FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13140FLGHOOKAP1417e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.1 bits (96), Expect = 7e-06
Identities = 17/48 (35%), Positives = 29/48 (60%)

Query: 356 LTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 403
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.6 bits (87), Expect = 9e-05
Identities = 22/60 (36%), Positives = 31/60 (51%), Gaps = 4/60 (6%)

Query: 2 SFSQAVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
+ A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13145SYCECHAPRONE290.010 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.5 bits (63), Expect = 0.010
Identities = 16/34 (47%), Positives = 20/34 (58%), Gaps = 2/34 (5%)

Query: 44 LKNQDPTNPLQNNELTTQLAQISTVSGIEKLNTT 77
L N+ P N L NN L TQL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


90GX95_13555GX95_13585N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_13555025-4.582033DNA-binding response regulator
GX95_13560025-5.215267two-component sensor histidine kinase
GX95_13565126-6.990991dipeptidase
GX95_13570430-8.332770hypothetical protein
GX95_13575237-9.784877DUF3950 domain-containing protein
GX95_13580338-10.015951inositol phosphatase
GX95_13585435-9.900271type III secretion chaperone protein SigE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13555HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 56/117 (47%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQKTIEWVRQGLTEAGYVVDYACDGRDGLHLALQEHYSLIILDIMLPGLDGWQ 61
IL+ +D+ + Q L+ AGY V + L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRALRTAHQS-PVICLTARDSVEDRVKGLEAGANDYLVKPFSFAELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ +K E GA DYL KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13560PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 18/102 (17%), Positives = 38/102 (37%), Gaps = 15/102 (14%)

Query: 348 ILLQRVLSNLLTNAIRYSDENAVIRIESAYDDNVAEIRVANPGSHPADADKLFRRFWRGD 407
+L+Q ++ N + + I + I ++ D+ + V N GS K
Sbjct: 258 MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE-------- 309

Query: 408 NARHTAGFGLGLSLVNA-IALLHGGSASYRYADEHNIFSVRL 448
G GL V + +L+G A + +++ + +
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13580TYPE3OMBPROT6560.0 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 656 bits (1693), Expect = 0.0
Identities = 187/396 (47%), Positives = 254/396 (64%), Gaps = 5/396 (1%)

Query: 166 LNNQPWQTIKNTLTHNGHHYTNTQLPAAEMKIGAKDIFPSAYQGKGVCSWDTRNIHHANN 225
LNN+ W + ++H+G +Y PA+ MKIG K+IF Y GKG+C TR H N
Sbjct: 146 LNNKNWGPVNKNISHHGKNYGFQLTPASHMKIGNKNIFVKEYNGKGICCASTRESDHIAN 205

Query: 226 LWMSTVSVHEDGKDKTLFCGIRHGVLSPYH-EKDPLLRQVGAENKAKEVLTAALFSKPEL 284
+W+S V V ++GK+ +F GIRHGV+S Y +K+ R V A NKA+E+++AAL+S+PEL
Sbjct: 206 MWLSKV-VDDEGKE--IFSGIRHGVISAYGLKKNSSERAVAARNKAEELVSAALYSRPEL 262

Query: 285 LNRALEGEAVSLKLVSVGLLTASNIFGKEGTMVEDQMRAWQSL-TQPGKMIHLKIRNKDG 343
L++AL G+ V LK+VS LLT +++ G E +M++DQ+ A + L ++ G+ L IRN DG
Sbjct: 263 LSQALSGKTVDLKIVSTSLLTPTSLTGGEESMLKDQVNALKGLNSKRGEPTKLLIRNSDG 322

Query: 344 DLQTVKIKPDVAAFNVGVNELALKLGFGLKASDRYNAEALHQLLGNDLRPEARPGGWVGE 403
L+ V + V FN GVNELALK+G G + D+ N E++ LLG++ GGW E
Sbjct: 323 LLKEVSVNLKVVTFNFGVNELALKMGLGWRNVDKLNDESICSLLGDNFLKNGVIGGWAAE 382

Query: 404 WLAQYPDNYEVVNTLARQIKDIWKNNLHHKDGGEPYKLAQRLAMLAHEIDAVPAWNCKSG 463
+ + P V LA QIK+I L D GEPYKL+QR+ +LA+ I AVP WNCKSG
Sbjct: 383 AIEKNPPCKNDVIYLANQIKEIINKKLQKNDNGEPYKLSQRMTLLAYTIGAVPCWNCKSG 442

Query: 464 KDRTGMMDSEIKREHISLHQTHMLSAPGSLPDSGGQKIFQKVLLNSGNLEIQKQNTGGAG 523
KDRTGM D+EIKRE I H+T S S S +++F +L+NSGN+EIQ+ NTG G
Sbjct: 443 KDRTGMQDAEIKREIIRKHETGQFSQLNSKLSSEEKRLFSTILMNSGNMEIQEMNTGVPG 502

Query: 524 NKVMKNLSPEVLNLSYQKRVGDENIWQSVKGISSLI 559
NKVMK L L LSY +R+GD IW VKG SS +
Sbjct: 503 NKVMKKLPLSSLELSYSERIGDSKIWNMVKGYSSFV 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_13585PF078241651e-56 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 165 bits (419), Expect = 1e-56
Identities = 33/114 (28%), Positives = 63/114 (55%), Gaps = 1/114 (0%)

Query: 1 MESLLNRLYDALGLDAPE-DEPLLIIDDGIQVYFNESDHTLEMCCPFMPLPDDILTLQHF 59
ME L + + ALG+ + + D+ +++DD + +Y + ++ + CPF LP++I L +
Sbjct: 1 MEDLADVICRALGIPSIDTDDQAIMLDDDVLIYIEKEGDSINLLCPFCALPENINDLIYA 60

Query: 60 LRLNYTSAVTIGADADNTALVALYRLPQTSTEEEALTGFELFISNVKQLKEHYA 113
L LNY+ + + D + +L+A L + E+ E +IS V+ LK+ +A
Sbjct: 61 LSLNYSEKICLATDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRWLKDEFA 114


91GX95_14565GX95_14595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_14565-128-8.043912DNA-binding transcriptional regulator
GX95_14570-120-5.682411hypothetical protein
GX95_14575-1110.126397sugar-phosphatase
GX95_145801120.713515multidrug transporter MdfA
GX95_14585-1110.753841undecaprenyl-diphosphate phosphatase
GX95_14590-211-0.052822DNA-binding transcriptional repressor DeoR
GX95_14595-2100.215388serine-type D-Ala-D-Ala carboxypeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14580HTHTETR461e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.2 bits (109), Expect = 1e-08
Identities = 17/80 (21%), Positives = 33/80 (41%)

Query: 7 RRANDPKRREKIIQATLEAVKTYGVHAVTHRKIAAIAQVPLGSMTYYFAGMDALLSEAFT 66
+ + R+ I+ L GV + + +IA A V G++ ++F L SE +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 67 LFTENMSRQYQDFFAQVTDA 86
L N+ ++ A+
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14585TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.3 bits (76), Expect = 0.002
Identities = 33/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVTVVR-ASALM--GALGIGLIIFVDSDWVA-GVSVILWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + +L GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407



Score = 28.7 bits (64), Expect = 0.042
Identities = 32/135 (23%), Positives = 56/135 (41%), Gaps = 11/135 (8%)

Query: 38 IRDILSVSTAEMGAVLFGLSIGSMSGILCS---AWLVKRFGTRKVIRTTMTCAVTGMVIL 94
++D+ +STAE+G+V+ + G+MS I+ LV R G V+ +T +
Sbjct: 283 MKDVHQLSTAEIGSVI--IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTA 340

Query: 95 SVALWCASPLIFALGLAVFGASFGAAEVAINVEGAAVERELNKTVLPMMHGFYSFGTLAG 154
S L S + + + V G + I+ V L + +F +
Sbjct: 341 SFLLETTSWFMTIIIVFVLG-GLSFTKTVIS---TIVSSSLKQQEAGAGMSLLNFTSFLS 396

Query: 155 AGVGMALTA--LSVP 167
G G+A+ LS+P
Sbjct: 397 EGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14595TCRTETB446e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.1 bits (104), Expect = 6e-07
Identities = 66/356 (18%), Positives = 127/356 (35%), Gaps = 51/356 (14%)

Query: 48 QAGLDWVPTSMTAYLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLATLLAKNIEQ 107
A +WV T+ + G + G LSD++G + ++L G++ + + +
Sbjct: 48 PASTNWVNTAFMLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS 104

Query: 108 FT-FLRFLQGISLCFIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWVH 166
RF+QG A+ + + K L+ ++ + +GP +G H
Sbjct: 105 LLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAH 164

Query: 167 VLPWEGMFILFAALAAIAFFGLQRAMPETATRRGE------------------------- 201
+ W +L + I L + + + +G
Sbjct: 165 YIHW-SYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSI 223

Query: 202 ------TLSFKALGRDYRLV---------IKNRRFVAGALALGFVSLPLLAWIAQSPIII 246
LSF + R V KN F+ G L G + + +++ P ++
Sbjct: 224 SFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMM 283

Query: 247 ISGEQLSSYEYG-LLQVPVFGALIAGNLVLARLTSRRTVRSLIVMGGWPIVAGLIIAAAA 305
QLS+ E G ++ P ++I + L RR ++ +G + + A+
Sbjct: 284 KDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTAS-- 341

Query: 306 TVVSSHAYLWMTAGLSVYAFGIGLANAGLVRLTLFSSDMSKGTVSAAMGMLQMLIF 361
+ +MT + V+ G GL+ V T+ SS + + A M +L F
Sbjct: 342 -FLLETTSWFMTIII-VFVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14610BLACTAMASEA475e-08 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 47.1 bits (112), Expect = 5e-08
Identities = 49/207 (23%), Positives = 78/207 (37%), Gaps = 25/207 (12%)

Query: 1 MTQYASSLRSLAAGSVLLFLFASPVKAEEQTIAPPGVDAR-AWILMDYASGKVLAEGNAD 59
M + SL A ++ L + ASP E+ ++ + R I MD ASG+ L AD
Sbjct: 1 MRYIRLCIISLLA-TLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRAD 59

Query: 60 EKLDPASLTKIMTSYVVGQALKAGKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQV 119
E+ S K++ V + AG +L + + +P V D +
Sbjct: 60 ERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGM 113

Query: 120 SVADLNKGIIIQSGNDACIALADYVAGSQESFIGLMNAYAKRLGLTNTT---FQTVHGLD 176
+V +L I S N A L V G + A+ +++G T ++T
Sbjct: 114 TVGELCAAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEA 168

Query: 177 APGQF---STARDMA------LLGKAL 194
PG +T MA L + L
Sbjct: 169 LPGDARDTTTPASMAATLRKLLTSQRL 195


92GX95_14830GX95_14855N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_14830-1152.172267ATP-dependent RNA helicase RhlE
GX95_14835-2151.516631transcriptional regulator
GX95_14840-2151.921908secretion protein HlyD
GX95_14845-3151.404621multidrug ABC transporter ATP-binding protein
GX95_14850-1181.155585hypothetical protein
GX95_14855-1160.737870hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14830SECA300.024 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.024
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14835HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 22/103 (21%), Positives = 46/103 (44%), Gaps = 7/103 (6%)

Query: 6 TTTKGEQAKSQLIAAALAQFGEYGLHATT-RDIAALAGQNIAAITYYFGSKEDLYLACAQ 64
T + ++ + ++ AL F + G+ +T+ +IA AG AI ++F K DL+ +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 65 WIADFLGEKFRPHAEKAERLFSQPAPDRDAIRELILLACKNMI 107
+GE E + P +RE+++ ++ +
Sbjct: 65 LSESNIGELEL---EYQAKFPGDPLS---VLREILIHVLESTV 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14840RTXTOXIND612e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 61.0 bits (148), Expect = 2e-12
Identities = 48/286 (16%), Positives = 104/286 (36%), Gaps = 28/286 (9%)

Query: 55 ASLNVDEGDAIKAGQVLGELDHAPYENALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAA 114
NV E + L + + ++N Q + + +A+ +LA E
Sbjct: 175 YFQNVSEEEV-LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 115 AVRQAQAAYDYAQNFYNRQQGLWKSRTISA--NDLENARSSRDQAQATLKSAQDKLSQYR 172
R + + + L + N+L +S +Q ++ + SA+++
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL-V 292

Query: 173 TGNREQDI----AQAKASLEQAKAQLAQAQLDLQDTTLIAPANGTLLTRAV-EPGSMLNA 227
T + +I Q ++ +LA+ + Q + + AP + + V G ++
Sbjct: 293 TQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 228 GSTVLTLSLT-RPVWVRAYVDERNLSQTQPGRDILLYTDGRPDKPYH---GKIGFVSPTA 283
T++ + + V A V +++ G++ ++ + P Y GK+ ++ A
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA 412

Query: 284 EFTPKTVETPDLRTDLVYRLRIIVT-------DADDALRQGMPVTV 322
D R LV+ + I + + + L GM VT
Sbjct: 413 --------IEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14845PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 22/89 (24%), Positives = 29/89 (32%), Gaps = 21/89 (23%)

Query: 294 PRFEDAFIDLLGGAGTSESPLGSILHTVEGTAGETVIEAQELTKKFGDFAATDHVNFVVQ 353
PR E + +LG P + Q + K HV V++
Sbjct: 548 PRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVME 590

Query: 354 RGEIFG----LLGPNGAGKSTTFKMMCGL 378
G F L G G GKST + GL
Sbjct: 591 PGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.043
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_14855ABC2TRNSPORT461e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 45.7 bits (108), Expect = 1e-07
Identities = 35/139 (25%), Positives = 60/139 (43%), Gaps = 5/139 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFVGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCATQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYL 333
P+ H D+ + I L
Sbjct: 209 AARFLPLSHSIDLIRPIML 227


93GX95_16160GX95_16185N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_16160-2114.4739592,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
GX95_16165-1125.016800isochorismatase
GX95_161700125.5055882,3-dihydroxybenzoate-AMP ligase
GX95_161751145.513596isochorismate synthase EntC
GX95_161801153.768982Fe2+-enterobactin ABC transporter
GX95_161852164.549306MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16160DHBDHDRGNASE337e-120 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 337 bits (866), Expect = e-120
Identities = 105/257 (40%), Positives = 148/257 (57%), Gaps = 20/257 (7%)

Query: 9 KTVWVTGAGKGIGYATALAFVDAGARVFGFDRE---------------FTQENYPFATEV 53
K ++TGA +GIG A A GA + D E +P
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP----- 63

Query: 54 MDVADAAQVAQVCQRVLQKTPRLDVLVNAAGILRMGATDALSVDDWQQTFAVNVGGAFNL 113
DV D+A + ++ R+ ++ +D+LVN AG+LR G +LS ++W+ TF+VN G FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 114 FSQTMAQFRRQQGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALTVGLELAGCGVRCN 173
++ G+IVTV S+ A PR M+AY +SKAA +GLELA +RCN
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 174 VVSPGSTDTDMQRTLWVSEDAEQQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDLA 233
+VSPGST+TDMQ +LW E+ +Q I+G E FK GIPL K+A+P +IA+ +LFL S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 234 SHITLQDIVVDGGSTLG 250
HIT+ ++ VDGG+TLG
Sbjct: 244 GHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16165ISCHRISMTASE424e-153 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 424 bits (1092), Expect = e-153
Identities = 147/299 (49%), Positives = 192/299 (64%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQSYALPTALDIPTNKVNWAFEPERAALLIHDMQDYFVSFWGRNCPMMDQVIANI 60
MAIP +Q Y +PTA D+P NKV+W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRQYCKEHHIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVEALTPDEADTV 120
L+ C + IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P++ D V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKDTGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRKEHLMALNYVAGRSGRVVMTESLL------PTPVPASKA-----------ALRALIL 223
FS ++H MAL Y AGR VMT+SLL P V + A +R I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDETDEPLD-DENLIDYGLDSVRMMGLAARWRKVHGDIDFVMLAKNPTIDAWWALLS 281
LL ET E + E+L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W LL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16180FERRIBNDNGPP603e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 59.6 bits (144), Expect = 3e-12
Identities = 46/210 (21%), Positives = 80/210 (38%), Gaps = 21/210 (10%)

Query: 105 EPNAETVAAQMPDLILISATGGDSALALYDQLSAIAPTLVINYDDKS-----WQSLLTQL 159
EPN E + P ++ SA G S + L+ IAP N+ D + LT++
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPLAMARKSLTEM 141

Query: 160 GEITGQEKQAAARIAEFEAQLTTVKQRIALPPQPVSALVYTPAAHSANLWTPESAQGKLL 219
++ + A +A++E + ++K R L ++ P S ++L
Sbjct: 142 ADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEIL 201

Query: 220 TQLGFTLATLPRGLQTSKSQGKRHDIIQLGGENLAAGLNGQSLFLFAGDNKDVAALYTNP 279
+ G A + + + + LAA + L ++KD+ AL P
Sbjct: 202 DEYGIPNAW--------QGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATP 253

Query: 280 LLAHLPAVQNKRVYALGTETFRLDYYSATL 309
L +P V+ R + F Y ATL
Sbjct: 254 LWQAMPFVRAGRFQRVPAVWF----YGATL 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16185TCRTETB290.048 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.048
Identities = 69/397 (17%), Positives = 130/397 (32%), Gaps = 66/397 (16%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQVGLSVTLTGGAMFIGLMVGGVLADRYERKKVIL 86
F S+++ +L V++P T IG V G L+D+ K+++L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 87 LARGTCGIGFIGLCVNALLPEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGRENLMQ 146
G + V ++A ++ G F +L ++ + +EN +
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL----VMVVVARYIPKENRGK 139

Query: 147 AGAITMLTVRLGSVISPMLGGILLASGGVAWNYGLAAAGTFITLLPLLTLPRLPVPPQPR 206
A + V +G + P +GG++ + W+Y L IT++ + L +L
Sbjct: 140 AFGLIGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIP--MITIITVPFLMKLLKKEVRI 195

Query: 207 ------------------------------------------------ENPFIAL-LAAF 217
+PF+ L
Sbjct: 196 KGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKN 255

Query: 218 RFLLASPLIGGIALLGGLVTMASAVRVLYPALAMSWQMSAAQIGLLYAAI-PLGAAIGAL 276
+ L GGI T+A V ++ + Q+S A+IG + + I
Sbjct: 256 IPFMIGVLCGGIIF----GTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 277 TSGQLAHSVRPGLIMLVSTVG---SFLAVGLFAIMPVWIAGVICLALFGWLSAISSLLQY 333
G L P ++ + SFL W +I + + G LS +++
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIST 371

Query: 334 TLLQTQTPENMLGRMNGLWTAQNVTGDAIGAALLGGL 370
+ + + M L + + G A++GGL
Sbjct: 372 IVSSSLKQQEAGAGM-SLLNFTSFLSEGTGIAIVGGL 407


94GX95_16560GX95_16600N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_165602142.549130DNA polymerase III subunit gamma/tau
GX95_16565014-0.258288adenine phosphoribosyltransferase
GX95_16570014-1.009455hypothetical protein
GX95_16575012-0.604414primosomal replication protein N''
GX95_16580-110-0.400641DUF2496 domain-containing protein
GX95_16585-110-0.697373hypothetical protein
GX95_16590-110-0.983222DNA-binding transcriptional repressor AcrR
GX95_16595014-1.425407efflux transporter periplasmic adaptor subunit
GX95_16600-114-1.088648aminoglycoside/multidrug transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16560IGASERPTASE442e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.3 bits (104), Expect = 2e-06
Identities = 52/275 (18%), Positives = 85/275 (30%), Gaps = 34/275 (12%)

Query: 366 PEPETPRQSFAPVAPTAVMTPP--QVQQPSAP-----------APQTSPAPLPASTSQVL 412
PE E Q V T + TP Q PS P AP PAP S +
Sbjct: 983 PEVEKRNQ---TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 413 AARNQLQRAQGVTKTKK--SEPAAASRARPVNNSALERLASVSERVQARPTPSALETAPV 470
A N Q ++ V K ++ +E A + R + + + E A
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQN-----------REVAKEAKSNVKANTQTNEVAQS 1088

Query: 471 KKEAYRWKATTPVVQTKEVVATPKALKKALEHEKTPELAAKLAAEAIERDPWAAQVSQLS 530
E T +TKE K K +E EKT E K+ ++ + + V +
Sbjct: 1089 GSET----KETQTTETKETATVEKEEKAKVETEKTQE-VPKVTSQVSPKQEQSETVQPQA 1143

Query: 531 LPKLVEQVALNAWKEQNGNAVCLHLRSTQRHLNSSGAQQKLAQALSDLTGTTVELTIVED 590
P +N + Q+ + +S+ Q + + VE
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 591 DNPAVRTPLEWRQAIYEEKLAQARESIIADNNIQT 625
T + + ++ S+ + T
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16575CHANLCOLICIN270.041 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 27.3 bits (60), Expect = 0.041
Identities = 19/61 (31%), Positives = 26/61 (42%), Gaps = 2/61 (3%)

Query: 116 QEFERRLLAMTQE--RKIRLAQATSLVEQQTLQKEVEIYEGRLARCRHALEKIENVLARL 173
E ER LA +E RK A + E + +KE+E + R E E LA L
Sbjct: 125 AEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAAL 184

Query: 174 T 174
+
Sbjct: 185 S 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16590CHANLCOLICIN376e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.6 bits (84), Expect = 6e-04
Identities = 41/219 (18%), Positives = 81/219 (36%), Gaps = 18/219 (8%)

Query: 92 RQKVAQAPEKMRQ-ATAALNALSDVDNDDEMRKTLSALSLRQLELRVA--QVLDDLQNSQ 148
R ++A+A EK R+ A AA A + + + + A + RQL+L A + L L
Sbjct: 129 RLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEA 188

Query: 149 NDLAAYNSQLVSLQTQPERVQNAMYTASQQI-------QQIRNRLDGNNVGEAALRPSQQ 201
+ +L + Q++ ++ + T + ++ L G A +
Sbjct: 189 KAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYK 248

Query: 202 VLLQAKQALLNAQID--------QQRKSLEGNTVLQDTLQKQRDYVTANSNRLEHQLQLL 253
L + + L D + + G +++ QKQ NR+ + +
Sbjct: 249 ELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQI 308

Query: 254 QEAVNSKRLTLTEKTAQEAISPDETARIQANPLVKQELD 292
Q+A++ A+ + + + Q N L Q D
Sbjct: 309 QKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKD 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16595HTHTETR2048e-69 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 204 bits (519), Expect = 8e-69
Identities = 187/214 (87%), Positives = 199/214 (92%)

Query: 1 MARKTKQQALETRQHILDVALRLFSQQGVSATSLAEIANAAGVTRGAIYWHFKNKSDLFS 60
MARKTKQ+A ETRQHILDVALRLFSQQGVS+TSL EIA AAGVTRGAIYWHFK+KSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELEIEYQAKFPDDPLSVLREILVHILEATVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELE+EYQAKFP DPLSVLREIL+H+LE+TVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMVVVQQAQRSLCLESYDRIEQTLKHCINAKMLPENLLTRRAAILMRSFISGLMENWLF 180
GEM VVQQAQR+LCLESYDRIEQTLKHCI AKMLP +L+TRRAAI+MR +ISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARAYVTILLEMYQLCPTLRASTVN 214
APQSFDLKKEAR YV ILLEMY LCPTLR N
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATN 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16600RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 33/216 (15%), Positives = 75/216 (34%), Gaps = 27/216 (12%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAELTVKRYQKLLGTQYISKQEYDQALADAQQATAAVVA 159
+ Y A +L + + ++ + Q +++ ++ L +Q T +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSSV-TEGALVQNGQASALATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET-LMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA------------NGSLKQENGKAKVDLVTSDGIKFPQSGTLEFSDVT 266
+ D + G L KV + D I+ + G + ++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYL-----VGKVKNINLDAIEDQRLGLVFNVIIS 427

Query: 267 VDQTTGSITLRAIFPNPDHTLLPGMFVRARLQEGTK 302
+++ S + I L GM V A ++ G +
Sbjct: 428 IEENCLSTGNKNIP------LSSGMAVTAEIKTGMR 457



Score = 32.9 bits (75), Expect = 0.002
Identities = 24/133 (18%), Positives = 45/133 (33%), Gaps = 10/133 (7%)

Query: 49 PLQITTELPGR-TVAYRIAEVRPQVSGIILKRNFV-EGSDIEAGVSLYQIDP-------A 99
++I G+ T + R E++P + I+ K V EG + G L ++
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIV-KEIIVKEGESVRKGDVLLKLTALGAEADTL 137

Query: 100 TYQATYDSAKGDLAKAQAAANIAELTVKRYQKLLGTQYISKQEYDQALADAQQATAAVVA 159
Q++ A+ + + Q + EL KL Y ++ L
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 160 AKAAVETARINLA 172
+ +NL
Sbjct: 198 WQNQKYQKELNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16605ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3544), Expect = 0.0
Identities = 811/1033 (78%), Positives = 918/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISATYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDPISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPTELTKYQLTPVDVINAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ L KY+LTPVDVIN +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTDEFGKILLKVNQDGSQVRLRDVAKIELGGENYDVIAKFNGQPASGLGIKLATGANAL 300
+ +EFGK+ L+VN DGS VRL+DVA++ELGGENY+VIA+ NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTATAIRAELKKMEPFFPPGMKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+L +++PFFP GMK++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 TEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPVAKGDHGEGKKGFFGWFNRLFDKSTHHYTDSVGNILRSTGR 540
SVLVALILTPALCAT+LKPV+ H E K GFFGWFN FD S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLLLYLIIVVGMAYLFVRLPSSFLPDEDQGVFLTMVQLPAGATQERTQKVLDEVTDYYLN 600
YLL+Y +IV GM LF+RLPSSFLP+EDQGVFLTM+QLPAGATQERTQKVLD+VTDYYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKANVESVFAVNGFGFAGRGQNTGIAFVSLKDWAERPGEKNKVEAITQRATAAFSQIKD 660
EKANVESVF VNGF F+G+ QN G+AFVSLK W ER G++N EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLFGEVAKYPDLLVGVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQL G A++P LV VRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSISDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS+SDIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDINDWYVRGSDGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D++ YVR ++G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MAMMEELASKLPSGIGYDWTGMSYQERLSGNQAPALYAISLIVVFLCLAALYESWSIPFS 900
MA+ME LASKLP+GIGYDWTGMSYQERLSGNQAPAL AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 VEATLEAVRMRLRPILMTSLAFMLGVMPLVISSGAGSGAQNAVGTGVLGGMVTATVLAIF 1020
VEATL AVRMRLRPILMTSLAF+LGV+PL IS+GAGSGAQNAVG GV+GGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


95GX95_16980GX95_17005N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_169802172.836268two-component system sensor histidine kinase
GX95_169851163.506126phosphate regulon transcriptional regulatory
GX95_169903153.381843hypothetical protein
GX95_169952163.462607exonuclease subunit SbcD
GX95_170002142.904928exonuclease subunit SbcC
GX95_17005-2151.261140MFS transporter AraJ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16980PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 17/123 (13%), Positives = 38/123 (30%), Gaps = 28/123 (22%)

Query: 300 TFTFEVDDSLSVLGNEEQLRSAISNLVYNAVNH----TPAGTHITVNWRRAAHGAEFCIQ 355
F +++ ++ + + + LV N + H P G I + + ++
Sbjct: 241 QFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVE 297

Query: 356 DNGPGIAAEHIPRLTERFYRVDKARSRQTGGSGLGLAIVKHALNH---HESRLEIDSSPG 412
+ G +G GL V+ L E+++++ G
Sbjct: 298 NTGSLAL------------------KNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 413 KGT 415
K
Sbjct: 340 KVN 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16985HTHFIS972e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.8 bits (241), Expect = 2e-25
Identities = 34/149 (22%), Positives = 63/149 (42%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNKLNEPWPDLILLDWMLPGGSGLQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKREALTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPGSHRVMTGDSP 152
E L D + G S
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_16995FRAGILYSIN300.021 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 29.7 bits (66), Expect = 0.021
Identities = 14/70 (20%), Positives = 29/70 (41%), Gaps = 4/70 (5%)

Query: 149 KQQQLLHAIADYYQQQYREACQLRGERKLPVIATGHLTTVGASKSDAVRDIYIGTLDAFP 208
K+ Q+++ IA++Y +++ + E++ T D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAIN-EKEAFECIYDSRTRSA--GKD-IVSVKINIDKAKK 190

Query: 209 AQHFPPADYI 218
+ P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17000RTXTOXIND467e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 46.0 bits (109), Expect = 7e-07
Identities = 31/198 (15%), Positives = 70/198 (35%), Gaps = 13/198 (6%)

Query: 373 TQQSHDRAQLSQWQQQLLSDTRQRDALPPLTLDLTPQALAEARALHTRQRPLRHRLAALQ 432
TQ S +A+L Q + Q+LS + + + LP L L P + R L ++
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL------IK 192

Query: 433 GQIIPKQKRQAQLQAAIARHHQEQTQYTQHLTDKRLSYKTKAQELADVRTICEQ----EA 488
Q Q ++ Q + + + E+ + + + L D ++ + +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 489 RIKDLESQRAHLQS--GQPCPLCGSTTHPAIAAYQALELSANQTRRDALEKEVKTLAEEG 546
+ + E++ + ++A + +L + + L+K +T
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNI- 311

Query: 547 AALRGQLDALTQQLQRDE 564
L +L ++ Q
Sbjct: 312 GLLTLELAKNEERQQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17005TCRTETA522e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.7 bits (124), Expect = 2e-09
Identities = 70/356 (19%), Positives = 122/356 (34%), Gaps = 35/356 (9%)

Query: 5 IFSLALGTFGLGMAEFGIMGVLTELARDVGITIPAAGH---MISFYAFGVVLGAPVMALF 61
+ ++AL G+G+ IM VL L RD+ + H +++ YA APV+
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRFSLKHILLFLVTLCVMGNAIFTFSSSYLMLAVGRLVSGFPHGAFFGVGAIVLSKIIR 121
S RF + +LL + + AI + +L +GR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLVGIPVGTYLSQEFSWRYTFLLIAVFNIAVLTAIFFWVPDI 181
G A G +S +V PV L FS F A N F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 RDKAQGSLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYIKPFMMYI 229
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 230 SGFSETSMTFIMMLVGLGM---VLGNLLSGKLSGRYTPLRIAVVTDLIIVLSLMALFFFS 286
F + T + L G+ + +++G ++ R R ++ ++ +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILLA 295

Query: 287 GYKTASLTFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAIG 340
+ F + + P +L E G G +A +L S +G
Sbjct: 296 FATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351


96GX95_17105GX95_17140N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_17105117-0.959712transcriptional regulator
GX95_171101180.281052autotransporter outer membrane beta-barrel
GX95_171151173.474641delta-aminolevulinic acid dehydratase
GX95_171201173.164678propionate--CoA ligase
GX95_171251142.1383672-methylcitrate dehydratase
GX95_171300131.0184722-methylcitrate synthase
GX95_17135-2141.460178methylisocitrate lyase
GX95_17140-2160.861996propionate catabolism operon regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17105PF06291300.002 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 30.0 bits (67), Expect = 0.002
Identities = 21/68 (30%), Positives = 30/68 (44%), Gaps = 11/68 (16%)

Query: 28 VNDKEIICSPDESNTHTFVILEGVVSLVRGDKVLIGIVQAPFIFGLADGVAKKEAQYKLI 87
V +K +P E+ TH F VS + K V A I G A+ V K E Q +
Sbjct: 29 VGNKPTAVTPKETITHHFF-----VSGIGQKKT----VDAAKICGGAENVVKTETQQTFV 79

Query: 88 AESGCIGY 95
+G +G+
Sbjct: 80 --NGLLGF 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17110PRTACTNFAMLY1215e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 121 bits (304), Expect = 5e-30
Identities = 99/436 (22%), Positives = 166/436 (38%), Gaps = 59/436 (13%)

Query: 597 TYSANGEADNSYTDNVVA---ATGNYKVRIDNATGAGSVADYKGNELIRVNDVNTDATFS 653
+ N AD +D +V A+G +++ + N+ GS L+ + + ATF+
Sbjct: 483 LFRMNVFADLGLSDKLVVMQDASGQHRLWVRNS---GSEPASANTLLLVQTPLGSAATFT 539

Query: 654 AAN---KADLGAYTYQAKQEGNTV------------------------------------ 674
AN K D+G Y Y+ GN
Sbjct: 540 LANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQ 599

Query: 675 VLEQMELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNARHGLADNGGAWVSYFGGNF 732
EL+ AN A++ + +W E + + RL R D GGAW F
Sbjct: 600 PPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGAWGRGFAQRQ 658

Query: 733 NGDNGTIN-YDQDVNGIMVGVDTKVDGNNAKWIVGAAAGFAKGDLS---DRTGQVDQDSQ 788
DN +DQ V G +G D V +W +G AG+ +GD D G D
Sbjct: 659 QLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTD---- 714

Query: 789 SAYIYSSARFANN--IFVDGNLSYSHFNNDLSANMSDGTYVDGNTSSDAWGFGLKLGYDW 846
S ++ A + + ++D L S ND SDG V G + G L+ G +
Sbjct: 715 SVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRF 774

Query: 847 KLGDAGYVTPYGSVSGLFQSGDDYQLSNNMKVDGQSYDSMRYEIGVDAGYTFTYSEDQAL 906
D ++ P ++ G Y+ +N ++V + S+ +G++ G + + +
Sbjct: 775 THADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQV 834

Query: 907 TPYFKLAYVYD-DSNNDADVNGDSIDNGVEGSAVRVGLGTQFSFTKNFSAYTDANYLGGG 965
PY K + + + D NG + + G+ +GLG + + S Y Y G
Sbjct: 835 QPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGP 894

Query: 966 DVDQDWSANVGVKYTW 981
+ W+ + G +Y+W
Sbjct: 895 KLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17115BINARYTOXINB320.003 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.3 bits (73), Expect = 0.003
Identities = 19/69 (27%), Positives = 29/69 (42%)

Query: 254 DVLREIRERTELPLGAYQVSGEYAMIKFAAMAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ E+ + +L L QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKNI 322
L+L E+ I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17140HTHFIS345e-115 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 345 bits (886), Expect = e-115
Identities = 122/379 (32%), Positives = 191/379 (50%), Gaps = 51/379 (13%)

Query: 191 DALDMTRLTRR--QRVDYPPGKGLQTRYELGDIRGQSPQMEQLRQTITLYARSRAAVLIQ 248
D ++ + R P K + + G+S M+++ + + ++ ++I
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT 166

Query: 249 GETGTGKELAAQAIHQTFFHRQPHRQNKPSPPFVAVNCGAITESLLEAELFGYEEGAFTG 308
GE+GTGKEL A+A+H R+N P FVA+N AI L+E+ELFG+E+GAFTG
Sbjct: 167 GESGTGKELVARALHD-----YGKRRNGP---FVAINMAAIPRDLIESELFGHEKGAFTG 218

Query: 309 SRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVLEEKAVTRVGGHQPIPVDVRVIS 368
++ G FE A GGTLFLDEIG+MP+ QTRLLRVL++ T VGG PI DVR+++
Sbjct: 219 AQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVA 277

Query: 369 ATHCDLDREIMQGRFRPDLFYRLSILRLTLPPLRERQADILPLAENFLKQSLAAMEIPFT 428
AT+ DL + I QG FR DL+YRL+++ L LPPLR+R DI L +F++Q +
Sbjct: 278 ATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ-AEKEGLD-- 334

Query: 429 ESIRHGLTQCQPLLLAWRWPGNIRELRNMMERLALFLS---------------------- 466
++ + L+ A WPGN+REL N++ RL
Sbjct: 335 --VKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPI 392

Query: 467 ----VDPAPTLDRQFMRQLLPELMVNTAELTPST---------VDAHALQDVLARFNGDK 513
Q + + + + + + P + ++ + L G++
Sbjct: 393 EKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQ 452

Query: 514 TAAARYLGISRTTLWRRLK 532
AA LG++R TL ++++
Sbjct: 453 IKAADLLGLNRNTLRKKIR 471


97GX95_17195GX95_17250N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_171950144.215263MFS transporter
GX95_172000134.425169hypothetical protein
GX95_17205-1134.949121heavy metal transport/detoxification protein
GX95_17210-1124.611877Cu(I)-responsive transcriptional regulator
GX95_17215-1134.212851copper-translocating P-type ATPase
GX95_17220-1161.819208efflux transporter periplasmic adaptor subunit
GX95_17225-116-0.195716multidrug efflux RND transporter permease
GX95_17230125-5.122247multidrug transporter
GX95_17235236-12.306505hypothetical protein
GX95_17240234-10.083513hypothetical protein
GX95_17245234-10.229386transcriptional regulator
GX95_17250335-10.870281hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17195TCRTETA567e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.4 bits (136), Expect = 7e-11
Identities = 56/306 (18%), Positives = 108/306 (35%), Gaps = 17/306 (5%)

Query: 19 FTSWMLDAFDFFILVFVLSDLAEWFHAS---VSDVSIAIMLTLAVRPIGALLFGRMAEKY 75
++ LDA +++ VL L S + I + L ++ A + G +++++
Sbjct: 11 LSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF 70

Query: 76 GRRPILMLNILFFTVFELLSAWSPTFMAFLIFRVMYGVAMGGIWGVASSLAMETIPDRSR 135
GRRP+L++++ V + A +P I R++ G+ G VA + + R
Sbjct: 71 GRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDER 129

Query: 136 ----GLMSGIFQAGYPCGYLFASVIFGLFYSMVGWRGMFLIGA---LPVVLLPYIWFKVP 188
G MS F G + A + G F A L
Sbjct: 130 ARHFGFMSACFGFG-----MVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 189 ESPVWLAARARKENTALLPVLRKQWKLCLYLVLVMAFFNFFSHGTQDLYPTFLKMQHGFD 248
R N + + L+ V L+ F + + +D
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 249 PHLISI-IAIFYNIAAMLGGIFYGTLSERIGRKKAIMIAAFLALPVLPLWAFSSGSFTIG 307
I I +A F + ++ + G ++ R+G ++A+M+ L AF++ +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 308 LGAFLM 313
L+
Sbjct: 305 PIMVLL 310



Score = 33.6 bits (77), Expect = 0.001
Identities = 37/186 (19%), Positives = 77/186 (41%), Gaps = 10/186 (5%)

Query: 3 TPLNWTTTQRHVAFASFTSWMLDAF-DFFILVFVLSDLAEWFHASVSDVSIAIMLTLAVR 61
W VA +++ ++V+ + FH + + I++ +
Sbjct: 201 ASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG-EDRFHWDATTIGISLAAFGILH 259

Query: 62 PIG-ALLFGRMAEKYGRRPILMLNILF-FTVFELLSAWSPTFMAFLIFRVMYGVAMG--G 117
+ A++ G +A + G R LML ++ T + LL+ + +MAF I ++ +G
Sbjct: 260 SLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPA 319

Query: 118 IWGVASSLAMETIPDRSRGLMSGIFQAGYPCGYLFASVIFGLFYSMVGWRG-MFLIGA-L 175
+ + S E + +G ++ + G L + I+ S+ W G ++ GA L
Sbjct: 320 LQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA--ASITTWNGWAWIAGAAL 377

Query: 176 PVVLLP 181
++ LP
Sbjct: 378 YLLCLP 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17220RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 4e-08
Identities = 18/112 (16%), Positives = 37/112 (33%), Gaps = 7/112 (6%)

Query: 74 ELRSRVGGTLDAVSVPEGRLVSRGQLLFQIDPRPFEVALDTAVAQLRQAEVLARQAQADF 133
E++ + + V EG V +G +L ++ E + L QA + + Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 134 DRIQR-------LVASGAVSRKNADDVTATRNARQAQMQSAKAAVAAARLEL 178
I+ L + ++V + + Q + + L L
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNL 209



Score = 34.4 bits (79), Expect = 7e-04
Identities = 20/106 (18%), Positives = 37/106 (34%), Gaps = 13/106 (12%)

Query: 112 LDTAVAQLRQAEVLARQAQADFDRIQRLVASGAVSRKNADDVTATRNARQAQMQSAKAAV 171
L +QL Q E A+ ++ + +L + ++ + +
Sbjct: 268 LRVYKSQLEQIESEILSAKEEYQLVTQLFKN---------EILDKLRQTTDNIGLLTLEL 318

Query: 172 AAARLELSWTRITAPIAGRVDRILVTRGNLVSGGVAGNATLLTTIV 217
A + I AP++ +V ++ V GGV A L IV
Sbjct: 319 AKNEERQQASVIRAPVSVKVQQLKVHT----EGGVVTTAETLMVIV 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17225ACRIFLAVINRP10460.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1046 bits (2706), Expect = 0.0
Identities = 435/1040 (41%), Positives = 660/1040 (63%), Gaps = 19/1040 (1%)

Query: 6 FFIARPIFAIVLSLLMLLAGAIAFLKLPLSEYPAVTPPTVQVSASYPGANPQVIADTVAA 65
FFI RPIFA VL++++++AGA+A L+LP+++YP + PP V VSA+YPGA+ Q + DTV
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLEQVINGVDGMLYMNTQMAIDGRMVISIAFEQGTDPDMAQIQVQNRVSRALPRLPEEVQ 125
+EQ +NG+D ++YM++ G + I++ F+ GTDPD+AQ+QVQN++ A P LP+EVQ
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 126 RIGVVTEKTSPDMLMVVHLVSPQKRYDSLYLSNFAIRQVRDELARLPGVGDVLVWGAGEY 185
+ G+ EK+S LMV VS +S++ V+D L+RL GVGDV ++G +Y
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG-AQY 182

Query: 186 AMRVWLDPAKIANRGLTASDIVTALREQNVQVAAGSVGQQPEASA-AFQMTVNTLGRLTS 244
AMR+WLD + LT D++ L+ QN Q+AAG +G P ++ R +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 245 EEQFGEIVVKIGADGEVTRLRDVARVTLGADAYTLRSLLNGEAAPALQIIQSPGANAIDV 304
E+FG++ +++ +DG V RL+DVARV LG + Y + + +NG+ A L I + GANA+D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 305 SNAIRGKMDELQQNFPQDIEYRIAYDPTVFVRASLQSVAITLLEALVLVVLVVVLFLQTW 364
+ AI+ K+ ELQ FPQ ++ YD T FV+ S+ V TL EA++LV LV+ LFLQ
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 365 RASIIPLVAVPVSLVGTFALMHLFGFSLNTLSLFGLVLSIGIVVDDAIVVVENVERHISQ 424
RA++IP +AVPV L+GTFA++ FG+S+NTL++FG+VL+IG++VDDAIVVVENVER + +
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 425 GKSPG-EAAKKAMDEVTGPILSITSVLTAVFIPSAFLAGLQGEFYRQFALTIAISTILSA 483
K P EA +K+M ++ G ++ I VL+AVFIP AF G G YRQF++TI + LS
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 484 INSLTLSPALAAILLRPHHDTAKADWLTRLMGTVTGGFFHRFNRFFDSASNRYVSAVRRA 543
+ +L L+PAL A LL+P ++ GGFF FN FD + N Y ++V +
Sbjct: 483 LVALILTPALCATLLKP---------VSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 544 VRGSVIVMVLYAGFVGLTWLGFHQVPNGFVPAQDKYYLVGIAQLPSGASLDRTEAVVKQM 603
+ + +++YA V + F ++P+ F+P +D+ + + QLP+GA+ +RT+ V+ Q+
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 604 SAIALA--EPGVESVVVFPGLSVNGPVNVPNSALMFAMLKPFDEREDPSLSANAIAGKLM 661
+ L + VESV G S +G N+ + F LKP++ER SA A+ +
Sbjct: 594 TDYYLKNEKANVESVFTVNGFSFSG--QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 662 HKFSHIPDGFIGIFPPPPVPGLGATGGFKLQIEDRAELGFEAMTKVQSEIMSKAMQTP-E 720
+ I DGF+ F P + LG GF ++ D+A LG +A+T+ +++++ A Q P
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 721 LANMLASFQTNAPQLQVDIDRVKAKSMGVSLTDIFETLQINLGSLYVNDFNRFGRTWRVM 780
L ++ + + Q ++++D+ KA+++GVSL+DI +T+ LG YVNDF GR ++
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 781 AQADAPFRMQQEDIGLLKVRNAKGEMIPLSAFVTIMRQSGPDRIIHYNGFPSVDISGGPA 840
QADA FRM ED+ L VR+A GEM+P SAF T G R+ YNG PS++I G A
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 841 PGFSSGQATDAIEKIVRETLPEGMVFEWTDLVYQEKQAGNSALAIFALAVLLAFLILAAQ 900
PG SSG A +E + + LP G+ ++WT + YQE+ +GN A A+ A++ ++ FL LAA
Sbjct: 832 PGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 901 YNSWSLPFAVLLIAPMSLLSAIVGVWVSGGDNNIFTQIGFVVLVGLAAKNAILIVEFAR- 959
Y SWS+P +V+L+ P+ ++ ++ + N+++ +G + +GL+AKNAILIVEFA+
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 960 AKEHDGADPLTAVLEASRLRLRPILMTSFAFIAGVVPLVLATGAGAEMRHAMGIAVFAGM 1019
E +G + A L A R+RLRPILMTS AFI GV+PL ++ GAG+ ++A+GI V GM
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 1020 LGVTLFGLLLTPVFYVVVRR 1039
+ TL + PVF+VV+RR
Sbjct: 1011 VSATLLAIFFVPVFFVVIRR 1030



Score = 89.5 bits (222), Expect = 4e-20
Identities = 68/427 (15%), Positives = 143/427 (33%), Gaps = 36/427 (8%)

Query: 643 FDEREDPSLSANAIAGKLMHKFSHIPDGFIGIFPPPPVPGLGATGGFKLQIEDRAELGFE 702
F DP ++ + KL +P + ++ + + ++
Sbjct: 94 FQSGTDPDIAQVQVQNKLQLATPLLPQEV----QQQGISVEKSSSSYLMVAGFVSDNP-- 147

Query: 703 AMTKVQSEIMSKAMQTPELANM--LASFQTNAPQLQVDI--DRVKAKSMGVSLTDIFETL 758
T+ + L+ + + Q Q + I D ++ D+ L
Sbjct: 148 GTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDVINQL 207

Query: 759 QIN--------LGSLYVNDFNRFGRTWRVMAQADAPFRMQQEDIGLLKVR-NAKGEMIPL 809
++ LG + + + P E+ G + +R N+ G ++ L
Sbjct: 208 KVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP-----EEFGKVTLRVNSDGSVVRL 262

Query: 810 SAFVTIMRQSGPDRII-HYNGFPSVDISGGPAPGFSSGQATDAIEKIV---RETLPEGM- 864
+ +I NG P+ + A G ++ AI+ + + P+GM
Sbjct: 263 KDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMK 322

Query: 865 ---VFEWTDLVYQEKQAGNSALAIFALAVLLAFLILAAQYNSWSLPFAVLLIAPMSLLSA 921
++ T V + + + + A++L FL++ + + P+ LL
Sbjct: 323 VLYPYDTTPFV---QLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGT 379

Query: 922 IVGVWVSGGDNNIFTQIGFVVLVGLAAKNAILIVE-FARAKEHDGADPLTAVLEASRLRL 980
+ G N T G V+ +GL +AI++VE R D P A ++
Sbjct: 380 FAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQ 439

Query: 981 RPILMTSFAFIAGVVPLVLATGAGAEMRHAMGIAVFAGMLGVTLFGLLLTPVFYVVVRRM 1040
++ + A +P+ G+ + I + + M L L+LTP + +
Sbjct: 440 GALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKP 499

Query: 1041 ALKRENR 1047
+
Sbjct: 500 VSAEHHE 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17230cdtoxinb290.042 Cytolethal distending toxin B signature.
		>cdtoxinb#Cytolethal distending toxin B signature.

Length = 269

Score = 28.8 bits (64), Expect = 0.042
Identities = 21/70 (30%), Positives = 27/70 (38%), Gaps = 4/70 (5%)

Query: 75 AIALRNNRDLRKAGLNVEAARALYRIQRAEMLPTLGIATAMDAGRTPADLSVMDEPEINR 134
AIA+RNN A VE +R R + L D R PADL + + R
Sbjct: 155 AIAMRNN----DAPALVEEVYNFFRDSRDPVHQALNWMILGDFNREPADLEMNLTVPVRR 210

Query: 135 RYEMAGATTA 144
E+ A
Sbjct: 211 ASEIISPAAA 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_17250ENTEROVIROMP1347e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 134 bits (339), Expect = 7e-43
Identities = 59/183 (32%), Positives = 88/183 (48%), Gaps = 21/183 (11%)

Query: 1 MKRRSSFLVFLGLLLASPLALANDQHTVSFGYAQTHLSSLKNSDSKDLRGFNFKYRYEFN 60
MK+ + +L + TV+ GYAQ+ N + GFN KYRYE +
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMN----KMGGFNLKYRYEED 56

Query: 61 ET-WGMLGSFTATRNEMENYTWKEGKLHKNGSDSVDYGSLMFGPTYRFNDYVSLYGNAGI 119
+ G++GSFT T + K Y + GP YR ND+ S+YG G+
Sbjct: 57 NSPLGVIGSFTYTEKSRTASSGDYNK--------NQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 ATMKF--------NKHSKEDSFAYGAGVIFNPVKSISIDASWEASRFFAVDTNTFGVSVG 171
KF + + F+YGAG+ FNP++++++D S+E SR +VD T+ VG
Sbjct: 109 GYGKFQTTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVG 168

Query: 172 YRF 174
YRF
Sbjct: 169 YRF 171


98GX95_19715GX95_19745N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_19715-1131.093724hypothetical protein
GX95_19720-1131.545638MFS transporter
GX95_197250151.991118hypothetical protein
GX95_19730-1151.802544hypothetical protein
GX95_19735-1131.416938hypothetical protein
GX95_19740-1131.964806hypothetical protein
GX95_19745-1141.171610beta-aspartyl-peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19715FLGFLIH347e-04 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 33.6 bits (76), Expect = 7e-04
Identities = 23/70 (32%), Positives = 33/70 (47%), Gaps = 8/70 (11%)

Query: 258 KGERQGRQVGLEEGLAEGLEKGLEKGQHVAALRIAR--------QMLADGLDRETVQRFT 309
+G +QG + G +EGLA+GLE+GL + + A AR Q D LD R
Sbjct: 62 EGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLM 121

Query: 310 GLTAEELQDV 319
+ E + V
Sbjct: 122 QMALEAARQV 131



Score = 30.9 bits (69), Expect = 0.005
Identities = 16/47 (34%), Positives = 26/47 (55%)

Query: 238 PHTKERLMTLIERIRAADRRKGERQGRQVGLEEGLAEGLEKGLEKGQ 284
P +++L L + + G +GRQ G ++G EGL +GLE+G
Sbjct: 38 PSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGL 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19720TCRTETB485e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 47.6 bits (113), Expect = 5e-08
Identities = 35/141 (24%), Positives = 64/141 (45%), Gaps = 5/141 (3%)

Query: 58 SLYLAGGMALQWLLGPLSDRIGRRPVLIAGALIFTLACAATLLTTSMTQFLV-ARFVQGT 116
L + G A+ G LSD++G + +L+ G +I + S L+ ARF+QG
Sbjct: 59 MLTFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA 115

Query: 117 SICFIATVGYVTVQEAFGQTKAIKLMAIITSIVLVAPVIGPLSGAALMHFVHWKVLFGII 176
+ V V + K +I SIV + +GP G + H++HW L +I
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LI 174

Query: 177 AVMGLLALCGLLLAMPETVQR 197
++ ++ + L+ + + V+
Sbjct: 175 PMITIITVPFLMKLLKKEVRI 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19730TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 83/396 (20%), Positives = 141/396 (35%), Gaps = 31/396 (7%)

Query: 9 PRHPIFTALFGMMVLTLGMGVGRFLYTPMLPVMLAEKQLTFNQLSWIASANYAGYLAGSL 68
P P+ L + + +G+G L P+LP +L + + N ++ A Y
Sbjct: 3 PNRPLIVILSTVALDAVGIG----LIMPVLPGLLRDLVHS-NDVTAHYGILLALYALMQF 57

Query: 69 LFSFGLFHLPSRL--RPMLLASAVATGILILSMAIFTQPAVVMLVRFLAGVASAGMMIFG 126
+ L L R RP+LL S + MA V+ + R +AG+ A + G
Sbjct: 58 ACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG 117

Query: 127 SMI-----VLHHTRHPFVIAALFSGVGAGIALGNEYVIGGLHYALSAHSLWLGAGALAGI 181
+ I RH ++A F G G+ G V+GGL S H+ + A AL G+
Sbjct: 118 AYIADITDGDERARHFGFMSACF---GFGMVAGP--VLGGLMGGFSPHAPFFAAAALNGL 172

Query: 182 LLLIVAMLIPPRAHALPPAPLARIENQPMPWWQLA-LLYGFAGFGYIIVATYLPLMAKSA 240
L L+P +H PL R P+ ++ A + A + L +A
Sbjct: 173 NFLTGCFLLPE-SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAA 231

Query: 241 GSPLLTAHL--WSLVGLAIIPGCFGWLWA----------AKHWGVLPCLTANLLIQSACV 288
+ W + I FG L + A G L ++
Sbjct: 232 LWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY 291

Query: 289 LLSLASDSLLLLILSSIGFGATFMGTTSLVMPLARQLSAPGNINLLGLVTLTYGIGQILG 348
+L + + + + +G +L L+RQ+ L G + + I+G
Sbjct: 292 ILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVG 351

Query: 349 PLAASLSGNGAPAIINATLCGAAALFFAALISAAQQ 384
PL + + N A A + + A ++
Sbjct: 352 PLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_19750UREASE371e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 36.6 bits (85), Expect = 1e-04
Identities = 32/129 (24%), Positives = 49/129 (37%), Gaps = 33/129 (25%)

Query: 26 CDVLLANGKIIAVGADIPGDIVPDCT--------VINLSGRMLCPGFIDQHVHLIGG--- 74
D+ L +G+I A+G D+ P T VI G+++ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICPQQI 145

Query: 75 ------------GGEAGP------TTRTP-EVSLSRLTEA--GITTVVGLLGTDSVSRHP 113
GG GP TT TP ++R+ EA + G + S P
Sbjct: 146 EEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIARMIEAADAFPMNLAFAGKGNAS-LP 204

Query: 114 ASLLAKTRA 122
+L+
Sbjct: 205 GALVEMVLG 213


99GX95_20410GX95_20440N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_20410-1172.137002protease modulator HflC
GX95_20415-3132.812354HflK protein
GX95_20420-3162.479073GTPase HflX
GX95_20425-1153.624725RNA chaperone Hfq
GX95_20430-1133.615148tRNA (adenosine(37)-N6)-dimethylallyltransferase
GX95_20435-1123.101377DNA mismatch repair protein MutL
GX95_20440-1112.258084N-acetylmuramoyl-L-alanine amidase AmiB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20410PYOCINKILLER290.030 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 29.0 bits (64), Expect = 0.030
Identities = 18/65 (27%), Positives = 30/65 (46%), Gaps = 3/65 (4%)

Query: 225 NRMRAEREAVARRHRSQGQEEAEKLRAAADYEVTK---TLAEAERQGRIMRGEGDAEAAK 281
N+ R + A A+R + + +RAA Y + +A A +G I +G A A+
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQ 279

Query: 282 LFADA 286
+DA
Sbjct: 280 AISDA 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20420SECA330.002 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 0.002
Identities = 26/144 (18%), Positives = 55/144 (38%), Gaps = 6/144 (4%)

Query: 282 HVVDAADVRVQENIEAVNTVLEEIDAHEIPTLMVMNKIDMLDDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P + ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQSGVGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIEY 424
+R I R +++P EY
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20435ALARACEMASE300.028 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 30.1 bits (68), Expect = 0.028
Identities = 26/161 (16%), Positives = 57/161 (35%), Gaps = 18/161 (11%)

Query: 31 VENSLDAGATRVDIDIER---GGAKLIR-IRDNGCGIKKEELALALARHATSKIASLDDL 86
++ SLD A + ++ I R A++ ++ N G E + A+ + +L++
Sbjct: 5 IQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 87 EAIISLGFRGEAL----------ASISSVSRLTLTSRTAEQAEAWQAYAEGRDMDVTVK- 135
+ G++G L I RLT + Q +A Q +D+ +K
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 136 -PAAHPVGTTLEVLDLFYNTPARRKFMRTEK--TEFNHIDE 173
+ +G + + + + + F +
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEH 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20440PF03544290.030 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.2 bits (65), Expect = 0.030
Identities = 15/64 (23%), Positives = 25/64 (39%), Gaps = 7/64 (10%)

Query: 130 PPPPPPVVAKRVESAPRPTEPARNPFKSSDDRLTGVTSSNTVTRPAARASAGAGDKVVIA 189
P P P K+VE R +P + S + + RP + + A K V +
Sbjct: 100 KPKPKPKPVKKVEQPKRDVKPVESRPASPFE-------NTAPARPTSSTATAATSKPVTS 152

Query: 190 IDAG 193
+ +G
Sbjct: 153 VASG 156


100GX95_20825GX95_20855N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_20825-112-1.105805phosphoethanolamine transferase EptA
GX95_20830-113-0.389949two-component system response regulator BasR
GX95_20835-112-0.861005two-component system sensor histidine kinase
GX95_20840012-1.438824proline/betaine transporter
GX95_20845-217-0.111487hypothetical protein
GX95_20850-2140.094600VOC family protein
GX95_20855-2130.211918aminoalkylphosphonic acid N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20825BCTERIALGSPF320.005 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 32.1 bits (73), Expect = 0.005
Identities = 39/163 (23%), Positives = 60/163 (36%), Gaps = 13/163 (7%)

Query: 80 CVFILVGAAAQYFILTYGIIIDRSMIANMMDTTPAETFALM-TPQMVLTLG---LSGVLA 135
CV +V A +L+ + +M P T LM V T G L +LA
Sbjct: 177 CVLTVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLA 236

Query: 136 AVIAFWVKIRPATPRLRSGLYRLASVLISILLVILVAAFFYKDYASLFRNNKQLIKALSP 195
+AF V +R R+ L LI + L A + + + L + L++A+
Sbjct: 237 GFMAFRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRI 296

Query: 196 SNSIVASWSWYSHQRLANLPLVRIGEDAHRN--------PLML 230
S V S + H+ VR G H+ P+M
Sbjct: 297 S-GDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMR 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20830HTHFIS928e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 8e-24
Identities = 46/144 (31%), Positives = 69/144 (47%), Gaps = 1/144 (0%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVSTARAAEHSLESGHYSLMVLDLGLPDEDGLH 61
IL+ +DD + L A GY S A + +G L+V D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLTRIRQKKYTLPVLILTARDTLNDRISGLDVGADDYLVKPFALEELHARI-RALLRRHN 120
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 NQGESELTVGNLTLNIGRHQAWRD 144
+ E + +GR A ++
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQE 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20835PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 38/182 (20%), Positives = 78/182 (42%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMDSVSQLLQLARVGQSFSSGNYQEVKLLEDV-ILPSYDELNTM-LETR-QQTLL 240
+ +M+ S+S+L++ S N ++V L +++ ++ SY +L ++ E R Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADVVVRGDATLLRMLLRNLVENAHRY----SPEGTHITIHISADPDAI-MAVEDE 295
+ + DV V ML++ LVEN ++ P+G I + + D + + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGKLSEAFVRMDSRYGGIGLGLSIV-SRITQLHQGQFFLQNRTERTGTRAWV 354
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 ML 356
++
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20840TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 53/284 (18%), Positives = 104/284 (36%), Gaps = 40/284 (14%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYATIGIWAPILLLLCKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L + ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEENFLEWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKHWRS 260
PFF A L + L K E+ P SF+ +
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFRWARGMTVVA 213

Query: 261 LLSCIGLVIATNVTYYMLLTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVMGLLS 319
L + ++ + + + H+ G+ + ++ L + G ++
Sbjct: 214 ALMAVFFIM--QLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVA 271

Query: 320 DRFGRRPFVIMGSIA-LFALAIPAFILINSNVIGLIFAGLLMLA 362
R G R +++G IA + AF + F +++LA
Sbjct: 272 ARLGERRALMLGMIADGTGYILLAFA----TRGWMAFPIMVLLA 311



Score = 38.7 bits (90), Expect = 5e-05
Identities = 37/164 (22%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVIMGSIALFALAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP ++ ++L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLL---VSLAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLIAG 401
+ + + +++ G ++A I V + + + R + ++A F ++AG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQDLMMPAYYLMVIAVIGLITGI-SMKETANR 444
P L + S P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFS--PHAPFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_20855SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.0 bits (75), Expect = 2e-04
Identities = 21/86 (24%), Positives = 33/86 (38%), Gaps = 9/86 (10%)

Query: 51 LALRNGEVVGMISLHMQFHLHHANWIG--EIQELVVLPQMRGQKVGSQLLAWAEEEARQA 108
L +G I + +NW G I+++ V R + VG+ LL A E A++
Sbjct: 69 LYYLENNCIGRIKIR-------SNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN 121

Query: 109 GAELTELSTNIKRRDAHRFYVREGYK 134
L T A FY + +
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFI 147


101GX95_21815GX95_21850N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_21815-212-0.153545MFS transporter
GX95_21820-212-1.341087CDP-diacylglycerol diphosphatase
GX95_21825-114-0.448357sulfate transporter subunit
GX95_218300140.4152066-phosphofructokinase
GX95_218350150.839830divalent metal cation transporter FieF
GX95_218400141.453815stress adaptor protein CpxP
GX95_218452161.935647DNA-binding response regulator
GX95_218500161.911141two-component system sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21820TCRTETB290.040 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.040
Identities = 12/58 (20%), Positives = 24/58 (41%), Gaps = 1/58 (1%)

Query: 16 MAINVVIIAMQLLLAYFYTDIYGLSAADVGVLFVVVRMIDAII-DPAMGVLTDKLNTR 72
I + ++ Y D++ LS A++G + + + II G+L D+
Sbjct: 266 GIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21840ABC2TRNSPORT280.034 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 28.4 bits (63), Expect = 0.034
Identities = 32/140 (22%), Positives = 58/140 (41%), Gaps = 27/140 (19%)

Query: 10 SRAAIAATAMASALLLIKIFAWWYTGSVSILAALVD-SLVDIAASLTNLLVVRYSLQPAD 68
++AA+A + + W S+L AL +L +A + ++V +L P+
Sbjct: 123 TKAALAGAGIGVVAAALGYTQWL-----SLLYALPVIALTGLAFASLGMVVT--ALAPSY 175

Query: 69 DEHTFGHGKAESLAALAQSMFISGSAL--------------FLFLTSIQNLIKPTPMNDP 114
D F ++L + +F+SG+ FL L+ +LI+P + P
Sbjct: 176 DYFIF----YQTLV-ITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHP 230

Query: 115 GVGIGVTVIALICTIILVTF 134
V + V AL I++ F
Sbjct: 231 VVDVCQHVGALCIYIVIPFF 250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21850HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 2e-24
Identities = 36/128 (28%), Positives = 64/128 (50%), Gaps = 2/128 (1%)

Query: 3 KILLVDDDRELTSLLKELLEMEGFNVLVAHDGEQALELL-DDSIDLLLLDVMMPKKNGID 61
IL+ DDD + ++L + L G++V + + + DL++ DV+MP +N D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 TLKALRQTH-QTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRRSHW 120
L +++ PV++++A+ + + + E GA DYLPKPF+ EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 SEQQQSSD 128
+ D
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_21855PF06580290.037 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.037
Identities = 19/108 (17%), Positives = 37/108 (34%), Gaps = 28/108 (25%)

Query: 354 LENIVRNALRY------SHTKIEVGFSVDKDGITITVDDDGPGVSPEDREQIFRPFYRTD 407
++ +V N +++ KI + + D +T+ V++ G +E
Sbjct: 260 VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE---------- 309

Query: 408 EARDRESGGTGLGLAIVESAMQQHRGWVKAD---DSPLGGLRLTLWLP 452
TG GL V +Q G +A G + + +P
Sbjct: 310 --------STGTGLQNVRERLQMLYG-TEAQIKLSEKQGKVNAMVLIP 348


102GX95_22075GX95_22120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GX95_22075119-2.438138hypothetical protein
GX95_22080019-1.866144MFS transporter
GX95_220851200.222808porin
GX95_220900201.229636hypothetical protein
GX95_220950201.434232GTP-binding protein TypA
GX95_22100-2151.491978type I glutamate--ammonia ligase
GX95_22105-2130.788135two-component system sensor histidine kinase
GX95_22110-1100.127588nitrogen regulation protein NR(I)
GX95_22115011-0.964256oxygen-independent coproporphyrinogen III
GX95_22120-112-1.826447GTPase-activating protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22075TCRTETA320.005 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.7 bits (72), Expect = 0.005
Identities = 31/161 (19%), Positives = 55/161 (34%), Gaps = 8/161 (4%)

Query: 186 AQLSYIFAATLFSLFGLLFMWLCYAGVKERYVEVKPVDKAQKPGLLQSFRAIAGNRPLFI 245
+ AA L L L +L K E +P+ + + L SFR G +
Sbjct: 159 PHAPFFAAAALNGLNFLTGCFLLPESHKG---ERRPLRR-EALNPLASFRWARGMTVVAA 214

Query: 246 LCIANLCTLGAFNVKLAIQVYYTQYVLN-DPILLSWM--GFFSMGCIFIGVFLMPGAVRR 302
L V A+ V + + + D + F + + + R
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLA-QAMITGPVAAR 273

Query: 303 FGKKKVYIGGLLIWVAGDLLNYFFGGGSVSFVAFSCLAFFG 343
G+++ + G++ G +L F G ++F LA G
Sbjct: 274 LGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGG 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22095TCRTETOQM1781e-50 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 178 bits (454), Expect = 1e-50
Identities = 100/448 (22%), Positives = 170/448 (37%), Gaps = 87/448 (19%)

Query: 4 NLRNIAIIAHVDHGKTTLVDKLLQQSGTFDARAETQE--RVMDSNDLEKERGITILAKNT 61
+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAHGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIIYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIIDHVPAPDVDLDGPLQMQISQLDYNNYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + L ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLTHLGLERIDSDIAEAGDIIAITGLG-ELN--ISDTICDPQNVEALPALSVDE 304
K+ ++ T + E D A +G+I+ + +LN + DT PQ +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQR----ERIENPL 343

Query: 305 PTVSMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGEL 364
P + + + D L LR +S G++
Sbjct: 344 PLLQTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKV 394

Query: 365 HLSVLIENMRRE-GFELAVSRPKVIFRE 391
+ V ++ + E+ + P VI+ E
Sbjct: 395 QMEVTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22105PF06580290.034 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.034
Identities = 33/189 (17%), Positives = 71/189 (37%), Gaps = 39/189 (20%)

Query: 171 IIEQADRLRNLVDRL-------LGPQHPGMHIT--ESIHKVAERVVALVSMELPDNVRLI 221
I+E + R ++ L L ++ + + V + + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYS-NARQVSLADELTVV-DSYLQLASIQFEDRLQFE 243

Query: 222 RDYDPSLPELPHDPEQIEQVLL-NIVRNALQALGPEGGEITLRTRTAFQLTLHGERYRLA 280
+P++ ++ P + Q L+ N +++ + L P+GG+I L+
Sbjct: 244 NQINPAIMDVQV-PPMLVQTLVENGIKHGIAQL-PQGGKILLKGT------KDNGTVT-- 293

Query: 281 ARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHAGK---IEFTSWPG 337
++VE+ G + ++ TG GL R + G I+ + G
Sbjct: 294 --LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 338 HTEFSVYLP 346
V +P
Sbjct: 340 KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22110HTHFIS5970.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 597 bits (1540), Expect = 0.0
Identities = 204/478 (42%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGNEVLAALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIEVNGPTTDMIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLERRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L++ + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRIHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETETALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFEASAPDSPSHLPPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GX95_22120SECA280.018 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.018
Identities = 14/74 (18%), Positives = 27/74 (36%)

Query: 11 KAFGKQRRKTREELNQEARDRKRLKKHRGHAPGSRAAGGNSASGGGNQNQQKDPRIGSKT 70
K + + EE+ + + R+ + +SA+ Q + ++G
Sbjct: 824 STLSKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRND 883

Query: 71 PVPLGVTEKVTQQH 84
P P G +K Q H
Sbjct: 884 PCPCGSGKKYKQCH 897



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.