PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeGambia94_24.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002332 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPGAM_00365HPGAM_00430Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_003654213.548416Urease accessory protein UreH
HPGAM_003704233.388819Urease accessory protein UreG
HPGAM_003753212.434430urease accessory protein
HPGAM_003802182.435309urease accessory protein UreE
HPGAM_003852212.353867urea transporter
HPGAM_003900182.354824urease subunit beta
HPGAM_00395-3111.448014urease subunit alpha
HPGAM_004001132.073661*lipoprotein signal peptidase
HPGAM_004052142.673107phosphoglucosamine mutase
HPGAM_004103172.92309530S ribosomal protein S20
HPGAM_004152142.211708peptide chain release factor 1
HPGAM_004203151.650372hypothetical protein
HPGAM_004253161.681236outer membrane protein (omp3)
HPGAM_004302170.937726hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00390UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00425FLAGELLIN403e-05 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 39.6 bits (92), Expect = 3e-05
Identities = 36/285 (12%), Positives = 77/285 (27%), Gaps = 16/285 (5%)

Query: 66 NTASSDSQEVTTLENTATTDSQTATTDQTYTKSTDTTVADAAKQVETDNTAVQSADTALQ 125
+ + + + T+ + ++ D + V + V TD TA D
Sbjct: 170 DGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV-- 227

Query: 126 SAVTQVENDAKATNFDEKTFESDQQAEQTAEANLQKAENQLTNDQNALETALKDQTPSTP 185
N T+ E D + A +A+ + E D T
Sbjct: 228 --YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTF 285

Query: 186 PTKETPPAKKDETSGTPSSSGGTGGDKHTASSGSPTPSTPPTPTPSTSGGSTITSQLTKD 245
D ++ G A T + +T+ S
Sbjct: 286 TIDTK--TGNDGNGKVSTTINGEKVTLTVADI---------TAGAANVDAATLQSSKNVY 334

Query: 246 TTMVNNLKSVSVSAMNTTLSGVTQLSQQTAAISNLLSGNSNLGSVITNAQGLSSAFNALE 305
T++VN + N + + + ++ N + ++ A +
Sbjct: 335 TSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMF 394

Query: 306 SAQNTLKGYLDSSSATIGQLTNGTNAVVGALDKAINQVDMALADL 350
+ + T + ++D A+++VD + L
Sbjct: 395 IDKTASGVSTLINEDAAAAKK-STANPLASIDSALSKVDAVRSSL 438


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00430CABNDNGRPT300.043 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 29.6 bits (66), Expect = 0.043
Identities = 22/148 (14%), Positives = 44/148 (29%), Gaps = 15/148 (10%)

Query: 437 NNMDNTHANDSKD---QGGNALINPNNATNDDHNDDHMDTNTTDTSNANDTPTDDKDAGG 493
+++ ++N +D ++ + + D + ++ N D GG
Sbjct: 268 DSVYGFNSNTDRDFYTATDSSKALIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGG 327

Query: 494 NNTGDMNNTDTGNTDTGNTDTGNTDDMSNMNNGND----DTGNANDDMGNSND-----MG 544
N + N G+ +D+ N+ ++ GN G D G
Sbjct: 328 ---LKGNVSIAHGVTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAG 384

Query: 545 DDMNNANDMNDDMGNSNDDMGDMGDMND 572
D D + D + D D
Sbjct: 385 RDTFVYGSGQDSTVAAYDWIADFQKGID 412


2HPGAM_00550HPGAM_00595Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_005501133.203749hypothetical protein
HPGAM_005550123.685651Methyl-accepting chemotaxis protein tlpB;
HPGAM_00560-1123.9327272',3'-cyclic-nucleotide 2'-phosphodiesterase
HPGAM_00565-2124.380657S-ribosylhomocysteinase
HPGAM_00570-2134.067780cystathionine gamma-synthase/cystathionine
HPGAM_00575-1142.757148cysteine synthase B
HPGAM_005901110.722234molecular chaperone DnaK
HPGAM_005952111.521472heat shock protein GrpE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00565LUXSPROTEIN2262e-79 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 226 bits (578), Expect = 2e-79
Identities = 58/145 (40%), Positives = 92/145 (63%), Gaps = 7/145 (4%)

Query: 5 VESFNLDHTKVKAPYVRVADRKKGVNGDLIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 64
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 65 IRNHAN----YVVDWSPMGCQTGFYLTVLNHDNYTEILEVLEKTMQDVLKAK---EVPAS 117
+RNH N ++D SPMGC+TGFY++++ + ++ + M+DVLK + ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 118 NEKQCGWAANHTLEGAQDLARAFLD 142
NE QCG AA H+L+ A+ +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00590SHAPEPROTEIN1493e-42 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 149 bits (378), Expect = 3e-42
Identities = 77/384 (20%), Positives = 141/384 (36%), Gaps = 86/384 (22%)

Query: 5 IGIDLGTTNSAMAVYEG----NEAKIIA-NKEGKNTTPSIVAFTDKGEILVGESAKRQAV 59
+ IDLGT N+ + V NE ++A ++ + S+ A VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAA--------VGHDAKQMLG 64

Query: 60 TNPEKTIYSIKRIMGLMFNEDKAKEAEKRLPYKIVDRNGACAIEISGKIYTPQEISAKIL 119
P I +I+ + ++G A + +++ +
Sbjct: 65 RTPGN-IAAIRPM-----------------------KDGVIA-----DFFVTEKMLQHFI 95

Query: 120 MKLKEDAESYLGESVTEAVITVPAYFNDSQRKATKEAGTIAGLNVLRIINEPTSAALAYG 179
++ ++ ++ VP +R+A +E+ AG + +I EP +AA+ G
Sbjct: 96 KQVHSNS---FMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAG 152

Query: 180 LDKKESEKIMVYDLGGGTFDVTVLETGDNVVEVLATGGDAFLGGDDFDNRVIDFLAAEFK 239
L E+ MV D+GGGT +V V+ V +GGD FD +I+++ +
Sbjct: 153 LPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYG 207

Query: 240 NETGIEIKNDVMALQRLKEAAENAKKELSSAM----ETEINLPFITADATGPKHLVKKLT 295
+ G + AE K E+ SA EI + P+ +
Sbjct: 208 SLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN-S 253

Query: 296 RAKFESLTEDL----------VEETISKIESVIKDAGLTKNEISEVVMVGGSTRIPKVQE 345
E+L E L +E+ ++ S I + G +V+ GG + +
Sbjct: 254 NEILEALQEPLTGIVSAVMVALEQCPPELASDISERG--------MVLTGGGALLRNLDR 305

Query: 346 RVKAFINKELNKSVNPDEVVAVGA 369
+ + + +P VA G
Sbjct: 306 LLMEETGIPVVVAEDPLTCVARGG 329


3HPGAM_00975HPGAM_01275Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_009750103.269011fumarate reductase iron-sulfur subunit
HPGAM_00980-1112.817129fumarate reductase flavoprotein subunit
HPGAM_00985-2120.564119fumarate reductase cytochrome b-556 subunit
HPGAM_00990014-0.594785triosephosphate isomerase
HPGAM_00995115-0.105739enoyl-(acyl carrier protein) reductase
HPGAM_01000520-0.633394UDP-3-O-[3-hydroxymyristoyl] glucosamine
HPGAM_010051029-1.288828hypothetical protein
HPGAM_010101131-1.941252hypothetical protein
HPGAM_010151332-1.922356hypothetical protein
HPGAM_010201030-1.266024hypothetical protein
HPGAM_01025830-0.943428hypothetical protein
HPGAM_01030628-2.069117hypothetical protein
HPGAM_01035525-3.842916hypothetical protein
HPGAM_01040625-2.954459hypothetical protein
HPGAM_01045625-3.183809hypothetical protein
HPGAM_01050626-1.669357hypothetical protein
HPGAM_01055727-1.414999hypothetical protein
HPGAM_01060827-0.689228hypothetical protein
HPGAM_010657260.918961phage-related CUP0950-like protein
HPGAM_010707261.396384hypothetical protein
HPGAM_010757240.580653mosaic CUP1551/CUP0957-like protein
HPGAM_010807220.051517hypothetical protein
HPGAM_01085621-0.059991hypothetical protein
HPGAM_01090621-0.746265hypothetical protein
HPGAM_01095721-3.531728hypothetical protein
HPGAM_01100218-1.101820hypothetical protein
HPGAM_011051180.534128hypothetical protein
HPGAM_011101181.023772hypothetical protein
HPGAM_011150200.878789hypothetical protein
HPGAM_01120-2192.307663hypothetical protein
HPGAM_01125-2173.198747S-adenosylmethionine synthetase
HPGAM_01130-2182.283356mulitfunctional nucleoside diphosphate kinase
HPGAM_01135-3181.448014hypothetical protein
HPGAM_01140-3171.37665750S ribosomal protein L32
HPGAM_01145-113-3.838616putative phosphate acyltransferase
HPGAM_01150012-3.5526343-oxoacyl-(acyl carrier protein) synthase III
HPGAM_01155013-4.528864hypothetical protein
HPGAM_01160013-4.550331hypothetical protein
HPGAM_01165011-3.560710hypothetical protein
HPGAM_01170011-3.780192hypothetical protein
HPGAM_01175-213-0.262007hypothetical protein
HPGAM_01180-212-1.261883putative lipopolysaccharide biosynthesis
HPGAM_01185-2130.077541putative lipopolysaccharide biosynthesis
HPGAM_01190-190.599886hypothetical protein
HPGAM_01195-2101.272930hypothetical protein
HPGAM_01200-1111.867767heat shock protein 90
HPGAM_01205-1112.806412hypothetical protein
HPGAM_01210-1123.083869succinyl-diaminopimelate desuccinylase
HPGAM_01215-1102.721081tRNA uridine 5-carboxymethylaminomethyl
HPGAM_01220-1113.385253sodium-dependent transporter
HPGAM_01225-1123.605805phosphatidate cytidylyltransferase
HPGAM_01230-1112.8354981-deoxy-D-xylulose 5-phosphate reductoisomerase
HPGAM_01245-2112.992894hypothetical protein
HPGAM_01250-1132.549266hypothetical protein
HPGAM_01255-1132.542614cysteine desulfurase
HPGAM_01260-2141.977170nifU-like protein
HPGAM_01265-1171.080161hypothetical protein
HPGAM_012700152.344107DNA repair protein RadA
HPGAM_012753152.042347bifunctional methionine sulfoxide reductase A/B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00995DHBDHDRGNASE584e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 58.1 bits (140), Expect = 4e-12
Identities = 61/263 (23%), Positives = 110/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSSYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E + +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNNIKQDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L +++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01030RTXTOXIND320.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.003
Identities = 29/152 (19%), Positives = 55/152 (36%), Gaps = 7/152 (4%)

Query: 31 SMQAALQSEQLALNEQAQGLQSEQLRAKMQIDFLGMQANLQSAKADTLNKLIQCQ-AMLK 89
+L EQ + Q Q Q E K + + L + A + + + ++ + +
Sbjct: 185 LRLTSLIKEQFS-TWQNQKYQKELNLDKKRAERLTVLARINRYENLS--RVEKSRLDDFS 241

Query: 90 SLKDNAMINRANAFVSLLQ-VQANAANTITFHNFETAFKIIAQIGSEYDQITS-YNRNVS 147
SL I + + V+A + E I EY +T + +
Sbjct: 242 SLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 148 VKEKEQTNELKTLLNELGKELEKLNQQSEVNS 179
K ++ T+ + L EL K E+ Q S + +
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQ-QASVIRA 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01065BONTOXILYSIN290.047 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 29.5 bits (66), Expect = 0.047
Identities = 9/53 (16%), Positives = 22/53 (41%)

Query: 303 YFDPEYLKKVFTHELGEMNIYIFVDNALSLSQNADNRAIVIVGVENYNESVRY 355
+ D + +K F ++ ++ I L L + + +I + N ++Y
Sbjct: 810 FLDIQSIKNFFNSQVEQVMKEILSPYQLLLFASKGPNSNIIEDISGKNTLIQY 862


4HPGAM_01660HPGAM_01810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_016601153.77951750S ribosomal protein L21
HPGAM_016651153.85659350S ribosomal protein L27
HPGAM_016701153.761267dipeptide ABC transporter periplasmic
HPGAM_016751144.012775dipeptide permease protein
HPGAM_016800133.287255dipeptide permease protein
HPGAM_01685-3132.915681peptide ABC transporter ATP-binding protein
HPGAM_01690-3142.711720dipeptide ABC transporter
HPGAM_01695-2132.156919GTPase CgtA
HPGAM_01700-1141.775659hypothetical protein
HPGAM_017050182.301086hypothetical protein
HPGAM_017101192.857205glutamate-1-semialdehyde aminotransferase
HPGAM_017153192.005859hypothetical protein
HPGAM_017203161.759925hypothetical protein
HPGAM_017252150.901951hypothetical protein
HPGAM_017301150.032376polysaccharide deacetylase
HPGAM_01735016-2.350518hypothetical protein
HPGAM_01740018-3.076979hypothetical protein
HPGAM_01745018-3.181836nitrite extrusion protein (narK)
HPGAM_01750020-3.485419putative ABC transporter permease
HPGAM_01755-118-2.361967ABC transporter, permease
HPGAM_01760-214-1.779699ABC transporter, ATP-binding protein
HPGAM_01765314-1.246680hypothetical protein
HPGAM_01770214-1.235813putative heme iron utilization protein
HPGAM_01775114-1.517361arginyl-tRNA synthetase
HPGAM_01780314-1.158169Sec-independent protein translocase protein
HPGAM_01785114-1.444293guanylate kinase
HPGAM_01790114-1.631072poly E-rich protein
HPGAM_01795-114-2.180801nuclease NucT
HPGAM_01800114-2.165435putative Outer membrane protein
HPGAM_01805216-2.208952flagellar basal body L-ring protein
HPGAM_01810214-1.879088CMP-N-acetylneuraminic acid synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01745TCRTETB320.005 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.8 bits (72), Expect = 0.005
Identities = 33/177 (18%), Positives = 71/177 (40%), Gaps = 1/177 (0%)

Query: 53 GSFLIQFLSPLMSLESIAKISFGLIALSFLVCYFDSIPFFWLWIWRFIAGVASSALMILV 112
G+ + LS + ++ + + ++ + F L + RFI G ++A LV
Sbjct: 65 GTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALV 124

Query: 113 APLSLPYVKEHKKALVGGLIFSAVGIGSVFSGFVLPWISSYNIKWAWIFLGGSCLIAFIL 172
+ Y+ + + GLI S V +G + I+ Y I W+++ L I +
Sbjct: 125 MVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHY-IHWSYLLLIPMITIITVP 183

Query: 173 SLVGLKTRSLRKKSVKKEESAFKIPFHLWLLLISCALNAIGFLPHTLFWVDYLIRHL 229
L+ L + +R K + + + ++ +I FL ++ ++H+
Sbjct: 184 FLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHI 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01785PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01790IGASERPTASE679e-14 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 67.4 bits (164), Expect = 9e-14
Identities = 43/267 (16%), Positives = 90/267 (33%), Gaps = 21/267 (7%)

Query: 140 ESLGDLEALAKEEPNNEEQL--LPTLNEQEGETLKEETQEEIKKEEVKEMQEEIKEKEKQ 197
+ +++A P+N E++ + E E K+ + +++ E+
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 198 EVAEKPQDEEKPKDDETQGSVETPKDKEVSKELETQEQVETPKEEKQEQEPIKEQEPIKE 257
Q+ E K+ ++ T ++ ET+E T +E E ++E K
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE---KEEKAKV 1114

Query: 258 ETQEIKEEKQEKTQDSPSAQELEAMQELVKEIQENSNDQENKKETQETQETTETPQDIET 317
ET++ +E + +Q SP ++ E +Q + +EN K+ +T T +T Q +
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 318 QELEIPKEEETQEVAEKTQAQGLEKEEIAETPQEKEIQETQDETPQELEVQDEKLQEKET 377
+ + + + E P+ TQ E
Sbjct: 1175 TSSNVEQPVTESTTVNTGNS-------VVENPENTTPATTQPTVNSESS---------NK 1218

Query: 378 PKDENMQESTQNLQEKETQELETPQAQ 404
PK+ + + E +
Sbjct: 1219 PKNRHRRSVRSVPHNVEPATTSSNDRS 1245



Score = 63.2 bits (153), Expect = 2e-12
Identities = 62/324 (19%), Positives = 104/324 (32%), Gaps = 41/324 (12%)

Query: 193 EKEKQEVAEKPQDEEKPKDDETQGSVETPKDKEVSKELETQEQVETPKEEKQEQEPIKEQ 252
E EK+ + P + + ++E+++ E P + E + E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 253 EPIKEETQEIKEEKQEKTQDSPSAQELEAMQELVKEIQENSNDQENKKETQETQETTETP 312
+ +T E E+ +T EA + Q N Q ET+ETQ T
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS-GSETKETQTTET-- 1100

Query: 313 QDIETQELEIPKEEETQEVAEKTQAQGLEKEEIAETPQEKEIQETQDETPQELEVQDEKL 372
KE T E EK + E E+ E P+ + E + ++ Q E
Sbjct: 1101 -----------KETATVEKEEKAKV---ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 373 QEKETPKDENMQESTQNLQEKETQELETPQAQDETPQEDHYESIEDIPEPVMAKAMGEEL 432
+E + + + Q + +T Q ET + +PV
Sbjct: 1147 RENDP------TVNIKEPQSQTNTTADTEQPAKETSSN--------VEQPVTESTTVNTG 1192

Query: 433 PFLNEAVAKTPNNENATETPKENTTETS-----KNENDTETPQEKEESDKTSSPLELRLN 487
+V + P N T +E+S ++ + E TSS +
Sbjct: 1193 N----SVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248

Query: 488 LQDLLKSLNQESFKSLLENKTLSI 511
L DL S N + S K +
Sbjct: 1249 LCDLT-STNTNAVLSDARAKAQFV 1271



Score = 38.9 bits (90), Expect = 5e-05
Identities = 35/209 (16%), Positives = 62/209 (29%), Gaps = 15/209 (7%)

Query: 109 QKKLGSNASELEPSQNLDPTQEVLETNWDELESLGDLEALAKEEPNNEEQLLPTLNEQEG 168
+ + E N+ + E E+ KE E+ E++
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK-------EEKA 1112

Query: 169 ETLKEETQEEIKKE-EVKEMQEEIKEKEKQEVAEKPQD-----EEKPKDDETQGSVETPK 222
+ E+TQE K +V QE+ + + Q + D +E T E P
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 223 DKEVSKELETQEQVETPKEEKQEQEPIKEQEPIKEETQEIKEEKQEKTQDSPSAQELEAM 282
KE S +E T + TQ + + + + ++
Sbjct: 1173 -KETSSNVEQPVTESTT-VNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230

Query: 283 QELVKEIQENSNDQENKKETQETQETTET 311
V+ +SND+ T T
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNA 1259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01805FLGLRINGFLGH1951e-64 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 195 bits (496), Expect = 1e-64
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


5HPGAM_02435HPGAM_02490Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_024353130.987697GTP-binding protein TypA
HPGAM_024406170.227146putative type II DNA modification enzyme
HPGAM_024455181.093602type II restriction endonuclease
HPGAM_024504191.655156type II DNA modification (methyltransferase)
HPGAM_024652180.956447catalase-like protein
HPGAM_024703180.352776hypothetical protein
HPGAM_02475317-1.113689putative Outer membrane protein
HPGAM_02480417-1.229857hypothetical protein
HPGAM_02485417-1.098518hypothetical protein
HPGAM_02490318-1.545816hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02435TCRTETOQM1972e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 197 bits (503), Expect = 2e-57
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLEKERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LE++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


6HPGAM_02640HPGAM_02785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_02640215-2.597510GTPase Era
HPGAM_02645415-2.750816hypothetical protein
HPGAM_02650619-3.216086hypothetical protein
HPGAM_02655819-2.392377putative cag island protein
HPGAM_02660819-2.406299cag island protein
HPGAM_02665917-2.126757cag island protein
HPGAM_02670815-2.014056cag pathogenicity island protein (cag4)
HPGAM_02675816-2.331235hypothetical protein
HPGAM_02680818-2.359016CAG pathogenicity island protein 5
HPGAM_02685819-2.796365cag island protein, DNA transfer protein
HPGAM_02690820-2.936105cag pathogenicity island protein (cag6)
HPGAM_02695920-2.765319cag pathogenicity island protein Y VirB10-like
HPGAM_027001025-4.312065cag pathogenicity island protein X
HPGAM_02705927-4.565520cag pathogenicity island protein W
HPGAM_027101122-5.479861cag pathogenicity island protein V
HPGAM_027151022-5.333045cag island protein
HPGAM_027201121-5.094664CAG pathogenicity island protein T
HPGAM_02725717-4.347057CAG pathogenicity island protein S
HPGAM_02730618-2.567535hypothetical protein
HPGAM_02735617-2.371568cag island protein
HPGAM_02740719-2.536391cag pathogenicity island protein (cagN, cag17)
HPGAM_02745519-2.572095cag pathogenicity island protein L
HPGAM_02750519-2.804284cag pathogenicity island protein I
HPGAM_02755519-2.933302cag island protein
HPGAM_02760620-4.027858cag island protein
HPGAM_02765721-4.045131cag island protein
HPGAM_02770522-2.432276DNA transfer protein
HPGAM_02775625-1.944940cag pathogenicity island protein D
HPGAM_02780626-1.474999cag pathogenicity island protein (cagC, cag25)
HPGAM_02785321-1.251565cag pathogenicity island protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02640PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 48/94 (51%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELRVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 TRKQVLQKLQEYQKYSSQFLDLVPLSAKKSQNLN 161
++ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02660ANTHRAXTOXNA290.011 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/30 (30%), Positives = 18/30 (60%)

Query: 104 IDLAKQNERKKDLEKEKKELLNKTEKQKIK 133
++ ++ + D+++ K NKTEK+K K
Sbjct: 31 VNAMNEHYTESDIKRNHKTEKNKTEKEKFK 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02695IGASERPTASE455e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.7 bits (105), Expect = 5e-06
Identities = 38/231 (16%), Positives = 84/231 (36%), Gaps = 5/231 (2%)

Query: 595 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKQECEKLLTPEAKKKLEEAKKSIRVYLDC 654
P ++ + + ++KT + ++ T + ++ +EAK +++
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 655 VSKAKNEAEKKECEKLLTPE-AKKLLEEEAKESVKAYLDCVSQAKTEAEKQECEKLLTPE 713
A++ +E KE + T E A EE+AK + + + KQE + + P+
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 714 AKKKLEEAKKSIRVYLDCVSKAKNEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQ 773
A+ E + S+ A+ ++ K + ++ + E + + +
Sbjct: 1143 AEPAREND--PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 774 AKTEAEKQECEKLLTPEAKKKLEEAKKSIRVYLDCVSQAKTEAEKQECEKL 824
T A Q + K ++S+R V A T + + L
Sbjct: 1201 NTTPATTQPTVNSESSNKPK--NRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 37.4 bits (86), Expect = 7e-04
Identities = 43/241 (17%), Positives = 91/241 (37%), Gaps = 8/241 (3%)

Query: 849 KAKNEAERKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKLLTPEARKL 908
+ NE + E + P A E E+V S+ + E+ E T + R++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATET--TAQNREV 1068

Query: 909 LEESKKSVKAYLDC--VSQAKNEAERKECEKLLTPEARKLLEEAKESVKAYKDCVSRARN 966
+E+K +VKA V+Q+ +E + + + + E+AK + ++
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQ 1128

Query: 967 EKEKKECEKLLTPEAKKKLEQQVLDCLKNAK----TEAEKKRCVKDLPKDLQKKVLAKES 1022
K+E + + P+A+ E +K + T A+ ++ K+ ++++ V +
Sbjct: 1129 VSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 1023 VRVYLDCVSKAKNEAERKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEK 1082
V V +N + + + K + SV++ V A +
Sbjct: 1189 VNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248

Query: 1083 L 1083
L
Sbjct: 1249 L 1249



Score = 37.4 bits (86), Expect = 7e-04
Identities = 40/221 (18%), Positives = 84/221 (38%), Gaps = 6/221 (2%)

Query: 994 KNAKTEAEKKRCVKDLP-KDLQKKVLAKESVRVYLDCVSKAKNEAERKECEKLLTPEARK 1052
+ + E+ V + P ++ + V + ++K + ++ T + R+
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 1053 LLEEAKESVKAYKDCVSRARNEKEKKECEKLLTPEARKLLEQEVKKSVKAYKDCVSR-AR 1111
+ +EAK +VKA A++ E KE + T E + ++E K V +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 1112 NEKEKKECEKLLTPEARKLLENQALDCLKNAK----TEAEKKRCVKDLPKDLQKKVLAKE 1167
K+E + + P+A EN +K + T A+ ++ K+ ++++ V
Sbjct: 1128 QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTEST 1187

Query: 1168 SVRVYLDCVSRARNEKEKKECEKLLTPEARKLLEEAKESVR 1208
+V V N + + + K + SVR
Sbjct: 1188 TVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVR 1228



Score = 35.8 bits (82), Expect = 0.002
Identities = 36/237 (15%), Positives = 80/237 (33%), Gaps = 23/237 (9%)

Query: 1277 EESKKSVKAYLDCVSKAKNEAERKECEKLLTPEARKLLEEAKESVKAYKDCLSQARNETE 1336
S+ + + ++K + ++ T + R++ +EAK +VKA A++ +E
Sbjct: 1032 TPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSE 1091

Query: 1337 RRACEKLLTPEARKLLEQEVKKSVKAYLDCVSRARNEKEKQECEKYLTPEARKFLEKQRQ 1396
+ + T E + ++E +A+ E EK + +T + +Q
Sbjct: 1092 TKETQTTETKETATVEKEE-------------KAKVETEKTQEVPKVTSQV-----SPKQ 1133

Query: 1397 QKDKAIKDCLKNADPNDRAAIMKCLDGL-----SDEEKLKYLQEAREKAVLDCLKTARTD 1451
++ + ++ + A ND +K E+ K E+ V + +
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGN 1193

Query: 1452 EEKRKCQNLYSDLIQEIQNKKAQNKQNQLSKTERLHQASECLDNLDDPTDQEAIEQC 1508
+N Q N ++ NK + D+ + C
Sbjct: 1194 SVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 35.4 bits (81), Expect = 0.003
Identities = 35/237 (14%), Positives = 84/237 (35%), Gaps = 7/237 (2%)

Query: 475 AKTDEERNECLKNIPQDLQKELLADMSVKAYKDCVSRARNEKEKKECEKLLTPEAKKKLE 534
A+ DE E +A+ S + K ++ E + + EAK ++
Sbjct: 1018 ARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVK 1077

Query: 535 --QQVLDCLKNAKTDEERKKCLKDLPKDLQSDILAKESLKAYKDCASQAKTEAEKKECEK 592
Q + ++ +E + ++ + AK + ++ + K+E +
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 593 LLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKQECEKLLTPEAKKKLEEAKKSIRVYL 652
+ P+A+ E + ++K SQ T A+ ++ K + ++ + E+
Sbjct: 1138 TVQPQAEPARENDPTVNIKEP---QSQTNTTADTEQPAKETSSNVEQPVTESTTVNT--G 1192

Query: 653 DCVSKAKNEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKQECEKL 709
+ V + + + E+ + + SV++ V A T + + L
Sbjct: 1193 NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 33.9 bits (77), Expect = 0.009
Identities = 24/156 (15%), Positives = 63/156 (40%), Gaps = 4/156 (2%)

Query: 796 EEAKKSIRVYLDCVSQAKTEAEKQECEKLLTPEAKKLLEESKKSVKAYLDC--VSKAKNE 853
++ + V + ++KT + ++ T + +++ +E+K +VKA V+++ +E
Sbjct: 1032 TPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSE 1091

Query: 854 AERKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKLLTPEARKLLEESK 913
+ + + + E+AK + ++ K+E + + P+A E
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND- 1150

Query: 914 KSVKAYLDCVSQAKNEAERKECEKLLTPEARKLLEE 949
+ SQ A+ ++ K + + + E
Sbjct: 1151 -PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185



Score = 33.5 bits (76), Expect = 0.010
Identities = 35/209 (16%), Positives = 77/209 (36%), Gaps = 16/209 (7%)

Query: 9 ETSKKAQQDSPQDLSNEEATEANHFEDLLKEESSDNHLDNPTETKTHFDEDKLEETQTQM 68
E SK Q+ + + ++ATE + +E+ N N + + +ETQT
Sbjct: 1042 ENSK--QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT- 1098

Query: 69 DSGGNETSESSNGSLADKLFKKARKLVDDKRPFTQQKSLDEEAQKLNEEDDQENNEHQEE 128
ET E++ + KA+ + + + S Q+ +E + +E
Sbjct: 1099 -----ETKETATVEKEE----KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 129 TQTDLIDGETSEKAQQDSPQDLSNEEATEANHFEDLLKEESSDNHLDNPTESSDNHLDNS 188
T I ++Q ++ D +++ E + E ++ N ++ E+ +N +
Sbjct: 1150 DPTVNIK---EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206

Query: 189 AE-TKTQETKTHFDEDKLEEITDDSNDQE 216
+ T E+ + ++ E
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVE 1235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02700TYPE4SSCAGX8640.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 864 bits (2234), Expect = 0.0
Identities = 516/522 (98%), Positives = 517/522 (99%), Gaps = 1/522 (0%)

Query: 1 MEQAFFKKIVGCFCLGYLFLSSVIEAAP-DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 59
M QAFFKKIVGCFCLGYLFLSS IEA DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 60 LDNVTVIQLEKDETISYITTGFNKGWNIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 119
LDNVTVIQLEKDETISYITTGFNKGW+IVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 120 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 179
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 180 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 239
ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 240 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 299
EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 300 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 359
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 360 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 419
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 420 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 479
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 480 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 521
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02710PF043351173e-34 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 117 bits (294), Expect = 3e-34
Identities = 43/205 (20%), Positives = 73/205 (35%), Gaps = 10/205 (4%)

Query: 27 KLNKTNRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02740TYPE4SSCAGX320.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.1 bits (72), Expect = 0.003
Identities = 30/119 (25%), Positives = 56/119 (47%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTLYQRHDDKEITKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K + D KE+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSVQKKAAKHRGLQELNETNANPLNDNPNGNSSTETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N S K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02770ACRIFLAVINRP320.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.7 bits (72), Expect = 0.015
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGINPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


7HPGAM_03460HPGAM_03575Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_03460-117-4.017873hypothetical protein
HPGAM_03465-1100.249816hypothetical protein
HPGAM_03470-1100.512408putative Outer membrane protein
HPGAM_03475-1100.296127aspartate aminotransferase
HPGAM_03480-110-0.120383phage integrase family site specific
HPGAM_034852100.682294methylated-DNA--protein-cysteine
HPGAM_034901101.449954hypothetical protein
HPGAM_034950101.655638putative lipopolysaccharide biosynthesis
HPGAM_03500291.292716ribonucleotide-diphosphate reductase subunit
HPGAM_03505291.170731hypothetical protein
HPGAM_035101100.768698hypothetical protein
HPGAM_035150100.464865bifunctional N-acetylglucosamine-1-phosphate
HPGAM_035202110.861888flagellar biosynthesis protein FliP
HPGAM_035252111.999131iron(III) dicitrate transport protein
HPGAM_035302122.431221ferrous iron transport protein B
HPGAM_035352142.882209hypothetical protein
HPGAM_035653143.471941hypothetical protein
HPGAM_035702133.596681N-methylhydantoinase
HPGAM_035750124.193347hydantoin utilization protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03510PF07132331e-04 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 33.1 bits (75), Expect = 1e-04
Identities = 19/45 (42%), Positives = 31/45 (68%)

Query: 37 IGGGVGAGMGGAMGGMIGALGGPWGTVFGAGIGGGIGAYSGAEIG 81
+G +G G+GG +GG+ +LGG G + G G+GGG+G+ G+ +G
Sbjct: 61 MGSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLG 105



Score = 31.6 bits (71), Expect = 4e-04
Identities = 18/49 (36%), Positives = 28/49 (57%)

Query: 34 GKLIGGGVGAGMGGAMGGMIGALGGPWGTVFGAGIGGGIGAYSGAEIGD 82
G ++GGG+G G+GG + G GG G G G+G +G+ G+ +G
Sbjct: 62 GSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLGSALGG 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03520FLGBIOSNFLIP2742e-95 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 274 bits (701), Expect = 2e-95
Identities = 112/245 (45%), Positives = 162/245 (66%), Gaps = 2/245 (0%)

Query: 1 MRFFIFLILICPLICPLMSADSTLPSVNLSLNAPSDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYMD 120
L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+ +
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE 118

Query: 121 KKISYTEAFEKSALPFKKFMLKNTREKDLALFFRIRNLPNPKTPDDVSLSVLIPAFMISE 180
+KIS EA EK A P ++FML+ TRE DL LF R+ N + P+ V + +L+PA++ SE
Sbjct: 119 EKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSE 178

Query: 181 LKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTEN 240
LKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL +
Sbjct: 179 LKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGS 238

Query: 241 LVASF 245
L SF
Sbjct: 239 LAQSF 243


8HPGAM_03680HPGAM_03720Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_03680213-0.246348ABC-type transport system, ATP binding protein
HPGAM_03685113-0.789363hypothetical protein
HPGAM_036900120.167832DNA polymerase III subunits gamma and tau
HPGAM_036952121.843076hypothetical protein
HPGAM_037001123.015561hypothetical protein
HPGAM_037051133.048629hypothetical protein
HPGAM_037102173.025352hypothetical protein
HPGAM_037152163.142178putative Outer membrane protein
HPGAM_037201143.178265anaerobic C4-dicarboxylate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03690IGASERPTASE356e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 6e-04
Identities = 24/94 (25%), Positives = 37/94 (39%), Gaps = 8/94 (8%)

Query: 487 LENKSAPEETKEVKDFKISSLREKILPKP-------TTETTAEIQEKEIKEKEVQENETK 539
N A + + +I+ + E +P P TTET AE ++E K E E +
Sbjct: 1000 PNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDAT 1059

Query: 540 ETKETQPKEAPTALQEFMANHSEL-IEEIKSEFE 572
ET + A A AN + + SE +
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03705SECA280.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.013
Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 71 RIARKNLSKMSEEDFKKMREEVRK--ELEEKTKGLSDEEIKAK 111
++ K ++ ++MR+ V +E + + LSDEE+K K
Sbjct: 4 KLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGK 46


9HPGAM_04575HPGAM_04700Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_04575419-4.458588hypothetical protein
HPGAM_04580725-7.050428hypothetical protein
HPGAM_04585725-7.345636hypothetical protein
HPGAM_045902221.588477hypothetical protein
HPGAM_045952222.862980hypothetical protein
HPGAM_046001214.126986hypothetical protein
HPGAM_046051214.744434hypothetical protein
HPGAM_046101204.720448hypothetical protein
HPGAM_046151215.328573outer membrane protein
HPGAM_046201203.552293**hypothetical protein
HPGAM_046250172.690806hydrogenase isoenzymes formation protein HypD
HPGAM_046300151.889573hydrogenase assembly chaperone
HPGAM_046351142.096071hydrogenase/urease nickel incorporation protein
HPGAM_046401132.055296hypothetical protein
HPGAM_046451122.512939hypothetical protein
HPGAM_046501131.791125acetate kinase A/propionate kinase 2
HPGAM_046551121.160536phosphotransacetylase
HPGAM_046602150.589709hypothetical protein
HPGAM_046650131.598688flagellar basal body rod modification protein
HPGAM_046700152.282298flagellar hook protein FlgE
HPGAM_046751161.669589putative restriction endonuclease
HPGAM_046801192.613600adenine specific DNA methyltransferase
HPGAM_046951173.418164hypothetical protein
HPGAM_047001173.392983Outer membrane porin and adhesin HopC; putative
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04580BINARYTOXINA260.041 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 25.8 bits (56), Expect = 0.041
Identities = 19/73 (26%), Positives = 34/73 (46%), Gaps = 8/73 (10%)

Query: 19 LNQKIELEVFDLVVESLRNQIPLDKRFKDHALVGTYKGCRE-----CHIK-PDV--LLVY 70
+I LE F+ + E++++++ FKD +L G + H+K P +L Y
Sbjct: 151 NQNEISLEKFNELKETIQDKLFKQDGFKDVSLYEPGNGDEKPTPLLIHLKLPKNTGMLPY 210

Query: 71 RVKNNVLTLVRLG 83
N+V TL+
Sbjct: 211 INSNDVKTLIEQD 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04650ACETATEKNASE478e-171 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 478 bits (1233), Expect = e-171
Identities = 192/405 (47%), Positives = 271/405 (66%), Gaps = 9/405 (2%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKLVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVDEKVMREIGNLS 118
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 119 ILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHASMPSYAYMYALPYGLYEKYQIRRY 178
LAPLHNPAN+ GI+ + P +P +AVFDTAFH +MP YAY+Y +PY Y KY+IR+Y
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKY 176

Query: 179 GFHGTSHHYVAKEAAKFLNIPYEEFNAISLHLGNGSSAAAIQNGKSVDTSMGLTPLEGLI 238
GFHGTSH YV++ AA+ LN P E I+ HLGNGSS AA++NGKS+DTSMG TPLEGL
Sbjct: 177 GFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLA 236

Query: 239 MGTRCGDIDPTVVEYIAQCANKSLEEVIKILNHESGLKGICG-DNDARNIEARK-EKGDK 296
MGTR G IDP+++ Y+ + N S EEV+ ILN +SG+ GI G +D R++E + GDK
Sbjct: 237 MGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDK 296

Query: 297 QARLAFEMCTYRIKKYIGAYMVVLKKVDAIIFTGGLGENYSALRESVCEGLENLGIVLNK 356
+A+LA + YR+KK IG+Y + VD I+FT G+GEN +RE + +GLE LG L+K
Sbjct: 297 RAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDK 356

Query: 357 TINNEPGSGLVDLSQPNTKIQILRIPTDEELEIALQAKEMVEKLK 401
N G +S ++K+ ++ +PT+EE IA +++VE LK
Sbjct: 357 EKNKVRGEE-AIISTADSKVNVMVVPTNEEYMIAKDTEKIVESLK 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04660IGASERPTASE371e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.4 bits (86), Expect = 1e-04
Identities = 43/234 (18%), Positives = 74/234 (31%), Gaps = 13/234 (5%)

Query: 278 KRDKTLSKKNSEKTPIHAKTQTTAPSATPENAPKISLKTLPLMPLIGANPPNDNIPTPLE 337
KR++T+ N Q PS N + P+ P A TP E
Sbjct: 987 KRNQTVDTTNITTP---NNIQADVPSVPSNNEEIARVDEAPVPPPAPA--------TPSE 1035

Query: 338 KEEKTKEVSDNKEKAKETSSSAQSAQNTPASDKTSENKNIAPKETIKHFTQQLKQEIQEY 397
E E S + K E + + + E K+ T + Q E +E
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 398 KPPMSKISMDLFPKELGKVEVTIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNTLGF 457
+ +K + + +E KVE + + V +T + + + T+
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 458 EGVDLSFSQDSSKEQQAPKDQPKEPFKEQELTPLKENALKSYQENTDHENQETS 511
+ + + EQ A + E T N S EN ++ T+
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTT--VNTGNSVVENPENTTPATT 1207



Score = 35.8 bits (82), Expect = 4e-04
Identities = 51/279 (18%), Positives = 96/279 (34%), Gaps = 45/279 (16%)

Query: 106 HPTAEHEAQEVHETDPKTPNETLSKNEKKPNEALSNAHQTNLPNKNPITPTNHANNAIKT 165
+P E Q V T+ TPN + P SN + ++ P+ P A + T
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVP----SNNEEIARVDEAPVPPPAPATPSETT 1037

Query: 166 PTTPTHSTKEPKTLKDIQTLSQKHDLNASNIQAATTPENKNPLNASDQLALKATQTPTNH 225
T +S +E KT++ N Q AT +N + K ++
Sbjct: 1038 ETVAENSKQESKTVE-------------KNEQDATETTAQN------REVAKEAKSNVKA 1078

Query: 226 TLAKNDAKNTANLSSVLQSLEKKEPQNKERANPQNSEKKTPPLKEALQMNAIKRDKTLSK 285
N+ + + + Q+ E KE E+ E ++ + + K
Sbjct: 1079 NTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET--------------EKTQEVPK 1124

Query: 286 KNSEKTPIHAKTQTTAPSATP--ENAPKISLK------TLPLMPLIGANPPNDNIPTPLE 337
S+ +P +++T P A P EN P +++K A + N+ P+
Sbjct: 1125 VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 338 KEEKTKEVSDNKEKAKETSSSAQSAQNTPASDKTSENKN 376
+ + E + T+ + S +N++
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04670FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


10HPGAM_05065HPGAM_05245Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_05065317-1.703824cell division protein FtsA
HPGAM_05070220-4.184710cell division protein FtsZ
HPGAM_05085318-5.811245hypothetical protein
HPGAM_05090318-5.568189hypothetical protein
HPGAM_05095422-6.649026hypothetical protein
HPGAM_05100421-6.661686DNA transfer protein
HPGAM_05105319-5.958640topoisomerase I
HPGAM_05110518-5.938489conjugal plasmid transfer system protein
HPGAM_05115424-7.282840competence protein
HPGAM_05120427-8.844979hypothetical protein
HPGAM_05125527-8.748064hypothetical protein
HPGAM_05130429-9.895991hypothetical protein
HPGAM_05135326-7.762414hypothetical protein
HPGAM_05140225-7.350824VirB11 type IV secretion ATPase
HPGAM_05145123-7.351072hypothetical protein
HPGAM_05150219-6.362726hypothetical protein
HPGAM_05155218-5.864945hypothetical protein
HPGAM_05160316-4.214023VirD4 coupling protein
HPGAM_05165418-3.755812hypothetical protein
HPGAM_05170216-1.988368hypothetical protein
HPGAM_05185216-1.721385hypothetical protein
HPGAM_05190419-1.380730hypothetical protein
HPGAM_05195419-1.355707chromosome partitioning protein ParA
HPGAM_05210319-2.779183hypothetical protein
HPGAM_05235319-2.429281hypothetical protein
HPGAM_05240521-4.245509hypothetical protein
HPGAM_05245521-4.071366hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05065SHAPEPROTEIN411e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 40.5 bits (95), Expect = 1e-05
Identities = 38/176 (21%), Positives = 66/176 (37%), Gaps = 12/176 (6%)

Query: 211 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 264
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 265 HMLNTPFPYAEEVKIKYGDLSFESGAETPSQSVQIPTTGSDGNESHIVPLSEIQTIMRER 324
+ AE +K + G S G E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 325 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 377
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05235RTXTOXINA340.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 33.8 bits (77), Expect = 0.003
Identities = 37/184 (20%), Positives = 59/184 (32%), Gaps = 28/184 (15%)

Query: 175 SSAYDNNPNSPSNNAINGKDGANGSNGYGVN----GNDGVNGNDGVNGNDGVNGSSGSNG 230
++ D S + +G DG + G N G+ G + G NG+D + G G++
Sbjct: 725 TTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDK 784

Query: 231 VNESHSNNNAVGSGIDT--------------DGVLGVDGVNGSSSS---SGGS-----VG 268
+ NN G D G G D + GS + GG G
Sbjct: 785 LIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKG 844

Query: 269 GYENNFTNHGSTNNNTGGYDNFNNGSSSGGSLGNGGLFPIPFGNGDTNNSNNSTNTTSPT 328
GY N+ + S + D + G SL + + F + +
Sbjct: 845 GYGNDIYRYLSGYGHHIIDD--DGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNVLS 902

Query: 329 NGSS 332
G
Sbjct: 903 IGHK 906


11HPGAM_05670HPGAM_05735Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_0567029-0.735882glucokinase
HPGAM_05675311-1.909957zinc-dependent alcohol dehydrogenase
HPGAM_05680214-3.019551putative lipopolysaccharide biosynthesis
HPGAM_05685416-1.951740lipopolysaccharide biosynthesis protein
HPGAM_05690516-0.357567hypothetical protein
HPGAM_056953142.038838hypothetical protein
HPGAM_057001163.143589putative Outer membrane protein
HPGAM_057050153.231337hypothetical protein
HPGAM_05710-1132.804075pyruvate flavodoxin oxidoreductase subunit
HPGAM_05715-1122.363785pyruvate flavodoxin oxidoreductase subunit
HPGAM_05720-1112.054746pyruvate flavodoxin oxidoreductase subunit
HPGAM_057251120.024271pyruvate ferredoxin oxidoreductase, beta
HPGAM_05730213-0.475352adenylosuccinate lyase
HPGAM_05735217-1.147260putative Outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05710YERSSTKINASE290.011 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.9 bits (64), Expect = 0.011
Identities = 13/33 (39%), Positives = 22/33 (66%)

Query: 80 IENIFANEKEDTTYIITSYLNKEELFEKKPELK 112
+ N+ A+EK D ++++ L+ E FEK PE+K
Sbjct: 314 VGNLGASEKSDVFLVVSTLLHCIEGFEKNPEIK 346


12HPGAM_05780HPGAM_05805Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_05780319-0.919403type II DNA modification (methyltransferase)
HPGAM_05785315-0.991996hypothetical protein
HPGAM_05790312-1.931296hypothetical protein
HPGAM_05795413-1.689499FKBP-type peptidyl-prolyl cis-trans isomerase
HPGAM_05800414-2.478398hypothetical protein
HPGAM_05805414-2.137983peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05780PRPHPHLPASEC310.005 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 31.1 bits (70), Expect = 0.005
Identities = 22/105 (20%), Positives = 37/105 (35%), Gaps = 8/105 (7%)

Query: 137 GYTTHYQILNSADFQLAQKRERLYIVGFRKDLKHPFNFPLGLANDYCFKDFLDADNEYYL 196
G TH I+ L + RK+L+ L + D+ D Y
Sbjct: 35 GTGTHAMIVTQGVSILENDLSKNEPESVRKNLEILKENMHELQLGSTYPDY---DKNAY- 90

Query: 197 DVSNATFQRYLRNPYNHNRVSLENILTLENAVLDTRQSDLRLYFN 241
+Q + +P N S +N L ++ DT +S +R +
Sbjct: 91 ----DLYQDHFWDPDTDNNFSKDNSWYLAYSIPDTGESQIRKFSA 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05805OMPADOMAIN1471e-45 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (372), Expect = 1e-45
Identities = 48/169 (28%), Positives = 75/169 (44%), Gaps = 24/169 (14%)

Query: 22 KMDNKTVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAIESGTIIASIYFDF 80
+ DN ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKCVQ-----KTR----ECYRENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


13HPGAM_05860HPGAM_05885Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_05860017-4.175848F0F1 ATP synthase subunit B'
HPGAM_05865117-3.592622plasmid replication-partition related protein
HPGAM_05870117-3.931753hypothetical protein
HPGAM_05875217-3.947819biotin--protein ligase
HPGAM_05880218-4.618032methionyl-tRNA formyltransferase
HPGAM_05885219-4.641417hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05860TRNSINTIMINR280.013 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.2 bits (62), Expect = 0.013
Identities = 13/43 (30%), Positives = 22/43 (51%)

Query: 59 EIGHQIETLLKEAAEKRREMLAEAIQKATESYDAVIKKKENEL 101
+I QI KEA E R+ E+ +A + Y+ +++ EL
Sbjct: 316 DIVEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEEL 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_05880FERRIBNDNGPP320.002 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 32.2 bits (73), Expect = 0.002
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 72 EPEVQILKGLKPDFIVVVAYGKILPKEVLAIAP 104
EP +++L +KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


14HPGAM_06355HPGAM_06455Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_06355310-0.339093DNA polymerase III subunit delta'
HPGAM_06360111-0.501237dihydropteroate synthase
HPGAM_063651110.454107hypothetical protein
HPGAM_063700101.348384hypothetical protein
HPGAM_063750112.055857putative membrane transport protein
HPGAM_06380-2101.974150hypothetical protein
HPGAM_06385-1142.739274hypothetical protein
HPGAM_06390-3103.389535carbamoyl phosphate synthase small subunit
HPGAM_06395-2122.665782formamidase
HPGAM_06400-1131.124218hypothetical protein
HPGAM_064050112.146521hypothetical protein
HPGAM_064101122.004445Maf-like protein
HPGAM_064151122.040607alanyl-tRNA synthetase
HPGAM_064203191.854834hypothetical protein
HPGAM_064253192.031348hypothetical protein
HPGAM_064300151.503381outer membrane protein - adhesin
HPGAM_06435214-1.074130hypothetical protein
HPGAM_06440113-0.63185730S ribosomal protein S18
HPGAM_06445213-0.613072single-stranded DNA-binding protein
HPGAM_06450211-0.78109930S ribosomal protein S6
HPGAM_06455211-0.535794DNA polymerase III subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06370TYPE3IMSPROT260.030 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 25.9 bits (57), Expect = 0.030
Identities = 8/20 (40%), Positives = 12/20 (60%)

Query: 53 VFFVFSYFYKELKMDKQKVK 72
F + + KELKM K ++K
Sbjct: 203 YAFEYYQYIKELKMSKDEIK 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06405adhesinmafb300.005 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.6 bits (66), Expect = 0.005
Identities = 16/50 (32%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 32 MEEIENSDPNQNNPFITA--AMGIGGAAISIFFPDTKPIVDGVKPLAEKG 79
ME I NPFI+A A+GIG + K + + PL +G
Sbjct: 225 MEFINGVAAGALNPFISAGEALGIGDILYGTRYAIDKAAMRNIAPLPAEG 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06420PF05844250.039 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 24.6 bits (53), Expect = 0.039
Identities = 12/65 (18%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDDDIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSD 74
KQK +
Sbjct: 256 KQKVE 260


15HPGAM_07145HPGAM_07185Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_07145-212-3.042460prephenate dehydrogenase
HPGAM_07150-112-3.650962hypothetical protein
HPGAM_07155-112-3.844710putative endonuclease
HPGAM_07160113-4.367217adenine-specific DNA-methyltransferase
HPGAM_07165-113-3.405189putative type III restriction enzyme R protein
HPGAM_07170-314-2.554841biotin synthase
HPGAM_07175-114-3.591586putative ribonuclease N
HPGAM_07180013-3.743840hypothetical protein
HPGAM_07185012-3.184792hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07145SHIGARICIN280.038 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 27.9 bits (62), Expect = 0.038
Identities = 9/74 (12%), Positives = 19/74 (25%), Gaps = 5/74 (6%)

Query: 79 TPIKKSATIIDLGGAKAQILHNIPKSIRQNFIAAHPMCGTEFYGPKASVKGLYENALVIL 138
P + L GA + ++R+ + Y L
Sbjct: 18 APAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKLYDIPLLRSTLPGSQRY-----AL 72

Query: 139 CDLEDSGTKQVELA 152
L + + + +A
Sbjct: 73 IHLTNYADETISVA 86


16HPGAM_07275HPGAM_07535Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_072754230.462507hypothetical protein
HPGAM_072802200.324931hypothetical protein
HPGAM_072850182.162262hypothetical protein
HPGAM_07290-1152.010500hypothetical protein
HPGAM_07295-1131.888245hypothetical protein
HPGAM_07300-2121.675375hypothetical protein
HPGAM_07305-2121.98057016S ribosomal RNA methyltransferase KsgA/Dim1
HPGAM_07310-1120.957734hypothetical protein
HPGAM_07315216-3.431506arabinose-5-phosphate isomerase
HPGAM_07320621-5.820278ribosomal RNA large subunit methyltransferase N
HPGAM_073251025-7.756287hypothetical protein
HPGAM_073301026-7.903618hypothetical protein
HPGAM_073351026-8.034291type I R-M system specificity subunit
HPGAM_073401127-7.970062relaxase
HPGAM_07345929-8.101005integrase/recombinase (xerD)
HPGAM_07350730-7.081086hypothetical protein
HPGAM_07355629-6.687075hypothetical protein
HPGAM_07360829-6.215024hypothetical protein
HPGAM_07365829-6.020334hypothetical protein
HPGAM_07370930-5.681700hypothetical protein
HPGAM_073751031-5.466013type IV secretion system protein TrbL
HPGAM_073801129-5.855728hypothetical protein
HPGAM_073851329-6.109952hypothetical protein
HPGAM_073901029-6.605533hypothetical protein
HPGAM_073951030-6.234827hypothetical protein
HPGAM_074001131-6.235091hypothetical protein
HPGAM_074051030-6.970221hypothetical protein
HPGAM_074101030-7.223854hypothetical protein
HPGAM_074151032-6.650459hypothetical protein
HPGAM_07420830-5.874085hypothetical protein
HPGAM_07425828-6.686317DNA topoisomerase I
HPGAM_07430625-6.649666hypothetical protein
HPGAM_07435727-6.243344hypothetical protein
HPGAM_07440727-6.207232hypothetical protein
HPGAM_07455728-4.778936VirD4 coupling protein
HPGAM_07460828-4.575701hypothetical protein
HPGAM_07465827-4.512868VirB11 type IV secretion ATPase
HPGAM_07470827-4.822490hypothetical protein
HPGAM_07475827-4.741723hypothetical protein
HPGAM_07480724-5.246328hypothetical protein
HPGAM_07485623-6.537612VirB10 type IV secretion protein
HPGAM_07490723-7.383656type IV secretion system protein VirB9
HPGAM_07495724-7.715528VirB8 type IV secretion protein
HPGAM_07500825-7.954440VirB7 type IV secretion protein
HPGAM_07505826-8.241920VirB4 type IV secretion ATPase
HPGAM_075101333-9.166365hypothetical protein
HPGAM_07515831-8.239749hypothetical protein
HPGAM_07520320-6.086869hypothetical protein
HPGAM_07525116-4.247718hypothetical protein
HPGAM_07530-116-3.544467hypothetical protein
HPGAM_07535-115-3.433946hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07310TYPE4SSCAGX320.012 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.012
Identities = 20/76 (26%), Positives = 34/76 (44%)

Query: 4 NNQNNENHENSSENSKDHHEARAGAFERFTNRKKRFRENAQKNAESSNHETLSHHKKERH 63
N QN N++N SE K E ER + +++ + NA K E N + ++R
Sbjct: 189 NPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQAEEAVRQRA 248

Query: 64 PNKKPNNHHKPKHAPQ 79
+K K + +P+
Sbjct: 249 KDKISIKTDKSQKSPE 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07335ACRIFLAVINRP280.027 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.9 bits (62), Expect = 0.027
Identities = 12/59 (20%), Positives = 20/59 (33%), Gaps = 8/59 (13%)

Query: 5 NQNWKKVRLGDIAEIKRGVRITKNELDVFGKYPVVSGGVGFLGYTNNFNRYENTITIAQ 63
N + VRL D+A ++ G N + P G+ N + A+
Sbjct: 254 NSDGSVVRLKDVARVELGGE-NYNVIARINGKPAAGLGI-------KLATGANALDTAK 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07385IGASERPTASE300.027 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.027
Identities = 20/128 (15%), Positives = 44/128 (34%), Gaps = 9/128 (7%)

Query: 547 NSTATQQENTKQNQAIEQNGTTQAKEPQSKQELKKTLHPDE-------PWLDYDPKAHKC 599
N ++ T I QA P ++ DE P +
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 600 LQERQKEEIQEKAQSNNSDEPWIEHGKRMQEKAKAHYQACLEREKAKELAKEQNNAQKEV 659
+Q+ + EK + + ++ + + ++AK++ +A + + + E Q
Sbjct: 1042 ENSKQESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 660 KKEMPTID 667
KE T++
Sbjct: 1100 TKETATVE 1107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07390ALARACEMASE290.020 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.4 bits (66), Expect = 0.020
Identities = 8/61 (13%), Positives = 21/61 (34%), Gaps = 9/61 (14%)

Query: 106 KASIAYIRD--------YEMRYVKARDEQGNLIPLKDKEGNLKHYSNG-EVIYENEKVPQ 156
+ I ++ Y RY +++ ++ +G +H G V+ + +
Sbjct: 236 SSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMT 295

Query: 157 R 157

Sbjct: 296 V 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07480cloacin372e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 37.4 bits (86), Expect = 2e-04
Identities = 23/87 (26%), Positives = 32/87 (36%), Gaps = 3/87 (3%)

Query: 134 GKGFSGSGASGMGYASGYGDTSNNAGSNGTSANGVNGTSGNNGAKGENGSSGANGANGTS 193
G G G + G G++S + GS G GN G G +G G N ++
Sbjct: 26 GLGVGGGASDGSGWSSE--NNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83

Query: 194 GYQGVGSNPFPPIAGSGNGSSGSSNSG 220
V FP ++ G G S S
Sbjct: 84 VAAPVAFG-FPALSTPGAGGLAVSISA 109



Score = 36.2 bits (83), Expect = 5e-04
Identities = 32/101 (31%), Positives = 40/101 (39%), Gaps = 24/101 (23%)

Query: 140 SGASGMGYASGYGDTSNNAGSNGTSANGVNGTSGNNGAKGENGSSGANGANGTSGYQGVG 199
SG G G+ +G TS N +NG G G GA+ SG+
Sbjct: 2 SGGDGRGHNTGAHSTSGN----------INGGPTGLGVGG--------GASDGSGW-SSE 42

Query: 200 SNPFPPIAGSG---NGSSGSSNSGYTHFMNGGGGIGDMGGG 237
+NP+ +GSG G SG N G N GGG G G
Sbjct: 43 NNPWGGGSGSGIHWGGGSGHGNGG--GNGNSGGGSGTGGNL 81



Score = 34.3 bits (78), Expect = 0.002
Identities = 26/91 (28%), Positives = 34/91 (37%), Gaps = 7/91 (7%)

Query: 184 SGANGANGTSGYQGVGSNPFPPIAGSGNGSSGSSNSGYTHFMNGGGGIGDMGGGFIPFPY 243
SG +G +G N G G G S SG++ N GG G +
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHW----- 56

Query: 244 SPGLQNGSGANGINGTNGINGTSGANGSNSA 274
G +G G G NG +G +G N S A
Sbjct: 57 --GGGSGHGNGGGNGNSGGGSGTGGNLSAVA 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07495PF04335882e-22 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 88.0 bits (218), Expect = 2e-22
Identities = 34/224 (15%), Positives = 74/224 (33%), Gaps = 29/224 (12%)

Query: 121 ESFKKDELDLSSVFEIQRKNTQIAYRLAIGGLIGIVALSIAIFIMMPLKENTPYFIDFAN 180
F++ ++ ++A+ +A A +A+ + PLK PY I
Sbjct: 12 AYFEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDR 71

Query: 181 SDKHFAVVQRADTRLDYS--EAFLRNLVGSYITARETINHIDDKIRLNETIREQSSEEVW 238
+ ++ + + EA + + +Y+ RE + + + S+
Sbjct: 72 NTGEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYF-DAVMVMSARPEQ 130

Query: 239 KILEQLVSSKG-----SIYSNSNMDREIKIINISIYKQGKQQNIAVADIVAKVFDKGYLI 293
+ + +I +N D ++I +S + VA+V+ +
Sbjct: 131 DRWSRFYKTDNPQSPQNILANRT-DVFVEIKRVSF----------LGGNVAQVYFTKESV 179

Query: 294 SEKRYRVSLIYHFKPLIQFDYSSMP-------KNPTGFIVDKYS 330
+ S I++ P KNP G+ V+ Y
Sbjct: 180 TGSN---STKTDAVATIKYKVDGTPSKEVDRFKNPLGYQVESYR 220


17HPGAM_07715HPGAM_07775Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_077153120.789968hypothetical protein
HPGAM_077202100.353775putative Outer membrane protein
HPGAM_077252110.370396branched-chain amino acid aminotransferase
HPGAM_07730112-0.548696outer membrane protein
HPGAM_07735112-0.756741DNA polymerase I
HPGAM_07740-1170.492582type II restriction enzyme
HPGAM_07745-1170.765651putative type II DNA modification enzyme
HPGAM_077503211.510345hypothetical protein
HPGAM_077552151.136843thymidylate kinase
HPGAM_077602120.510161phosphopantetheine adenylyltransferase
HPGAM_077652120.4869173-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPGAM_077702110.155331hypothetical protein
HPGAM_077752110.202261flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_07760LPSBIOSNTHSS2257e-79 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 225 bits (575), Expect = 7e-79
Identities = 63/147 (42%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLDERLKMMQLATKSFT 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPEEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


18HPGAM_07930HPGAM_08070Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_079302132.693099saccharopine dehydrogenase
HPGAM_079351122.128570ferrodoxin-like protein
HPGAM_079400101.948403putative glycerol-3-phosphate acyltransferase
HPGAM_07945-1112.183527dihydroneopterin aldolase
HPGAM_079500112.260899hypothetical protein
HPGAM_079550112.157570iron-regulated outer membrane protein
HPGAM_07960011-2.201825selenocysteine synthase
HPGAM_07965012-2.342929transcription elongation factor NusA
HPGAM_07980115-4.382247hypothetical protein
HPGAM_07985112-3.787407hypothetical protein
HPGAM_07990111-3.629230hypothetical protein
HPGAM_08015110-3.316996type III restriction enzyme
HPGAM_08020111-3.213267type III R-M system modification enzyme
HPGAM_08025011-2.799845putative type III restriction enzyme M protein
HPGAM_08030014-1.110763ATP-dependent DNA helicase RecG
HPGAM_08035015-0.711278hypothetical protein
HPGAM_08040-114-0.715644hypothetical protein
HPGAM_080450120.007259exodeoxyribonuclease III
HPGAM_080551130.177825*hypothetical protein
HPGAM_080603170.170344chromosomal replication initiation protein
HPGAM_080654220.067608hypothetical protein
HPGAM_080703200.148182hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_08060HTHFIS354e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 4e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


19HPGAM_00190HPGAM_00225N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_00190-3130.781977type IV secretion system protein VirB8
HPGAM_00195-3120.412144DNA transformation compentancy
HPGAM_00200-3130.285125DNA transformation compentancy
HPGAM_00205-1130.245868mannose-1-phosphate guanyltransferase
HPGAM_00210-291.027643GDP-D-mannose dehydratase
HPGAM_00215-1110.793240putative sugar nucleotide biosynthesis
HPGAM_002200150.658675hypothetical protein
HPGAM_002250140.731778hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00190PF043351331e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (336), Expect = 1e-40
Identities = 36/202 (17%), Positives = 71/202 (35%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYRLLGLMSFIALVLAIVLISILPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKAQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ K N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLTNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00195TYPE4SSCAGX310.005 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.3 bits (70), Expect = 0.005
Identities = 35/115 (30%), Positives = 52/115 (45%), Gaps = 25/115 (21%)

Query: 155 FIEDKNYYTNAFIKPQKENQENMTENAPKDAQKNNKPLKEEKEETKTPEEEVIIIGDNTN 214
I+ +N T A+I N A + N + ++EEK++ II D
Sbjct: 339 LIKQENLNTTAYI--------NRVMMASNEQIINKEKIREEKQK---------IILDQAK 381

Query: 215 AMKIIKKDIQKGYKALKSSQ--RKWYCLWACSKKSKLSLMPEEIFNDKQFTYFKF 267
A+ + Q + ALK + R + A K+SK +MP EIF+D FTYF F
Sbjct: 382 AL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00210NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.5 bits (217), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00215NUCEPIMERASE511e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 51.3 bits (123), Expect = 1e-09
Identities = 52/346 (15%), Positives = 108/346 (31%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCTYPKYAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKNFAMWGDGTARREYLNAKDLARFIA 222
+YG + + P + T + K+ ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYENIAQ----------MPS-------VMNVGSGVDYSIEEYYEKVAQVLDYKGVFVKD 265
+ I P+ V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKETYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_00225ANTHRAXTOXNA270.032 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 27.0 bits (59), Expect = 0.032
Identities = 22/71 (30%), Positives = 32/71 (45%), Gaps = 15/71 (21%)

Query: 20 KEKGIKKKIKEEET-IKKEKIKEKIK---------EAFD-----EKLPKKPPIDIEVTFS 64
E IK+ K E+ +KEK K+ I E D + L KK P D+ +S
Sbjct: 39 TESDIKRNHKTEKNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYS 98

Query: 65 KYGHGLYWIDI 75
+ G +Y+ DI
Sbjct: 99 ELGGEIYFTDI 109


20HPGAM_01385HPGAM_01420N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_01385-2151.400084neutrophil activating protein
HPGAM_01390-3151.393854putative histidine kinase sensor protein
HPGAM_01395-3132.027058hypothetical protein
HPGAM_01400-3122.422373flagellar basal body P-ring protein
HPGAM_01405-2122.145173ATP-dependent RNA helicase DeaD
HPGAM_01410-2111.750823hypothetical protein
HPGAM_01415-2111.207631hypothetical protein
HPGAM_01420-2122.036790ABC transporter, ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01385HELNAPAPROT1488e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 148 bits (374), Expect = 8e-49
Identities = 38/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEGFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKHLEKEFEELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E + + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01390PF06580300.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 286 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 344
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 345 TKLKGNGLGLA 355
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01400FLGPRINGFLGI363e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 363 bits (933), Expect = e-127
Identities = 117/345 (33%), Positives = 191/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AIVSGN-----------SNNLLSANIINGATIEREVSYDLFHKNAMTLSLKNPNFKNAIQ 186
A++ SA + NGA IERE+ + L L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVATALDPKTIQITRPERLSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIIVHPIVVTSQDITLKITKEP--------LSDSKNTQDLDNNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01420HTHFIS320.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.006
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANLIMRLNPR----FKPHNGEILFETTNLLKESEAF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


21HPGAM_01895HPGAM_01950N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_01895-2101.113168flagellar MS-ring protein
HPGAM_01900-2121.475857flagellar motor switch protein G
HPGAM_01905-2131.080917flagellar assembly protein H
HPGAM_01910-2111.9909071-deoxy-D-xylulose-5-phosphate synthase
HPGAM_01915-1111.260824GTP-binding protein LepA
HPGAM_01920013-0.082099hypothetical protein
HPGAM_019252141.092562hypothetical protein
HPGAM_01930012-0.624222hypothetical protein
HPGAM_01935-1110.457412flagellar basal-body rod protein
HPGAM_01940012-0.162913alpha-ketoglutarate permease
HPGAM_01945013-0.739537hypothetical protein
HPGAM_01950013-0.634560cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01895FLGMRINGFLIF5510.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 551 bits (1420), Expect = 0.0
Identities = 179/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVLKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GASKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANALEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATLSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIATLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01900FLGMOTORFLIG348e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 348 bits (895), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIAKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01905FLGFLIH372e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 37.5 bits (86), Expect = 2e-05
Identities = 44/207 (21%), Positives = 90/207 (43%), Gaps = 14/207 (6%)

Query: 48 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 106
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 107 EEKMRNELTHSVNEEKNQLLHAITALDEKMKSSENHLMALE----KELSAIAIDIAKEVI 162
+ L + E K+Q + + + + L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 163 LKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 219
+ ++ + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 220 AISKGGVMITSSNGSLDGNLMERFKTL 246
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01915TCRTETOQM1123e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 112 bits (282), Expect = 3e-28
Identities = 53/162 (32%), Positives = 87/162 (53%), Gaps = 7/162 (4%)

Query: 11 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMKSQVMDTMDIEKERGITIKAQSV 67
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 68 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 127
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 128 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSNANEVS 169
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 82.6 bits (204), Expect = 1e-18
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 169 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 228
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 229 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 285
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 286 KNPTSKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 345
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 346 FRVGFLGLLHMEVIKERLEREFSLNLIATAPTVVY 380
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 407 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 466
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 467 LKSCTKGYASFDYEP 481
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01935FLGHOOKAP1300.009 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.009
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01940TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 1e-05
Identities = 58/315 (18%), Positives = 104/315 (33%), Gaps = 67/315 (21%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFMLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EETMDSKTTSKTTIKEETQRGSLKELLNHKKALM-------IVFGLTMGGSLCFYTFTVY 263
+ + +K + K ++ + T +
Sbjct: 190 K-----------------KEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 LKIFLTNSSSFSPK-------ESSFIMLLALSYFIFLQPLCG---MLADKIKRTQMLMVF 313
IF+ + + ++ M+ L I + G M+ +K L
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 314 AITGLIVTPVVFYGI 328
I +I+ P I
Sbjct: 293 EIGSVIIFPGTMSVI 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_01950IGASERPTASE300.035 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.035
Identities = 34/209 (16%), Positives = 73/209 (34%), Gaps = 13/209 (6%)

Query: 163 QAFNLDFTLKKEGFENTPLDAQKKETKNDKDKENPKENPIDESHKTPNEESFLAIPTPYN 222
Q +L+ +L + + + D NP+ +++ T N + I
Sbjct: 949 QRDHLNVSLVGNTVDLGAWKYKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVP 1008

Query: 223 TTLNNSEPQEGLVQISPHPPTHYTIYPKRNRFDDLTNPTNPPLKEPKQETKEREPTLKKE 282
+ +N+E + V +P PP + T N + E E++ T E
Sbjct: 1009 SVPSNNE-EIARVDEAPVPPPAPATPSETT----ETVAENSKQESKTVEKNEQDAT---E 1060

Query: 283 TPTTLKPIMPISAPNTENDNKTENHKTPNHPIKEDALQENPQKENQKENIEEKENLKEEE 342
T + + + N + + +T N + + + + +E++E K E
Sbjct: 1061 TTAQNREVAKEAKSNVKANTQT-NEVAQ---SGSETKETQTTETKETATVEKEEKAKVE- 1115

Query: 343 KRETQNAPNFSPITPTSAKKPVMVKELSE 371
+TQ P + ++ V+ +E
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144


22HPGAM_02210HPGAM_02245N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_02210-114-0.1047983-dehydroquinate dehydratase
HPGAM_02215117-0.062154X-Pro aminopeptidase
HPGAM_02220115-0.5030802-amino-4-hydroxy-6-
HPGAM_02225116-0.133313flagellar biosynthesis regulator FlhF
HPGAM_022302160.032376hypothetical protein
HPGAM_02235317-0.052583flagellar biosynthesis sigma factor
HPGAM_022400130.284505flagellar motor switch protein FliM
HPGAM_02245-112-0.039165flagellar motor switch protein FliY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02210V8PROTEASE270.045 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 26.9 bits (59), Expect = 0.045
Identities = 9/27 (33%), Positives = 16/27 (59%)

Query: 141 MVNILAEMKAFQEAQKNNPNNPNNPIN 167
+ + ++ + Q NNP+NP+NP N
Sbjct: 274 LKQNIEDIHFANDDQPNNPDNPDNPNN 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02225PF05272330.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.003
Identities = 11/57 (19%), Positives = 24/57 (42%), Gaps = 3/57 (5%)

Query: 224 ENSVTIKRYFREVLRKMI---LCRPEDLNLRQKRILMLVGPTGVGKTTTLAKLAARY 277
+ RY + V + ++ + R + + ++L G G+GK+T + L
Sbjct: 564 DYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02240FLGMOTORFLIM431e-154 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 431 bits (1109), Expect = e-154
Identities = 122/342 (35%), Positives = 207/342 (60%), Gaps = 3/342 (0%)

Query: 1 MADILSQEEIDALLEVVDENVDIQNVQKKDIIPQRSVTLYDFKRPNRVSKEQLRSFRSIH 60
M ++LSQ+EID LL + D + I R +TLYDF+RP++ SKEQ+R+ +H
Sbjct: 1 MTEVLSQDEIDQLLTAISSG-DASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMH 59

Query: 61 DKMARNLSSQVSSIMRSIVEIQLHSVDQMTYGEFLMSLPSPTSFNVFSMKPMGGTGVLEI 120
+ AR ++ +S+ +RS+V + + SVDQ+TY EF+ S+P+P++ V +M P+ G VLE+
Sbjct: 60 ETFARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEV 119

Query: 121 NPSIAFPMIDRLLGGKGSAYDQNREFSDIELNLLDTILRQVMQILKEVWSPVVEMYPTID 180
+PSI F +IDRL GG G A R+ +DIE ++++ ++ +++ ++E W+ V+++ P +
Sbjct: 120 DPSITFSIIDRLFGGTGQAAKVQRDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLG 179

Query: 181 AKESSANVVQIVAQNEISIMVVLEIIIGHSRGMMNICYPVISIESILSKMGSRDFMLSET 240
E++ QIV +E+ ++V LE +G GMMN C P I+IE I+SK+ S+ + S
Sbjct: 180 QIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVR 239

Query: 241 NSKKSRNKE-LQALLSGVSVDMMVFLGAVELSLKEMLDLDVGDTIRLNKI-ANDEVSVYV 298
S ++ L+ LS V +D++ +G++ LS++++L L VGD IRL+ D + +
Sbjct: 240 RSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSI 299

Query: 299 HKKKRYLASVGFQGYRKTIQIKEVVYSEKERTKEILEMLEEQ 340
+K++L G G + QI E + S + E L EE+
Sbjct: 300 GNRKKFLCQPGVVGKKIAAQILERIESTSQEDFEELSADEEE 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02245FLGMOTORFLIN1123e-33 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 112 bits (282), Expect = 3e-33
Identities = 42/124 (33%), Positives = 77/124 (62%), Gaps = 1/124 (0%)

Query: 161 TEAFEGQFEKTHKEEKEETTKSATEET-KTHDASLENIEIRNISMLLDVKLNVKVRIGQK 219
T A + + E+K TTKSA + + + +++I +++D+ + + V +G+
Sbjct: 12 TGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRT 71

Query: 220 KMILKDVVSMDIGSVVELDQLVNDPLEILVDDKVIAKGEVVIVDGNFGIQITDIGTKKER 279
+M +K+++ + GSVV LD L +PL+IL++ +IA+GEVV+V +G++ITDI T ER
Sbjct: 72 RMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSER 131

Query: 280 LEQL 283
+ +L
Sbjct: 132 MRRL 135


23HPGAM_02980HPGAM_03010N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_02980113-1.584338hypothetical protein
HPGAM_02985113-0.856431hypothetical protein
HPGAM_02990115-0.593500dihydroorotase
HPGAM_02995016-2.694962hypothetical protein
HPGAM_03000-214-2.866644hypothetical protein
HPGAM_03005-215-2.389843flagellar motor switch protein
HPGAM_03010-113-1.091941endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02980TYPE3IMSPROT310.003 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.5 bits (69), Expect = 0.003
Identities = 19/66 (28%), Positives = 29/66 (43%), Gaps = 4/66 (6%)

Query: 88 LQSYSVMLFFNLLLLTDILGFLPFSIYHHFMASLIFSAFFCSSLFLSSPLLGVIALVALS 147
L Y F L+L+ +LPFS S + +L PLL V AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLLVR 153
S ++
Sbjct: 101 SHVVQY 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_02995TONBPROTEIN495e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 49.2 bits (117), Expect = 5e-09
Identities = 24/57 (42%), Positives = 28/57 (49%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 139
P P +P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 62 QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 45.0 bits (106), Expect = 1e-07
Identities = 25/70 (35%), Positives = 32/70 (45%), Gaps = 8/70 (11%)

Query: 84 PKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEE 143
P + P +P P P P P P E P KPKPKP+PK K V+KV+E
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVKKVQE 108

Query: 144 KKVVEEKKEE 153
+ + K E
Sbjct: 109 QPKRDVKPVE 118



Score = 39.2 bits (91), Expect = 9e-06
Identities = 43/214 (20%), Positives = 79/214 (36%), Gaps = 34/214 (15%)

Query: 98 PTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVEEKKEEKKVV 157
P PP +P +P EP+P+P+P P+ P K+ V EK + K + K V
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-------KEAPVVIEKPKPKPKPKPKPVK 104

Query: 158 EQKVEQKKIEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTRRDIDELYGEEFGD 217
KV+++ + KPV E P N T +
Sbjct: 105 --KVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAATSKPVTS 148

Query: 218 LGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDLKIIIGS 277
+ + + RN + YP A L +G V+F + P+G + +++I+
Sbjct: 149 VASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAK 197

Query: 278 EYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 311
M + ++ + +P + ++ I +
Sbjct: 198 PANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 39.2 bits (91), Expect = 9e-06
Identities = 25/72 (34%), Positives = 31/72 (43%), Gaps = 1/72 (1%)

Query: 87 TLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKV 146
T+ P P PP P E P+PEP P+P E K K K + KKV
Sbjct: 48 TMVTPADLEPPQAVQPPPEPVVEPE-PEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV 106

Query: 147 VEEKKEEKKVVE 158
E+ K + K VE
Sbjct: 107 QEQPKRDVKPVE 118



Score = 38.0 bits (88), Expect = 2e-05
Identities = 16/54 (29%), Positives = 21/54 (38%)

Query: 74 QDPNKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKK 127
Q +P P P P PKP KPKP+P K + +PK+
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112



Score = 32.7 bits (74), Expect = 0.001
Identities = 14/56 (25%), Positives = 22/56 (39%)

Query: 74 QDPNKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPN 129
+P P+P P++ P P P PKP K + +PK +P +
Sbjct: 65 PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 31.1 bits (70), Expect = 0.005
Identities = 12/52 (23%), Positives = 16/52 (30%)

Query: 75 DPNKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPK 126
+P P + P P P P K E+PK + KP
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPAS 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03005FLGMOTORFLIN1002e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 100 bits (250), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03010OMS28PORIN290.009 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.4 bits (65), Expect = 0.009
Identities = 26/109 (23%), Positives = 51/109 (46%), Gaps = 5/109 (4%)

Query: 23 NQTTELHHKNPYELLVATILSAQCTDARVNQITPKLFEKYPSVNDLALASLE--EVKEII 80
N+ E+ K E A ++ + T QI + K P+ +L L E +V+++
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 81 QSVSYSNNKSKHLINMAQKVVRDFKGVIPSTQKELMSLDGVGQKTANVV 129
+++ S + AQKV+ G+ PS + ++++ V + +NVV
Sbjct: 189 ETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


24HPGAM_03080HPGAM_03160N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_03080-212-0.143830methyl-accepting chemotaxis protein (MCP)
HPGAM_03095-2100.008014hypothetical protein
HPGAM_03100-3100.754869flagellin A
HPGAM_03105-4110.8672413-methyladenine DNA glycosylase
HPGAM_03110-3111.236553hypothetical protein
HPGAM_031151110.598709uroporphyrinogen decarboxylase
HPGAM_031201100.205281outer-membrane protein of the hefABC efflux
HPGAM_031252100.094273hypothetical protein
HPGAM_03130210-0.271792putative efflux transporter
HPGAM_03135311-1.059228hypothetical protein
HPGAM_03140111-0.871530putative vacuolating cytotoxin (VacA)-like
HPGAM_03145-117-2.465140putative ABC transporter permease
HPGAM_03150-312-0.749017ABC transporter, permease
HPGAM_03155-310-0.341896ABC transporter, ATP-binding protein
HPGAM_03160-29-0.440112hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03080OMS28PORIN300.014 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.014
Identities = 26/102 (25%), Positives = 49/102 (48%), Gaps = 2/102 (1%)

Query: 143 NAAKNGEEHSNEGLITVNKTGQDIESLYEKMQNATSLADSLNQRS--NEITQVISLIDDI 200
N + ++ N+ L T+NK +D+ S E ++ ++ N + +SL+ D+
Sbjct: 47 NKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKKFVGSMSLMSDV 106

Query: 201 AEQTNLLALNAAIEAARAGEHGRGFAVVADEVRKLAEKTQKA 242
A+ T + + A I A +G G V + +K ++TQKA
Sbjct: 107 AKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKA 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03100FLAGELLIN2446e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 244 bits (624), Expect = 6e-77
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03120RTXTOXIND290.032 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.032
Identities = 14/113 (12%), Positives = 41/113 (36%), Gaps = 16/113 (14%)

Query: 203 LARMIALQKKLEQIQTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQ 255
LAR+ + ++ + + L K + + ++A L Y + ++
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279

Query: 256 FALEQNRLTLEYLTNLNVKNLKKTTIDVPNLQLRE-RKDLVSLREQISALKYQ 307
+ + + +T K +D +LR+ ++ L +++ + +
Sbjct: 280 SEILSAKEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03125RTXTOXIND525e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.8 bits (124), Expect = 5e-10
Identities = 24/82 (29%), Positives = 37/82 (45%), Gaps = 5/82 (6%)

Query: 27 NVKAIQDSKLTLDSTGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQ 86
K I+ IV I V EG V+KGDVLL L +A + T+ L+ A+ +
Sbjct: 95 RSKEIKPI-----ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE 149

Query: 87 YQRYSKIGGAVDKNTLEGYEFT 108
RY + +++ N L +
Sbjct: 150 QTRYQILSRSIELNKLPELKLP 171



Score = 32.1 bits (73), Expect = 0.002
Identities = 19/115 (16%), Positives = 41/115 (35%), Gaps = 13/115 (11%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLEGYEFTYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D L L + A + ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL---------LTLELAKNEERQQASV 329

Query: 128 LRAPFDGVVASKNIQVGEGVSANSTVLLRLVSHARKLVIE--FDSKYINAVKVGD 180
+RAP V + GV + L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQ 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03130ACRIFLAVINRP8980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 898 bits (2323), Expect = 0.0
Identities = 287/1040 (27%), Positives = 518/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSVNKFDTDSQAIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ ++ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILVNAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKRIQAISP-SYEIRPFLDTTTFIRSSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT F++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RSGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
++ TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPKHS-------RFYVWSEPFFKALESRYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFIAVVLVFVGSLFVASKLGMEFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHDEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKEHKLGQFELMSALKKELKS 631
+ + E FT + G QN FV LKP +ER + ++ K EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNG-DENSAEAVIHRAKMELGK 656

Query: 632 MPEAKDLDSINLSEVALIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAEPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + E
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLVALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03140VACCYTOTOXIN2734e-76 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 273 bits (699), Expect = 4e-76
Identities = 106/397 (26%), Positives = 181/397 (45%), Gaps = 14/397 (3%)

Query: 2804 AGNNSILWLNELFAAKGGNPLFAPYYLQDNPTEHIVTLMKDITSALGMLSNSNLKNNSTD 2863
+G L L + +A + I + T+ L +++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2864 VLQLNTYTQQMSRLAKLSNFASFDSTDFSERLSSLKNQRFADATPNAMDVILKYSQRDKL 2923
L L+ SRL LS + F++RL +LK+QRFA +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2924 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFVRG---VIVGGYAAYGYSGFYER--ITN 2978
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2979 SKSDNVDVGLYARAFIKKSELTFSVNETWGANKTQISSNDTLLSMINQSYKYSTWTTNAK 3038
S ++N + G+Y+R F + E F G++++ ++ LL +NQSY Y ++ +
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3039 VNYGYDFMFKNKSIILKPQIGLRYYYIGMSGLEGVMNNALYNQFKANADPSKKSVLTIDF 3098
+YGYDF F +++LKP +G+ Y ++G + + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3099 ALENRHYFNTNSYFYAIGGVGRDLLVNSMGDKLVRFIGNNTLSYRKGDLYNTFANITTGG 3158
+E R+Y+ SYFY GV ++ N V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFA-NFGSSNAVSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3159 EVRLFKSFYANAGVGARFGLDYKMIDIIGNIGMRLAF 3195
E++L K + N G L + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 35.8 bits (82), Expect = 0.004
Identities = 16/100 (16%), Positives = 31/100 (31%), Gaps = 5/100 (5%)

Query: 703 SYTFDGANNTFNEDKFNGGSFNFNHAEQTDAFNNNSFNGGSFSFNAKQVDFNHNSFNGGV 762
SY+ + E FN + ++A Q +N + G+ + N + G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 763 FNF---NNTPKASFTNDTFNVNNQFKING-TQTDFTFNKG 798
+ + + N + + N TQ N
Sbjct: 331 YKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSA 370



Score = 33.9 bits (77), Expect = 0.013
Identities = 58/297 (19%), Positives = 97/297 (32%), Gaps = 32/297 (10%)

Query: 251 SNGATTISGV-TFNNNGALTYKGGNGIGGGITFINSNINHYKLNLNANSVTFNNSTLGSM 309
+ G T+ + N N T + G G +T ++++ K +N ++ S L
Sbjct: 386 AGGKNTVVNINRINTNADGTIRVG-GFKASLTTNAAHLHIGKGGINLSNQASGRSLLVEN 444

Query: 310 PNGNANTIGNAYILNANNITFNNLTFNGGWFVFMRPDSKIDFQGTTTINNPTSPFLNMTS 369
GN G L NN G + +F+ T N T+ F N S
Sbjct: 445 LTGNITVDGP---LRVNNQV-GGYALAGS-------SANFEFKAGTDTKNGTATFNNDIS 493

Query: 370 KVTINPNAIFNIQNYTPSIGSAYTLFSMKNGSIAYNDVNNLWNIIRLKNTQATKDADKNH 429
+ I + F+ + ++ V N NI +L T +T A KN
Sbjct: 494 LGRFVNLKVDAHTANFKGIDTGNGGFNT----LDFSGVTNKVNINKL-ITASTNVAVKN- 547

Query: 430 TSSNNNTHTYYVTYNLGGTLYNFRQIFSPDSIVLQSVYYGANNIYYTNSVNIHDNVFNLK 489
+ N ++G + I S I + G +IY K
Sbjct: 548 -FNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIYSGG--------VKFK 598

Query: 490 NINDDRADAIFYLNGLNTWNYTNARFAQTYDGKNSALVFNATTPWANGSIPKSNSTV 546
+ +Y WNY +AR + + N +PW + +N T+
Sbjct: 599 GGEKLVINDFYY----APWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTL 651


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03160LCRVANTIGEN300.002 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 29.7 bits (66), Expect = 0.002
Identities = 15/33 (45%), Positives = 20/33 (60%)

Query: 16 KRKRLLTELAELEAEIKVGSERRSGFNVSLSPS 48
R +L ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


25HPGAM_03955HPGAM_03985N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_03955-114-0.931733hypothetical protein
HPGAM_03960-111-1.257961hypothetical protein
HPGAM_03965-112-1.173431molybdenum cofactor biosynthesis protein A
HPGAM_03970-110-0.451015molybdopterin-guanine dinucleotide biosynthesis
HPGAM_03975-29-0.069792flagellar biosynthesis protein FlhB
HPGAM_03980-2120.091673hypothetical protein
HPGAM_039850120.653664N-acetylmuramoyl-L-alanine amidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03955RTXTOXINA347e-04 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 34.2 bits (78), Expect = 7e-04
Identities = 21/109 (19%), Positives = 48/109 (44%), Gaps = 11/109 (10%)

Query: 121 IEKMLIGYGSLGASSFTAGAVLGGGLAASGLAGAAVLGGL--VAGPALAILGAISTDEME 178
++K+L Y G +G L +G +L G AL+ ++ DE+
Sbjct: 114 LDKLLQKYQKAGNILGGGAENIGDNLGKAG----GILSTFQNFLGTALS---SMKIDELI 166

Query: 179 KKRDDAKAY--LSQVEAAVKKADAMIDNIQAVRKVVDLFTEQITKLDAL 225
KK+ +A+++ + ++D + ++ V+ F++Q+ L ++
Sbjct: 167 KKQKSGGNVSSSELAKASIELINQLVDTVASLNNNVNSFSQQLNTLGSV 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_0396056KDTSANTIGN300.034 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 29.5 bits (66), Expect = 0.034
Identities = 14/37 (37%), Positives = 19/37 (51%), Gaps = 5/37 (13%)

Query: 353 LTRYLSGKIDKTELLKQLGKANTTLVSSGAMAVAGQA 389
L Y KID ++ K T +V+SGA+ VA A
Sbjct: 464 LGSYTYAKIDNKDV-----KGYTGMVASGALGVAINA 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03975TYPE3IMSPROT367e-129 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 367 bits (943), Expect = e-129
Identities = 118/359 (32%), Positives = 194/359 (54%), Gaps = 14/359 (3%)

Query: 4 EEKTELPSTKKIQKAREEGNVPKSMEVVGFLGLLAGLMSIFVFFIWWVDGFSEMYRHVLK 63
EKTE P+ KKI+ AR++G V KS EVV ++A + ++ + FS++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 64 DFSLDFSKESVQELFNQLAKDTFLLLLPVLIILMVVAFLSNVLQFGWLFAPKVIEPKFSK 123
L FS +++ + + + + F L P+L + ++A S+V+Q+G+L + + I+P K
Sbjct: 63 QSYLPFS-QALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 124 INPINGVKNLFSLKKLLDGSLITLKVFLAFFLGFFIFSLFLGEL------NHAALLNLQG 177
INPI G K +FS+K L++ LKV L L + I L L + L G
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 178 QLLWFKSKALWLISSLLFLFFVLAFVDLIIKRRQYTNSLKMTKQEVKDEYKQQEGNPEIK 237
Q+L + L +I ++ F+ V++ D + QY LKM+K E+K EYK+ EG+PEIK
Sbjct: 182 QIL----RQLMVICTVGFV--VISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIK 235

Query: 238 AKIRQMMVKNATNKMMQEIPKANVVVTNPTHYAVALKFD-EEHPVPVVVAKGTDYLAIRI 296
+K RQ + + M + + +++VVV NPTH A+ + + E P+P+V K TD +
Sbjct: 236 SKRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTV 295

Query: 297 KGIAREHDIEIIENKTLARELYRDVKLNATIPEELFEAVAIVFAQVAKLEQERQKQKII 355
+ IA E + I++ LAR LY D ++ IP E EA A V + + E+Q +++
Sbjct: 296 RKIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHSEML 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_03985TONBPROTEIN300.021 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 29.6 bits (66), Expect = 0.021
Identities = 13/73 (17%), Positives = 22/73 (30%), Gaps = 11/73 (15%)

Query: 135 QTPKPTPKPIKKEAKKTKEKTPTKHAHSKHAHSPLNERSAKKEIPKKEIPKKEIPKKEIP 194
Q P + E + E + K + K PK +E P
Sbjct: 62 QPPPEPVVEPEPEPEPIPEPPKEAPVVIE-----------KPKPKPKPKPKPVKKVQEQP 110

Query: 195 KKEIPKKEAENES 207
K+++ E+ S
Sbjct: 111 KRDVKPVESRPAS 123


26HPGAM_04545HPGAM_04580N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_04545-1101.947395cysteinyl-tRNA synthetase
HPGAM_045500102.474232vacuolating cytotoxin
HPGAM_045550161.289837putative lipopolysaccharide biosynthesis
HPGAM_045600181.872003IRON(III) dicitrate transport system ATP-binding
HPGAM_045650181.474454iron(III) dicitrate ABC transporter, permease
HPGAM_04570016-1.619803hypothetical protein
HPGAM_04575419-4.458588hypothetical protein
HPGAM_04580725-7.050428hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04545OMS28PORIN300.015 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.015
Identities = 17/51 (33%), Positives = 32/51 (62%), Gaps = 4/51 (7%)

Query: 309 EEDLLVSKKRLDKIYRLKQRVLGTLGGINPNFKKEILECMQDDLNVSKALS 359
+E L+ S++ LD+ + Q+VL + G+NP+ K ++L +V+KA+S
Sbjct: 188 KETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVLA----KKDVAKAIS 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04550VACCYTOTOXIN20210.0 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 2021 bits (5238), Expect = 0.0
Identities = 1152/1292 (89%), Positives = 1218/1292 (94%), Gaps = 5/1292 (0%)

Query: 1 MEIQQTHRKMNRPLVSLVLAGALISAIPQESHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60
MEIQQTHRK+NRPLVSL L GAL+S PQ+SHAAFFTTVIIPAIVGGIATGAAVGTVSGL
Sbjct: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60

Query: 61 LSWGLKQAEEANKTPDKPDKVWRIQAGRGFNNFPHKQYDLYQSLLSSKIDGGWDWGNAAR 120
L WGLKQAEEANKTPDKPDKVWRIQAG+GFN FP+K+YDLY+SLLSSKIDGGWDWGNAAR
Sbjct: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR 120

Query: 121 HYWVKGGQWNKLEVDMKDAVGTYKLSGLRNFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180
HYWVK GQWNKLEVDM++AVGTY LSGL NFTGGDLDVNMQKATLRLGQFNGNSFTSYKD
Sbjct: 121 HYWVKDGQWNKLEVDMQNAVGTYNLSGLINFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180

Query: 181 SADRTTRVNFNAKNISIENFVEINNRVGSGAGRKASSTVLTLQASEGITSSKNAEISLYD 240
SADRTTRV+FNAKNI I+NF+EINNRVGSGAGRKASSTVLTLQASEGITS +NAEISLYD
Sbjct: 181 SADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYD 240

Query: 241 GATLNLASNSVKLNGNVWMGRLQYVGAYLAPSYSTINTSKVQGEVDFNHLTVGDQNAAQA 300
GATLNLASNSVKL GNVWMGRLQYVGAYLAPSYSTINTSKV GEV+FNHLTVGD NAAQA
Sbjct: 241 GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQA 300

Query: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNNTPS---QSGTKNDKQEISQNNNS 357
GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPN+ PS Q+ KNDKQE SQNN S
Sbjct: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNN-S 359

Query: 358 NTEVINPPNNTQKTETEPTQVIDGPFAGGKDTVVNIDRINTKADGTIRVGGFKASLTTNA 417
NT+VINPPN+ QKTE +PTQVIDGPFAGGK+TVVNI+RINT ADGTIRVGGFKASLTTNA
Sbjct: 360 NTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNINRINTNADGTIRVGGFKASLTTNA 419

Query: 418 AHLNIGKGGVNLSNQASGRTLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAGV 477
AHL+IGKGG+NLSNQASGR+LLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAG
Sbjct: 420 AHLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAGT 479

Query: 478 DTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTGKVNINKLITA 537
DTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVT KVNINKLITA
Sbjct: 480 DTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTNKVNINKLITA 539

Query: 538 STNVAVKNFNINELVVKTNGISVGEYTHFSEDIGSQSRINTVRLETGTRSIFSGGVKFKG 597
STNVAVKNFNINELVVKTNG+SVGEYTHFSEDIGSQSRINTVRLETGTRSI+SGGVKFKG
Sbjct: 540 STNVAVKNFNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIYSGGVKFKG 599

Query: 598 GEKLVIDEFYYSPWNYFDARNVKNVEITRKFASSTPENPWGTSKLMFNNLTLGQNAVMDY 657
GEKLVI++FYY+PWNYFDARN+KNVEIT K A +PWGT+KLMFNNLTLGQNAVMDY
Sbjct: 600 GEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTLGQNAVMDY 659

Query: 658 SQFSNLTIQGDFINNQGTINYLVRGGKVATLSVGNAAAMMFNNDIDSATGFYKPLIKINS 717
SQFSNLTIQGDF+NNQGTINYLVRGG+VATL+VGNAAAM F+N++DSATGFY+PL+KINS
Sbjct: 660 SQFSNLTIQGDFVNNQGTINYLVRGGQVATLNVGNAAAMFFSNNVDSATGFYQPLMKINS 719

Query: 718 AQDLIKNTEHVLLKAKIIGYGNVSTGTNSISNVNLEEQFKERLALYNNNNRMDTCVVRNE 777
AQDLIKN EHVLLKAKIIGYGNVS GT+SI+NVNL EQFKERLALYNNNNRMD CVVRN
Sbjct: 720 AQDLIKNKEHVLLKAKIIGYGNVSAGTDSIANVNLIEQFKERLALYNNNNRMDICVVRNT 779

Query: 778 NDIKACGMAIGNQSMVNNPENYKYLIGKAWRNIGISKTANGSKISVYYLGNSTPTENGGN 837
+DIKACG AIGNQSMVNNPENYKYL GKAW+NIGISKTANGSKISV+YLGNSTPTENGGN
Sbjct: 780 DDIKACGTAIGNQSMVNNPENYKYLEGKAWKNIGISKTANGSKISVHYLGNSTPTENGGN 839

Query: 838 TTNLPTNTTNNARSANYALVKNAPFA-HSATPNLVAINQHDFGTIESVFELANRSKDIDT 896
TTNLPTNTTN R A+YAL+KNAPFA +SATPNLVAINQHDFGTIESVFELANRS DIDT
Sbjct: 840 TTNLPTNTTNKVRFASYALIKNAPFARYSATPNLVAINQHDFGTIESVFELANRSNDIDT 899

Query: 897 LYTHSGTKGRDLLQTLLIDSHDAGYARQMIDNTSTGEITKQLNAATDALNNVASLEHKQS 956
LY +SG +GRDLLQTLLIDSHDAGYAR MID TS EITKQLN AT LNN+ASLEHK S
Sbjct: 900 LYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTS 959

Query: 957 GLQTLSLSNAMILNSRLVNLSRKHTNHIDSFAQRLQALKGQRFASLESAAEVLYQFAPKY 1016
GLQTLSLSNAMILNSRLVNLSR+HTNHIDSFA+RLQALK QRFASLESAAEVLYQFAPKY
Sbjct: 960 GLQTLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFASLESAAEVLYQFAPKY 1019

Query: 1017 EKPTNVWANAIGGASLNSGSNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANS 1076
EKPTNVWANAIGG SLNSG NASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANS
Sbjct: 1020 EKPTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANS 1079

Query: 1077 LNSGANNANFGVYSRFFANQHEFDFEAQGALGSDQSSLNFKSALLQDLNQSYNYLAYSAT 1136
LNSGANN NFGVYSR FANQHEFDFEAQGALGSDQSSLNFKSALL+DLNQSYNYLAYSA
Sbjct: 1080 LNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAA 1139

Query: 1137 ARASYGYDFAFFRNALVLKPSVGVGYNHLGSTNFKSNSQSQVALKNGASSQHLFNANANV 1196
RASYGYDFAFFRNALVLKPSVGV YNHLGSTNFKSNS +VALKNGASSQHLFNA+ANV
Sbjct: 1140 TRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSNQKVALKNGASSQHLFNASANV 1199

Query: 1197 EARYYYGDTSYFYLHAGVLQEFAHFGSNDVASLNTFKINAARSPLSTYARAMMGGELRLA 1256
EARYYYGDTSYFY++AGVLQEFA+FGS++ SLNTFK+NA R+PL+T+AR MMGGEL+LA
Sbjct: 1200 EARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNATRNPLNTHARVMMGGELKLA 1259

Query: 1257 KEVFLNLGVVYLHNLISNASHFASNLGMRYSF 1288
KEVFLNLG VYLHNLISN HFASNLGMRYSF
Sbjct: 1260 KEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04570DHBDHDRGNASE897e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 88.6 bits (219), Expect = 7e-23
Identities = 56/233 (24%), Positives = 104/233 (44%), Gaps = 10/233 (4%)

Query: 14 KVAVITGASSGIGLECALMLLDQGYKVYALSRRATLCVALNHALC------ECVDIDVSD 67
K+A ITGA+ GIG A L QG + A+ + +L E DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 68 SNALKEVFLNISAKEDHCDVLINSAGYGVFGSVEDTPIEEVKKQFSVNFFALCEVVQLCL 127
S A+ E+ I + D+L+N AG G + EE + FSVN + +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 128 PLLKNKPYSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNVQVCLIEPG 187
+ ++ I + S V + Y++SK A ++ L LEL +N++ ++ PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 188 PVKSNWEKTAFENDERKDSVYTLEVNAAKSFYSGV-YQKALNAKEVAQKIVFL 239
+++ + + + ++ + V + ++F +G+ +K ++A ++FL
Sbjct: 189 STETDMQWSLWADENGAEQVIK---GSLETFKTGIPLKKLAKPSDIADAVLFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_04580BINARYTOXINA260.041 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 25.8 bits (56), Expect = 0.041
Identities = 19/73 (26%), Positives = 34/73 (46%), Gaps = 8/73 (10%)

Query: 19 LNQKIELEVFDLVVESLRNQIPLDKRFKDHALVGTYKGCRE-----CHIK-PDV--LLVY 70
+I LE F+ + E++++++ FKD +L G + H+K P +L Y
Sbjct: 151 NQNEISLEKFNELKETIQDKLFKQDGFKDVSLYEPGNGDEKPTPLLIHLKLPKNTGMLPY 210

Query: 71 RVKNNVLTLVRLG 83
N+V TL+
Sbjct: 211 INSNDVKTLIEQD 223


27HPGAM_06145HPGAM_06180N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPGAM_06145-2140.656916putative arabinose transporter
HPGAM_06150017-0.505886Alpha-carbonic anhydrase; putative signal
HPGAM_06155-1111.118265hypothetical protein
HPGAM_06160-2112.335862aspartate-semialdehyde dehydrogenase
HPGAM_06165-1132.184364histidyl-tRNA synthetase
HPGAM_061700113.674507ADP-heptose--LPS heptosyltransferase II
HPGAM_061751123.881259hypothetical protein
HPGAM_061801124.063165elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06145TCRTETB516e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 50.6 bits (121), Expect = 6e-09
Identities = 43/193 (22%), Positives = 85/193 (44%), Gaps = 6/193 (3%)

Query: 37 LSDIAKSFEMESATVGLMITAYAWVVSLGSLPLMLLSAKVERKRLLLFLFALFILSHILS 96
L DIA F A+ + TA+ S+G+ LS ++ KRLLLF + ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 97 ALAWNFWVLLI-SRIGIAFAHSIFWSITASLVIRVAPRNKKQQALGLLALGSSLAMILGL 155
+ +F+ LLI +R + F ++ +V R P+ + +A GL+ ++ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 156 PLGRIIGQMLDWRSTFGVIGGVATLIALLMWKLLPPLPSRNAGTLASVPILMKRPLLMGI 215
+G +I + W ++ ++ + T+I + L R G I++ + +GI
Sbjct: 157 AIGGMIAHYIHW--SYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIIL---MSVGI 211

Query: 216 YLLVIMVISGHFT 228
++ S +
Sbjct: 212 VFFMLFTTSYSIS 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06155IGASERPTASE371e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.0 bits (85), Expect = 1e-04
Identities = 28/191 (14%), Positives = 69/191 (36%), Gaps = 15/191 (7%)

Query: 88 DDQSKKEVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEK-----QKTSNIET 142
+ EVA++ E + + K + ++EK K E EK + TS +
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETK------ETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 143 NNQIKVEQEKQKTNNTQKDLIKKAEQNCQENHNQFFIKKVGIKGGAIEVEAECKTPKPTK 202
+ + Q + D ++ + + ++ + + + ++
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNT 1191

Query: 203 TNQTPIQPKH-LPNSKQPRSQRGSKARELIAYLQKELESLPYSQKAIAKQVDFYKPSSIA 261
N P++ P + QP S + + ++ + S+P++ + + S++A
Sbjct: 1192 GNSVVENPENTTPATTQPTVNSESSNKPKNRH-RRSVRSVPHNVEPATTSSN--DRSTVA 1248

Query: 262 HLELDPRDFNA 272
+L + NA
Sbjct: 1249 LCDLTSTNTNA 1259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06160CLENTEROTOXN320.003 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 31.9 bits (72), Expect = 0.003
Identities = 21/110 (19%), Positives = 39/110 (35%), Gaps = 15/110 (13%)

Query: 45 KIRAFNKDYEILETTH-EVFEKEEIDIAFFSAGGSVSEEFAISASKTALVIDNTSFFRLN 103
K+ A + Y+ + +H + + I + G +S+ A S ID S
Sbjct: 131 KVYATYRKYQAIRISHGNISDDGSI---YKLTGIWLSKTSADSLGN----IDQGSLIETG 183

Query: 104 EKVPLVVPEINAKEIFNAPLNIIANPNCSTIQMTQIL--NPLHLHFKIKS 151
E+ L VP + ++ + +T L NP + +S
Sbjct: 184 ERCVLTVPSTDIEKEILDL-----AAATERLNLTDALNSNPAGNLYDWRS 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPGAM_06180TCRTETOQM6420.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 642 bits (1657), Expect = 0.0
Identities = 179/671 (26%), Positives = 306/671 (45%), Gaps = 66/671 (9%)

Query: 9 RIRNIGIAAHIDAGKTTTSERILFYTGVSHKIGEVHDGAATMDWMEQEKERGITITSAAT 68
+I NIG+ AH+DAGKTT +E +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TCFWKDHQINLIDTPGHVDFTIEVERSMRVLDGAVSVFCSVGGVQPQSETVWRQANKYGV 128
+ W++ ++N+IDTPGH+DF EV RS+ VLDGA+ + + GVQ Q+ ++ K G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFVNKMDRIGANFYSVENQIKQRLKANPVPINIPIGAEDTFIGVIDLVQMKAIVWNN 188
P I F+NK+D+ G + +V IK++L A V
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI--------------------------- 154

Query: 189 ETMGAKYDVEEIPSDLLEKAKEYREKLVEAVAEQDEALMEKYLGGEELSVEEIKKGIKTG 248
K VE P+ + E + + V E ++ L+EKY+ G+ L E+++
Sbjct: 155 -----KQKVELYPNMCVTNFTESEQ--WDTVIEGNDDLLEKYMSGKSLEALELEQEESIR 207

Query: 249 CLNMSLVPMLCGSSFKNKGVQTLLDAVIDYLPAPTEVVDIKGIDPKSEEEVFVKSSDDGE 308
N SL P+ GS+ N G+ L++ + + + T E
Sbjct: 208 FHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSE 248

Query: 309 FAGLAFKIMTDPFVGQLTFVRVYRGKLESGSYVYNSTKDKKERVGRLLKMHSNKREDIKE 368
G FKI +L ++R+Y G L V S K+K ++ + + + I +
Sbjct: 249 LCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELCKIDK 307

Query: 369 VYAGEICAFVG----LKDTLTGDTLCDEKNAVVLERMEFPEPVIHIAVEPKTKADQEKMG 424
Y+GEI L L GDT + ER+E P P++ VEP +E +
Sbjct: 308 AYSGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLL 362

Query: 425 VALGKLAEEDPSFRVMTQEETGQTLIGGMGELHLEIIVDRLKREFKVEAEIGQPQVAFRE 484
AL ++++ DP R T + ++ +G++ +E+ L+ ++ VE EI +P V + E
Sbjct: 363 DALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME 422

Query: 485 TIRSSVSKEHKYAKQSGGRGQYGHVFIKLEPKEPGSGYEFVNEISGGVIPKEYIPAVDKG 544
R E+ + + + + + P GSG ++ + +S G + + + AV +G
Sbjct: 423 --RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 545 IQEAMQNGVLAGYPVVDFKVTLYDGSYHDVDSSEMAFKIAGSMAFKEASRAANPVLLEPM 604
I+ + G L G+ V D K+ G Y+ S+ F++ + ++ + A LLEP
Sbjct: 481 IRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPY 539

Query: 605 MKVEVEVPEEYMGDVIGDLNRRRGQINSMDDRLGLKIVNAFVPLVEMFGYSTDLRSATQG 664
+ ++ P+EY+ D + I + I++ +P + Y +DL T G
Sbjct: 540 LSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNG 599

Query: 665 RGTYSMEFDHY 675
R E Y
Sbjct: 600 RSVCLTELKGY 610



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.