PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeSouthAfrica7.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002336 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPSA_00010HPSA_00035Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00010-1143.6629196,7-dimethyl-8-ribityllumazine synthase
HPSA_00015-1143.3942082-dehydro-3-deoxyphosphooctonate aldolase
HPSA_00020-1154.128556carbonic anhydrase
HPSA_00025-1164.434022orotidine 5'-phosphate decarboxylase
HPSA_00030-1143.625828pantoate--beta-alanine ligase
HPSA_00035-1133.387755*****outer membrane protein HopZ
2HPSA_00095HPSA_00160Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00095-2153.216624carboxynorspermidine decarboxylase
HPSA_00100-1163.324944lipid A 1-phosphatase
HPSA_00105-1163.055353lipid A phosphoethanolamine transferase
HPSA_001101173.395548hypothetical protein
HPSA_001152152.054543hypothetical protein
HPSA_001202130.741140type II citrate synthase
HPSA_00125214-0.117067isocitrate dehydrogenase
HPSA_00130-111-0.873009hypothetical protein
HPSA_00135011-0.893985dethiobiotin synthetase
HPSA_00140013-0.925883hypothetical protein
HPSA_00145-2150.099522hypothetical protein
HPSA_00150-2120.093535ATP-dependent Clp protease adapter protein clpS
HPSA_00155-113-0.109262ATP-dependent Clp protease, ATP-binding subunit
HPSA_00160215-0.102986aspartate alpha-decarboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00155HTHFIS350.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.8 bits (80), Expect = 0.001
Identities = 16/56 (28%), Positives = 27/56 (48%), Gaps = 4/56 (7%)

Query: 163 KNLSALAQDNALDPVIGREEEILRVIEILGR--RKKNNPLLIGEAGVGKTSIAEAL 216
L +QD P++GR + + +L R + ++ GE+G GK +A AL
Sbjct: 127 SKLEDDSQDGM--PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


3HPSA_00265HPSA_00450Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00265216-0.151546bifunctional proline
HPSA_00270621-1.772459hypothetical protein
HPSA_00275518-1.532494hypothetical protein
HPSA_00280416-1.190205hypothetical protein
HPSA_00285314-1.177221hypothetical protein
HPSA_00300315-0.774806hypothetical protein
HPSA_00305112-0.109436hypothetical protein
HPSA_003101100.464630hypothetical protein
HPSA_00315-1101.499870hypothetical protein
HPSA_00330-1101.612614ATP-binding protein
HPSA_00335-1111.751384ATP-binding protein
HPSA_00340-1122.011367ATP-binding protein
HPSA_003453203.471925urease accessory protein UreD
HPSA_003504223.230105Urease accessory protein UreG
HPSA_003554192.546871hypothetical protein
HPSA_003603172.581474urease accessory protein UreE
HPSA_003653192.693403Urease accessory protein ureI
HPSA_003702162.717484urease subunit beta
HPSA_00375-191.768144urease subunit alpha
HPSA_003851132.094169*lipoprotein signal peptidase
HPSA_003902142.532840phosphoglucosamine mutase
HPSA_003953142.11550130S ribosomal protein S20
HPSA_004002132.284559peptide chain release factor 1
HPSA_004054152.120307hypothetical protein
HPSA_004103151.852295outer membrane protein HorA
HPSA_004152151.423334hypothetical protein
HPSA_00420-2140.588652hypothetical protein
HPSA_004250150.27205430S ribosomal protein S9
HPSA_00430-114-0.05571150S ribosomal protein L13
HPSA_004350120.152769hypothetical protein
HPSA_004400110.486393hypothetical protein
HPSA_00445011-1.259323hypothetical protein
HPSA_00450211-1.285038RNA polymerase sigma factor RpoD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00275GPOSANCHOR320.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.002
Identities = 17/69 (24%), Positives = 29/69 (42%)

Query: 2 LENDKQVLNNEKIELSNDITKLTAEKDDLLKTKENLTKEKENLNTDLSNAKNEASQTSQK 61
LE + N S I L AEK L K +L + + LN + + + + + +
Sbjct: 265 LEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREA 324

Query: 62 LKDLQQKHA 70
K L+ +H
Sbjct: 325 KKQLEAEHQ 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00280GPOSANCHOR413e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.2 bits (96), Expect = 3e-06
Identities = 43/228 (18%), Positives = 66/228 (28%)

Query: 4 LSSTKEKLEARIDLLESEIIDLSTGIKNLVAETSKLKDANNQLRQKNDKLFTTKERLTKE 63
L E ++I L L A + L+ A + + L E
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 64 NAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENTELASKNTVLTEKTAEL 123
A LE R AEL K E + L K +L +
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 124 KNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEKDNQQLTDKLKKLESVQ 183
+ L + AL Q L++ + LE + L + LE
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQS 304

Query: 184 KNLENTNNQLRQAREKIAEEKTELEREMARLKGLEGMEAKSNLDLHNK 231
+ L LR+ + E K +LE E +L+ + S L
Sbjct: 305 QVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352



Score = 39.3 bits (91), Expect = 1e-05
Identities = 38/212 (17%), Positives = 70/212 (33%)

Query: 4 LSSTKEKLEARIDLLESEIIDLSTGIKNLVAETSKLKDANNQLRQKNDKLFTTKERLTKE 63
+ + ++I LE+ DL ++ + ++ L + L K L K
Sbjct: 104 NDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKA 163

Query: 64 NAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENTELASKNTVLTEKTAEL 123
N + S + + L + LE +L +A E +T ++K L + A L
Sbjct: 164 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 223

Query: 124 KNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEKDNQQLTDKLKKLESVQ 183
L + N + + L+ LE A LEK + + +
Sbjct: 224 AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKI 283

Query: 184 KNLENTNNQLRQAREKIAEEKTELEREMARLK 215
K LE L + + + L L+
Sbjct: 284 KTLEAEKAALEAEKADLEHQSQVLNANRQSLR 315



Score = 31.2 bits (70), Expect = 0.004
Identities = 29/210 (13%), Positives = 55/210 (26%)

Query: 48 QKNDKLFTTKERLTKENAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENT 107
N+ T +++ R + E L +K L N L + T+E +
Sbjct: 36 NTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELS 95

Query: 108 ELASKNTVLTEKTAELKNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEK 167
K + +E ++ L + L LE A L
Sbjct: 96 NAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAA 155

Query: 168 DNQQLTDKLKKLESVQKNLENTNNQLRQAREKIAEEKTELEREMARLKGLEGMEAKSNLD 227
L L+ + L + + + ELE+ + ++
Sbjct: 156 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 215

Query: 228 LHNKRLANENRDLKTQNRKLEEENAKLKKE 257
L ++ A R + N
Sbjct: 216 LEAEKAALAARKADLEKALEGAMNFSTADS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00285GPOSANCHOR352e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 2e-04
Identities = 28/172 (16%), Positives = 55/172 (31%)

Query: 46 LKNENAELLRQNGDLAVKIKNLENTNNQLRQARENWIEEKRELTTEKERLVRKNTELENR 105
N + + L + L L +A E + + + + L + LE R
Sbjct: 132 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 191

Query: 106 NAELYAELLKEKESLTKANTELAEKIKALNAQNNELVASLADQSELDLHNKRLASENRDL 165
AEL L T + ++ A + +++ + L
Sbjct: 192 QAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL 251

Query: 166 KTQNRKLEEENIKLKKEVKDVKERHSQLQQQNDELERLNANAERTQHDLKQQ 217
+ + LE +L+K ++ + + LE A E + DL+ Q
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00305CHANNELTSX270.011 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 26.5 bits (58), Expect = 0.011
Identities = 11/29 (37%), Positives = 18/29 (62%)

Query: 32 LSNHFHNLGSWQDAKRDNFSEVIDNLRST 60
++ +FHN G W D + NF + ++RST
Sbjct: 254 VARYFHNGGQWADDAKLNFGDGPFSVRST 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00370UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 354/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTELIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DTEL EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00410FLAGELLIN371e-04 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 37.3 bits (86), Expect = 1e-04
Identities = 32/260 (12%), Positives = 80/260 (30%), Gaps = 5/260 (1%)

Query: 80 TTASTDNATATTDKAYTNSTDTTVANAAKQVETENTAVQNAETDLQNAETDLQNAVTKVE 139
S+ D + V + V T+ TA + NA + Q E
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNA-ANGQLTTDDAE 242

Query: 140 NDAKAKDFDETTFQADQKAEQSAQTALQNAEDQFTNDQNALNTALQDQKTPTPSTPSPTK 199
N+ F T A ++ A++ ++ T D + + + + T
Sbjct: 243 NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTT 302

Query: 200 KDETSGGSGDKDQHTASSGTPSSSGNAVASQLTKDTATVDGFSKVSVNSMNTTLSGVTQM 259
+ G ++G + + S T+ V+G + N +
Sbjct: 303 IN---GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 260 SQQTETISNLLSSSADLGSVISNSQGLSDAFSALKSAQNTLKGYLDSSSATIGQLTNGSN 319
+ + ++ + + + ++ A + + + +
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKK-STA 418

Query: 320 AVVSTLDKAINEVDMALANL 339
++++D A+++VD ++L
Sbjct: 419 NPLASIDSALSKVDAVRSSL 438


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00450IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 5e-04
Identities = 27/168 (16%), Positives = 54/168 (32%), Gaps = 7/168 (4%)

Query: 3 KKANEEKAL-KKAKTEAKQEAKTEAKENKGNSKAKETKIKEAKVKESKEAKIKEGKAKEI 61
+ A E A ++ EAK K + N+ ETK + + KE KAK
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 62 KAKETDPTKKLS------FNEALEELFADSLSDCVSYESIIQISAKVPTLAQIKKIKELC 115
K + K S + A+ + +I + ++ T A ++ +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET 1175

Query: 116 QKYQKRLVSSSEYAKKLNAIDKIKNTEEKQKVLDEELEDGYDFLKEKD 163
++ V+ S N++ + + + K +
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223


4HPSA_00685HPSA_00740Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00685-1123.200169bacterioferritin comigratory protein
HPSA_00690-2122.760199hypothetical protein
HPSA_006950112.902607iron-sulfur cluster binding protein
HPSA_007000102.614666Fe-S oxidoreductase
HPSA_007050112.798861L-lactate permease
HPSA_007100102.120161L-lactate permease
HPSA_007152121.481927DNA glycosylase MutY
HPSA_007203131.865752hypothetical protein
HPSA_007254151.269592cbb3-type cytochrome c oxidase subunit I
HPSA_007300160.320134cbb3-type cytochrome c oxidase subunit II
HPSA_00735316-1.525653cytochrome c oxidase subunit Q
HPSA_00740314-1.373353cytochrome c oxidase, cbb3-type, subunit III
5HPSA_00960HPSA_01135Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00960-1103.294127fumarate reductase iron-sulfur subunit
HPSA_00965-1113.342809fumarate reductase flavoprotein subunit
HPSA_00970-2142.148507fumarate reductase cytochrome b-556 subunit
HPSA_00975-2152.288015triosephosphate isomerase
HPSA_00980-2163.511704enoyl-(acyl carrier protein) reductase
HPSA_00985-2173.849041UDP-3-O-[3-hydroxymyristoyl] glucosamine
HPSA_00990-2183.989952S-adenosylmethionine synthetase
HPSA_00995-2212.966255nucleoside diphosphate kinase
HPSA_01000-2222.331786hypothetical protein
HPSA_01005-114-3.29919350S ribosomal protein L32
HPSA_01010-112-2.065836putative glycerol-3-phosphate acyltransferase
HPSA_01015-111-3.1398153-oxoacyl-(acyl carrier protein) synthase III
HPSA_01020011-4.090908hypothetical protein
HPSA_01025011-3.412604hypothetical protein
HPSA_01030-110-3.209540hypothetical protein
HPSA_01035-212-0.093414ATP-binding protein
HPSA_01040-110-0.481129LPS 1,2-glycosyltransferase
HPSA_01045-190.819900lipopolysaccharide biosynthesis protein
HPSA_01050-191.284638outer membrane protein
HPSA_01055-1101.962109heat shock protein 90
HPSA_010600103.015283hypothetical protein
HPSA_010650122.949763succinyl-diaminopimelate desuccinylase
HPSA_010700122.867204tRNA uridine 5-carboxymethylaminomethyl
HPSA_01075-1151.761502putative transporter
HPSA_01080-1141.005319phosphatidate cytidylyltransferase
HPSA_01085-113-0.3626851-deoxy-D-xylulose 5-phosphate reductoisomerase
HPSA_010902111.046168hypothetical protein
HPSA_010952112.349712hypothetical protein
HPSA_011001102.117863hypothetical protein
HPSA_01105-1113.147950hypothetical protein
HPSA_01110-1132.920673hypothetical protein
HPSA_01115-1143.036803cysteine desulfurase
HPSA_01120-1142.394208nifU-like protein
HPSA_011250122.451497hypothetical protein
HPSA_011301112.655690DNA repair protein RadA
HPSA_011352152.550877bifunctional methionine sulfoxide reductase A/B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00980DHBDHDRGNASE593e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 58.5 bits (141), Expect = 3e-12
Identities = 60/263 (22%), Positives = 110/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLVVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSSYVYE 62
++GK + G A + I +A++ +QGA + A Y E LEK V + E + +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNSIKQDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNDGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + ++ A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHNIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++NIR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSHLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


6HPSA_01510HPSA_01655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_015100173.25627750S ribosomal protein L21
HPSA_015151163.45364650S ribosomal protein L27
HPSA_015200153.331658periplasmic dipeptide-binding protein
HPSA_01525-1153.880549dipeptide permease protein
HPSA_01530-1153.398867dipeptide transport system permease protein
HPSA_01535-3153.316155ABC-type transport system, ATP-binding protein;
HPSA_01540-3143.091579dipeptide transport system ATP-binding protein
HPSA_01545-2152.509112GTPase ObgE
HPSA_01550-2142.045317hypothetical protein
HPSA_01555-1152.564037hypothetical protein
HPSA_015600172.959596glutamate-1-semialdehyde aminotransferase
HPSA_015653171.654799hypothetical protein
HPSA_015703141.542547hypothetical protein
HPSA_015753151.330256N-carbamoyl-D-amino acid amidohydrolase
HPSA_015802132.168004hypothetical protein
HPSA_015852131.484546hypothetical protein
HPSA_015902141.217510putative cobalamin synthesis protein
HPSA_01595-1150.799557nitrite extrusion protein (narK)
HPSA_016000170.832975hypothetical protein
HPSA_01605-1161.032072outer membrane protein HopD
HPSA_01610215-0.511324hypothetical protein
HPSA_01615115-0.396312putative heme iron utilization protein
HPSA_01620014-0.511431arginyl-tRNA synthetase
HPSA_01625113-0.058140Sec-independent protein translocase protein
HPSA_01630012-0.292265guanylate kinase
HPSA_01635012-0.710760hypothetical protein
HPSA_01640012-1.763301nuclease NucT
HPSA_01645112-1.864309Outer membrane protein HorC; putative signal
HPSA_01650213-1.926780flagellar basal body L-ring protein
HPSA_01655213-1.660829CMP-N-acetylneuraminic acid synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01595TCRTETB300.014 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.2 bits (68), Expect = 0.014
Identities = 32/153 (20%), Positives = 63/153 (41%), Gaps = 1/153 (0%)

Query: 23 VLIPLLILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFL 82
V +P + + P + + A ++ + G+ + LS + ++ + + +
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 83 ICYFDSIPFFWLWIWRFIAGVASSALMILVAPLSLPYVKENKRALVGGLIFSAVGIGSVF 142
I + F L + RFI G ++A LV + Y+ + R GLI S V +G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 143 SGFVLPWISSYNIKWAWIFLGGSCLIAFILSLV 175
+ I+ Y I W+++ L I + L+
Sbjct: 155 GPAIGGMIAHY-IHWSYLLLIPMITIITVPFLM 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01630PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01635IGASERPTASE616e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 61.2 bits (148), Expect = 6e-12
Identities = 52/280 (18%), Positives = 97/280 (34%), Gaps = 18/280 (6%)

Query: 153 EEPNAEEQLLPTLDAQKENEEIKEEEKQEVAE--TPQEKEVSQELENPQVQEAETQEVAE 210
PN + +P+ NEEI ++ V E ++ + QE++T E E
Sbjct: 998 TTPNNIQADVPS--VPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE 1055

Query: 211 QTET-------QELETPKDETQENAETPQEKEVSKELETLQTQEAETQEIQEEKQEQEQV 263
Q T + + K + N +T + + E + QT E + E++++ +
Sbjct: 1056 QDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE 1115

Query: 264 KEKTQDSPSAQELEAMQELVKEIQENSNEDKKETQEIAEIPQTQEEEIIETPQEIQTQEI 323
EKTQ+ P + ++ E + E +E I + Q + E +E
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET 1175

Query: 324 TETPQTQETETETQEVTEQKETQEITETPQEAETKTQDQETPPKTQETQENYETIEDIPE 383
+ + TE+ T + E P+ T ++ +N
Sbjct: 1176 SSNVEQPVTESTTVNTGNS-----VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230

Query: 384 PVMAKAMGEALP-LVNEALIE-TSNNENAAQTPKESVTQT 421
P + + AL + TS N NA + + Q
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQF 1270



Score = 60.1 bits (145), Expect = 1e-11
Identities = 42/273 (15%), Positives = 88/273 (32%), Gaps = 23/273 (8%)

Query: 168 QKENEEIKEEEKQEVAETPQEKEVSQELENPQVQEAETQEVAEQTETQELETPKDETQEN 227
++ + + Q S N ++ + V TP + T+
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA----TPSETTETV 1040

Query: 228 AETPQEKEVSKELETLQTQEAETQEIQEEKQEQEQVKEKTQDSPSAQELEAMQELVKEIQ 287
AE +++ + E E Q + K+ + VK TQ + E+
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN--------------EVA 1086

Query: 288 ENSNEDKKETQEIAEIPQTQEEEIIETPQEIQTQEITETPQTQETETETQEVTEQKETQE 347
++ +E K+ + T E+ E +++T++ E P+ + QE +E + Q
Sbjct: 1087 QSGSETKETQTTETKETATVEK---EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 348 ITETPQEAETKTQDQETPPKTQETQEN--YETIEDIPEPVMAKAMGEALPLVNEALIETS 405
+ ++ ++ T E ET ++ +PV V E T+
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 406 NNENAAQTPKESVTQTPQENAKIPQKSDATSSP 438
ES + + + + P
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236



Score = 53.5 bits (128), Expect = 1e-09
Identities = 41/257 (15%), Positives = 87/257 (33%), Gaps = 18/257 (7%)

Query: 225 QENAETPQEKEVSKELETLQTQEAETQEIQEEKQEQEQ--VKEKTQDSPSAQELEAMQEL 282
E + +Q EE ++ V +PS +
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 283 VKEIQENSNEDKKETQEIAEIPQTQEEEIIETPQEIQTQEITETPQTQETETETQEVTEQ 342
+E + ++ T+ A+ + +E QT E+ + +ET+ + TE
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ----SGSETKETQTTET 1100

Query: 343 KETQEITETPQEAETKTQDQETPPKTQET---QENYETI----EDIPEPVMAKAMGEALP 395
KET + + + + QE P T + QE ET+ E E + E
Sbjct: 1101 KETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 396 LVNEALIETSNNENAAQTPKESVTQTPQENA-----KIPQKSDATSSPLELRLNLQDLLK 450
N + + ++ VT++ N + P+ + ++ + + K
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 451 SLNQESLKSLLENKTLS 467
+ ++ S++S+ N +
Sbjct: 1221 NRHRRSVRSVPHNVEPA 1237



Score = 42.7 bits (100), Expect = 3e-06
Identities = 37/186 (19%), Positives = 65/186 (34%), Gaps = 10/186 (5%)

Query: 122 ESSQNLEPVKETLEANWDALEDLGDLESLAKEEPNAEEQLLPTLDAQKENEEIK----EE 177
E+ N++ +T E E + KE E++ ++ +K E K
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 178 EKQEVAETPQEKEVSQELENPQVQEAETQEVAEQTETQELETPKDETQENAETPQEKEVS 237
KQE +ET Q + +P V E Q ++ T + E P ET N E P + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQ--SQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 238 KELETLQTQEAETQEIQEEKQEQEQVKEKTQDSPSAQELEAMQELVKEIQENSNEDKKET 297
E E Q V ++ + P + +++ V E + +
Sbjct: 1189 VN---TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS-VPHNVEPATTSSNDR 1244

Query: 298 QEIAEI 303
+A
Sbjct: 1245 STVALC 1250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01650FLGLRINGFLGH1942e-64 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 194 bits (493), Expect = 2e-64
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 KKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


7HPSA_02270HPSA_02355Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_02270013-3.281893molybdenum ABC transporter ModA
HPSA_02275013-3.867388molybdenum ABC transporter ModB
HPSA_02280010-2.396389molybdate transport system ATP-binding protein
HPSA_02285-112-2.318882glutamyl-tRNA synthetase
HPSA_02290-114-2.720242hypothetical protein
HPSA_02295-114-2.673702type II adenine specific methyltransferase
HPSA_02300015-1.613003hypothetical protein
HPSA_02305013-0.453204GTP-binding elongation factor protein, TypA
HPSA_02310619-0.175709proline/betaine transporter
HPSA_023156180.436713GTP-binding protein
HPSA_023203190.014998type II DNA modification (methyltransferase)
HPSA_02335320-0.030706hypothetical protein
HPSA_023402170.213036catalase-like protein
HPSA_023453190.398146Outer membrane protein HofC; putative signal
HPSA_02350217-0.799855hypothetical protein
HPSA_02355218-0.645732hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02280PF05272300.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.013
Identities = 11/29 (37%), Positives = 17/29 (58%)

Query: 33 LLGESGAGKSTILRILAGLEAVSSGYIEV 61
L G G GKST++ L GL+ S + ++
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02305TCRTETOQM1989e-58 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (505), Expect = 9e-58
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FNAMDV-GDSVVDPNNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02310TCRTETA320.005 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 31.7 bits (72), Expect = 0.005
Identities = 39/273 (14%), Positives = 92/273 (33%), Gaps = 49/273 (17%)

Query: 33 NTYGIFAAGY-LARPLGGIVMAHFGDRFGRKNMFMLSILLMVIPTFMLALMPTFDHFVSF 91
YGI A Y L + V+ DRFGR+ + ++S+ + ++A P
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVL--- 99

Query: 92 GVNSMGLAPENAHYLGYIAPIFLVFVRVCQGVAVGGELPGAWVFVQEHAPQGQKNTFVGF 151
Y+G R+ G+ G A ++ + ++ GF
Sbjct: 100 -------------YIG----------RIVAGIT-GATGAVAGAYIADITDGDERARHFGF 135

Query: 152 LTASVVSGILLGSLVYIGIYMVFDKAVVEDWAWRVAFGLGGIFGIISVYLRRFLEETPVF 211
++A G++ G + +G M ++ F ++ FL
Sbjct: 136 MSACFGFGMVAGPV--LGGLM-------GGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 212 QQMKQDNILVKFPLKEVFKDSLFGILVSMLITWVLTACI-----LIFILFIPNFTLMHPN 266
+ + PL ++ +++ + + + ++++F +
Sbjct: 187 GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGED------R 240

Query: 267 FNFTPFEKTY-FQILGLVSIVSSIILTGFLADK 298
F++ G++ ++ ++TG +A +
Sbjct: 241 FHWDATTIGISLAAFGILHSLAQAMITGPVAAR 273


8HPSA_03150HPSA_03175Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_03150-115-4.674864hypothetical protein
HPSA_03155-116-4.657074coproporphyrinogen III oxidase
HPSA_03160019-4.582225glycerol-3-phosphate dehydrogenase
HPSA_03165324-6.277421hypothetical protein
HPSA_03170121-5.286730hypothetical protein
HPSA_03175-119-4.741316hypothetical protein
9HPSA_03230HPSA_03295Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_032302100.791134ribonucleotide-diphosphate reductase subunit
HPSA_032353120.488441hypothetical protein
HPSA_032402120.128835hypothetical protein
HPSA_03245290.734795bifunctional N-acetylglucosamine-1-phosphate
HPSA_032502100.672768flagellar biosynthesis protein FliP
HPSA_032552101.204678Iron(III) dicitrate transport protein FecA;
HPSA_03260-1121.811404iron(II) transport protein (feoB)
HPSA_032651122.407414hypothetical protein
HPSA_032701143.721262acetyl-CoA acetyltransferase
HPSA_032753143.310567succinyl-CoA-transferase subunit A
HPSA_032804153.276709succinyl-CoA-transferase subunit B
HPSA_032854162.864396short-chain fatty acids transporter
HPSA_032903141.853082outer membrane protein
HPSA_032952161.790786hydantoin utilization protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_03250FLGBIOSNFLIP2755e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 275 bits (705), Expect = 5e-96
Identities = 115/244 (47%), Positives = 164/244 (67%), Gaps = 3/244 (1%)

Query: 2 RFLKIIFLIFALIGPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLIL 61
R L + ++ LI PL A + LP + S P + + + +T L P+++L
Sbjct: 3 RLLSVAPVLLWLITPL--AFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAILL 59

Query: 62 VMTSFIRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYMDK 121
+MTSF R+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+ ++
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 122 KISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDDVSLSVLIPAFMISEL 181
KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++ SEL
Sbjct: 120 KISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSEL 179

Query: 182 KTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTENL 241
KTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL +L
Sbjct: 180 KTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSL 239

Query: 242 VASF 245
SF
Sbjct: 240 AQSF 243


10HPSA_03415HPSA_03490Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_03415312-0.909886RNA polymerase factor sigma-54
HPSA_03420113-0.068960ABC transporter, ATP-binding protein
HPSA_03425114-0.674673hypothetical protein
HPSA_034300120.495473DNA polymerase III subunits gamma and tau
HPSA_034352153.024460hypothetical protein
HPSA_034402143.085308hypothetical protein
HPSA_034452132.861268hypothetical protein
HPSA_034501142.869321hypothetical protein
HPSA_034551132.099192L-asparaginase II
HPSA_034600110.562593anaerobic C4-dicarboxylate transporter
HPSA_03465013-0.932728Outer membrane protein HopP
HPSA_03470111-2.560860putative Outer membrane protein
HPSA_03475214-4.060211putative transcriptional regulator
HPSA_03480216-4.891008tRNA(Ile)-lysidine synthase
HPSA_03485112-3.892741hypothetical protein
HPSA_03490213-3.452062hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_03440SECA260.050 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 26.4 bits (58), Expect = 0.050
Identities = 11/43 (25%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 71 RIAKKNLSQMSEEDFKKMREEVRK--ELEEKTKGLSEEEIKAK 111
++ K ++ ++MR+ V +E + + LS+EE+K K
Sbjct: 4 KLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGK 46


11HPSA_04210HPSA_04240Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_042101163.422469hydrogenase nickel incorporation protein
HPSA_042150133.470016flagellar hook protein FlgE
HPSA_042202192.017193hypothetical protein
HPSA_042251143.431226CDP-diacylglycerol pyrophosphatase
HPSA_042301143.300943alkylphosphonate uptake protein
HPSA_042351143.048912hypothetical protein
HPSA_042401133.182993hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04215FLGHOOKAP1427e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 7e-06
Identities = 13/49 (26%), Positives = 27/49 (55%)

Query: 669 SISGSKLESSNVDLSRSLTNLIVVQRGFQANSKAVTTSDQILNTLLNLK 717
+S + S V+L NL Q+ + AN++ + T++ I + L+N++
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 39.2 bits (91), Expect = 5e-05
Identities = 11/35 (31%), Positives = 20/35 (57%)

Query: 4 SLWSGVNGMQAHQIALDIESNNIANVNTTGFKYSR 38
+ + ++G+ A Q AL+ SNNI++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


12HPSA_04570HPSA_04755Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_04570122-4.461507hypothetical protein
HPSA_04575222-4.994595*protein tRNA-associated locus protein
HPSA_04580320-4.137674***hypothetical protein
HPSA_04585423-5.994428hypothetical protein
HPSA_04590417-4.374887integrase-recombinase protein
HPSA_04595417-3.792279hypothetical protein
HPSA_04600515-3.349641hypothetical protein
HPSA_04605414-3.061795hypothetical protein
HPSA_04620414-4.354658hypothetical protein
HPSA_04625415-3.994028hypothetical protein
HPSA_04630516-6.350114PARA protein
HPSA_04635518-6.744448hypothetical protein
HPSA_04640420-7.275801hypothetical protein
HPSA_04645424-8.192702hypothetical protein
HPSA_04650325-7.611275hypothetical protein
HPSA_04655326-7.716903hypothetical protein
HPSA_04660427-8.152030VirD4 coupling protein
HPSA_04665530-10.065771hypothetical protein
HPSA_04670529-9.233243hypothetical protein
HPSA_04675326-7.112230hypothetical protein
HPSA_04680520-6.004614VirB11 type IV secretion ATPase
HPSA_04685520-5.831041hypothetical protein
HPSA_04690519-5.512639hypothetical protein
HPSA_04695521-6.085825hypothetical protein
HPSA_04700523-7.138598competence protein
HPSA_04705724-7.786450comB9-like competence protein
HPSA_04710628-8.519560hypothetical protein
HPSA_04715628-8.923145hypothetical protein
HPSA_04720627-8.743781topoisomerase I
HPSA_04725421-8.148447DNA transfer protein
HPSA_04730316-6.829619hypothetical protein
HPSA_04735113-4.658689hypothetical protein
HPSA_04740011-4.422183hypothetical protein
HPSA_04755211-1.331437cagY like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04605CABNDNGRPT403e-05 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 40.0 bits (93), Expect = 3e-05
Identities = 24/119 (20%), Positives = 39/119 (32%), Gaps = 7/119 (5%)

Query: 146 TNDPMYANTPFSNNPNSPNDNAINGKDGANGSNGYGINGNDGINGSSGSNGANGSNGNTS 205
N + P + AI GAN + + G N ++ + ++ + +
Sbjct: 232 NETGADYNGHYGGAPMIDDIAAIQRLYGANMTTR-TGDSVYGFNSNTDRDFYTATDSSKA 290

Query: 206 NNNAI--GSGIDTDGVLGVDG---VNGSSSSSGGSVGGYENNFTNHGSTNNN-TGGYDN 258
++ G DT G +N + S G N HG T N GG N
Sbjct: 291 LIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGN 349



Score = 29.6 bits (66), Expect = 0.046
Identities = 24/123 (19%), Positives = 41/123 (33%), Gaps = 27/123 (21%)

Query: 150 MYANTPFSNNPNSPNDNAINGKDGANGSNGYGINGNDGINGSSGSNGANGSNGN------ 203
FS+ + +I N G +GND + G+S N G GN
Sbjct: 316 NLNEGSFSDVGGLKGNVSIAHGVTIE--NAIGGSGNDILVGNSADNILQGGAGNDVLYGG 373

Query: 204 -----------------TSNNNAIGSGIDT--DGVLGVDGVNGSSSSSGGSVGGYENNFT 244
S ++ + D D G+D ++ S+ + G + ++ FT
Sbjct: 374 AGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFT 433

Query: 245 NHG 247
G
Sbjct: 434 GKG 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04625FbpA_PF05833330.030 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 32.5 bits (74), Expect = 0.030
Identities = 55/290 (18%), Positives = 107/290 (36%), Gaps = 33/290 (11%)

Query: 1183 FALQNNRFDSFIPSDQLKIVNAI-ASHF-GFKQEKLQRWYEKIDTANFGYSEQDYKII-- 1238
F + ++F + L++ + I + F G + ++ + S + K I
Sbjct: 174 FDFSYDMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVE 233

Query: 1239 --KDFMDRVGENNINLNEQTLNEYFIN-HPENILGHLSLEKTRY---SFEINGEQIYKYE 1292
KD + N N T N F+ + N++ +K +Y S + K +
Sbjct: 234 VCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDK 293

Query: 1293 LQALENKSLDLSQALSQAIEKLPKDVYQYHKTTLKTDALIIDANNERYQEVQKLIK---- 1348
L++KS DL + + I + K + T K + + + ++ +L+
Sbjct: 294 SDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCE------DKDIFKLYGELLTANIY 347

Query: 1349 NLARG-ELVKWDNLYYQLEQNNEMGIFLRPTKINSKAQDLRIKAYFKIKDALNDL----- 1402
L +G ++ N YY E + + I L K S+ K Y K+K +
Sbjct: 348 ALKKGLSHIELAN-YYS-ENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLL 405

Query: 1403 -TSAELNPLSS---DLELESKRAKLNLVYDEFVKKFGYLNENRNRKDIKQ 1448
ELN L S ++ ++ + E ++ GY+ + K K
Sbjct: 406 QNEEELNYLYSVLTNINNADNYDEIEEIKKELIET-GYIKFKKIYKSKKS 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04710PF04335982e-26 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 98.4 bits (245), Expect = 2e-26
Identities = 40/209 (19%), Positives = 77/209 (36%), Gaps = 16/209 (7%)

Query: 87 EADVLFQAERKIGDWIFSSAVFFFALAFIEAIIIICLLPLKEKVPYLVTFSNATQNFAIV 146
E D L AER + A ALA + + L PLK PY++T T +I
Sbjct: 21 ERDKLAAAERS-KKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIA 79

Query: 147 QR--ADKSIRANQALVRQLVASYVNNRE--NISNIKEQNEIAHETIRLQSAFEVWDFFEK 202
+ D +I ++A+ + +A+YV RE + +E + + + SA D + +
Sbjct: 80 AKLHGDATITYDEAVRKYFLATYVRYREGWIAAAREEY----FDAVMVMSARPEQDRWSR 135

Query: 203 LVSYEH-----SIYTNINLTRKISIINIALISKTQANIEISAQLFHKEKLESLKRYRIIM 257
++ +I N + I ++ + A + + + + +
Sbjct: 136 FYKTDNPQSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFT-KESVTGSNSTKTDAVATI 193

Query: 258 TFEFEPIEIDTKSVPLNPTGFIVTGYDVT 286
++ + NP G+ V Y
Sbjct: 194 KYKVDGTPSKEVDRFKNPLGYQVESYRAD 222


13HPSA_04905HPSA_04955Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_049052160.597248hypothetical protein
HPSA_04910414-0.109897cell division protein FtsA
HPSA_04915417-0.980644cell division protein FtsZ
HPSA_04920214-2.710256hypothetical protein
HPSA_04925113-2.393700hypothetical protein
HPSA_04930213-2.506844hypothetical protein
HPSA_04935115-2.466257hypothetical protein
HPSA_04940120-0.319922mechanosensitive channel MscS
HPSA_04945015-2.517609hypothetical protein
HPSA_04950316-2.121698hypothetical protein
HPSA_04955316-2.518885hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04910SHAPEPROTEIN424e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 41.7 bits (98), Expect = 4e-06
Identities = 40/181 (22%), Positives = 69/181 (38%), Gaps = 13/181 (7%)

Query: 211 AASIATLSNDEKELGVACVDMGGETCNLTIYSGNSIRYNKYLPIGSHHLTTDL------S 264
AA+I + G VD+GG T + + S N + Y+ + IG + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 265 YMLNTPFPCAEEVKIKYGDLSFESEGETPSQNVQIPTTGSDGNENHIVPLSEIQSIMRER 324
Y AE +K + G S E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 325 ALETFKIIHKSIQESGLE---EHLGGGVVLTGGMALMKGIKELARIHFTNYPVRLA-TPM 380
+ ++++ E + G+VLTGG AL++ + L T PV +A P+
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVAEDPL 322

Query: 381 E 381

Sbjct: 323 T 323


14HPSA_05250HPSA_05485Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_05250211-0.248069biotin carboxyl carrier protein
HPSA_05255211-0.119917biotin carboxylase
HPSA_05270211-1.822660putative type II restriction endonuclease
HPSA_05275310-0.706568hypothetical protein
HPSA_05280390.001114sugar nucleotide biosynthesis
HPSA_05285311-0.246526*ribonucleotide-diphosphate reductase subunit
HPSA_05290114-0.532691protein-L-isoaspartate O-methyltransferase
HPSA_05295012-0.526179hypothetical protein
HPSA_05300-114-0.301210tRNA pseudouridine synthase A
HPSA_05305-2130.109612UDP-glucose 4-epimerase
HPSA_053151130.140989**hypothetical protein
HPSA_05320191.261014putative outer membrane protein
HPSA_05325091.399319short chain alcohol dehydrogenase
HPSA_053302100.938301hypothetical protein
HPSA_05335082.211842hypothetical protein
HPSA_05340-182.0392692-keto-3-deoxy-6-phosphogluconate aldolase
HPSA_05345-170.989873phosphogluconate dehydratase
HPSA_05350-19-0.394013glucose-6-phosphate 1-dehydrogenase
HPSA_053551100.0789606-phosphogluconolactonase
HPSA_053603100.286434glucokinase
HPSA_05365313-0.773384NADP-dependent alcohol dehydrogenase
HPSA_05370214-0.890912putative lipopolysaccharide biosynthesis
HPSA_053752121.112876lipopolysaccharide biosynthesis protein
HPSA_053803112.953689hypothetical protein
HPSA_053851123.123313putative outer membrane protein HorH; putative
HPSA_053901112.526783pyruvate flavodoxin oxidoreductase subunit
HPSA_05395-192.304602pyruvate flavodoxin oxidoreductase subunit
HPSA_05400-281.621012pyruvate flavodoxin oxidoreductase subunit
HPSA_054051140.157168pyruvate ferredoxin oxidoreductase, beta
HPSA_05410216-0.354113adenylosuccinate lyase
HPSA_054151160.262949outer membrane protein 3
HPSA_054202140.402833excinuclease ABC subunit B
HPSA_05425215-0.068802hypothetical protein
HPSA_05430115-0.722567hypothetical protein
HPSA_05435114-1.050444hypothetical protein
HPSA_05440112-1.096841gamma-glutamyltranspeptidase (ggt)
HPSA_05445013-3.603961flagellar hook-associated protein FlgK
HPSA_05450220-4.130340hypothetical protein
HPSA_05455218-3.756426hypothetical protein
HPSA_05460118-1.807830type II DNA modification enzyme
HPSA_05465214-1.098728FlgM protein
HPSA_05470413-1.937307hypothetical protein
HPSA_05475412-1.737079FKBP-type peptidyl-prolyl cis-trans isomerase
HPSA_05480513-2.716110hypothetical protein
HPSA_05485313-1.934056peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05275cloacin300.007 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.1 bits (67), Expect = 0.007
Identities = 31/151 (20%), Positives = 54/151 (35%), Gaps = 19/151 (12%)

Query: 20 QRVNSRARKEENKEIQNLSENDERIKLAKQAKQDNLAIGNLESRLKSLKSMDKDAKELMS 79
+R RAR E N+ ++++ N ER A Q SR L + +K + ++
Sbjct: 320 ERNYERARAELNQANEDVARNQERQAKAVQV---------YNSRKSELDAANKTLADAIA 370

Query: 80 ISKAYAHNNEKDQNDFKHFKNRLDKAIDSFNQNLGNDAASLKLPSNIDIDDPKALEKFSK 139
K + ++ N A+ + D AL S
Sbjct: 371 EIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAAL---SS 427

Query: 140 SLESEKENIQNSLHQWKKQLAETNHLNKEYN 170
++ES K+ + K+ + N+LN E N
Sbjct: 428 AMESRKK-------KEDKKRSAENNLNDEKN 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05305NUCEPIMERASE1164e-32 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 116 bits (292), Expect = 4e-32
Identities = 65/341 (19%), Positives = 129/341 (37%), Gaps = 48/341 (14%)

Query: 1 MALLFTGACGYIGSHTARAFLENTTENIVIVDDLSTGF---LEHVKTLEHYYPNRVMFIQ 57
M L TGA G+IG H ++ LE +V +D+L+ + L+ LE F +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQ-ARLELLAQPGFQFHK 58

Query: 58 ANLNETHKLDAFLDKQQLKDPIEAILHFGAKISVEESTRLPLEYYTNNTLNTLELVKLCL 117
+L + + E + +++V S P Y +N L +++ C
Sbjct: 59 IDLADREGMTDLFASGH----FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCR 114

Query: 118 KHHIKRFIFSSTAVVYGESDS-SLNEESPLN-PINPYGASKMMSERILLDASKVADFNCV 175
+ I+ +++S++ VYG + + + ++ P++ Y A+K +E + S +
Sbjct: 115 HNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPAT 174

Query: 176 ILRYFNVAGACMHNDYATPYTLGQRTLNATHLIKIACECAVGKRKKMGIFGTDYPTRDGT 235
LR+F V G D A + + L GK ++ G
Sbjct: 175 GLRFFTVYGPWGRPDMA-LFKFTKAMLE-------------GKSID--VYN------YGK 212

Query: 236 CIRDYIHVDDLANAHLASYHTLLEQNKS---------------EIYNVGYNQGHSVKEVI 280
RD+ ++DD+A A + + + +YN+G + + + I
Sbjct: 213 MKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYI 272

Query: 281 NKVKEISNNNFLVEILDKRQGDPASLIANNAKILQNTPFKP 321
+++ +L + GD A+ + + F P
Sbjct: 273 QALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTP 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05325DHBDHDRGNASE812e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 80.9 bits (199), Expect = 2e-20
Identities = 50/232 (21%), Positives = 89/232 (38%), Gaps = 19/232 (8%)

Query: 4 ILVSGATSGFGLEIAKAFLQKNHVVFGTGRRQKNL------QELQLAYPKRFIPLCFDLK 57
++GA G G +A+ + + + L + + + + F P D++
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA--DVR 67

Query: 58 NKLETKQAVETLFSITDRIDALINNAGLALGLNKTYECEFNDWEVMIDTNIKGLLYLTRL 117
+ + + ID L+N AG+ L + +WE N G+ +R
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 118 ILPSMIEHNRGTIINLGSIAGTYAYPGGNVYGASKAFVKQFSLNLRADLAGTNIRVSNVE 177
+ M++ G+I+ +GS Y +SKA F+ L +LA NIR + V
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 178 PG---------LCGETEFSMVRFKGDKLKAQSVYENTTYLKPQDIANIVLWI 220
PG L + + KG ++ KP DIA+ VL++
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05390YERSSTKINASE290.018 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.6 bits (63), Expect = 0.018
Identities = 13/33 (39%), Positives = 21/33 (63%)

Query: 80 IENIFANEKEDTTYIITSYLSKEELFEKKPELK 112
+ N+ A+EK D ++++ L E FEK PE+K
Sbjct: 314 VGNLGASEKSDVFLVVSTLLHCIEGFEKNPEIK 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05410TYPE3IMRPROT290.027 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 29.3 bits (66), Expect = 0.027
Identities = 14/54 (25%), Positives = 24/54 (44%), Gaps = 7/54 (12%)

Query: 316 LFITSDFMLSRLNGVVENLVIYP-------KNMLKNLALSGGLVFSQRVLLELP 362
LF+T + L ++ +V+ P N L +G L+F ++L LP
Sbjct: 136 LFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALP 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05440NAFLGMOTY310.012 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 30.9 bits (69), Expect = 0.012
Identities = 15/37 (40%), Positives = 21/37 (56%), Gaps = 1/37 (2%)

Query: 259 YNVKWRKPVMGSYRGYKIISMSPPSSGGTHLIQILNV 295
+ +K R+P MG R +ISM PP G H +I N+
Sbjct: 72 FELKMRRP-MGETRNVSLISMPPPWRPGEHADRITNL 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05445FLGHOOKAP15650.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 565 bits (1457), Expect = 0.0
Identities = 128/610 (20%), Positives = 229/610 (37%), Gaps = 75/610 (12%)

Query: 6 SSLNTSYTGLQAHQSMMDVTGNNISNASDEFYSRQRVIAKPQAAYMYGTKNVNMGVDVEA 65
S +N + +GL A Q+ ++ NNIS+ + Y+RQ I + + V GV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 66 IERVHDEFVFSRYTKANYENTYYDTEFSHLKEASAYFPDIDEASLFTDLQDYFNSWKELS 125
++R +D F+ ++ A +++ + + + +SL T +QD+F S + L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML-STSTSSLATQMQDFFTSLQTLV 120

Query: 126 KNAKDSAQKQALAQKTEALTHNIKDTRERLTTLQHKASEELKSVIKEVNSLGSQIAEINK 185
NA+D A +QAL K+E L + K T + L + + + + + ++N+ QIA +N
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 186 RIKEVENNKSLKHANELRDKRDELEFHLRELLGGNVFKSSIKTHSLTDKDSADFDESYNL 245
+I + + N L D+RD+L L +++G V S +YN+
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEV--------------SVQDGGTYNI 226

Query: 246 NIGHGFNIIDGSIFHPLVVKESENKGGLNQVYFQSDDFKVTNITDK-LNQGKVGALLNVY 304
+ +G++++ GS L S V + I +K LN G +G +L
Sbjct: 227 TMANGYSLVQGSTARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFR 286

Query: 305 NDGSNGTLKGKLQDYIDLLDSFARGLIESTNAIYAQSASHNIEGEPVEFNSDEAFKDTNY 364
+ L + L A E+ N + +A D N
Sbjct: 287 SQ--------DLDQTRNTLGQLALAFAEAFNTQH------------------KAGFDANG 320

Query: 365 NIKNGSFDL----IAYNTDGKEIARKTIAITPITTMNDIIQAINANTDDN-----KDNNT 415
+ F + + NT K +T + + I+ + + N T
Sbjct: 321 DAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQWQVTRLASNTT 380

Query: 416 ENDFDDYFTASFNNETKKFVIQPKNASQGLFVSMKDNGTNFMGALKLNPFFQGDDASNIS 475
D + + + + + M L D + I+
Sbjct: 381 FTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLI-------TDEAKIA 433

Query: 476 LNKEYKKEPTTIRPWLAPINGNFDVANMMQQLQYDSVDFYNDKFDIKPMKISEFYQFLTG 535
+ E E G+ D N L S N K ++ Y L
Sbjct: 434 MASE---EDA----------GDSDNRNGQALLDLQS----NSKTVGGAKSFNDAYASLVS 476

Query: 536 KINTDAEKSGRILDTKKSMLETIKKEQLSISQVSVDEEMVNLIKFQSGYAANAKVITAID 595
I T+ +++ + +Q SIS V++DEE NL +FQ Y ANA+V+ +
Sbjct: 477 DIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTAN 536

Query: 596 RMIDTLLGIK 605
+ D L+ I+
Sbjct: 537 AIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05480IGASERPTASE349e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.3 bits (78), Expect = 9e-04
Identities = 32/160 (20%), Positives = 57/160 (35%), Gaps = 10/160 (6%)

Query: 51 SQVESNTQAQEGLKSVYEGQSNKIKHLNDAILSQEESLRALKASQEVHANTIKQQSEILE 110
+ + + Q + SV +N+ D + + E A KQ+S+ +E
Sbjct: 995 TNITTPNNIQADVPSV--PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVE 1052

Query: 111 DLRNEIRANQQAIQQLDKQNKEMSELLTKLSQDLVAQIAL-----IQKTLKEQEKQEKAE 165
+NE A + Q + + S + + VAQ KE EK E
Sbjct: 1053 --KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 166 KQYKQSAPTNEK-SKQSLAADALEQDKENQPKHQQLPKEE 204
K ++ T E S + EQ + QP+ + + +
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05485OMPADOMAIN1434e-44 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 143 bits (362), Expect = 4e-44
Identities = 46/172 (26%), Positives = 72/172 (41%), Gaps = 27/172 (15%)

Query: 22 NMDKETVAGDVSAKTVQTAPV-TTEPAPEKEEPKEEPVPVVEEKPAKPAIESGTIIASIY 80
D ++ VS + Q PAP P P V+ K T+ + +
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK-------HFTLKSDVL 222

Query: 81 FDFDKYEIKDSDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVK 137
F+F+K +K Q LD++ + V++ G TD GS YNQ L +R SV
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 138 NALVIKGVEKDMIKTISFGEAKPKCTQ-----KTR----ECYKENRRVDVKL 180
+ L+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 283 DYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


15HPSA_05540HPSA_05565Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_05540019-3.638071F0F1 ATP synthase subunit B'
HPSA_05545117-3.211725plasmid replication-partition related protein
HPSA_05550117-3.788588SpoOJ regulator (soj)
HPSA_05555018-3.842987biotin--protein ligase
HPSA_05560016-3.828431methionyl-tRNA formyltransferase
HPSA_05565016-4.013495ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05560FERRIBNDNGPP346e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.8 bits (77), Expect = 6e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKSLKPDFIVVVAYGKILPKEVLKIAP 104
EP +++L +KP F+V A P+ + +IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


16HPSA_06845HPSA_06880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_068452151.008541hypothetical protein
HPSA_068502150.722306putative inner membrane protein translocase
HPSA_068652150.504547tRNA modification GTPase TrmE
HPSA_068704160.931103Outer membrane protein HomD; putative signal
HPSA_06875316-0.654573hypothetical protein
HPSA_06880217-0.639418hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_0685060KDINNERMP413e-141 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 413 bits (1062), Expect = e-141
Identities = 123/336 (36%), Positives = 202/336 (60%), Gaps = 18/336 (5%)

Query: 226 YTFSGVLLENNDKKIEKIE---DKDAKEIKRFSNILFLSSVDRYFTTLLFTDNPQGFEVL 282
+TF G D+K EK + D + + S +++ + +YF T N G
Sbjct: 216 HTFRGAAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNF 274

Query: 283 INPETGTKNPLGFISLKNEA-----------TLHGYIGPKDYRSLKAISPMLTDAIEYGL 331
G N + I K++ ++GP+ + A++P L ++YG
Sbjct: 275 YTANLG--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGW 332

Query: 332 ITFFAKGVFVLLDYLYQFVGNWGWAIILLTIIVRIILYPLSYKGMVSMQKLKELAPKMKE 391
+ F ++ +F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ L PK++
Sbjct: 333 LWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQA 392

Query: 392 LQEEYKGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWI 451
++E + Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + +
Sbjct: 393 MRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA 452

Query: 452 LWIHDLSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTLFLITFPAG 511
LWIHDLS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+G
Sbjct: 453 LWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSG 512

Query: 512 LVLYWTTNNILSVLQQLIINKVLENKKRAHVQNKKE 547
LVLY+ +N+++++QQ +I + LE K+ H + KK+
Sbjct: 513 LVLYYIVSNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_06865TCRTETOQM364e-04 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 35.6 bits (82), Expect = 4e-04
Identities = 33/134 (24%), Positives = 55/134 (41%), Gaps = 25/134 (18%)

Query: 216 LSIIGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 QGHKVRLIDTAGIRESMDEIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFTIIDALN 318
+ KV +IDT G + + E+ R SL L D + + ++ + + AL
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


17HPSA_06945HPSA_07025Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_069452111.561426ABC transporter ATP-binding protein
HPSA_069501110.224819putative ABC transport system permease protein
HPSA_06955212-0.013618hypothetical protein
HPSA_069602110.273687outer membrane protein
HPSA_069652120.392660branched-chain amino acid aminotransferase
HPSA_06970212-0.398585outer membrane protein (omp31)
HPSA_06975213-0.634669DNA polymerase I
HPSA_069802170.714009DNA polymerase I
HPSA_069952170.681405type II restriction modification enzyme
HPSA_070003190.632690hypothetical protein
HPSA_070052140.392851thymidylate kinase
HPSA_070102120.377322phosphopantetheine adenylyltransferase
HPSA_070152120.5322833-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPSA_070202110.176711hypothetical protein
HPSA_070252110.169561flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07010LPSBIOSNTHSS2242e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 224 bits (572), Expect = 2e-78
Identities = 63/147 (42%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLDERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECIAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


18HPSA_07180HPSA_07245Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_071802132.668933hypothetical protein
HPSA_071851111.992278ferredoxin-like protein
HPSA_07190-1101.732582putative glycerol-3-phosphate acyltransferase
HPSA_07195-191.098092dihydroneopterin aldolase
HPSA_07200-181.253410hypothetical protein
HPSA_07205-181.101637iron-regulated outer membrane protein
HPSA_07210112-2.634121selenocysteine synthase
HPSA_07215-113-2.990064transcription elongation factor NusA
HPSA_07220015-4.277784hypothetical protein
HPSA_07225016-3.697532hypothetical protein
HPSA_07230113-3.719112hypothetical protein
HPSA_0723509-3.209406type IIS restriction enzyme R and M protein
HPSA_0724009-3.327931putative type IIS restriction-modification
HPSA_07245110-3.123341hypothetical protein
19HPSA_07350HPSA_07445Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_073502143.650400hypothetical protein
HPSA_073553134.421682hypothetical protein
HPSA_073601122.667760hypothetical protein
HPSA_073651122.842569iron(III) dicitrate transport protein (fecA)
HPSA_07370-281.563604arginase
HPSA_07375-281.229180alanine dehydrogenase
HPSA_07380-19-0.912671NADP-dependent alcohol dehydrogenase
HPSA_07385110-1.786172hypothetical protein
HPSA_0739019-1.443569putative outer membrane protein
HPSA_07395310-1.768203probable inorganic polyphosphate/ATP-NAD kinase
HPSA_0740039-2.010303DNA repair protein RecN
HPSA_07405012-2.567663fibronectin/fibrinogen-binding protein
HPSA_07410113-1.066736hypothetical protein
HPSA_07415-110-1.473843hypothetical protein
HPSA_07420011-2.088429DNA polymerase III subunit epsilon
HPSA_07425114-2.091156ribulose-phosphate 3-epimerase
HPSA_07430114-1.267494fructose-1,6-bisphosphatase
HPSA_07435314-1.580772hypothetical protein
HPSA_07440013-2.636436putative type II DNA methylase protein
HPSA_07445011-3.353674hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07405FbpA_PF058331155e-30 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 115 bits (290), Expect = 5e-30
Identities = 77/361 (21%), Positives = 143/361 (39%), Gaps = 31/361 (8%)

Query: 97 AKDLAYKSETFILRLELIPKKANLMILDQEKCVIEA--FRFNDRVAKNDILGALPLN-TY 153
+ ++ ++ + + L L K + + I++ F FN N +G LN
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 154 EHQEKDLDFKDLLEILEKDFLFYQHKE----LEHKKNQIIKRLNAQKERLKEKLDNLENP 209
+ K + + ++LE FY K+ L+ K + + K + R +K L N
Sbjct: 269 KEDYKKIQYDSSSKLLEN---FYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNT 325

Query: 210 KNLQLEAKELQTQASLLLTYQHLINKHESCVILKDFED---KECAIEIDKSMPLNAFINK 266
+ + LL + + K S + L ++ I +D++ + +
Sbjct: 326 LKKCEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQS 385

Query: 267 KFTLSKKKKQKSQFLYLEEENLKEKIAFKENQIHYIKGATEESVLE------------MF 314
+ K K+ + + +E++ + + + I A +E F
Sbjct: 386 YYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKF 445

Query: 315 MPVKNSKIKRPMSGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRDIPGSHLI 373
+ SK + + I +GKN +N L L+ A +D+W H ++IPGSH+I
Sbjct: 446 KKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVI 505

Query: 374 VFCQKNAPKDEIIMELAKMLIKMQKDVFNS-YEIDYTQRKFVKIIKGAH---VIYSKYRT 429
V + P + ++E A + K +S +DYT+ K VK GA VIYS +T
Sbjct: 506 VKNIMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQT 564

Query: 430 I 430
I
Sbjct: 565 I 565



Score = 34.5 bits (79), Expect = 8e-04
Identities = 19/104 (18%), Positives = 49/104 (47%), Gaps = 7/104 (6%)

Query: 36 KERHAFVVDLN--TPYIGLSQKPLEGVLKNTLALDFCLNKFTKNAKILQANIIDNDRI-- 91
+ ++ + P I L+ +K + L K+ NAKI+ + I+ DRI
Sbjct: 43 RLSFKLLISSSSNYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVV 101

Query: 92 LEITGAKDLAYKSETFILRLELIPKKANLMILD-QEKCVIEAFR 134
++ +L + S L +E++ + +N+ ++ ++ ++++ +
Sbjct: 102 IDFESTDELGFNSIY-SLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


20HPSA_07520HPSA_07750Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_07520014-3.131212rod shape-determining protein MreB
HPSA_07525320-6.066547rod shape-determining protein MreC
HPSA_07540624-7.654943hypothetical protein
HPSA_07545625-7.236006hypothetical protein
HPSA_07550626-6.659661hypothetical protein
HPSA_07555522-5.832573hypothetical protein
HPSA_07560722-5.954360hypothetical protein
HPSA_07565720-6.113365VirB4-like protein
HPSA_07570815-5.010663DNA topoisomerase I (topA)
HPSA_07575714-4.729100IS200 insertion sequence from SARA17
HPSA_07580716-5.176551IS606 transposase
HPSA_07585818-5.732621DNA topoisomerase I (topA)
HPSA_07590717-5.734124VirB8 type IV secretion protein
HPSA_07595821-5.683044VirB9 type IV secretion protein
HPSA_07600825-6.821398VirB10 type IV secretion protein
HPSA_076051027-8.723260hypothetical protein
HPSA_07610926-7.911778hypothetical protein
HPSA_07615924-7.674532hypothetical protein
HPSA_07620823-7.558559type IV secretion system ATPase (virB11-like)
HPSA_07625922-8.373685hypothetical protein
HPSA_07630922-8.011763hypothetical protein
HPSA_07635820-6.910171conjugal transfer protein (traG)
HPSA_07640614-3.962576hypothetical protein
HPSA_07645515-4.318700hypothetical protein
HPSA_07650514-3.696264hypothetical protein
HPSA_07655514-2.734135hypothetical protein
HPSA_07660515-2.772029PARA protein
HPSA_07665515-2.978147hypothetical protein
HPSA_07670419-3.670058hypothetical protein
HPSA_07685418-3.342295hypothetical protein
HPSA_07690418-5.091189hypothetical protein
HPSA_07695218-5.377891hypothetical protein
HPSA_07700115-5.237199hypothetical protein
HPSA_07705-110-5.158018hypothetical protein
HPSA_07710-110-4.322964integrase/recombinase XercD family protein
HPSA_07715012-2.375581putative type III restriction enzyme R protein
HPSA_07720-112-1.119002modification methylase KpnI
HPSA_07725012-1.229720type III restriction-modification system:
HPSA_07730113-0.041238putative transcriptional regulator
HPSA_077351120.058775hypothetical protein
HPSA_07740213-0.094648hypothetical protein
HPSA_07745111-0.633601replicative DNA helicase
HPSA_07750214-0.875330DNA transfer protein ComE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07520SHAPEPROTEIN474e-171 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 474 bits (1222), Expect = e-171
Identities = 180/347 (51%), Positives = 249/347 (71%), Gaps = 2/347 (0%)

Query: 2 IFSKLIGLFSHDIAIDLGTANTIVLVKGQGIIINEPSIVAVRMGLFDSKAYDILAVGSEA 61
+ K G+FS+D++IDLGTANT++ VKGQGI++NEPS+VA+R S + AVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKS-VAAVGHDA 59

Query: 62 KEMLGKTPNSIRAIRPMKDGVIADYDITAKMIRYFIEKVHKRKTW-IRPRIMVCVPYGLT 120
K+MLG+TP +I AIRPMKDGVIAD+ +T KM+++FI++VH PR++VCVP G T
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 SVERNAVKESALSAGAREVFLIEEPMAAAIGAGLPVKEPQGSLIVDIGGGTTEIGVISLG 180
VER A++ESA AGAREVFLIEEPMAAAIGAGLPV E GS++VDIGGGTTE+ VISL
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GLVISKSIRVAGDKLDQSIVEYIRKKFNLLIGERTGEEIKIEIGCAIKLDPPLTMEVSGR 240
G+V S S+R+ GD+ D++I+ Y+R+ + LIGE T E IK EIG A D +EV GR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 DQVSGLLHTIELSSDDVFEAIKDQVREISSALRSVLEEVKPDLARDIVQNGVVLTGGGAL 300
+ G+ L+S+++ EA+++ + I SA+ LE+ P+LA DI + G+VLTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 IKGLDKYLSDMVKLPVYVGDEPLLAVAKGTGEAIQDLDLLSRVGFSE 347
++ LD+ L + +PV V ++PL VA+G G+A++ +D+ FSE
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSE 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07590PF04335934e-24 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 93.0 bits (231), Expect = 4e-24
Identities = 35/213 (16%), Positives = 70/213 (32%), Gaps = 13/213 (6%)

Query: 142 FEEVRD-ASVIYHLEKKLGDYIFYVACFFFGTTALLIILLIILLPLKQKEPYLVQFSNNK 200
FEE ++ + VA ++ + L PLK EPY++ N
Sbjct: 14 FEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNT 73

Query: 201 ENFALVQ--KADSTITANKALIRSLVGAYVLNRESITHIKQHEKMRQNTIKEQSSNEVWY 258
++ D+TIT ++A+ + + YV RE + + + S+
Sbjct: 74 GEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWI--AAAREEYFDAVMVMSARPEQD 131

Query: 259 EFEKLIA-----HYDSIYTNPLLTRKVKIANI-YLDKDLAYIDIEVSLYHSGELESLKRY 312
+ + +I N V+I + +L ++A + +G +
Sbjct: 132 RWSRFYKTDNPQSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFTKESV-TGSNSTKTDA 189

Query: 313 KVVMSFEFKKQEINFDSMSLNPTGFIVTGYDVT 345
+ ++ NP G+ V Y
Sbjct: 190 VATIKYKVDGTPSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07665GPOSANCHOR320.046 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.046
Identities = 16/129 (12%), Positives = 39/129 (30%), Gaps = 3/129 (2%)

Query: 2476 EEVKLRAEIKSEEAKYKAFNKEHYFNEENLKNNASKLDFLHKELKDLETLQSSVMIPTHT 2535
+ L AE + EA+ K +++K+ L E L ++ +
Sbjct: 177 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL---EKA 233

Query: 2536 EIKLYDLKNEESKDYELIKVKEVEPLKENASMSEELTHKKLKEQNKQIAEQNKEKLDAIK 2595
+ +S + ++ ++ A + + L + E A
Sbjct: 234 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 293

Query: 2596 KQFASNLND 2604
+ ++L
Sbjct: 294 EAEKADLEH 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07685CABNDNGRPT403e-05 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 40.0 bits (93), Expect = 3e-05
Identities = 24/119 (20%), Positives = 39/119 (32%), Gaps = 7/119 (5%)

Query: 146 TNDPMYANTPFSNNPNSPNDNAINGKDGANGSNGYGINGNDGINGSSGSNGANGSNGNTS 205
N + P + AI GAN + + G N ++ + ++ + +
Sbjct: 232 NETGADYNGHYGGAPMIDDIAAIQRLYGANMTTR-TGDSVYGFNSNTDRDFYTATDSSKA 290

Query: 206 NNNAI--GSGIDTDGVLGVDG---VNGSSSSSGGSVGGYENNFTNHGSTNNN-TGGYDN 258
++ G DT G +N + S G N HG T N GG N
Sbjct: 291 LIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGN 349



Score = 29.6 bits (66), Expect = 0.046
Identities = 24/123 (19%), Positives = 41/123 (33%), Gaps = 27/123 (21%)

Query: 150 MYANTPFSNNPNSPNDNAINGKDGANGSNGYGINGNDGINGSSGSNGANGSNGN------ 203
FS+ + +I N G +GND + G+S N G GN
Sbjct: 316 NLNEGSFSDVGGLKGNVSIAHGVTIE--NAIGGSGNDILVGNSADNILQGGAGNDVLYGG 373

Query: 204 -----------------TSNNNAIGSGIDT--DGVLGVDGVNGSSSSSGGSVGGYENNFT 244
S ++ + D D G+D ++ S+ + G + ++ FT
Sbjct: 374 AGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFT 433

Query: 245 NHG 247
G
Sbjct: 434 GKG 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07730HTHFIS903e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.9 bits (223), Expect = 3e-23
Identities = 37/118 (31%), Positives = 56/118 (47%), Gaps = 2/118 (1%)

Query: 1 MQK-KIFLLEDDYLLSESVKEFLEHLGYEVACAFNGKEAYERLSVERFNLLLLDVQVPEM 59
M I + +DD + + + L GY+V N + ++ +L++ DV +P+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 60 NSLELFKRIKNDFLISTPVIFITALQDNTTLKNAFNLGASDYLKKPFDLDELEARIKR 117
N+ +L RIK PV+ ++A T A GA DYL KPFDL EL I R
Sbjct: 61 NAFDLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


21HPSA_00185HPSA_00210N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00185-2120.365906ComB1 protein
HPSA_00190-3130.238213ComB2 protein
HPSA_00195-2120.381720ComB10 competence protein
HPSA_00200-215-0.164847mannose-6-phosphate isomerase /
HPSA_00205-214-0.167349GDP-D-mannose dehydratase
HPSA_002100100.013983GDP-fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00185PF043351331e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (336), Expect = 1e-40
Identities = 39/202 (19%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIVFVLAIVLISLLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ + + + +L PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSNAKVWQRFEDLIKTNN 157
D +I+ +EA+ + + YV RE + ++ V + S R+ KT+N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSITAKLLNENKLVYEKRYKIVMSYVFDAP 216
Q+ L + V I +A V T + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKVTRYSIT 238
KNP G++V Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00190TYPE4SSCAGX290.020 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.4 bits (65), Expect = 0.020
Identities = 28/90 (31%), Positives = 41/90 (45%), Gaps = 7/90 (7%)

Query: 178 NTLENANKPLKETKAAEETKEK-EEEEVITIGDNTNAMKIVKKDIQRNYKALKSSQ-RKW 235
NT N+ + + KEK EE+ I D A+ + Q + ALK + +
Sbjct: 346 NTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRN 400

Query: 236 YCLWLCSKKSKLSLMPKEIFNDKQFTYFKF 265
Y + +K +MP EIF+D FTYF F
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00205NUCEPIMERASE881e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 1e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLED 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00210NUCEPIMERASE461e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 45.5 bits (108), Expect = 1e-07
Identities = 50/358 (13%), Positives = 106/358 (29%), Gaps = 78/358 (21%)

Query: 5 ILITGAYGMVGQNTALYFKKN---------------------------KPNITLLTPKKS 37
L+TGA G +G + + + +P
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFH----- 57

Query: 38 ELYLLDKDSVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDL 97
++ L D++ + + R + ++ + Y NL L +
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 98 GIKKAINLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVF 157
I+ + +SS Y P D ++ + YA K + S G+
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLP 172

Query: 158 YKTLVPCNLYGE-------FDKFEEKIAHMIPGLISRMHTAKLKGEKNFVMWGDGTARRE 210
L +YG KF + + L+G+ ++ G +R+
Sbjct: 173 ATGLRFFTVYGPWGRPDMALFKFTKAM---------------LEGKS-IDVYNYGKMKRD 216

Query: 211 YLNAKDLARFIALAYENIAQIPS-----------------VMNVGSGVDYSIEEYYEMVA 253
+ D+A I + I + V N+G+ + +Y + +
Sbjct: 217 FTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALE 276

Query: 254 QVLDYKGVFVKDLSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
L + +P + + D + + + E ++ G+K +Y +V
Sbjct: 277 DALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


22HPSA_00275HPSA_00305N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00275518-1.532494hypothetical protein
HPSA_00280416-1.190205hypothetical protein
HPSA_00285314-1.177221hypothetical protein
HPSA_00300315-0.774806hypothetical protein
HPSA_00305112-0.109436hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00275GPOSANCHOR320.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.002
Identities = 17/69 (24%), Positives = 29/69 (42%)

Query: 2 LENDKQVLNNEKIELSNDITKLTAEKDDLLKTKENLTKEKENLNTDLSNAKNEASQTSQK 61
LE + N S I L AEK L K +L + + LN + + + + + +
Sbjct: 265 LEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREA 324

Query: 62 LKDLQQKHA 70
K L+ +H
Sbjct: 325 KKQLEAEHQ 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00280GPOSANCHOR413e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.2 bits (96), Expect = 3e-06
Identities = 43/228 (18%), Positives = 66/228 (28%)

Query: 4 LSSTKEKLEARIDLLESEIIDLSTGIKNLVAETSKLKDANNQLRQKNDKLFTTKERLTKE 63
L E ++I L L A + L+ A + + L E
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 64 NAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENTELASKNTVLTEKTAEL 123
A LE R AEL K E + L K +L +
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 124 KNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEKDNQQLTDKLKKLESVQ 183
+ L + AL Q L++ + LE + L + LE
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQS 304

Query: 184 KNLENTNNQLRQAREKIAEEKTELEREMARLKGLEGMEAKSNLDLHNK 231
+ L LR+ + E K +LE E +L+ + S L
Sbjct: 305 QVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352



Score = 39.3 bits (91), Expect = 1e-05
Identities = 38/212 (17%), Positives = 70/212 (33%)

Query: 4 LSSTKEKLEARIDLLESEIIDLSTGIKNLVAETSKLKDANNQLRQKNDKLFTTKERLTKE 63
+ + ++I LE+ DL ++ + ++ L + L K L K
Sbjct: 104 NDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKA 163

Query: 64 NAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENTELASKNTVLTEKTAEL 123
N + S + + L + LE +L +A E +T ++K L + A L
Sbjct: 164 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 223

Query: 124 KNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEKDNQQLTDKLKKLESVQ 183
L + N + + L+ LE A LEK + + +
Sbjct: 224 AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKI 283

Query: 184 KNLENTNNQLRQAREKIAEEKTELEREMARLK 215
K LE L + + + L L+
Sbjct: 284 KTLEAEKAALEAEKADLEHQSQVLNANRQSLR 315



Score = 31.2 bits (70), Expect = 0.004
Identities = 29/210 (13%), Positives = 55/210 (26%)

Query: 48 QKNDKLFTTKERLTKENAELENRNAELSKEKENLVVKIRGLENANDQLWQAKENFTKENT 107
N+ T +++ R + E L +K L N L + T+E +
Sbjct: 36 NTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELS 95

Query: 108 ELASKNTVLTEKTAELKNENDRLNHQVIALNNEQGSLKQERAQLQDACGFLEETCANLEK 167
K + +E ++ L + L LE A L
Sbjct: 96 NAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAA 155

Query: 168 DNQQLTDKLKKLESVQKNLENTNNQLRQAREKIAEEKTELEREMARLKGLEGMEAKSNLD 227
L L+ + L + + + ELE+ + ++
Sbjct: 156 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 215

Query: 228 LHNKRLANENRDLKTQNRKLEEENAKLKKE 257
L ++ A R + N
Sbjct: 216 LEAEKAALAARKADLEKALEGAMNFSTADS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00285GPOSANCHOR352e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 2e-04
Identities = 28/172 (16%), Positives = 55/172 (31%)

Query: 46 LKNENAELLRQNGDLAVKIKNLENTNNQLRQARENWIEEKRELTTEKERLVRKNTELENR 105
N + + L + L L +A E + + + + L + LE R
Sbjct: 132 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 191

Query: 106 NAELYAELLKEKESLTKANTELAEKIKALNAQNNELVASLADQSELDLHNKRLASENRDL 165
AEL L T + ++ A + +++ + L
Sbjct: 192 QAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL 251

Query: 166 KTQNRKLEEENIKLKKEVKDVKERHSQLQQQNDELERLNANAERTQHDLKQQ 217
+ + LE +L+K ++ + + LE A E + DL+ Q
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00305CHANNELTSX270.011 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 26.5 bits (58), Expect = 0.011
Identities = 11/29 (37%), Positives = 18/29 (62%)

Query: 32 LSNHFHNLGSWQDAKRDNFSEVIDNLRST 60
++ +FHN G W D + NF + ++RST
Sbjct: 254 VARYFHNGGQWADDAKLNFGDGPFSVRST 282


23HPSA_00590HPSA_00610N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_00590-291.524797flagellin B
HPSA_00595-190.576649DNA topoisomerase I
HPSA_00600-180.495341hypothetical protein
HPSA_00605090.839743hypothetical protein
HPSA_006100132.493307phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00590FLAGELLIN2851e-92 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 285 bits (730), Expect = 1e-92
Identities = 130/519 (25%), Positives = 221/519 (42%), Gaps = 18/519 (3%)

Query: 2 SFRINTNIAALTSHAVGVQNNRDLSSSLEKLSSGLRINKAADDSSGMAIADSLRSQSANL 61
+ INTN +L + ++ LSS++E+LSSGLRIN A DD++G AIA+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIRNANDAIGMVQTADKAMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLE 121
QA RNAND I + QT + A++E L ++ +VQA + +++Q +IQ+ LE
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKASIGSTSSDKIGHVRMETSSFSG 181
E+D ++N T FNG ++LS + Q+GA T+ + +G +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVN---- 175

Query: 182 AGMLASAAAQNLTEVGLNFKQVNGVNDYKIETVRISTSAGTGIGALSEIINRFSNTLGVR 241
+ ++ +FK V G + Y + + +G + + V
Sbjct: 176 -----GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVN 230

Query: 242 ASYNVMATG----GTPVQSGTVRELTINGVEIGTVNDVHKNDADGRLTNAINSVKDRTGV 297
A+ + T T V + T E + K +G T V
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD-TFDYKGVTFTIDT 289

Query: 298 EASLDIQGRINLHSIDGRAISVHAASASGQVFGGGNFAGISGNEHAVIGRLTLTRTDARD 357
+ D G+++ +I+G +++ A + S D +
Sbjct: 290 KTGNDGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 358 IIVSGVNFSHVGFHSAQGVAEYTVNLRAVRGIFDANVASAAGANANGAQAETNSQGIGAG 417
S ++ +G ++ TVN + + AG + + +
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 418 --VTSLKGAMIVMDMADSARTQLDKIRSDMGSVQMELVTTINNISVTQVNVKAAESQIRD 475
+ K + DSA +++D +RS +G++Q + I N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 476 VDFAEESANFSKYNILAQSGSFAMAQANAVQQNVLRLLQ 514
D+A E +N SK IL Q+G+ +AQAN V QNVL LL+
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00595FbpA_PF05833310.023 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 30.6 bits (69), Expect = 0.023
Identities = 26/88 (29%), Positives = 32/88 (36%), Gaps = 26/88 (29%)

Query: 192 TLDAYFEPHLEAQLISYKGNKLK-----AQELIDEKKAQ--------------------- 225
TLD P Q K NKLK A E + + + +
Sbjct: 372 TLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIE 431

Query: 226 EIKNELEKESYIISSIVKKSKKSPTPPP 253
EIK EL + YI + KSKKS T P
Sbjct: 432 EIKKELIETGYIKFKKIYKSKKSKTSKP 459


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00605IGASERPTASE310.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.011
Identities = 44/235 (18%), Positives = 81/235 (34%), Gaps = 9/235 (3%)

Query: 136 TEQEQQKTEQERQKANKSGIELEQEQQKTSNIKTNNQIKVEQEKQKTSNTQKDLVKEQKD 195
T E +T E K +E + EQ T N ++ E + +NTQ + V +
Sbjct: 1032 TPSETTETVAENSKQESKTVE-KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 196 LVKEQKDLVKEQKDLVKEQKDLVKEQKDLVKKAEQNCQENHNQFFIKKVGIKGGIAIEIE 255
KE + ++ V++++ E + + + Q + Q + V + A E +
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150

Query: 256 AECKTPKPTKTNQTP---IQPKHL--PNSKQPRSQRGSKAQELIAYLQKELESLPYSQKA 310
+P T QP N +QP ++ + + ++ + P + +
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES-TTVNTGNSVVENPENTTPATTQP 1209

Query: 311 IAKQVDFYKPSSIAYLELDPRDFNVTEEWQKENLKIRSKAQAKMLEMRNPQAHLS 365
KP + + NV E + RS L N A LS
Sbjct: 1210 TVNSESSNKPKNRHRRSVRSVPHNV--EPATTSSNDRSTVALCDLTSTNTNAVLS 1262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_00610PHPHTRNFRASE2948e-92 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 294 bits (753), Expect = 8e-92
Identities = 104/454 (22%), Positives = 185/454 (40%), Gaps = 71/454 (15%)

Query: 388 DLEHMNSFKEGEILVTDN-TDPDWEPCMKK-ASAVITNRGGRTCHAAIVAREIGVPAIVG 445
+ + + E +++ ++ T D K+ T+ GGRT H+AI++R + +PA+VG
Sbjct: 146 ETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVG 205

Query: 446 VSGATDSLYTGMEITVSCAEGE---------EGYVYAGIYEHEIERVELSNMQETQT--- 493
T+ + G + V EG E ++ E + + +
Sbjct: 206 TKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTK 265

Query: 494 -----KIYINIGNPEKAFSFSQLPNHGVGLARMEMIILNQIKAHPLALVDLHHKKSVKEK 548
++ NIG P+ G+GL R E + +++ + P
Sbjct: 266 DGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDRDQ-LPTE------------- 311

Query: 549 NEIENLMAGYANPKDFFVKKIAEGIGMISAAFYPKPVIVRTSDFKSNEYMRMLGGSSYEP 608
E Y K++ + KPV++RT D ++ + L P
Sbjct: 312 ---EEQFEAY--------KEVVQ-------RMDGKPVVIRTLDIGGDKELSYL----QLP 349

Query: 609 NEENPMLGYRGASRYYSESYNEAFSWECEALALVREEMGLTNMKVMIPFLRTIEEGKKVL 668
E NP LG+R + F + AL N+KVM P + T+EE ++
Sbjct: 350 KELNPFLGFRAIRLCLE--KQDIFRTQLRALL---RASTYGNLKVMFPMIATLEELRQAK 404

Query: 669 EILRKNNLESGKNG------LEIYIMCELPVNVILADDFLSLFDGFSIGSNDLTQLTLGV 722
I+++ + G +E+ IM E+P + A+ F D FSIG+NDL Q T+
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 723 DRDSELVSHVFDERNEAMLKMFKKAIEACKRHNKYCGICGQAPSDYPEVTEFLVKEGITS 782
DR +E VS+++ + A+L++ I+A K+ G+CG+ D L+ G+
Sbjct: 465 DRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDE 523

Query: 783 ISLNPDSVIPTWNAVAKLE----KELKEHGLTAR 812
S++ S++P + + KL K + L
Sbjct: 524 FSMSATSILPARSQLLKLSKEELKPFAQKALMLD 557


24HPSA_01235HPSA_01270N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_01235-2151.127178neutrophil activating protein
HPSA_01240-3140.923619putative histidine kinase sensor protein
HPSA_01245-3131.608672hypothetical protein
HPSA_01250-3132.103770flagellar basal body P-ring protein
HPSA_01255-3112.194712ATP-dependent RNA helicase
HPSA_01260-2102.224716hypothetical protein
HPSA_01265-291.912972hypothetical protein
HPSA_01270-291.880679oligopeptide permease ATPase protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01235HELNAPAPROT1488e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 148 bits (374), Expect = 8e-49
Identities = 39/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKHLEKEFKELSNTAEKEDDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDRLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01240PF06580300.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01250FLGPRINGFLGI350e-122 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 350 bits (899), Expect = e-122
Identities = 118/345 (34%), Positives = 189/345 (54%), Gaps = 27/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 SITSGN-----------SNNLLSANIINGATIEREVSYDLFHKNAMVLSLKNPNFKNAIQ 186
++ SA + NGA IERE+ +VL L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNGV----FGKGAAVALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIIVHPVVVTSQDITLKITKEPLN---------NSKNAQDLDSMSLDTAHN 293
++GTIV G D+ + V V+ +T+++T+ P + D M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 294 TLSSNGKNITIAGVMKALQKIGVSAKGIVSILQALKKSGAISAEM 338
G ++ ++ L IG+ A GI++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01255SECA300.031 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.031
Identities = 17/63 (26%), Positives = 31/63 (49%), Gaps = 2/63 (3%)

Query: 263 IVFTRTKKEADELHQFLISKNYKSTALHGDMDQRDRRASIMAFKKNDADVLVATDVASRG 322
+V T + ++++ + L K L+ + A+I+A A V +AT++A RG
Sbjct: 453 LVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE--AAIVAQAGYPAAVTIATNMAGRG 510

Query: 323 LDI 325
DI
Sbjct: 511 TDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01270HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.009
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANIIMRLNPR----FKPHNGEVLFETTNLLKESEEF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


25HPSA_01745HPSA_01765N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_017450110.843527flagellar MS-ring protein
HPSA_017500111.135384flagellar motor switch protein G
HPSA_017550110.723890flagellar assembly protein H
HPSA_017600121.1034191-deoxy-D-xylulose-5-phosphate synthase
HPSA_01765-1130.243458GTP-binding protein LepA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01745FLGMRINGFLIF5490.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 549 bits (1416), Expect = 0.0
Identities = 173/582 (29%), Positives = 292/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALIVFLLIYPFKEKNYVQGGYGVLFERLDPSDNALIL 70
+++ +L +I LI AG A++V ++++ K +Y LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKIKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + ++P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GTPKKQIGGVPGVVSN-IGPVQGLKDNKEQEKYEKSQN---------------------- 341
G GGVPG +SN P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANTLEYEPLSDESLQKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +++I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATLSEKIMRKTQKVLGSFTPLIKYVLVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01750FLGMOTORFLIG348e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 348 bits (895), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIAKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01755FLGFLIH397e-06 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 39.0 bits (90), Expect = 7e-06
Identities = 45/211 (21%), Positives = 92/211 (43%), Gaps = 14/211 (6%)

Query: 44 ISQEPLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIG 102
++ E I + + L L +LQMQ A E+ +A I + G+K G
Sbjct: 15 LAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQG 71

Query: 103 FKEGEEKMRNELTHSVNEEKNQLLHAITALDEKMKSSQNHLMALE----KELSAIAIDMA 158
++EG + L + E K+Q + + + Q L AL+ L +A++ A
Sbjct: 72 YQEG---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAA 128

Query: 159 KEVILKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNEHLQNASKI---KL 215
++VI + ++ + + + L + L + L+V+P D +++ L + +L
Sbjct: 129 RQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRL 188

Query: 216 ESNEAISKGGVMITSSNGSLDGNLMERFRTL 246
+ + GG +++ G LD ++ R++ L
Sbjct: 189 RGDPTLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_01765TCRTETOQM1455e-39 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 145 bits (367), Expect = 5e-39
Identities = 100/437 (22%), Positives = 177/437 (40%), Gaps = 85/437 (19%)

Query: 11 NIRNFSIIAHIDHGKSTLADCLIFECNAIS---NREMKSQVMDTMDIEKERGITIKAQSV 67
I N ++AH+D GK+TL + L++ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 68 RLNYTFKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 127
+F+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 128 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSSANEVSAKAKIGIKD--------- 178
+ + INKID ++ V QDI++ + + +V + + +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDT 177

Query: 179 -------LLEKIITTIPAPSGDASAPLKALIYD-------------------------SW 206
LLEK ++ + + ++ +
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 207 F--------------------DNYLGALALVRIMDGNINTEQEILVMGTGKKHGVLGLYY 246
F LA +R+ G ++ + + K + +Y
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYT 296

Query: 247 PNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDAKNPTPKPIEGFMPAKPFV 303
+ GEI I+ L L SV +GDT P + IE P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL---PQRERIEN---PLPLL 346

Query: 304 FAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFGFRVGFLGLLHMEVIKERL 363
+ P + + E L +ALL++ +D L + +S+ + FLG + MEV L
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---EIILSFLGKVQMEVTCALL 403

Query: 364 EREFSLNLIATAPTVVY 380
+ ++ + + PTV+Y
Sbjct: 404 QEKYHVEIEIKEPTVIY 420



Score = 31.4 bits (71), Expect = 0.013
Identities = 15/82 (18%), Positives = 28/82 (34%), Gaps = 2/82 (2%)

Query: 407 IKEPFVRATIITPSEFLGNLMQLLNHKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 466
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 467 LKSCTKGYASFDYEPIENREAN 488
L T G + E
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTT 614


26HPSA_02665HPSA_02715N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_02665-112-0.685075signal peptidase I (lepB)
HPSA_02670013-0.595851bifunctional 5,10-methylene-tetrahydrofolate
HPSA_02675014-1.189252hypothetical protein
HPSA_02680014-0.453060hypothetical protein
HPSA_02685-1150.164494neuraminidase (sialidase)
HPSA_02690-216-1.912498dihydroorotase
HPSA_02695-115-2.304244hypothetical protein
HPSA_02710-215-2.336011flagellar motor switch protein
HPSA_02715-113-1.033886endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02665PREPILNPTASE290.026 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 28.6 bits (64), Expect = 0.026
Identities = 10/53 (18%), Positives = 22/53 (41%), Gaps = 10/53 (18%)

Query: 14 WVGTIVIVLLVIFFIAQAFIIPSRSMVGTLYEGDMLFVKKFSYGIPIPKIPWI 66
W+G L ++ ++ S+VG ++ ++ PIP P++
Sbjct: 225 WLG--WQALPIVLLLS--------SLVGAFMGIGLILLRNHHQSKPIPFGPYL 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02680TYPE3IMSPROT290.011 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.0 bits (65), Expect = 0.011
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 87 LQSYSVMLFFNVLLLIDILGFLPFSIYHHFMASLIFSALFCGSLFLSSPLLGVIALMALS 146
L Y F ++L+ +LPFS S + + +L PLL V ALMA++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 147 SSLL 150
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02690TYPE3OMGPROT300.018 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 29.9 bits (67), Expect = 0.018
Identities = 24/74 (32%), Positives = 33/74 (44%), Gaps = 11/74 (14%)

Query: 156 HSFPKLTIIIEHLSD---WRSIALIEKHDNLYATLTLHHIS---MTLDDLLGGSLNPHCF 209
+L II + D +AL D LT+ IS TL+ LLGGS
Sbjct: 496 RRTVRLFIIEPRIIDEGIAHHLALGNGQDLRTGILTVDEISNQSTTLNKLLGGSQ----- 550

Query: 210 CKPLIKTKKDQERL 223
C+PL K ++ Q+ L
Sbjct: 551 CQPLNKAQEVQKWL 564


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02695TONBPROTEIN503e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 50.0 bits (119), Expect = 3e-09
Identities = 24/57 (42%), Positives = 28/57 (49%)

Query: 83 APKPTLAGPQKPPTPPTPPKPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 139
P P +P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 62 QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.5 bits (110), Expect = 4e-08
Identities = 25/70 (35%), Positives = 32/70 (45%), Gaps = 8/70 (11%)

Query: 84 PKPTLAGPQKPPTPPTPPKPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEE 143
P + P +P P P P P P E P KPKPKP+PK K V+KV+E
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVKKVQE 108

Query: 144 KKVAEEKKEE 153
+ + K E
Sbjct: 109 QPKRDVKPVE 118



Score = 40.0 bits (93), Expect = 5e-06
Identities = 40/214 (18%), Positives = 75/214 (35%), Gaps = 34/214 (15%)

Query: 98 PTPPKPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKVAEEKKEEKKVV 157
P +PP +P +P EP+P+P+P P+ P +E V EK + K
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-------------KEAPVVIEKPKPKPKP 98

Query: 158 EQKVEHKKVEEKKPAKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTKRDIDELYGEEFGD 217
+ K K E+ K K + P N T +
Sbjct: 99 KPKPVKKVQEQPKRDVKPVESR----------PASPFENTAPARLTSSTATAATSKPVTS 148

Query: 218 LGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDLKIIVGS 277
+ + + RN + YP A L +G V+F + P+G + +++I+
Sbjct: 149 VASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAK 197

Query: 278 EYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 311
M + ++ + +P + ++ I +
Sbjct: 198 PANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 39.2 bits (91), Expect = 9e-06
Identities = 16/54 (29%), Positives = 21/54 (38%)

Query: 74 QDPSKNNPGAPKPTLAGPQKPPTPPTPPKPPTPPKPIEKPKPEPKPKPKPEPKK 127
Q +P P P P PKP KPKP+P K + +PK+
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112



Score = 35.3 bits (81), Expect = 2e-04
Identities = 15/56 (26%), Positives = 23/56 (41%)

Query: 74 QDPSKNNPGAPKPTLAGPQKPPTPPTPPKPPTPPKPIEKPKPEPKPKPKPEPKKPN 129
+P P+P P++ P PKP PKP K + +PK +P +
Sbjct: 65 PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 33.0 bits (75), Expect = 0.001
Identities = 13/52 (25%), Positives = 17/52 (32%)

Query: 75 DPSKNNPGAPKPTLAGPQKPPTPPTPPKPPTPPKPIEKPKPEPKPKPKPEPK 126
+P P + P P PKP K E+PK + KP
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPAS 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02710FLGMOTORFLIN1019e-31 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 101 bits (252), Expect = 9e-31
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02715OMS28PORIN270.041 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 27.4 bits (60), Expect = 0.041
Identities = 17/60 (28%), Positives = 34/60 (56%), Gaps = 3/60 (5%)

Query: 72 LASLEEVKEIIKSVSYFNNKSKHLINMAQKVVRDFNGVIPSTQKELMSLDGVGQKTANVV 131
A +E+VKE + + +++ + AQKV+ NG+ PS + ++++ V + +NVV
Sbjct: 181 FAKVEQVKETLMASERALDET---VQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


27HPSA_02830HPSA_02890N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_02830-2131.031787flagellin A
HPSA_02835-3121.3193673-methyladenine DNA glycosylase
HPSA_02840-2121.524963hypothetical protein
HPSA_028451110.400791uroporphyrinogen decarboxylase
HPSA_028501110.103574outer-membrane protein of the hefABC efflux
HPSA_02855290.133915putative efflux transporter
HPSA_02860180.007637acriflavine resistance protein AcrB
HPSA_02865110-0.854010hypothetical protein
HPSA_0287009-0.593720putative vacuolating cytotoxin VacA
HPSA_02875-212-0.645704ABC-type multidrug transport system, permease
HPSA_02880-211-0.470122hypothetical protein
HPSA_02885-211-0.665066NAD-dependent DNA ligase LigA
HPSA_02890-113-0.487219chemotaxis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02830FLAGELLIN2411e-75 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 241 bits (615), Expect = 1e-75
Identities = 125/518 (24%), Positives = 208/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKISSSAGTGIGVLAEVINKNSNQTGVKAYANVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGNLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQNGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SLDGRGIEIKTDSTNGVPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K +T V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 AGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02850RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.020
Identities = 15/113 (13%), Positives = 40/113 (35%), Gaps = 16/113 (14%)

Query: 208 LARMIALQKKLEQIQTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQ 260
LAR+ + ++ + + L K + + ++A L Y + ++
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279

Query: 261 FALEQNRLTLEYLTNLNVKNLKKTTIDAPNLQLRERQD-LVSLREQISALRYQ 312
+ + + +T K +D +LR+ D + L +++ +
Sbjct: 280 SEILSAKEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02855RTXTOXIND495e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.1 bits (117), Expect = 5e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 SAGIVDSIKVTEGSVVKKGDVLLLLYNQEKQAQSDSTEQQLIFAKKQYQRYSKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLESYEFN 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 31.3 bits (71), Expect = 0.003
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLESYEFNYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D + + E ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKNEER-------QQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02860ACRIFLAVINRP8920.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 892 bits (2307), Expect = 0.0
Identities = 291/1040 (27%), Positives = 521/1040 (50%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKRIQAISP-NYEIRPFLDTTSYIRTSIEDVKFDLVLGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESYYTKLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F ++YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFIAVVLVFVGSLFVASKLGMDFMLKEDRGRFQVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G F ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHAEVEFTTLQVGY-GTSQNPFRAKIFVQLKPLEERKKEHQLGQFELMRALKKELRS 631
+ + E FT + G +QN FV LKP EER + + + RA K EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNGDENSAEAVIHRA-KMELGK 656

Query: 632 LPEAKDLESINLSEVALIGGGGDSSPFQTYVFSHSQEAVDKSVANLKKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETQSPSSISRYNRQRSVTVLAEPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + E
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RSAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLIALATAFVLIYMILA 871
+ G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGSAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G GS ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02870VACCYTOTOXIN2758e-77 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 275 bits (705), Expect = 8e-77
Identities = 102/397 (25%), Positives = 181/397 (45%), Gaps = 14/397 (3%)

Query: 2799 AGNHSIMWLNELFVAKGGNPLFAPYYLQDNPTEHIVTLMKDITSALGMLSKPNLKNNSTD 2858
+G L L + + +A + I + T+ L ++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2859 VLQLNTYTQQMGRLAKLSNFASFDSTDFSERLSGLKNQRFADAIPNAMDVILKYSQRDKL 2918
L L+ RL LS + F++RL LK+QRFA + +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2919 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFIKG---VIVGGYAAYGYSGFYER--ITN 2973
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2974 SKSDNVDVGLYARAFIKKSELTFSINETWGANKTQISSTDTLLSMINQSYKYSTWTTNAK 3033
S ++N + G+Y+R F + E F G++++ ++ LL +NQSY Y ++ +
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3034 VNYGYDFMFKNKSVIIKPQISLGYYYIGMTGLEGVMDNSLYNQFKANADPSKKSVLTIDF 3093
+YGYDF F ++++KP + + Y ++G T + + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3094 ALENRHYFNTNSYFYAIGGIGRDLLVRSMGDKLVRFIGNNTLSYRKGELYNTFASITTGG 3153
+E R+Y+ SYFY G+ ++ + V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFANFGSSNA-VSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3154 EVRLFKSFYANAGVGARFGLDYKMINITGNIGMRLAF 3190
E++L K + N G L + + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 36.5 bits (84), Expect = 0.002
Identities = 56/360 (15%), Positives = 112/360 (31%), Gaps = 48/360 (13%)

Query: 117 YIGTKNASATPNNNSIWFGEKGYIGFITGVFKAKDIFITGNVGSGNEWKTGGGAILVFES 176
I T + N N + G+ G+ + ++G+ + W++ G I+
Sbjct: 275 TINTSKVTGEVNFNHLTVGDHNAAQ--AGIIASNKT----HIGTLDLWQSAGLNIIAPPE 328

Query: 177 SN-----ELTTNGAYFQNNRAGTQNSWMNLISNHSVHLTNTNFSNQTPHGGFNVMGRKIT 231
+ N + Q S N + ++ N+ +
Sbjct: 329 GGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQKTEIQP----------- 377

Query: 232 YNSGIINGGNFGFDNVDSHGTTTISGVTFNDNGALTYKGGNSIGGSITFTNSNINHYKLN 291
+ +I+G G T ++ N N T + G S+T ++++ K
Sbjct: 378 --TQVIDGPFAG------GKNTVVNINRINTNADGTIRVG-GFKASLTTNAAHLHIGKGG 428

Query: 292 LNANSITFNNSALGSMPNGNANTVGEAYILNASNITFNNLTFNGGWFVFNRPDAHVNFQG 351
+N ++ S L GN G + N G A+ F+
Sbjct: 429 INLSNQASGRSLLVENLTGNITVDGPLRV----NNQVGGYALAGS-------SANFEFKA 477

Query: 352 ITTINNPTSPFINMTGKVTINPNAIFNIQNYTPTIGSAYTLFSMKNGNITYNGVSDLWNI 411
T N T+ F N + I + F+ + ++GV++ NI
Sbjct: 478 GTDTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNT----LDFSGVTNKVNI 533

Query: 412 IRLKNTQATKDNSKNAASSNNTHTHYVTYNLGGTLYHFRQIFSPDSIILQSIHYGANNIH 471
+L A+ + + + N ++G + I S I + G +I+
Sbjct: 534 NKL--ITASTNVAVKNFNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIY 591



Score = 34.2 bits (78), Expect = 0.012
Identities = 19/97 (19%), Positives = 31/97 (31%), Gaps = 4/97 (4%)

Query: 702 YSFEGANNAFNSSKFNHGSFHFSHAEQTNAFNNNSFSGGSFSFNAKQVDFSGNSFNGGVF 761
YS + FNH + +A Q +N G+ + + + G +
Sbjct: 273 YSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGGY 331

Query: 762 DFNNTPKANFNNDTFNVNNQFKLNGSQTTFSFNKVVF 798
P +N T N K SQ S +V+
Sbjct: 332 KDK--PNDKPSNTTQNNAKNDKQESSQNN-SNTQVIN 365



Score = 31.9 bits (72), Expect = 0.049
Identities = 51/273 (18%), Positives = 83/273 (30%), Gaps = 35/273 (12%)

Query: 983 FKANQIDITGTIRSGNGAKTGGG-----AILVFNAQERLNIA-NANLNNDKAGLQDSWMN 1036
F A I I + N +G G +L A E + NA ++ + N
Sbjct: 190 FNAKNILIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYDGATLNLASN 249

Query: 1037 FIVNNGNVNATNANFSNQTPHGGFNL----KADNITWNKGSVSGGGSFGVDNANSHGTTT 1092
+ GNV + ++ K G + + T
Sbjct: 250 SVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH 309

Query: 1093 ISGVTFND----NGTLIYKGGENDAGNSLTLENNTFNSYNINARVQNLIFNNNSFNGGSY 1148
I + N +GG D N N T N+ + + NNS
Sbjct: 310 IGTLDLWQSAGLNIIAPPEGGYKDKPND-KPSNTTQNNAKND---KQESSQNNSNTQVIN 365

Query: 1149 SFNDTKNTTFK-----------GTNTLINSDPF-SRLQGSIAIDHNSVFNIERNLTNNTT 1196
N + T + G NT++N + + G+I F +LT N
Sbjct: 366 PPNSAQKTEIQPTQVIDGPFAGGKNTVVNINRINTNADGTI---RVGGFK--ASLTTNAA 420

Query: 1197 YTLLSGDSIKYNNQILADNVFSKNLWNLIHYDG 1229
+ + I +NQ ++ +NL I DG
Sbjct: 421 HLHIGKGGINLSNQASGRSLLVENLTGNITVDG 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02880LCRVANTIGEN300.001 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.0 bits (67), Expect = 0.001
Identities = 15/33 (45%), Positives = 20/33 (60%)

Query: 16 KRKRLLTELAELEAEIKVGSERKSSFNISLSPS 48
R +L ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_02890HTHFIS542e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.5 bits (131), Expect = 2e-10
Identities = 24/111 (21%), Positives = 46/111 (41%), Gaps = 8/111 (7%)

Query: 194 ILIAEDSLSALKTLEKIVQTLELRYLAFPNGRELLNYLYEKEHYEKVGVVITDLEMPNIS 253
IL+A+D + L + + N L ++ +V+TD+ MP+ +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA----GDGDLVVTDVVMPDEN 61

Query: 254 GFEVLKTIKA-DPKTNHIPVIINSSMSSDSNRQLAQSLEADGFVVKSNILE 303
F++L IK P +PV++ S+ ++ A A ++ K L
Sbjct: 62 AFDLLPRIKKARP---DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


28HPSA_04860HPSA_04895N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_04860-29-0.092252virulence associated protein D (vapD)
HPSA_04865-290.341068hypothetical protein
HPSA_04870-390.407524cobalt-zinc-cadmium resistance protein
HPSA_04875-210-0.557663putative cobalt-zinc-cadmium resistance protein,
HPSA_04880-211-0.764546hypothetical protein
HPSA_04885-310-0.365154glycyl-tRNA synthetase subunit beta
HPSA_04890-1100.864164phosphoglyceromutase
HPSA_048951110.541416aspartyl/glutamyl-tRNA amidotransferase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04860PF046051441e-48 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 144 bits (365), Expect = 1e-48
Identities = 30/90 (33%), Positives = 45/90 (50%)

Query: 3 AVTFDLDTNCLNENGVNLSKVYSDIRKFMEQHGFKWQQGSVYFGDETINAVTCVATVQIL 62
A+ FDL T L + + + YS I+KFM ++GF+ +Q S Y E IN + V L
Sbjct: 7 AINFDLSTKSLEKYFKDTREPYSLIKKFMLENGFEHRQYSGYTSKEPINERRVIRIVNKL 66

Query: 63 AKQIPCFANCVKDVRMLKIEENNDLMPAIK 92
K+ CVK+ + +I E L I+
Sbjct: 67 TKKFTWLGECVKEFDITEIGEQYSLKETIQ 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04870ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1943), Expect = 0.0
Identities = 225/1044 (21%), Positives = 466/1044 (44%), Gaps = 42/1044 (4%)

Query: 5 IIEFSLRQRVIVIVCAILILFFGTYSFINTPVDAFPDISPTQVKIILKLPGSSPEEMENN 64
+ F +R+ + V AI+++ G + + PV +P I+P V + PG+ + +++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 65 IVRPLELELLGLKGQKSLRSVSKYSIS-DITIDFDDSVDIYLARNIVNERLASVMKDLPV 123
+ + +E + G+ + S S + S IT+ F D +A+ V +L LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 124 GVEGGMAPIVTPLSDIFMF----TIDGNITEIEKRQLLDFVIRPQLRMISGVADVNSIGG 179
V+ + S M + + T+ + + ++ L ++GV DV G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 FSKAFVIVPDFNDMARLGVSISDLEAAVRVNLRNSGAGRVDR----DGETFLVKI--QTA 233
A I D + + + ++ D+ ++V AG++ G+ I QT
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 SLSLEDIGKITV--STNLGHLHIKDFAKVISQSRTRLGFVTKDGIGETTEGLVLSLKEAN 291
+ E+ GK+T+ +++ + +KD A+V +G + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARING-KPAAGLGIKLATGAN 298

Query: 292 TKEIIAQVYQKLEELKPFLPNGVFINVFYDRSEFTQKAIATVSKTLIEAIVLIIITLFLF 351
+ + KL EL+PF P G+ + YD + F Q +I V KTL EAI+L+ + ++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 352 LGNLRASVAVGVILPLSLSVAFIFIKLNNLTLNLMSLGGLIIAIGMLIDSAVVVVENAFE 411
L N+RA++ + +P+ L F + ++N +++ G+++AIG+L+D A+VVVEN E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV-E 417

Query: 412 TLSANTKTTKLHAIYRSCKEIAVSVVSGVVIIIVFFVPILTLQGLEGKMFRPLAQSIVYA 471
+ K A +S +I ++V +++ F+P+ G G ++R + +IV A
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 472 LLGTLVLSITIIPVVSSLVLK--ATPHSET---FLTRFLNKIYAPLLDFFVHNPKKVI-- 524
+ ++++++ + P + + +LK + H E F F N + ++ + ++ K++
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWF-NTTFDHSVNHYTNSVGKILGS 536

Query: 525 ----LGAFVFLIA-SLSLFPFVGKNFMPALDEGDVVLSVETTPSISLDQSRDLMLNIEST 579
L + ++A + LF + +F+P D+G + ++ + ++++ ++ +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 580 IKKHVKEVKTIVARTGSDELGLDLGGLNQTDTFISFIPKKEWSVKTKDELL-DKILDALK 638
K+ K V G G Q + ++F+ K W + DE + ++ K
Sbjct: 597 YLKNEKANVESVFTVN----GFSFSGQAQ-NAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 639 -DFKGINFSFTQPIEM-RISEMLTGVRGDLA-VKIFGDDISALNELSFQIA-QALKGIKG 694
+ I F P M I E+ T D + G AL + Q+ A +
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 695 SSEVLTTLNEGVNYLYVTPNKEAMANVGISSDEFSKFLKSALEGLIVDVIPTGISRTPVM 754
V E + ++E +G+S + ++ + +AL G V+ +
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 755 IRQESDFASSITKIKSLALTSKYGVLVPITSIAKIEEVDGPVSIVREDSMRMSVVRSNVV 814
++ ++ F + L + S G +VP ++ V G + R + + ++
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 815 GRDLNSFVEEAKKVISHNIKLPPSYYITYGGQFENQQRANKRLSTVIPLSILAIFFILFF 874
+ + + KLP + G ++ + + ++ +S + +F L
Sbjct: 832 PGTSSGDAMALMENL--ASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 875 TFKSIPLALLILLNIPFAVTGGLIALFVVGEYISVPASVGFIALFGIAVLNGVVMIGYFK 934
++S + + ++L +P + G L+A + + V VG + G++ N ++++ + K
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 935 ELLL-QGKSIEECVLLGAKRRLRPVLMTACIAGLGLIPLLFTHSVGSEVQKPLAIVVLGG 993
+L+ +GK + E L+ + RLRP+LMT+ LG++PL ++ GS Q + I V+GG
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 994 LVTSSALTLLLLPPMFMLIAKKIK 1017
+V+++ L + +P F++I + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04875RTXTOXIND290.041 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.041
Identities = 12/84 (14%), Positives = 35/84 (41%), Gaps = 8/84 (9%)

Query: 46 SKGLPFNAYIDFDSKSSVVQSLSFDASVVAVYKREGEQVKAGDVICEVSSID-------L 98
N + +S ++ + ++ V + +EGE V+ GDV+ +++++
Sbjct: 81 EIVATANGKLTHSGRSKEIKPIE-NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKT 139

Query: 99 SNLYFELQNNQNKLKIAKDITKKD 122
+ + + Q + +I + +
Sbjct: 140 QSSLLQARLEQTRYQILSRSIELN 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_04895PF05932270.008 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 27.1 bits (60), Expect = 0.008
Identities = 7/47 (14%), Positives = 15/47 (31%), Gaps = 1/47 (2%)

Query: 38 VENIFALETHELKTDAHLSTPLREDEPKSQNTAKDILSQNKHSQDHY 84
++N FAL L + P + +L+ + +
Sbjct: 34 IDNTFALTLSCDYARERLLLIGLLE-PHKDIPQQCLLAGALNPLLNA 79


29HPSA_05105HPSA_05140N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_05105-113-0.346656hypothetical protein
HPSA_05110013-0.098084D-3-phosphoglycerate dehydrogenase
HPSA_05115-1130.5871593-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPSA_051200140.745608hypothetical protein
HPSA_051250150.631222hypothetical protein
HPSA_051300150.921078UDP-2,3-diacylglucosamine hydrolase
HPSA_051350150.846153chemotaxis protein CheV
HPSA_05140-3150.508124auto phosphorylating histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05105V8PROTEASE300.006 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 29.6 bits (66), Expect = 0.006
Identities = 18/62 (29%), Positives = 32/62 (51%), Gaps = 1/62 (1%)

Query: 26 VKKEIPTVSFTH-QNPSQTEKTHDINLEENPNNHDTPNNEKALHNEEDRNNTLSQNLDAQ 84
+K+ I + F + P+ + + N +NPNN D PNN +N ++ +N + N D
Sbjct: 274 LKQNIEDIHFANDDQPNNPDNPDNPNNPDNPNNPDEPNNPDNPNNPDNPDNGDNNNSDNP 333

Query: 85 DS 86
D+
Sbjct: 334 DA 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05125ALARACEMASE290.017 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 28.6 bits (64), Expect = 0.017
Identities = 8/43 (18%), Positives = 15/43 (34%), Gaps = 1/43 (2%)

Query: 136 GVKPEETLDFYSQISETCKRIQLKGLMCIGAHTDDETKIEKSF 178
G +P+ L + Q+ + LM A + I +
Sbjct: 132 GFQPDRVLTVWQQL-RAMANVGEMTLMSHFAEAEHPDGISGAM 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05135HTHFIS597e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.1 bits (143), Expect = 7e-12
Identities = 29/129 (22%), Positives = 51/129 (39%), Gaps = 13/129 (10%)

Query: 182 GEVLFLDDSKTARKTLKNHLSKLGFNITEAVDGEDGLNKLEMLFKKYGDDLRNHLKFIIS 241
+L DD R L LS+ G+++ + + +++
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA----------AGDGDLVVT 53

Query: 242 DVEMPKMDGYHFLFKLQKDPRFAYIPVIFNSSICDNYSAERAKEMGAVAYLVK-FDAEKF 300
DV MP + + L +++K +PV+ S+ +A +A E GA YL K FD +
Sbjct: 54 DVVMPDENAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111

Query: 301 TEEISKILD 309
I + L
Sbjct: 112 IGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_05140HTHFIS565e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.6 bits (134), Expect = 5e-10
Identities = 24/121 (19%), Positives = 55/121 (45%), Gaps = 4/121 (3%)

Query: 686 VLAIDDSSTDRAIIRKCLKPLGITLLEATNGLEGLEMLKNGDKIPDAILVDIEMPKMDGY 745
+L DD + R ++ + L G + +N + GD D ++ D+ MP + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD--GDLVVTDVVMPDENAF 63

Query: 746 TFASEVRKYNKFKNLPLIAVTSRVTKTDRMRGVESGMTEYITKPYSGEYLTTVVKRSIKL 805
++K +LP++ ++++ T ++ E G +Y+ KP+ L ++ R++
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 806 E 806

Sbjct: 122 P 122


30HPSA_07995HPSA_08020N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSA_07995-2152.416853flagellar hook-basal body protein FliE
HPSA_08000-2131.961793flagellar basal body rod protein FlgC
HPSA_08005-2131.375023flagellar basal body rod protein FlgB
HPSA_08010-2151.790962probable cell division protein ftsW
HPSA_080150141.253757iron(III) ABC transporter, periplasmic
HPSA_080200130.566490hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_07995FLGHOOKFLIE749e-21 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 73.9 bits (181), Expect = 9e-21
Identities = 23/108 (21%), Positives = 46/108 (42%), Gaps = 7/108 (6%)

Query: 3 AIHNDKSLLSPFSELNTDNRTQREESSSAFKEQKGGEFSKLLKQSINELNSTQEQSDKAL 62
AI + ++S R Q F+ L +++ ++ TQ +
Sbjct: 2 AIQGIEGVISQLQATAMSARAQESLPQPTIS------FAGQLHAALDRISDTQTAARTQA 55

Query: 63 ADMATGQIK-DLHQAAIAIGKAETSMKLMLEVRNKAISAYKELLRTQI 109
G+ L+ + KA SM++ ++VRNK ++AY+E++ Q+
Sbjct: 56 EKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_08000FLGHOOKAP1280.012 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.4 bits (63), Expect = 0.012
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_08015FERRIBNDNGPP346e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 34.2 bits (78), Expect = 6e-04
Identities = 28/183 (15%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 108 NVELLKKLSPDLVVTFVG-NPKAVEHAKKFGISFLSFQETT--IAEAMQAMQ--AQARVL 162
N+ELL ++ P +V G P A+ +F + +A A +++ A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 163 EIDASKKFAKMQETLDFIAERL-KNVKKKKGVELFHKAN--KISGHQAISSDILEKGGID 219
+ A A+ ++ + + R K + + + G ++ +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 220 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWVSPLTPEDVLNNPKFSTIKAIKNKQVY 277
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 278 KLP 280
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSA_08020FERRIBNDNGPP330.002 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 32.6 bits (74), Expect = 0.002
Identities = 28/183 (15%), Positives = 73/183 (39%), Gaps = 10/183 (5%)

Query: 106 NVELLKKLSPDLVVTFVGNPKAVEHAKKF--GISFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EVDASKKLAKMQETLDFIAERL-KNVKKKKGVELFHKAN--KISGHQALDSDILEKGGVD 217
+ A LA+ ++ + + R K + + + G +L +IL++ G+
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 218 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLSPEDVLNNPKFSTIKAIKNKQVY 275
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 276 KLP 278
++P
Sbjct: 268 RVP 270



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.