PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_021285.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_021285 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1NH44784_000031NH44784_000631Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_000031039-4.605848phage-related hypothetical protein
NH44784_000041140-6.243428phage-related hypothetical protein
NH44784_000051037-6.701962Phage protein
NH44784_000061136-6.993374hypothetical protein
NH44784_000071234-6.814985hypothetical protein
NH44784_000081333-6.525888hypothetical protein
NH44784_000091423-4.916425IS, phage, Tn; Transposon-related functions
NH44784_000101322-3.945550hypothetical protein
NH44784_000111122-4.468529hypothetical protein
NH44784_000121223-4.625271putative baseplate protein
NH44784_000131223-4.816273hypothetical protein
NH44784_000141323-5.083546hypothetical protein
NH44784_000151118-4.278407hypothetical protein
NH44784_000161117-3.893583hypothetical protein
NH44784_000171315-2.575449hypothetical protein
NH44784_000181216-2.803872hypothetical protein
NH44784_000191418-2.540673hypothetical protein
NH44784_000201316-2.515348FIG00456185: hypothetical protein
NH44784_000211317-3.377696Phage protein
NH44784_000221314-3.249923hypothetical protein
NH44784_000231515-3.031583hypothetical protein
NH44784_000241315-2.411686hypothetical protein
NH44784_000251415-2.825514Phage protein
NH44784_000261315-2.234197FIG00925362: hypothetical protein
NH44784_000271316-1.744121hypothetical protein
NH44784_000281316-2.122194hypothetical protein
NH44784_000291317-2.462800Phage protein
NH44784_000301315-2.400553Phage protein
NH44784_000311118-0.896037Phage terminase, large subunit
NH44784_000321-123-0.998174putative phage-related protein
NH44784_000331-221-1.020867hypothetical protein
NH44784_000341-1180.010107phage-related hypothetical protein
NH44784_000351-1131.708456phage-related hypothetical protein
NH44784_0003611150.427343hypothetical protein
NH44784_000371116-0.581378hypothetical protein
NH44784_000381217-0.689212hypothetical protein
NH44784_000391218-0.943863hypothetical protein
NH44784_000401217-1.461636hypothetical protein
NH44784_000411321-3.119428hypothetical protein
NH44784_000421219-2.361227hypothetical protein
NH44784_000431220-1.366606hypothetical protein
NH44784_000441220-1.770376hypothetical protein
NH44784_000451419-0.764648phage-related hypothetical protein
NH44784_000461217-1.897721hypothetical protein
NH44784_000471117-1.638676hypothetical protein
NH44784_000481116-0.149428hypothetical protein
NH44784_000491416-1.146694hypothetical protein
NH44784_000501316-1.040667hypothetical protein
NH44784_000511616-1.162935hypothetical protein
NH44784_000521717-0.291501hypothetical protein
NH44784_000531613-0.896191TolA protein
NH44784_000541514-2.227948Phage related protein
NH44784_000551313-2.147932phage related protein
NH44784_000561317-2.734638phage-related hypothetical protein
NH44784_000571121-2.790364hypothetical protein
NH44784_000581125-3.700905hypothetical protein
NH44784_000591027-4.128726methyltransferase, putative
NH44784_000601131-5.007987hypothetical protein
NH44784_000611029-4.464106phage-related hypothetical protein
NH44784_000621120-3.786841FIG00434035: hypothetical protein
NH44784_000631016-3.215529hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_000511PF04619280.018 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.0 bits (62), Expect = 0.018
Identities = 9/50 (18%), Positives = 19/50 (38%), Gaps = 3/50 (6%)

Query: 35 DDYHPQHVSILSRDRIDAINVR---AEGVLTFGDREFFFIVRDGNWDGTV 81
D++ ++S + D + V + D F+ G+W G +
Sbjct: 84 DNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSWGGII 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_000531IGASERPTASE507e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.4 bits (120), Expect = 7e-09
Identities = 24/212 (11%), Positives = 61/212 (28%), Gaps = 10/212 (4%)

Query: 139 DAAELRATIEQVNARAVDASWEEYEAEAHRVKARALDALTQALAAREKYDAEQAELARLR 198
+ + T++ N + + + + A +E E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET---- 1039

Query: 199 AAEAEREQKEREERIAREATERAQREAEARAQAERDAAARREAEALA-----AAETARLN 253
AE +++ + E+ ++ATE + E +A+ + A + +A ET
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 254 AELAEQRRIAAEQQAELDRQAAAVREREAAAQAEQRARQAAEQAAAAERQRIADEQAAAA 313
+ + + E ++ + + +Q + + A R+
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 314 AEAARREEDMAHKAAINRAALDAFVQGGMPED 345
D A + ++ V +
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190


2NH44784_001311NH44784_001531Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0013112110.895897*hypothetical protein
NH44784_001331090.827832*GNAT family acetyltransferase YjcF
NH44784_0013411100.681058Permease of the drug/metabolite transporter
NH44784_001351-190.437331hypothetical protein
NH44784_001361-280.065546hypothetical protein
NH44784_001371-19-0.151558Ferredoxin-dependent glutamate synthase
NH44784_0013811130.217867Cardiolipin synthetase
NH44784_001391215-0.004835hypothetical protein
NH44784_001401316-0.342012ABC transporter ATP-binding protein
NH44784_001411520-1.508260FIG00433932: hypothetical protein
NH44784_001431223-1.758589*hypothetical protein
NH44784_001441126-2.219674hypothetical protein
NH44784_001451125-2.617425hypothetical protein
NH44784_001461025-2.790047hypothetical protein
NH44784_001471021-2.213727Permease of the drug/metabolite transporter
NH44784_001481018-0.618832diguanylate cyclase/phosphodiesterase (GGDEF &
NH44784_001491118-0.550698phage-related hypothetical protein
NH44784_001501116-0.634526phage-related putative membrane protein
NH44784_001511312-0.742056Phage tail fiber protein
NH44784_001521311-0.128548Putative bacteriophage protein
NH44784_001531213-0.393362putative bacteriophage protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_001431TCRTETB290.035 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.035
Identities = 28/155 (18%), Positives = 56/155 (36%), Gaps = 23/155 (14%)

Query: 30 VFAGAVIAAALSLALFAGGTGLGFLSVSPW---SGEGMSAPAVGIGVIAWMLFTQIVAYG 86
+ + GT GF+S+ P+ +S +G +I F ++
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVII----FPGTMSVI 307

Query: 87 VGGYVAGRLRTKWADVHGDEI-------------YFRDTAHGFLTWALSAVV-SAALLGS 132
+ GY+ G L + ++ I + +T F+T + V+ + +
Sbjct: 308 IFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKT 367

Query: 133 AVATLASGAAKAGASAAAGAGTAITATAAAGAGAG 167
++T+ S + K A AG + T+ G G
Sbjct: 368 VISTIVSSSLKQ-QEAGAGMS-LLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_001441PF05616240.018 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 23.9 bits (51), Expect = 0.018
Identities = 13/28 (46%), Positives = 17/28 (60%), Gaps = 1/28 (3%)

Query: 5 PSPE-DPRPDPNHAPDDPGAPGITPPAP 31
P+PE DP +P+ PD G PG P +P
Sbjct: 351 PNPEPDPDLNPDANPDTDGQPGTRPDSP 378


3NH44784_002291NH44784_002601Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_002291217-1.492623RecA protein
NH44784_002301-110-1.237166two-component response regulatory protein
NH44784_002311-19-1.972236Sensor protein basS/pmrB
NH44784_002321-18-3.423864Permeases of the major facilitator superfamily
NH44784_002331-19-3.037441ortholog of Bordetella pertussis (BX470248)
NH44784_002341-19-2.939181DNA recombination-dependent growth factor C
NH44784_002351-312-3.034673Putative Heme-regulated two-component response
NH44784_002361017-3.470565Aspartate carbamoyltransferase
NH44784_002371526-2.291824hypothetical protein
NH44784_0023813180.818201putative integral membrane protein
NH44784_0023913220.002548putative lipoprotein
NH44784_0024016270.348231entericidin A anti-toxin precursor
NH44784_0024117290.851409hypothetical protein
NH44784_0024215231.037526hypothetical protein
NH44784_0024314200.817103NAD(FAD)-utilizing dehydrogenases
NH44784_002441519-0.298809hypothetical protein
NH44784_002451318-0.252097ortholog of Bordetella pertussis (BX470248)
NH44784_0024712180.289243*hypothetical protein
NH44784_002481-116-0.060360Transcriptional regulator, GntR family domain /
NH44784_002491218-0.856918DedA
NH44784_0025010160.420023hypothetical protein
NH44784_002511012-0.062640ortholog of Bordetella pertussis (BX470248)
NH44784_00252109-0.474850hypothetical protein
NH44784_00253109-0.962781Transcriptional regulator, AsnC family
NH44784_002541-112-1.284609hypothetical protein
NH44784_002551-112-1.6092854-hydroxythreonine-4-phosphate dehydrogenase
NH44784_002561211-2.812953hypothetical protein
NH44784_002571315-3.589814Replicative DNA helicase
NH44784_002581414-3.917933ortholog of Bordetella pertussis (BX470248)
NH44784_002591314-4.181929LSU ribosomal protein L9p
NH44784_002601111-4.164249SSU ribosomal protein S18p
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002301HTHFIS935e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 5e-24
Identities = 36/118 (30%), Positives = 63/118 (53%), Gaps = 1/118 (0%)

Query: 2 RILIAEDDSILADGLSRSLRHNGYAVDAVRDGLAADSALAAQAFDLLILDLGLPQLAGLE 61
IL+A+DD+ + L+++L GY V + +AA DL++ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRRLRARNSALPVLILTAADSIEQRVKGLDLGADDYMAKPFALSE-LEARVRALTRR 118
+L R++ LPVL+++A ++ +K + GA DY+ KPF L+E + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002311PF06580415e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.4 bits (97), Expect = 5e-06
Identities = 21/77 (27%), Positives = 34/77 (44%), Gaps = 19/77 (24%)

Query: 390 LLNNLVDNALRY----TPRGGHITVRVQALDGQAVLEVEDSGPGIALEERERVFDRFYRV 445
L+ LV+N +++ P+GG I ++ +G LEVE++G +E
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE--------- 309

Query: 446 LGTLSDGSGLGLAIVRE 462
+G GL VRE
Sbjct: 310 ------STGTGLQNVRE 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002321TCRTETA411e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.6 bits (95), Expect = 1e-05
Identities = 53/280 (18%), Positives = 100/280 (35%), Gaps = 27/280 (9%)

Query: 46 GFIFALLAFAAGFAVRPFGALVFGRLGDLVGRKYTFLVTIVIMGLSTFLVGVLPSYASIG 105
G + AL A FA P G L D GR+ LV++ + ++ P
Sbjct: 46 GILLALYALMQ-FACAPVL----GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL---- 96

Query: 106 LAAPAILIVLRLLQGLALGGEYGGAATYVAEHAPHGRRGFYTSWIQTTATLGLFLSLLVI 165
+L + R++ G+ G A Y+A+ R + ++ G+
Sbjct: 97 ----WVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG---- 147

Query: 166 LGIRTFMGEDDFKAWGWRIPFLISVVLLGISVWIRLQLNESPTFQRMKDEGKGSKAPIAE 225
+ MG + PF + L G++ L + + + P+A
Sbjct: 148 PVLGGLMGG-----FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLAS 202

Query: 226 SFGQWKNLKVVLLALLGLTAGQAVVWYTGQFYALFFLTQTLKVDANTANIMIAIALLIGT 285
+W V+ AL+ + +V + F DA T I +A ++ +
Sbjct: 203 F--RWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHS 260

Query: 286 PF-FVIFGALSDKIGRKPIIMAGCLIAAATYFPIFQGITH 324
+I G ++ ++G + +M G +IA T + + T
Sbjct: 261 LAQAMITGPVAARLGERRALMLG-MIADGTGYILLAFATR 299



Score = 33.6 bits (77), Expect = 0.002
Identities = 14/47 (29%), Positives = 24/47 (51%)

Query: 270 ANTANIMIAIALLIGTPFFVIFGALSDKIGRKPIIMAGCLIAAATYF 316
I++A+ L+ + GALSD+ GR+P+++ AA Y
Sbjct: 42 TAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88


4NH44784_002921NH44784_003011Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_002921214-1.081243transcriptional regulator, LysR-family
NH44784_002931316-2.302882FIG00973797: hypothetical protein
NH44784_002941516-2.734435GTP-binding protein TypA/BipA
NH44784_002951416-1.426023tRNA pseudouridine synthase B
NH44784_002961517-1.426023Ribosome-binding factor A
NH44784_002971416-1.470305Translation initiation factor 2
NH44784_002981421-2.427884Transcription termination protein NusA
NH44784_002991426-1.631422COG0779: clustered with transcription
NH44784_003001324-1.065117Ribosomal large subunit pseudouridine synthase
NH44784_003011223-1.640287Segregation and condensation protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002931MICOLLPTASE310.001 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.8 bits (69), Expect = 0.001
Identities = 10/49 (20%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 83 ARFIQEIYAAFGEI--LGELHECSYVHVIDARAAAYGYGGKTQEYRHQH 129
F I L EL + H + R G G+ + Y+
Sbjct: 480 GTFFTYERTPEESIYTLEELFRHEFTHYLQGRYVVPGMWGQGEFYQEGV 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002941TCRTETOQM1685e-47 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 168 bits (428), Expect = 5e-47
Identities = 94/435 (21%), Positives = 167/435 (38%), Gaps = 62/435 (14%)

Query: 5 LRNVAIIAHVDHGKTTLVDQLLRQSGTFRENQALTE--RVMDSNDLEKERGITILAKNCA 62
+ N+ ++AHVD GKTTL + LL SG E ++ + D+ LE++RGITI +
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 63 VEYEGTHINIVDTPGHADFGGEVERVLSMVDGVLLLVDAVEGPMPQTIFVTRKALALGLK 122
++E T +NI+DTPGH DF EV R LS++DG +LL+ A +G QT + +G+
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 123 PIVVVNKVDRPGART-------------DFVINATFDLFDKLGATDEQL----------- 158
I +NK+D+ G + VI +L+ + T+
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 159 -DFPVVYASG--LSG---YAGLTPDVREGDMRPLF--------------EAILQHVPQRD 198
D Y SG L + + P++ E I
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSST 242

Query: 199 DDPNGPLQMQIISLDYNSYVGKIGVGRINRGRMRPGMEVAYKFGPEGQGGRGRINQVLKF 258
L ++ ++Y+ ++ R+ G + V + + +I ++
Sbjct: 243 HRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS-----EKEKIKITEMYTS 297

Query: 259 HGLERIVVDEAEAGDIVLINGIEDLGIGCTVTDPVTQDVLPMLRIDEPTLTMNFMVNTSP 318
E +D+A +G+IV++ E L + + D + P L +
Sbjct: 298 INGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQ 356

Query: 319 LAGREGKFVTSRQLRDRLDRELKSNVALRVRDTGDDTVFEVSGRGELHLTILLETMRRE- 377
L D L S+ LR +S G++ + + ++ +
Sbjct: 357 QREM---------LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKY 407

Query: 378 GYELAVSRPRVVFKE 392
E+ + P V++ E
Sbjct: 408 HVEIEIKEPTVIYME 422



Score = 33.7 bits (77), Expect = 0.003
Identities = 17/100 (17%), Positives = 32/100 (32%), Gaps = 1/100 (1%)

Query: 387 RVVFKEVDGVKCEPFESLTIDVEDAHQGGVMEELGRRKGDLQDMQPDGRGRTRLEYLIPA 446
V K+ EP+ S I + + + ++ D Q L IPA
Sbjct: 525 EQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQL-KNNEVILSGEIPA 583

Query: 447 RGLIGFQNEFLTLTRGTGLMSHIFHEYAPIKEGSIGERRN 486
R + ++++ T G + Y + + R
Sbjct: 584 RCIQEYRSDLTFFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002971TCRTETOQM772e-16 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 76.8 bits (189), Expect = 2e-16
Identities = 69/277 (24%), Positives = 94/277 (33%), Gaps = 76/277 (27%)

Query: 525 VMGHVDHGKTSLLDYIRRAKVAAGEAG------------------GITQHIGAYHVETER 566
V+ HVD GKT+L + + A E G GIT G + E
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 567 GMVTFLDTPGHEAFTAMRARGAKATDIVILVCAADDGVMPQTREAIHHAKAAGVPMVVAM 626
V +DTPGH F A R D IL+ +A DGV QTR H + G+P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 627 TKIDKPSANPERVKQ--------------------------------------------- 641
KID+ + V Q
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 642 ------ELVAEEVVPEEYGG--DVPFVPV---SAKTGEGIDALLENVLLQAELLELKAPV 690
L A E+ EE + PV SAK GID L+E + +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 691 DAQAKGLVIEARLDKGRGPVATILVQSGTLHRGDVVL 727
++ G V + + R +A I + SG LH D V
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_003001cloacin320.009 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.009
Identities = 28/118 (23%), Positives = 36/118 (30%), Gaps = 10/118 (8%)

Query: 431 GEANGNRKGGKPTGGRGQAGAGGKSGGRGGKAGGGKARGVRAAAAGSAGAPEATVGAGRK 490
G+ G+ G T G G G G G G G + GS G+G
Sbjct: 4 GDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHG 63

Query: 491 PAGAKPGGKPAGARGGNAGRGNKPAGAGRAGNKAEGARAGGNKGPGAGGKPRAARGGS 548
G GN+G G+ G A PGAGG + G+
Sbjct: 64 NGGG----------NGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111


5NH44784_004431NH44784_004491Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_004431114-3.415271Probable transmembrane protein
NH44784_004441114-3.893815hypothetical protein
NH44784_004451119-4.844827Citrate synthase (si
NH44784_004461017-4.028698YgfY COG2938
NH44784_004471119-3.711050Succinate dehydrogenase iron-sulfur protein
NH44784_004481121-2.898036Succinate dehydrogenase flavoprotein subunit
NH44784_004491020-3.016437Succinate dehydrogenase hydrophobic membrane
6NH44784_005741NH44784_005811Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_005741214-2.510286Transmembrane component BioN of energizing
NH44784_005751212-1.721944ATPase component BioM of energizing module of
NH44784_005761114-1.193870Peptide methionine sulfoxide reductase MsrB
NH44784_0057714140.079275Intracellular septation protein IspA
NH44784_0057814130.341750Cell division protein BolA
NH44784_0057912140.262025Peptidyl-prolyl cis-trans isomerase
NH44784_0058011101.273390DNA ligase
NH44784_0058112101.850224hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005791SOPEPROTEIN280.038 Salmonella type III secretion SopE effector protein ...
		>SOPEPROTEIN#Salmonella type III secretion SopE effector protein

signature.
Length = 239

Score = 27.8 bits (61), Expect = 0.038
Identities = 14/40 (35%), Positives = 21/40 (52%)

Query: 16 PAFAQNVATVNGKAIPQKSLDQFVKLLVSQGATDSPQLRE 55
PA+A A+ K+ DQ LL+S+G +P L+E
Sbjct: 103 PAYASQTREAILSAVYSKNKDQCCNLLISKGINIAPFLQE 142


7NH44784_005951NH44784_006151Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_005951110-3.132300Phosphoserine phosphatase
NH44784_005961212-3.711959Ferrichrome-iron receptor
NH44784_005971316-4.107818hypothetical protein
NH44784_005981317-3.500449Probable transmembrane protein
NH44784_005991218-3.634859NADH-ubiquinone oxidoreductase chain N
NH44784_006001217-4.477318NADH-ubiquinone oxidoreductase chain M
NH44784_006011018-2.471320NADH-ubiquinone oxidoreductase chain L
NH44784_006021019-2.230011NADH-ubiquinone oxidoreductase chain K
NH44784_006031-119-2.269115NADH-ubiquinone oxidoreductase chain J
NH44784_006041020-2.781037NADH-ubiquinone oxidoreductase chain I
NH44784_006051020-2.545061NADH-ubiquinone oxidoreductase chain H
NH44784_006061020-2.234937NADH-ubiquinone oxidoreductase chain G
NH44784_006071119-4.523755NADH-ubiquinone oxidoreductase chain F
NH44784_006081522-5.489950NADH-ubiquinone oxidoreductase chain E
NH44784_006091421-4.870468NADH-ubiquinone oxidoreductase chain D
NH44784_006101420-4.027649NADH-ubiquinone oxidoreductase chain C
NH44784_006111318-3.141289NADH-ubiquinone oxidoreductase chain B
NH44784_006121218-1.557984NADH ubiquinone oxidoreductase chain A
NH44784_006131-115-0.089877outer membrane porin
NH44784_006141-1112.383500Transcriptional regulator, TetR family
NH44784_006151-283.003278FIG00431876: hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006101PF04183280.045 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.5 bits (61), Expect = 0.045
Identities = 16/72 (22%), Positives = 21/72 (29%), Gaps = 27/72 (37%)

Query: 100 WRLRVRTWAPDDEFPMV-ASLMEC----------------WPAVGWFER----------E 132
WR W DE P++ A+LMEC A W +
Sbjct: 351 WRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYH 410

Query: 133 AFDLYGIVFEGH 144
YG+ H
Sbjct: 411 LLCRYGVALIAH 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006121HOKGEFTOXIC250.037 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 24.8 bits (54), Expect = 0.037
Identities = 10/30 (33%), Positives = 13/30 (43%)

Query: 4 QQYFPVLLFIVVATLIGFALLTAGSLLGPR 33
+ + IV TL+ F LT SL R
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIR 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006131ECOLNEIPORIN1012e-26 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 101 bits (253), Expect = 2e-26
Identities = 84/388 (21%), Positives = 141/388 (36%), Gaps = 58/388 (14%)

Query: 1 MKKTLLAAALLAGFAGVAQAETSVTLYGIIDTGIGYNK-ISGAGDAKNGSKIGMINGVQN 59
MKK+L+A L A A VTLYG I G+ ++ ++ G + G V
Sbjct: 1 MKKSLIALTLAAL---PVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGT-GIVDL 56

Query: 60 GSRWGLRGSEDLGDGLRAVFQLESGFDSGNGKSGQNGRLFGRQATVGLASDSWGQLDFGR 119
GS+ G +G EDLG+GL+A++Q+E +G + RQ+ +GL +G+L GR
Sbjct: 57 GSKIGFKGQEDLGNGLKAIWQVEQK----ASIAGTDSGWGNRQSFIGLKGG-FGKLRVGR 111

Query: 120 QTNIASKYFGSIDPFGAGFNVANIGTGMSAANTQRYDNMVMYQTPSFSGFQFGVGYSFSA 179
++ G I+P+ + ++ A + V Y +P F+G V Y+ +
Sbjct: 112 LNSVLKDT-GDINPW---DSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALND 167

Query: 180 DEGNADGSTKADPDRVGFKTADNVRAITTGLRYVNGPLNVALTYDQLNASHNQAQGEVDA 239
+ G + + G Y NG V Q ++
Sbjct: 168 NAGRHNS-----------------ESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIE- 209

Query: 240 TPRSYMIGGSYDFEVVKLALAYARTTDGWFATSNPAGASGTITVDGTSRKLGFGANQFAD 299
+ + + YD + + ++ + D A S + + A +F +
Sbjct: 210 KYQIHRLVSGYDNDALYASV-AVQQQD---AKLVEENYSHNSQTEVAAT----LAYRFGN 261

Query: 300 GFKANSYLLGLSAPIGGASSLFGSWQRVDPSNNKLTGDDATMNVFSAGYTYDLSKRTNLY 359
SY G + + G YD SKRT+
Sbjct: 262 VTPRVSYAHGFKGSFDAT------------------NYNNDYDQVVVGAEYDFSKRTSAL 303

Query: 360 AYGSYSKNYAFNDGVKATAVGVGLRHRF 387
+ + +TA GVGLRH+F
Sbjct: 304 VSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006141HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.1 bits (166), Expect = 2e-16
Identities = 33/150 (22%), Positives = 55/150 (36%), Gaps = 6/150 (4%)

Query: 8 ADRNERLSALRRQMILDAAQRVFERDGLEKTTIRAIAKEAGCTTGAIYPWFAGKEILYGA 67
A + ++ + RQ ILD A R+F + G+ T++ IAK AG T GAIY F K L+
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 68 LLDESLQRLHAHLQAATADC--APAAAARQAILAFFGYYAERRTDFSLGLYLFQ---GLG 122
+ + S + A P + R+ ++ L +F +G
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 123 PRGLGRDMDEQLNGRLRQ-CVDVLGQALAR 151
+ + L L +
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEA 151


8NH44784_006821NH44784_006871Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_006821012-3.084039Nitroreductase family protein
NH44784_006831018-3.480562Methyl-accepting chemotaxis protein I (serine
NH44784_006841220-4.668032hypothetical protein
NH44784_006851320-4.203801Putative phosphatase
NH44784_006861227-4.345462hypothetical protein
NH44784_006871120-3.006483hypothetical protein
9NH44784_007011NH44784_007471Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0070112200.550912hypothetical protein
NH44784_007021138-5.545874hypothetical protein
NH44784_007031346-7.987384hypothetical protein
NH44784_007041353-9.748936phage-related hypothetical protein
NH44784_007051454-9.681765hypothetical protein
NH44784_007061453-9.451072hypothetical protein
NH44784_007071453-9.498033hypothetical protein
NH44784_007081447-8.717379hypothetical protein
NH44784_007091443-7.873799hypothetical protein
NH44784_007101434-6.380204phage tape measure protein
NH44784_007111425-5.653236hypothetical protein
NH44784_007121325-5.970429hypothetical protein
NH44784_007131328-7.114704hypothetical protein
NH44784_007141331-7.302911hypothetical protein
NH44784_007151332-7.845963gp7
NH44784_007161534-7.635748FIG01047095: hypothetical protein
NH44784_007171432-7.447529gp5
NH44784_007181333-7.422673Phage terminase large subunit
NH44784_007191430-6.245300Phage terminase, small subunit
NH44784_007201430-6.148246Putative phage endonuclease
NH44784_007211430-5.657457DNA replication protein DnaC
NH44784_007221232-6.326400phage recombination protein
NH44784_007231532-7.575364hypothetical protein
NH44784_007241234-6.788799phage-related transcriptional transcriptional
NH44784_007251332-6.829983hypothetical protein
NH44784_007261434-7.482079putative phage repressor protein
NH44784_007271729-8.630324hypothetical protein
NH44784_007281626-7.346861hypothetical protein
NH44784_007291524-6.271552hypothetical protein
NH44784_007301624-6.772239hypothetical protein
NH44784_007311623-6.103842hypothetical protein
NH44784_007321424-6.452680putative bacteriophage protein
NH44784_007331124-4.859859Single-stranded DNA-binding protein
NH44784_007341124-4.813907hypothetical protein
NH44784_007351-124-4.273348hypothetical protein
NH44784_007361-125-4.205966hypothetical protein
NH44784_007371026-4.092690Retron-type reverse transcriptase
NH44784_007381017-0.846578hypothetical protein
NH44784_007391114-0.338779hypothetical protein
NH44784_007401-112-0.355464FIG00434035: hypothetical protein
NH44784_007411015-0.838479hypothetical protein
NH44784_007421014-1.252813hypothetical protein
NH44784_007431015-1.156191FIG00432251: hypothetical protein
NH44784_007441014-1.955404Threonine dehydratase
NH44784_007451220-2.743257putative lipoprotein
NH44784_007461322-3.242814Polyribonucleotide nucleotidyltransferase
NH44784_007471116-3.744319SSU ribosomal protein S15p (S13e
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_007101PYOCINKILLER320.009 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.5 bits (73), Expect = 0.009
Identities = 31/109 (28%), Positives = 46/109 (42%), Gaps = 5/109 (4%)

Query: 310 GAMARYLASLTASTVANARAALGAKAQAAAALEQAVANEKAAAAAAGQAAAQVRLGGSLA 369
G A Y L +++ + + AA A +A A KA AA +A +
Sbjct: 183 GLTAAYNVKLFTEAISSLQIRMN-TLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQ 241

Query: 370 ASTAAANAHRAAQTALAGAQRAATAAGTGLVAMLGGPAGIIGLLATAAA 418
A+ AAN + A A ATAAG GL+ + G A + ++ A A
Sbjct: 242 AAIRAANTY--AMPANGSV--VATAAGRGLIQVAQGAASLAQAISDAIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_007111INTIMIN340.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 33.9 bits (77), Expect = 0.001
Identities = 19/43 (44%), Positives = 26/43 (60%)

Query: 201 WTSSNPSMAMVNQVSGQITAVAAGSVTITAASSVAPAVTDTIA 243
W S+NP++A V+ SGQ+T G+ TI+ SS T TIA
Sbjct: 795 WRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA 837


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_007181OUTRSURFACE280.044 Outer surface protein signature.
		>OUTRSURFACE#Outer surface protein signature.

Length = 273

Score = 28.3 bits (63), Expect = 0.044
Identities = 23/86 (26%), Positives = 38/86 (44%), Gaps = 15/86 (17%)

Query: 45 AGLRHNVDSIKSKARILIAWVDEAESVSKIAWQKLAPTVRESGSEIWITWNPEKDGSPTD 104
+G+ KSKA++ IA + +SK ++ +E G + KD + TD
Sbjct: 73 SGVLEGTKDDKSKAKLTIA-----DDLSKTTFE----LFKEDGKTLVSRKVSSKDKTSTD 123

Query: 105 ERFRKNPPTSAKIV------ELNYTD 124
E F + SAK + +L YT+
Sbjct: 124 EMFNEKGELSAKTMTRENGTKLEYTE 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_007251INVEPROTEIN240.040 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 23.5 bits (50), Expect = 0.040
Identities = 12/42 (28%), Positives = 19/42 (45%), Gaps = 2/42 (4%)

Query: 7 DLARRIFEDPSEKAVMDLFQELCAERDRMAWANEGRESATMH 48
AR +F DPS+ ++ +EL +D + ES H
Sbjct: 113 RQARSLFPDPSDLVLV--LRELLRRKDLEEIVRKKLESLLKH 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_007451PF07520290.021 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 28.8 bits (64), Expect = 0.021
Identities = 12/58 (20%), Positives = 14/58 (24%)

Query: 112 IQILAVSLDPNYSGDRVGAFIYGLADTIIAAHNGKTRLYATDALDGQRVYNAARNVEA 169
IQIL D N R FI H L D + +
Sbjct: 27 IQILDFGFDINALDIRSFRFIERPEGAAEGRHRTLYPLTGEAERDAPILAATTPEDDE 84


10NH44784_008041NH44784_010941Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_008041-213-3.318821Dihydrodipicolinate synthase
NH44784_008051-215-4.046111Methyl-accepting chemotaxis protein I (serine
NH44784_008061-115-4.773304hypothetical protein
NH44784_008071-214-5.043936hypothetical protein
NH44784_008081-110-3.413248hypothetical protein
NH44784_008091114-3.339145Isocitrate dehydrogenase [NADP]
NH44784_008101-113-2.120468FIG00433688: hypothetical protein
NH44784_008111-113-2.170308putative cytochrome
NH44784_008121012-2.074241putative cytochrome
NH44784_008131012-1.706791ATPase associated with various cellular
NH44784_008141113-1.664119hypothetical protein
NH44784_008151113-1.665831Phosphinothricin N-acetyltransferase
NH44784_008161314-2.550087ortholog of Bordetella pertussis (BX470248)
NH44784_008171414-2.542250acetyltransferase, GNAT family
NH44784_008181416-2.590936putative zinc protease
NH44784_008191614-2.924395hypothetical protein
NH44784_008201511-2.941175Chaperone protein DnaJ
NH44784_008211410-3.517527Chaperone protein DnaK
NH44784_008221-111-2.303824putative thioredoxin
NH44784_008231-110-0.638340Heat shock protein GrpE
NH44784_008241-111-1.352602hypothetical protein
NH44784_008251-112-1.269529Ferrochelatase, protoheme ferro-lyase
NH44784_008261015-0.706201Heat-inducible transcription repressor HrcA
NH44784_008271-2120.615951NAD kinase
NH44784_008281013-1.663103DNA repair protein RecN
NH44784_008291221-4.761150Ferric uptake regulation protein FUR
NH44784_008301329-6.284020Outer membrane lipoprotein SmpA, a component of
NH44784_008311236-7.320415Dihydrodipicolinate reductase
NH44784_008321445-8.313178UDP-N-acetylenolpyruvoylglucosamine reductase
NH44784_008341551-10.259469*Integrase
NH44784_008351354-11.507214Transcriptional regulator, LysR family
NH44784_008361352-10.911118Inner membrane component of tripartite multidrug
NH44784_008371148-9.251227Membrane fusion component of tripartite
NH44784_008381145-8.820276Outer membrane component of tripartite multidrug
NH44784_008391141-8.657944Glucoamylase
NH44784_008401034-7.513516Cytochrome O ubiquinol oxidase subunit I
NH44784_008411132-5.586974Methyl-accepting chemotaxis protein I (serine
NH44784_008421124-4.377412Pyruvate/2-oxoglutarate dehydrogenase
NH44784_008431222-4.306229hypothetical protein
NH44784_008441321-3.779640FIG01210837: hypothetical protein
NH44784_008451223-3.229945hypothetical protein
NH44784_008461221-1.875636hypothetical protein
NH44784_008471320-1.854158FIG00460544: hypothetical protein
NH44784_008481321-1.669000hypothetical protein
NH44784_008491220-1.017675hypothetical protein
NH44784_008501221-1.281910Protein-disulfide isomerase
NH44784_008511222-1.192719Type IV secretory pathway, VirB4 components
NH44784_008521221-0.712888putative lipoprotein
NH44784_008531221-0.870870hypothetical protein
NH44784_0085414210.013296corresponds to STY4569 from Accession AL513382:
NH44784_008551522-1.772524Possible exported protein
NH44784_008561220-2.467690FIG00453933: hypothetical protein
NH44784_008571218-2.555560FIG00455763: hypothetical protein
NH44784_008581219-1.681362FIG00953352: hypothetical protein
NH44784_008591219-0.863523Candidate type III effector Hop protein
NH44784_008601120-0.795606hypothetical protein
NH44784_008611121-0.499512Type IV secretory pathway, VirD4 components
NH44784_0086210230.897102FIG00460984: hypothetical protein
NH44784_008631120-1.364219FIG016425: Soluble lytic murein transglycosylase
NH44784_008641021-2.222750probable exported protein STY4558
NH44784_008651122-3.575799FIG00978796: hypothetical protein
NH44784_008661222-4.003125hypothetical protein
NH44784_008671220-3.985475hypothetical protein
NH44784_008681220-4.022642Superfamily II DNA/RNA helicases, SNF2 family
NH44784_008691323-3.945403FIG00957490: hypothetical protein
NH44784_008701326-3.802388FIG00954465: hypothetical protein
NH44784_008711325-3.091254FIG023873: Plasmid related protein
NH44784_008721425-2.594535FIG026997: Hypothetical protein
NH44784_008731426-2.881184FIG041301: Hypothetical protein
NH44784_008741424-2.817278FIG046709: Hypothetical protein
NH44784_008751422-3.164897FIG036757: Plasmid-related protein
NH44784_008761422-3.474881FIG034376: Hypothetical protein
NH44784_008771422-4.018616FIG049434: Periplasmic protein TonB, links inner
NH44784_008781322-4.812833FIG076210: Hypothetical protein
NH44784_008791321-4.849087FIG076676: Hypothetical protein
NH44784_008801121-3.468252hypothetical protein
NH44784_008811221-3.301925FIG00902157: hypothetical protein
NH44784_008821124-1.538509hypothetical protein
NH44784_008831022-1.286065hypothetical protein
NH44784_008841023-1.514617FIG01213233: hypothetical protein
NH44784_008851024-1.470616DNA methyltransferase
NH44784_008861124-1.623392hypothetical protein
NH44784_008871125-1.868815hypothetical protein
NH44784_008881222-2.628033DNA topoisomerase III
NH44784_008891222-2.138521Single-stranded DNA-binding protein in
NH44784_008901322-1.785951Integrase regulator R
NH44784_008911221-1.295158FIG141694: hypothetical protein in PFGI-1-like
NH44784_008921123-1.698047FIG141751: hypothetical protein in PFGI-1-like
NH44784_008931023-2.615632FIG004780: hypothetical protein in PFGI-1-like
NH44784_008941-123-2.188473Protein with ParB-like nuclease domain in
NH44784_008951-226-3.086050FIG034647: hypothetical protein in PFGI-1-like
NH44784_008961-221-1.857058Chromosome partitioning ATPase in PFGI-1-like
NH44784_008971-220-2.907505Transcriptional regulator in PFGI-1-like
NH44784_008981-122-3.605860FIG041388: hypothetical protein in PFGI-1-like
NH44784_008991017-1.909115NAD(P)H dehydrogenase (quinone
NH44784_009001117-2.095734Histone acetyltransferase HPA2 and related
NH44784_009011118-2.243085Transcriptional regulator, GntR family domain /
NH44784_009021321-3.2697884-carboxymuconolactone decarboxylase
NH44784_009041322-3.284214*Integrase
NH44784_009051123-2.526464Transposase
NH44784_009061326-4.137347ISPsy4, transposition helper protein
NH44784_009071334-5.336927Heavy metal RND efflux outer membrane
NH44784_009081335-6.614223putative transposase
NH44784_009091037-6.967297IS1480 transposase
NH44784_009101-135-6.634357Transposase
NH44784_009111133-5.656793transposase subfamily
NH44784_009121135-5.619711LysR family transcriptional regulator YnfL
NH44784_009131132-4.837537hypothetical protein
NH44784_009141030-4.452531Copper resistance protein D
NH44784_009151323-3.922902Copper resistance protein C precursor
NH44784_009161326-4.199573Lead, cadmium, zinc and mercury transporting
NH44784_009171133-6.343731Asl7591 protein
NH44784_009181131-6.154840CopG protein
NH44784_009191130-6.013712Copper resistance protein B
NH44784_009201229-5.990001Multicopper oxidase
NH44784_009211128-5.683333DNA-binding heavy metal response regulator
NH44784_009221123-5.246248Heavy metal sensor histidine kinase
NH44784_009231215-2.758088Transcriptional regulator, ArsR family
NH44784_009241217-1.426023Lactoylglutathione lyase
NH44784_009251316-1.523703Arsenate reductase
NH44784_009261216-1.321448Arsenical-resistance protein ACR3
NH44784_009271216-0.163217Arsenate reductase
NH44784_009281115-0.156866Transcriptional regulator, PadR family
NH44784_009291115-0.145561Chromate transport protein ChrA
NH44784_009301116-0.818040GCN5-related N-acetyltransferase
NH44784_009311020-1.465848HYPOTHETICAL/UNKNOWN PROTEIN
NH44784_009321018-1.739619putative plasmid stabilization protein
NH44784_009331123-2.847721RND efflux system, inner membrane transporter
NH44784_009341223-2.925141RND efflux system, membrane fusion protein CmeA
NH44784_009351224-3.666589hypothetical protein
NH44784_009361224-3.245239Pyruvate/2-oxoglutarate dehydrogenase
NH44784_009371221-2.303216hypothetical protein
NH44784_009381322-2.267967FIG01210837: hypothetical protein
NH44784_009391322-1.987596FIG00976384: hypothetical protein
NH44784_009401321-2.090735hypothetical protein
NH44784_009411220-1.105070FIG00958347: hypothetical protein
NH44784_009421215-1.150421FIG00959181: hypothetical protein
NH44784_009431115-1.324190FIG00956406: hypothetical protein
NH44784_009441115-0.828539FIG00955362: hypothetical protein
NH44784_009451115-0.379360DNA repair protein RadC
NH44784_009461114-0.012995Protein-disulfide isomerase
NH44784_0094712140.183995Type IV secretory pathway, VirB4 components
NH44784_0094811140.681256putative lipoprotein
NH44784_0094911160.515724hypothetical protein
NH44784_0095012191.089067FIG00958851: hypothetical protein
NH44784_009511020-1.268605Possible exported protein
NH44784_009521024-2.332080hypothetical protein
NH44784_009531027-4.368491hypothetical protein
NH44784_009541232-6.099307hypothetical protein
NH44784_009551130-7.159745Candidate type III effector Hop protein
NH44784_009561232-8.687616Excisionase domain protein
NH44784_009571128-5.288142hypothetical protein
NH44784_009581127-4.918244AAA family ATPase, possible cell division
NH44784_009591123-4.055505FIG00931459: hypothetical protein
NH44784_009601022-2.969966hypothetical protein APECO1_2271
NH44784_009611021-1.563651FIG00960841: hypothetical protein
NH44784_009621018-0.423001ATPase involved in DNA repair
NH44784_0096310150.390456hypothetical protein
NH44784_009641-1160.905383Type IV secretory pathway, VirD4 components
NH44784_0096511170.443309FIG00954712: hypothetical protein
NH44784_009661118-0.282612FIG016425: Soluble lytic murein transglycosylase
NH44784_009671218-1.323774probable exported protein STY4558
NH44784_009681320-1.977145FIG00978796: hypothetical protein
NH44784_009691320-2.432671PilL protein
NH44784_009701421-3.090660Superfamily II DNA/RNA helicases, SNF2 family
NH44784_009711323-2.899501FIG00957490: hypothetical protein
NH44784_009721323-3.173596FIG00954465: hypothetical protein
NH44784_009731223-2.408966FIG023873: Plasmid related protein
NH44784_009741323-3.048003FIG026997: Hypothetical protein
NH44784_009751425-3.438832FIG041301: Hypothetical protein
NH44784_009761324-2.213187FIG046709: Hypothetical protein
NH44784_009771324-1.964680FIG036757: Plasmid-related protein
NH44784_009781221-1.686440FIG034376: Hypothetical protein
NH44784_009791217-0.819963protein of unknown function DUF932
NH44784_009801217-0.320644FIG049434: Periplasmic protein TonB, links inner
NH44784_0098111150.280461FIG01218638: hypothetical protein
NH44784_009821117-0.793982FIG076210: Hypothetical protein
NH44784_009831116-1.402231TnpA transposase
NH44784_009841217-1.634357Mercuric ion reductase
NH44784_009851422-3.572095Periplasmic mercury(+2) binding protein
NH44784_009861321-3.767872Mercuric transport protein, MerT
NH44784_009871323-4.338936Mercuric resistance operon regulatory protein
NH44784_009881323-4.392531FIG076676: Hypothetical protein
NH44784_009891226-4.694788hypothetical protein
NH44784_009901224-4.312859FIG00902157: hypothetical protein
NH44784_009911220-3.311630FIG01213552: hypothetical protein
NH44784_009921221-3.633381hypothetical protein
NH44784_009931222-3.634299hypothetical protein
NH44784_009941123-3.689930hypothetical protein
NH44784_009951122-3.041159Lipoprotein signal peptidase
NH44784_009961121-2.923807Lead, cadmium, zinc and mercury transporting
NH44784_009971125-3.398125Transcriptional regulator, MerR family
NH44784_009981123-2.700469COG3000: Sterol desaturase
NH44784_009991023-1.981579Cobalt-zinc-cadmium resistance protein CzcD
NH44784_010001221-1.747912DNA topoisomerase III
NH44784_010011219-0.623271Putative single-stranded DNA binding protein
NH44784_010021219-0.240487Integrase regulator R
NH44784_0100312200.256358FIG141694: hypothetical protein in PFGI-1-like
NH44784_0100412210.354515FIG141751: hypothetical protein in PFGI-1-like
NH44784_010051223-1.310763FIG004780: hypothetical protein in PFGI-1-like
NH44784_010061223-2.282517Protein with ParB-like nuclease domain in
NH44784_010071224-3.976332FIG034647: hypothetical protein in PFGI-1-like
NH44784_010081226-4.857125Chromosome partitioning ATPase in PFGI-1-like
NH44784_010091229-7.150813Transcriptional regulator in PFGI-1-like
NH44784_010101229-6.920042FIG041388: hypothetical protein in PFGI-1-like
NH44784_010121436-8.625123*Integrase
NH44784_010131341-9.425449Predicted dye-decolorizing peroxidase
NH44784_010141343-9.876508LysR family transcriptional regulator YnfL
NH44784_010151342-9.543060Threonine dehydrogenase and related Zn-dependent
NH44784_010161342-8.530118hypothetical protein
NH44784_010171241-8.741448Excinuclease ABC subunit B
NH44784_010181133-5.572533Transcriptional regulator, LysR family
NH44784_010191232-5.938396Isopropylmalate/homocitrate/citramalate
NH44784_010201232-6.089632Histone acetyltransferase HPA2 and related
NH44784_010211133-6.536578Carbon-nitrogen hydrolase
NH44784_010221233-6.703321S-(hydroxymethyl)glutathione dehydrogenase
NH44784_010231235-5.980773Lactoylglutathione lyase and related lyases
NH44784_010241334-4.664463Mlr6914 protein
NH44784_010251225-2.727021S-formylglutathione hydrolase
NH44784_010261228-2.758461hypothetical protein
NH44784_010271327-2.573926NADH dehydrogenase
NH44784_010281326-2.266359ATPase, AAA family
NH44784_010291226-2.236306Na+/H+ antiporter NhaA type
NH44784_010301124-3.321420Excinuclease ABC subunit A paralog of unknown
NH44784_010311232-4.929208Pyruvate/2-oxoglutarate dehydrogenase
NH44784_010321127-4.304919COGs COG2002
NH44784_010331227-3.971026hypothetical protein
NH44784_010341225-3.250763hypothetical protein
NH44784_010351326-3.143130hypothetical protein
NH44784_010361325-1.477227FIG00958347: hypothetical protein
NH44784_010371221-2.108829FIG00715740: hypothetical protein
NH44784_010381120-2.020940FIG00956406: hypothetical protein
NH44784_010391018-1.892690hypothetical protein
NH44784_010401019-1.334824DNA repair protein RadC
NH44784_010411118-1.313316Protein-disulfide isomerase
NH44784_010421117-1.322227Type IV secretory pathway, VirB4 components
NH44784_010431118-0.573145putative lipoprotein
NH44784_010441220-0.753378hypothetical protein
NH44784_0104513220.277383corresponds to STY4569 from Accession AL513382:
NH44784_010461529-2.970166Possible exported protein
NH44784_010471333-4.867319hypothetical protein
NH44784_010481332-6.667288FIG00953614: hypothetical protein
NH44784_010491237-5.798711hypothetical protein
NH44784_010501234-5.573346Candidate type III effector Hop protein
NH44784_010511131-5.700157hypothetical protein
NH44784_010521029-4.751653FIG00715720: hypothetical protein
NH44784_010531028-3.658630FIG00715078: hypothetical protein
NH44784_010541027-2.537586ATPase involved in DNA repair
NH44784_010551-120-1.161323hypothetical protein
NH44784_010561-121-0.647662Type IV secretory pathway, VirD4 components
NH44784_010571134-3.338423FIG00954712: hypothetical protein
NH44784_010581244-5.863941FIG016425: Soluble lytic murein transglycosylase
NH44784_010591240-6.312183probable exported protein STY4558
NH44784_010601243-7.156451FIG00978796: hypothetical protein
NH44784_010611244-7.893894PilL
NH44784_010621342-7.789541FIG131328: Predicted ATP-dependent endonuclease
NH44784_010631233-5.889898ATP-dependent DNA helicase UvrD/PcrA
NH44784_010641424-4.547876Superfamily II DNA/RNA helicases, SNF2 family
NH44784_010651425-4.044840FIG00957490: hypothetical protein
NH44784_010661424-3.721529FIG00954465: hypothetical protein
NH44784_010671323-3.159126FIG023873: Plasmid related protein
NH44784_010681323-3.773907FIG026997: Hypothetical protein
NH44784_010691328-4.453574FIG041301: Hypothetical protein
NH44784_010701226-4.918947FIG046709: Hypothetical protein
NH44784_010711126-5.452317FIG036757: Plasmid-related protein
NH44784_010721225-5.450875FIG034376: Hypothetical protein
NH44784_010731226-5.953155hypothetical protein
NH44784_010741124-4.507991FIG049434: Periplasmic protein TonB, links inner
NH44784_010751122-4.681040hypothetical protein
NH44784_010761122-4.194189FIG00902157: hypothetical protein
NH44784_010771220-3.394795hypothetical protein
NH44784_010781124-3.196980hypothetical protein
NH44784_010791125-3.335063DNA topoisomerase III
NH44784_010801328-2.711939Single-stranded DNA-binding protein
NH44784_010811329-2.334346Integrase regulator R
NH44784_010821229-2.075374FIG141694: hypothetical protein in PFGI-1-like
NH44784_010831231-2.253209FIG141751: hypothetical protein in PFGI-1-like
NH44784_010841228-3.155130FIG004780: hypothetical protein in PFGI-1-like
NH44784_010851126-2.895879Protein with ParB-like nuclease domain in
NH44784_010861025-4.545547FIG034647: hypothetical protein in PFGI-1-like
NH44784_010871119-4.056190Chromosome partitioning ATPase in PFGI-1-like
NH44784_010881019-4.071137Transcriptional regulator in PFGI-1-like
NH44784_010891119-4.128096FIG041388: hypothetical protein in PFGI-1-like
NH44784_010901018-3.323423Transcriptional regulator, ArsR family
NH44784_010911020-3.604976Arsenate reductase
NH44784_010921120-3.131344Arsenical-resistance protein ACR3
NH44784_010931123-3.542425FIG00460211: hypothetical protein
NH44784_010941120-3.596251Arsenate reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008131HTHFIS290.027 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.027
Identities = 11/38 (28%), Positives = 17/38 (44%), Gaps = 1/38 (2%)

Query: 27 LKLAVNAALTLQRPLLIKGEPGTGKTMLAEEVARALDR 64
++ T L+I GE GTGK ++A + R
Sbjct: 150 YRVLARLMQT-DLTLMITGESGTGKELVARALHDYGKR 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008151SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.1 bits (78), Expect = 1e-04
Identities = 16/68 (23%), Positives = 28/68 (41%), Gaps = 2/68 (2%)

Query: 87 AWPAYKYSVEHSVYVDGRFRGRGLGEALMRVLIERARANQVHLMIGGIDAANQGSIKLHE 146
W Y +E + V +R +G+G AL+ IE A+ N ++ N + +
Sbjct: 85 NWNGYAL-IED-IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYA 142

Query: 147 KLGFVHAG 154
K F+
Sbjct: 143 KHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008211SHAPEPROTEIN1413e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (357), Expect = 3e-39
Identities = 84/388 (21%), Positives = 141/388 (36%), Gaps = 77/388 (19%)

Query: 2 SKIIGIDLGTTNSCVAVMDGGQVKIIENAEGART----TPSIVAYMDDGETLVGAPAKRQ 57
S + IDLGT N+ + V G V + R +P VA VG AK+
Sbjct: 10 SNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVA-------AVGHDAKQM 62

Query: 58 AVTNPKNTLYAVKRLIGRKFEEKAVQKDINLMPYAIVKADNGDAWVEVRGKKMAPPQVSA 117
P N + A++ + + +A V+
Sbjct: 63 LGRTPGN-IAAIRPM---------------------------------KDGVIADFFVTE 88

Query: 118 DVLRK-MKKTAEDYLGEEVTEAVITVPAYFNDSQRQATKDAGRIAGLEVKRIINEPTAAA 176
+L+ +K+ + ++ VP +R+A +++ + AG +I EP AAA
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAA 148

Query: 177 LAFGLDKTEKGDRKIAVYDLGGGTFDVSIIEIADVDGEKQFEVLSTNGDTFLGGEDFDQR 236
+ GL +E V D+GGGT +V++I + V + +GG+ FD+
Sbjct: 149 IGAGLPVSE--ATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEA 197

Query: 237 IIDYIIGEFKKEQGVDLSKDVLALQRLKEAAEKAKIELSSS----QQTEINLPYITADAS 292
II+Y+ + G + AE+ K E+ S+ + EI +
Sbjct: 198 IINYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 293 GPKHLNLKITRAKLEALVEELIERTIEPCRVAIKDAGVKVSDIDD--VILVGGMTRMPKV 350
P+ L + LEAL E L + SDI + ++L GG + +
Sbjct: 245 VPRGFTLN-SNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 351 QEKVKEFFGKDPRKDVNPDEAVAAGAAI 378
+ E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008231RTXTOXIND300.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.006
Identities = 18/126 (14%), Positives = 41/126 (32%), Gaps = 6/126 (4%)

Query: 17 DADAAPAAQDAAQAELNELRAQLDAAQATVNEQQDQLLRVRAEAENVRRRAQEEVSKARK 76
+AD QA L + R Q+ + + ++L ++ E + EE
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSI----ELNKLPELKLPDEPYFQNVSEEEVLRLT 188

Query: 77 FGIESFAESLVPVKDSLEAALAQADQTVDTLREGVEVTLKQLAAAFERNLLKEIAPVQGD 136
I+ + K E L + T+ + + + E++ L + + +
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN--RYENLSRVEKSRLDDFSSLLHK 246

Query: 137 KFDPHL 142
+
Sbjct: 247 QAIAKH 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008271PF06057343e-04 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 34.4 bits (79), Expect = 3e-04
Identities = 18/62 (29%), Positives = 25/62 (40%), Gaps = 9/62 (14%)

Query: 40 DTARNTGLTEYPVATYEEIGKDAS-----LAVVMGGDGTVL----GAARHLAPYGVPVVG 90
+ A N GLT PV ++ +S L + + GDG L G PVVG
Sbjct: 24 EFADNLGLTLLPVEPSTQVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQGWPVVG 83

Query: 91 IN 92
+
Sbjct: 84 WS 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008281CHANLCOLICIN356e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 6e-04
Identities = 51/262 (19%), Positives = 89/262 (33%), Gaps = 36/262 (13%)

Query: 129 HAHQSLMRPEAQRDLLDAHGGHGELRQGVAQAWKQWRALARQLELAEKDAAGLAAERERL 188
HA+ + M+ E +R L A+A ++ R A E A ++A E ER
Sbjct: 117 HANNAAMQAEDERLRL-------------AKAEEKARKEAEAAEKAFQEAEQRRKEIERE 163

Query: 189 QWQVDELDRLGLAPDEWDALQSEHTRLSHSQSLLDGATQILDALDGEGDSAHHRLTAANQ 248
+ + + +L A + LS ++ A + L A E + N
Sbjct: 164 KAETERQLKLAEA------EEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNS 217

Query: 249 RIQQMLRHD----TGLQGIYDELESARIAISEAVSDLNNYVSRVDLDPRRLADVEARLSA 304
R+ + L G +EL A E + R + + EA
Sbjct: 218 RLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRR 277

Query: 305 V-----FETARKFRTEPEALCALRDSLHAELSALQAAADIDALRAQAQAAQAQYDAAAAK 359
V E +K T E + ++A+++ +Q A I + A A+ A
Sbjct: 278 VGAGKIREEKQKQVTASE---TRINRINADITQIQKA--ISQVSNNRNAGIARVHEAEEN 332

Query: 360 LTTARRKVAKDLGKQVTQAMQT 381
L ++ L Q+ A+
Sbjct: 333 L---KKAQNNLLNSQIKDAVDA 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008291ACRIFLAVINRP280.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.012
Identities = 11/49 (22%), Positives = 21/49 (42%), Gaps = 5/49 (10%)

Query: 30 QRHLSAEDVYRALIGENVEIG----LATVYRVLTQFEQAGILARSQFDS 74
+ L+ DV L +N +I T Q + I+A+++F +
Sbjct: 195 KYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS-IIAQTRFKN 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008361TCRTETB1214e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (306), Expect = 4e-32
Identities = 77/391 (19%), Positives = 164/391 (41%), Gaps = 16/391 (4%)

Query: 26 FVVVLDTTITNVAMPTISGFLGVSTTEGTWIITAYAVAEAITVPLTGWLSRQFGQVKVFI 85
F VL+ + NV++P I+ W+ TA+ + +I + G LS Q G ++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 86 VSVALFVLFSVCCGLSWSLPSLVLF-RVLQGFAGGPLIPLSATLLLSVFPEKKSNVALAL 144
+ + SV + S SL++ R +QG L ++ P++ A L
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 145 WGMMTVVAPIIGPILGGLISDNWRWQWVFYINIGFGLLVGLGSWWVLKGRETKIQRSRLD 204
G + + +GP +GG+I+ W ++ I + + + + + ++ + D
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM---ITIITVPFLMKLLKKEVRIKGHFD 200

Query: 205 GVGLTLLVVFVTAFQVMLDKGRELDWFSSNVILTCAIVSAISLILFIIWELTDEQPVIDL 264
G+ L+ V + F + F+++ ++ IVS +S ++F+ P +D
Sbjct: 201 IKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDP 250

Query: 265 SVLKSRNWVVSTITLCLMYGIFFGNIVLTPLWLQQWMGYTATWAGLATAPMGILAV-VTS 323
+ K+ +++ + +++G G + + P ++ + G G ++V +
Sbjct: 251 GLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG 310

Query: 324 PLVGRLLPKVDPRLLVTYGMGVLAASFVMRALMTSQVDFMSVAIPMFVLGAGIPACVITL 383
+ G L+ + P ++ G+ L+ SF+ + + + I +FVLG G+ +
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVI 369

Query: 384 TSLGVSDLPPDRIAGGSGLQNFLRVLCMAIG 414
+++ S L G L NF L G
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008371RTXTOXIND974e-24 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 97.2 bits (242), Expect = 4e-24
Identities = 75/413 (18%), Positives = 134/413 (32%), Gaps = 74/413 (17%)

Query: 25 KKMLIAFGAALLVVLACYFIWLIF--------FAGKTVTTDNAYTAVEVAQVTPLVSGPV 76
+ ++ L FI + GK + + + P+ + V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE------IKPIENSIV 107

Query: 77 KEVKVVDTQAVHAGDVLVVLDDTDAKIALEQAEADLGRAR-------------------- 116
KE+ V + ++V GDVL+ L A+ + ++ L +AR
Sbjct: 108 KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE 167

Query: 117 ------------------RQVRQIVANDTTLAGQMDQRAAAIQSAQHE---VTRARSRYD 155
R I +T Q Q+ + + E V +RY+
Sbjct: 168 LKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYE 227

Query: 156 KAVLDEKRR----RNLVEGGAVSAQDFTDSQAELREATAALGQAEANLKAAGAASVSAKG 211
EK R +L+ A++ + + + EA L ++ L+ + +SAK
Sbjct: 228 NLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 212 SRQANEALF----LDSTVETNPIVVTAKAHADQARVNLDRTVLRAPVDGIVTQRSV-DIG 266
Q LF LD +T + + +V+RAPV V Q V G
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 267 QQVQAGMRLMNIVPIQQIY-VDANFKEGQLRNVKPGQKARLTADIYGDDVEYDGRVEGFA 325
V LM IVP V A + + + GQ A + + + Y G + G
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRY-GYLVG-- 403

Query: 326 GGSGSALSVIPAQNATGNWIKVVQRLPVRIRLDPEQLKKHPLRIGLSMEVSVD 378
+ I + +V + + I + + + M V+ +
Sbjct: 404 -----KVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008421PF03544300.024 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.9 bits (67), Expect = 0.024
Identities = 27/125 (21%), Positives = 35/125 (28%), Gaps = 8/125 (6%)

Query: 342 ERPAPFAGTVAIDAAPSD----QDADASASTPTIASKPPAENQEPPLWENMGGAAAPARP 397
E PAP AP+D Q P EPP + +P
Sbjct: 42 ELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKP 101

Query: 398 PAAQTSPDVVEDLLEMVGMADPSNASQDGEAASVDAPAPTSEAAMSATA----APPPPVS 453
VE V + AS A + T+ AA S + P +S
Sbjct: 102 KPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALS 161

Query: 454 MAAPQ 458
PQ
Sbjct: 162 RNQPQ 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008631FLGFLGJ290.010 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 28.9 bits (64), Expect = 0.010
Identities = 15/30 (50%), Positives = 21/30 (70%), Gaps = 1/30 (3%)

Query: 7 ELPPPAYQIAAQQAGVPSPVLFAVALQESG 36
+L PA Q+A+QQ+GVP ++ A A ESG
Sbjct: 155 QLSLPA-QLASQQSGVPHHLILAQAALESG 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_00870156KDTSANTIGN270.014 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 27.2 bits (60), Expect = 0.014
Identities = 10/29 (34%), Positives = 13/29 (44%)

Query: 52 GETDEATRQANDVAIGADDPMISHFRITP 80
GE D D G D P+ F++TP
Sbjct: 108 GEVDSKGEIKADSGGGTDAPIRKPFKLTP 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008941ARGREPRESSOR330.001 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 33.3 bits (76), Expect = 0.001
Identities = 15/46 (32%), Positives = 21/46 (45%), Gaps = 12/46 (26%)

Query: 160 SQSELARRLAADGFPVQQSHISRM------------NDAVRYLLPA 193
+Q EL L DG+ V Q+ +SR N + +Y LPA
Sbjct: 21 TQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYKYSLPA 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009001SACTRNSFRASE412e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.5 bits (97), Expect = 2e-07
Identities = 21/86 (24%), Positives = 35/86 (40%), Gaps = 8/86 (9%)

Query: 60 ALAGYRFQENLI--------YGRFLYVDDLVVTERSRGARHGASLLQALERMAREAGCAK 111
A Y + N I + + ++D+ V + R G +LL A+E
Sbjct: 66 AAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCG 125

Query: 112 LVLDTGLGNALAQRFYFRQGLLTGAM 137
L+L+T N A FY + + GA+
Sbjct: 126 LMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009061HTHFIS300.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.009
Identities = 15/94 (15%), Positives = 36/94 (38%), Gaps = 10/94 (10%)

Query: 34 MEPEHWMAELLQAESAERQVRAQANQMKAARFPVHRDLLGFDFAASPVDRALVERLHQGD 93
+ + + +A + + R + ++ L+G S + + L +
Sbjct: 106 FDLTELIGIIGRALAEPK--RRPSKLEDDSQDG--MPLVG----RSAAMQEIYRVLAR-- 155

Query: 94 FIHTPENVVLVGGPGTGKTHLATAIGIQAVRQHR 127
+ T +++ G GTGK +A A+ R++
Sbjct: 156 LMQTDLTLMITGESGTGKELVARALHDYGKRRNG 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009111DHBDHDRGNASE270.014 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 26.6 bits (58), Expect = 0.014
Identities = 12/36 (33%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 5 YTEEQII-GFLKQAAAGTPVKELCRKHGFSDASFYL 39
EQ+I G L+ G P+K+L + +DA +L
Sbjct: 203 NGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009201PF07132310.011 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 31.2 bits (70), Expect = 0.011
Identities = 21/67 (31%), Positives = 29/67 (43%), Gaps = 1/67 (1%)

Query: 330 MGGMMG-GMDHGGSGQAMEMTGMTGMTGMTGMTGMGSGSMSGMDHGAMAGMNHGAMNHGG 388
MG MMG G+ G G + G+ G G+ G S+ A+ G GA+ G
Sbjct: 61 MGSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLGSALGGGLGGALGAGM 120

Query: 389 MAMDHSQ 395
AM+ S
Sbjct: 121 NAMNPSA 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009211HTHFIS823e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 3e-20
Identities = 39/136 (28%), Positives = 64/136 (47%), Gaps = 3/136 (2%)

Query: 2 KLLVVEDENKTADYVRQGLMEAGFVVDLARNGLDGHHLAMGETYDLVVLDVMLPDVDGWR 61
+LV +D+ + Q L AG+ V + N DLVV DV++PD + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRALRDAGKQVPVLFLTARGGVDDRVKGLELGADDYLVKPFAFSELLARVRTLL---RR 118
++ ++ A +PVL ++A+ +K E GA DYL KPF +EL+ + L +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 119 GSAPSQPDRIQVADLV 134
+ + D LV
Sbjct: 125 RPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009301SACTRNSFRASE280.014 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.014
Identities = 10/52 (19%), Positives = 23/52 (44%), Gaps = 6/52 (11%)

Query: 91 ISVLPPRQGRGIGSRLMEQALSELRAMQAAGCVL------LGDPTYYTRFGF 136
I+V + +G+G+ L+ +A+ + G +L + +Y + F
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009331ACRIFLAVINRP1244e-36 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 124 bits (312), Expect = 4e-36
Identities = 49/119 (41%), Positives = 79/119 (66%), Gaps = 5/119 (4%)

Query: 3 LSKFFMDRPIFAAVLSIVIFVSGLLAIPALPISEYPK-----VVVRAVYPGANPKVIAET 57
++ FF+ RPIFA VL+I++ ++G LAI LP+++YP V V A YPGA+ + + +T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 58 VATPLEEQINGVEGMMYIKSVAGSDGVLVTTVTFRSGINADDATVRVQNRVAQAQARLP 116
V +E+ +NG++ +MY+ S + S G + T+TF+SG + D A V+VQN++ A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLP 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009341RTXTOXIND512e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 2e-09
Identities = 25/121 (20%), Positives = 48/121 (39%), Gaps = 5/121 (4%)

Query: 2 SGYVERVAYREGDEVKKGDVLFMIDARSYRAELARAQAQLTRARSEAGRSR--SEAQRAK 59
+ V+ + +EG+ V+KGDVL + A A+ + Q+ L +AR E R + S +
Sbjct: 104 NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELN 163

Query: 60 VLVEQQAVSTEAWEQRRAADESAQ---AEVQAAQAAVETAQLNLERTQVRSPINGRASRA 116
L E + ++ + + Q + + Q L + R+ +R
Sbjct: 164 KLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARI 223

Query: 117 R 117

Sbjct: 224 N 224



Score = 34.0 bits (78), Expect = 6e-04
Identities = 13/82 (15%), Positives = 34/82 (41%)

Query: 25 IDARSYRAELARAQAQLTRARSEAGRSRSEAQRAKVLVEQQAVSTEAWEQRRAADESAQA 84
++ RAE A++ R + + +S L+ +QA++ A ++ A
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVN 266

Query: 85 EVQAAQAAVETAQLNLERTQVR 106
E++ ++ +E + + +
Sbjct: 267 ELRVYKSQLEQIESEILSAKEE 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009391OMPADOMAIN300.003 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 29.5 bits (66), Expect = 0.003
Identities = 15/34 (44%), Positives = 19/34 (55%), Gaps = 5/34 (14%)

Query: 14 GRTFGRGWRAY--ARGERRAS---NWLVSKGVPA 42
G T G AY ERRA ++L+SKG+PA
Sbjct: 259 GYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPA 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009811PYOCINKILLER260.039 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 25.5 bits (55), Expect = 0.039
Identities = 12/27 (44%), Positives = 12/27 (44%)

Query: 47 TYTMPDEGGASLRAAGWRLIGARGGGA 73
TY MP G AAG LI G A
Sbjct: 249 TYAMPANGSVVATAAGRGLIQVAQGAA 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_009851ENTEROVIROMP270.007 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 27.2 bits (60), Expect = 0.007
Identities = 13/29 (44%), Positives = 18/29 (62%)

Query: 1 MKKLTTLTTLIALAAALSAPAWAATKTVT 29
MKK+ L+ L A+ A + + AAT TVT
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAATSTVT 29


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_010061ARGREPRESSOR290.032 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 28.7 bits (64), Expect = 0.032
Identities = 13/46 (28%), Positives = 20/46 (43%), Gaps = 12/46 (26%)

Query: 165 SQSELARRLAADGFPVQRSHITRMAD---AVR---------YLLPA 198
+Q EL L DG+ V ++ ++R V+ Y LPA
Sbjct: 21 TQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYKYSLPA 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_010441PF02370300.009 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 30.5 bits (68), Expect = 0.009
Identities = 21/97 (21%), Positives = 43/97 (44%), Gaps = 3/97 (3%)

Query: 36 EEMKALGIEGDTPRDTVATLVAQVKQLRTELQTALSDNKSQREENQRLRQRENSIDQRIN 95
+ +AL E R ++++L E + + RE+ +R Q ++ +Q+
Sbjct: 48 PQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPER-REKFERQHQDKHYQEQQ-- 104

Query: 96 SALETERSNLRRDQQQAASERQQTEGLLADLQRRLEG 132
+ E+ L ++Q+ A E+Q ++ L R LE
Sbjct: 105 KKHQQEQQQLEAEKQKLAKEKQISDASRQGLNRDLEA 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_010451SHIGARICIN290.014 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 29.4 bits (66), Expect = 0.014
Identities = 12/47 (25%), Positives = 20/47 (42%), Gaps = 4/47 (8%)

Query: 185 NLRRDLDLGTLVPTLPVRAVALASWRLEDQWVTAVRLTNSSGGWITL 231
NLR+ L + +P+ L S Q + LTN + I++
Sbjct: 43 NLRKALPYERKLYDIPL----LRSTLPGSQRYALIHLTNYADETISV 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_010851ARGREPRESSOR330.001 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 33.3 bits (76), Expect = 0.001
Identities = 15/46 (32%), Positives = 20/46 (43%), Gaps = 12/46 (26%)

Query: 165 SQSELARRLAADGFPVQQSHISRMAD---AVR---------YLLPA 198
+Q EL L DG+ V Q+ +SR V+ Y LPA
Sbjct: 21 TQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYKYSLPA 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_010931TCRTETB290.024 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.024
Identities = 18/87 (20%), Positives = 33/87 (37%), Gaps = 5/87 (5%)

Query: 33 VLMQP--MQDELGLSKPAIVGAYSV--ALLISGLLSTAAGSIIDRIGGRLLMGSGALLAA 88
V M P M+D LS A +G+ + + + G ++DR G ++ G +
Sbjct: 276 VSMVPYMMKDVHQLS-TAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334

Query: 89 VMLACLSRVHNATELYLVWAGIGVAMS 115
V S + T ++ + V
Sbjct: 335 VSFLTASFLLETTSWFMTIIIVFVLGG 361


11NH44784_013391NH44784_013581Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0133912120.387201D-alanine--D-alanine ligase
NH44784_0134012120.920176UDP-N-acetylmuramate--alanine ligase
NH44784_0134110110.548898UDP-N-acetylglucosamine--N-acetylmuramyl-
NH44784_013421011-0.070825Cell division protein FtsW
NH44784_013431-2110.116867UDP-N-acetylmuramoylalanine--D-glutamate ligase
NH44784_013441-313-0.729113Phospho-N-acetylmuramoyl-pentapeptide-
NH44784_013451-19-0.035623UDP-N-acetylmuramoylalanyl-D-glutamate--2,
NH44784_013461110-1.178766Cell division protein FtsI [Peptidoglycan
NH44784_013471215-1.865287Cell division protein FtsL
NH44784_013481215-2.187831rRNA small subunit methyltransferase H
NH44784_013491115-2.794912Cell division protein MraZ
NH44784_013501115-2.914428COG0488: ATPase components of ABC transporters
NH44784_013511-122-4.350814Dihydroorotase
NH44784_013521125-5.150568Catalyzes the cleavage of
NH44784_013531-117-5.214522DNA topoisomerase III
NH44784_013541015-4.543894Glycine cleavage system transcriptional
NH44784_013551113-3.836012putative amino acid efflux protein
NH44784_013561111-3.150161Probable transmembrane protein
NH44784_01358119-3.225093Transcriptional regulator, AsnC family
12NH44784_013781NH44784_013981Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_013781015-3.552269Tricarboxylate transport transcriptional
NH44784_013791316-3.500459Adenosine (5')-pentaphospho-(5'')-adenosine
NH44784_013801417-3.049457prolyl-tRNA synthetase
NH44784_013811522-3.6121854-hydroxybenzoyl-CoA thioesterase family active
NH44784_013821420-2.552006MotA/TolQ/ExbB proton channel family protein
NH44784_013831316-1.445747Tol biopolymer transport system, TolR protein
NH44784_013841315-0.802285TolA protein
NH44784_013851-213-0.479521tolB protein precursor, periplasmic protein
NH44784_013861-1101.35750618K peptidoglycan-associated outer membrane
NH44784_0138710122.048612TPR repeat containing exported protein; Putative
NH44784_0138812152.972739Iron(III) dicitrate transport system permease
NH44784_013891-1121.059671Ferrichrome transport ATP-binding protein FhuC
NH44784_013901-1120.628771hypothetical protein
NH44784_013911-1110.435627putative hydrolase
NH44784_013921-290.302630Leucine-responsive regulatory protein, regulator
NH44784_013931-380.676516Protein of unknown function DUF6, transmembrane
NH44784_013941-29-0.363270Translation elongation factor G
NH44784_013951-1110.947008Universal stress protein family
NH44784_0139613121.091396hypothetical protein
NH44784_0139713111.298336Transcriptional regulator, GntR family domain /
NH44784_0139812111.001053Permease of the drug/metabolite transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_013781HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 1e-23
Identities = 33/126 (26%), Positives = 63/126 (50%), Gaps = 1/126 (0%)

Query: 2 RILLVEDERDMASWLMRALAQSGFVPDHAADARTAEAFMAGTEYDAIVMDLRLPDKHGLV 61
IL+ +D+ + + L +AL+++G+ ++A T ++A + D +V D+ +PD++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLREMRNRDDRTPVLLLTAQGALQDRVRGLNLGADDFLTKPFALEE-LEARLTALVRRSR 120
+L ++ PVL+++AQ ++ GA D+L KPF L E + AL R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GRQHPR 126

Sbjct: 125 RPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_013841IGASERPTASE728e-16 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 72.4 bits (177), Expect = 8e-16
Identities = 30/182 (16%), Positives = 61/182 (33%), Gaps = 4/182 (2%)

Query: 63 SPVSPPPQPQPEPSKPQPKPQPEPEPEKQPEPPPPPPKPEPQAQPKADEVDPEIALQEAK 122
+ ++ P Q + + ++ P PPP P P + A+ E E
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN 1054

Query: 123 KKKEKEEQARQAAEAAKEKARLEEERKQAELKEKKRLEQERQAAEKAAADKAAAEKAAAD 182
++ E A+ A + K+ ++ + E+ + +E Q E K A +
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET----KETATVEKEE 1110

Query: 183 KAAAEKKAKEDAAKKAAAEKAAAEKAAAEKAAADKAAAEKAAAEKKAKEEAAKKAAADKA 242
KA E + ++ K + E++ + A+ A K + A +
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 243 LK 244

Sbjct: 1171 PA 1172



Score = 54.7 bits (131), Expect = 3e-10
Identities = 28/195 (14%), Positives = 57/195 (29%), Gaps = 5/195 (2%)

Query: 62 NSPVSPPPQPQPEPSKPQPKPQPEPEPEKQPEPPPPPPKPEPQAQPKADEVDPEI--ALQ 119
+PV PP P + + E + + + Q + A E + Q
Sbjct: 1022 EAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ 1081

Query: 120 EAKKKKEKEEQARQAAEAAKEKARLEEERKQAELKEKKRLEQERQAAEKAAADKAAAEKA 179
+ + E KE A +E+E K EK + + + +++ +
Sbjct: 1082 TNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 180 AADKAAAEKKAKEDAAKKAAAE--KAAAEKAAAEKAAADKAAAEKAAAEKKAKEEAAKKA 237
A+ A E + + + A E+ A E ++ + ++
Sbjct: 1142 QAE-PARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 238 AADKALKDAFRNDAL 252
A N
Sbjct: 1201 NTTPATTQPTVNSES 1215



Score = 53.5 bits (128), Expect = 7e-10
Identities = 26/149 (17%), Positives = 48/149 (32%), Gaps = 9/149 (6%)

Query: 103 PQAQPKADEVDPE-IALQEAKKKKEKEEQARQAAEAAKEKARLE-----EERKQAELK-- 154
P+ + + VD I + + A ++A + + E
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 155 EKKRLEQERQAAEKAAADKAAAEKAAADKAAAEKKAKEDAAKKAAAEKAAAEKAAAE-KA 213
K+ + + E+ A + A + A +A + KA + A + E E K
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 214 AADKAAAEKAAAEKKAKEEAAKKAAADKA 242
A EKA E + +E K +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSP 1131



Score = 43.5 bits (102), Expect = 1e-06
Identities = 25/190 (13%), Positives = 53/190 (27%), Gaps = 23/190 (12%)

Query: 67 PPPQPQPEPSKPQPKPQPEPEP-----EKQPEPPPPPPKPEPQAQPKADEVDPEIALQEA 121
P Q Q E +PQ +P E +P E Q + +P + ++ P
Sbjct: 1130 SPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189

Query: 122 KKKKEKEEQARQAAEAAKEKARLEEERKQAELKEKKRLEQERQAAEKAAADKAAAEKAAA 181
E A + E + + + ++ + E A
Sbjct: 1190 NTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATT---------- 1239

Query: 182 DKAAAEKKAKEDAAKKAAAEKAAAEKAAAEKAAADKAAAEKAAAEKKAKEEAAKKAAADK 241
+ D + A + + A A KA K + ++ ++
Sbjct: 1240 --------SSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNE 1291

Query: 242 ALKDAFRNDA 251
+ + ++
Sbjct: 1292 GQYNVWVSNT 1301



Score = 35.0 bits (80), Expect = 5e-04
Identities = 17/155 (10%), Positives = 42/155 (27%)

Query: 65 VSPPPQPQPEPSKPQPKPQPEPEPEKQPEPPPPPPKPEPQAQPKADEVDPEIALQEAKKK 124
V P +P E +P+ + + P + + E +
Sbjct: 1139 VQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVEN 1198

Query: 125 KEKEEQARQAAEAAKEKARLEEERKQAELKEKKRLEQERQAAEKAAADKAAAEKAAADKA 184
E A E + + R + ++ + + + A + + +
Sbjct: 1199 PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258

Query: 185 AAEKKAKEDAAKKAAAEKAAAEKAAAEKAAADKAA 219
A A+ A A A + ++ ++
Sbjct: 1259 AVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 33.5 bits (76), Expect = 0.002
Identities = 22/182 (12%), Positives = 51/182 (28%), Gaps = 11/182 (6%)

Query: 74 EPSKPQPKPQPEPEP-EKQPEPPPPPPKPEPQAQPKADEVDPEIALQEAKKKKEKEEQAR 132
E ++ PK + P ++Q E P +P E DP + ++E + +
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPAR-------ENDPTVNIKEPQSQTNTTADTE 1169

Query: 133 QAAEAAKEKARLEEERKQAELKEKKRLEQERQAAEKAAADKAAAEKAAADKAAAEKKAKE 192
Q A+ +E +E + K + +
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS 1229

Query: 193 DAAKKAAAEKAAAEKAAAEKAAADKAAAEKAAAEKKAKEE---AAKKAAADKALKDAFRN 249
A ++ +++ ++ +AK + A + + N
Sbjct: 1230 VPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMN 1289

Query: 250 DA 251
+
Sbjct: 1290 NE 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_013861OMPADOMAIN929e-25 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 91.5 bits (227), Expect = 9e-25
Identities = 24/113 (21%), Positives = 47/113 (41%), Gaps = 11/113 (9%)

Query: 61 SVYFDFDSYTVSDQYRGLVETHARYLASH--QQQKIKIEGNTDERGGAEYNLALGQRRAD 118
V F+F+ T+ + + ++ L++ + + + G TD G YN L +RRA
Sbjct: 220 DVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQ 279

Query: 119 AVRRMMTLLGVSDNQIETISFGKEKPKATGSSE---------ADFAENRRADI 162
+V + G+ ++I G+ P + + A +RR +I
Sbjct: 280 SVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_013891PF05272290.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.019
Identities = 11/22 (50%), Positives = 12/22 (54%)

Query: 37 VTALLGPNGSGKSTLLKALAGL 58
L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_013941TCRTETOQM6380.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 638 bits (1647), Expect = 0.0
Identities = 177/681 (25%), Positives = 294/681 (43%), Gaps = 75/681 (11%)

Query: 11 RNIGISAHIDAGKTTTTERILFYTGITHKIGEVHNGAAVMDWMEQEQERGITITSAATTA 70
NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T+
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 71 FWKGMAGNYPEHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYDAVGGVQPQSETVWRQA 130
W+ ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 64 QWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 131 NKYRVPRLAFVNKMDRVGADFLRVQRQISERLKGDAVPVQLPVGAEDGFEGVIDLVKMKA 190
K +P + F+NK+D+ G D V + I E+L + V Q V M
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNMCV 166

Query: 191 IIWDDASQGVRFEYRDIPAALQAQAQEWHDKMVEKAAEANEALLEKYLGGEALTEAEIKQ 250
+ ++ Q + E N+ LLEKY+ G++L E++Q
Sbjct: 167 TNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALELEQ 202

Query: 251 GLRQRTVANEIVPMLCGSAFKNKGVQAMLDAVIDYLPSPVDVPAIKGHDERDHEIERHPT 310
R + P+ GSA N G+ +++ + + S
Sbjct: 203 EESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH------------------R 244

Query: 311 DKEPFSALAFKIMTDPFVGQLVFFRVYSGVVKSGDSVYNPVKEKKERLGRILQMHANERR 370
+ FKI +L + R+YSGV+ DSV KEK ++ + E
Sbjct: 245 GQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELC 303

Query: 371 EITEVYAGDIAAAVG----VKDVTTGDTLTDPAHVIVLERMVFPEPVISQAVEPKTKADQ 426
+I + Y+G+I + V GDT P ER+ P P++ VEP +
Sbjct: 304 KIDKAYSGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQR 358

Query: 427 EKMGIALGRLAQEDPSFRVRTDEESGQTIISGMGELHLEILVDRMKREFGVEATVGKPQV 486
E + AL ++ DP R D + + I+S +G++ +E+ ++ ++ VE + +P V
Sbjct: 359 EMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTV 418

Query: 487 AYRETIRKTCDEVEGKFVKQSGGRGQYGHVVLKLEPQEPGKGYEFVDAIKGGVVPREFIP 546
Y E K E + + + L + P G G ++ ++ G + + F
Sbjct: 419 IYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQN 475

Query: 547 AVDKGIREALNAGVLAGYPVVDVKATLFFGSYHDVDSNENAFKMAGSMAFKEGMRRADPV 606
AV +GIR G L G+ V D K +G Y+ S F+M + ++ +++A
Sbjct: 476 AVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTE 534

Query: 607 LLEPMMQVEVETPEDFTGNVMGDLSSRRGMVQGMEDIAGGGGKVVRAEVPLAEMFGYSTS 666
LLEP + ++ P+++ D + + ++ E+P + Y +
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ--LKNNEVILSGEIPARCIQEYRSD 592

Query: 667 LRSLTQGRATYTMEFKHYAEA 687
L T GR+ E K Y
Sbjct: 593 LTFFTNGRSVCLTELKGYHVT 613


13NH44784_014501NH44784_014661Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_014501212-0.702907hypothetical protein
NH44784_014511112-0.324652Type III secretion inner membrane protein
NH44784_0145212131.225199Type III secretion inner membrane protein
NH44784_0145312121.563107Type III secretion inner membrane protein
NH44784_0145413121.761477Type III secretion inner membrane protein
NH44784_0145513112.918170Type III secretion inner membrane protein
NH44784_0145612122.043454hypothetical protein
NH44784_0145712121.425159Type III secretion spans bacterial envelope
NH44784_0145811121.676430Flagellum-specific ATP synthase FliI
NH44784_0145911171.615339Type III secretion cytoplasmic protein (YscL
NH44784_0146010140.823783hypothetical protein
NH44784_014611013-0.395854Type III secretion bridge between inner and
NH44784_014621014-0.517904putative type III secretion protein
NH44784_014631214-0.485163hypothetical protein
NH44784_014641113-2.130958putative regulatory protein
NH44784_014651212-2.511953putative outer protein B
NH44784_014661210-1.815602putative outer protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014511TYPE3IMSPROT346e-120 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 346 bits (889), Expect = e-120
Identities = 151/344 (43%), Positives = 219/344 (63%)

Query: 5 MSGEKTEQPTQKKLRDARQKGDVAHSKDFTQTLLVLALFGYLLGNAHGIVEALGRLVLIP 64
MSGEKTEQPT KK+RDAR+KG VA SK+ T L++AL L+G + E +L+LIP
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 65 STLVGQSFEQALPIALDAALREAVYLVLPFVLIVLVVGMFSEFLQVGVVLAFEKLKPSAK 124
+ F QAL +D L E YL P + + ++ + S +Q G +++ E +KP K
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 125 KLNVMSNLKNIFSKKNLVELLKSIIKIAFLSVLVTLVVRDALPELMAVPHSGLAGLEAGV 184
K+N + K IFS K+LVE LKSI+K+ LS+L+ ++++ L L+ +P G+ + +
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 185 GGMMRALIVNIAVAYVVISLADFVWQRMQYRKGLMMSKEDIKQEFKEMEGDPHIKHKRKH 244
G ++R L+V V +VVIS+AD+ ++ QY K L MSK++IK+E+KEMEG P IK KR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 245 LHQEMVMHGAVASTRKASVLVTNPTHLAVAIYYEPDETPLPVVLAKGEGALAEQMMRAAR 304
HQE+ + +++SV+V NPTH+A+ I Y+ ETPLP+V K A + + + A
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 305 EAGVPVMQNIPLARALMASALPDQYIPSELIEPVAEVLRLVRKL 348
E GVP++Q IPLARAL AL D YIP+E IE AEVLR + +
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014521TYPE3IMRPROT1435e-44 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 143 bits (363), Expect = 5e-44
Identities = 46/230 (20%), Positives = 96/230 (41%), Gaps = 4/230 (1%)

Query: 14 LSTLALTQPRILALCAMLPLFNRQLLPGMLRYALCAAIGLVLVPALAPRYAVIDLDAVEL 73
L+ R+LAL + P+ + + +P ++ L I + P+L +
Sbjct: 13 LNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFF--A 70

Query: 74 VLLVAKEVFIGLVMGFLVAIPFWIFEAVGFVVDNQRGASLGATINPATGNDSSPLGILFN 133
+ L +++ IG+ +GF + F G ++ Q G S ++PA+ + L + +
Sbjct: 71 LWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMD 130

Query: 134 QAFMVFFLVGGGFMLMLTMLYDSFRLWDLWDWAPTLRRESVPLMLDQLGRFLRLTLLFAA 193
++ FL G + ++++L D+F + L + + L+ A
Sbjct: 131 MLALLLFLTFNGHLWLISLLVDTFHTLPI--GGEPLNSNAFLALTKAGSLIFLNGLMLAL 188

Query: 194 PAIISMFLAEVGLALVSRFAPQLQVFFLAMPIKSALALLVMVLYMSTLFE 243
P I + + L L++R APQL +F + P+ + + +M M +
Sbjct: 189 PLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014531TYPE3IMQPROT713e-20 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 71.3 bits (175), Expect = 3e-20
Identities = 36/79 (45%), Positives = 46/79 (58%)

Query: 5 DLVSYMTQALYLVLWLSLPPIAVAAIVGTLFSLFQALTQIQEQTLSFAVKLIAVFATIML 64
DLV +ALYLVL LS P VA I+G L LFQ +TQ+QEQTL F +KL+ V + L
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 TARWLSAELYNFTISVFDL 83
+ W L ++ V L
Sbjct: 63 LSGWYGEVLLSYGRQVIFL 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014541TYPE3IMPPROT2309e-79 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 230 bits (587), Expect = 9e-79
Identities = 87/224 (38%), Positives = 132/224 (58%), Gaps = 14/224 (6%)

Query: 5 DPISLAVVLALLALVPLAAVMTTSFLKIAVVLTLVRNALGVQQVPPNMALYGLALILSAY 64
+ ISL +LA L+P T F+K ++V +VRNALG+QQ+P NM L G+AL+LS +
Sbjct: 3 NDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMF 62

Query: 65 VMGPVVMQIGDELRAPPAVAAPGTPEPDRLEGILEAVARGAEPMRAFMLKNSRAEQRDFF 124
VM P++ + + + + V G + R +++K S E FF
Sbjct: 63 VMWPIMHDAYVYFEDED-------VTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFF 115

Query: 125 LRTARGLWGEQQARN-------LKEDDLLVLIPSFLLSELTAAFQIGFLLYLPFVIIDLI 177
++ +++ + L+P++ LSE+ +AF+IGF LYLPFV++DL+
Sbjct: 116 ENAQLKRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLV 175

Query: 178 VSNILLAMGMMMVSPVTISMPLKLFLFVMVDGWTRLIQGLVLSY 221
VS++LLA+GMMM+SPVTIS P+KL LFV +DGWT L +GL+L Y
Sbjct: 176 VSSVLLALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014551TYPE3OMOPROT748e-17 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 73.5 bits (180), Expect = 8e-17
Identities = 37/166 (22%), Positives = 70/166 (42%), Gaps = 10/166 (6%)

Query: 187 PAANEIDADAVPVRIAACLGWTELDAAQLRSLAPRDTVFLDHCLVSPEGELWLGAGAQGL 246
PA + + +G ++ + L + D + + E++ A G
Sbjct: 138 PAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS----RAEVYCYAKKLGH 193

Query: 247 RVRRQDSSYLVTQGWTSLMTETPQSPQDADAGAQTPLDIDAIPVRLTFELGERLITLGEL 306
R + +V + E + + A+ ++ +PV+L F L + +TL EL
Sbjct: 194 F-NRVEGGIIVETLDIQHIEEENNTTETAET----LPGLNQLPVKLEFVLYRKNVTLAEL 248

Query: 307 RQLQPGETFDLARPLAEGPVLVRANGALVGSGELVEIDGRIGVTLH 352
+ + L AE V + ANG L+G+GELV+++ +GV +H
Sbjct: 249 EAMGQQQLLSLPTN-AELNVEIMANGVLLGNGELVQMNDTLGVEIH 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014611FLGMRINGFLIF984e-25 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 97.7 bits (243), Expect = 4e-25
Identities = 42/174 (24%), Positives = 73/174 (41%), Gaps = 4/174 (2%)

Query: 32 ALLLAACGARVELFSAANESEANEVLSVLLDAGIAAQKATTKTGVAVSVDGQQVARALDI 91
+L A LFS ++ + +++ L I + A+ V +V
Sbjct: 41 MVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYR--FANGSGAIEVPADKVHELRLR 98

Query: 92 LRSRGLPRERFDGMGQIFRKEGLVSSPLEERARYIYALSQELTNTLSQMDGVLAARVHVV 151
L +GLP+ G ++ +E S E+ Y AL EL T+ + V +ARVH+
Sbjct: 99 LAQQGLPKGGAVGF-ELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLA 157

Query: 152 LPERGGVGENTTPSTAAVFIKHQTGYNLDALQPQ-IRKLVTHAIPGLTEDRVSI 204
+P+ +A+V + + G LD Q + LV+ A+ GL V++
Sbjct: 158 MPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTL 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014641SYCDCHAPRONE568e-13 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 56.5 bits (136), Expect = 8e-13
Identities = 29/137 (21%), Positives = 58/137 (42%), Gaps = 4/137 (2%)

Query: 13 EALYRGLEALPSNDRLTPEQLEVVYALAYAHVAQEQYAQALPVFAFLAQYGPTRKHYLIG 72
E+ +G + + ++ + LE +Y+LA+ +Y A VF L + +G
Sbjct: 16 ESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLG 75

Query: 73 LGLCLQMLGRLEEAINIFSLVLTLYPDSLSTALRIAECQLAARQFDQARR----TLQLLS 128
LG C Q +G+ + AI+ +S + AEC L + +A +L++
Sbjct: 76 LGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIA 135

Query: 129 AADVPSQVRARAEALLQ 145
++ R ++L+
Sbjct: 136 DKTEFKELSTRVSSMLE 152


14NH44784_015071NH44784_015471Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_015071219-2.342960probable iron binding protein from the
NH44784_015081016-1.983599putative integral membrane protein
NH44784_015091-215-2.745222N-acetyl-gamma-glutamyl-phosphate reductase
NH44784_015101-213-3.444733SSU ribosomal protein S9p (S16e
NH44784_015111-115-3.270965LSU ribosomal protein L13p (L13Ae
NH44784_015121013-1.925132Branched-chain amino acid transport ATP-binding
NH44784_015131015-1.473847Branched-chain amino acid transport ATP-binding
NH44784_015141115-1.558627Branched-chain amino acid transport system
NH44784_015151117-1.391800High-affinity branched-chain amino acid
NH44784_015161317-1.091015Leucine-, isoleucine-, valine-, threonine-, and
NH44784_015171113-0.502072TRAP transporter solute receptor, unknown
NH44784_015181014-1.026023TldE/PmbA protein, part of proposed TldE/TldD
NH44784_015191114-0.901777FIG138315: Putative alpha helix protein
NH44784_0152010141.241884probable carboxylesterase
NH44784_015211-1131.291368hypothetical protein
NH44784_0152210132.268625Hydroxymethylpyrimidine ABC transporter, ATPase
NH44784_0152310122.329646Hydroxymethylpyrimidine ABC
NH44784_015241-1123.040896Nicotinamidase/isochorismatase family protein
NH44784_015251-2103.479170Acetyl-CoA synthetase (ADP-forming) alpha and
NH44784_015261-1132.699982Glutamyl-tRNA synthetase
NH44784_015271-1112.361855putative membrane transport protein
NH44784_015281-3141.851754putative lipoprotein
NH44784_015291-2160.174143Transcriptional regulator, LysR family
NH44784_015301018-1.073343hypothetical protein
NH44784_015311218-1.950715hypothetical protein
NH44784_015321114-2.350755Transcriptional regulator, AsnC family
NH44784_015331015-1.678475Indolepyruvate ferredoxin oxidoreductase, alpha
NH44784_015341115-3.611194Ribonucleotide reductase of class Ia
NH44784_015351211-3.390206Ribonucleotide reductase of class Ia
NH44784_015361011-2.489442histone protein
NH44784_015371-211-2.559525Integral membrane protein YggT, involved in
NH44784_015381-211-2.424536D-alanine--poly(phosphoribitol) ligase subunit
NH44784_015391-212-3.187928D-alanyl transfer protein DltB
NH44784_015401-113-0.935499ortholog of Bordetella pertussis (BX470248)
NH44784_015411-113-0.081163Poly(glycerophosphate chain) D-alanine transfer
NH44784_015421114-0.085990acyl carrier protein
NH44784_015431214-0.031921Outer membrane lipoprotein
NH44784_015441212-0.430728Sugar kinases, ribokinase family
NH44784_015451314-1.113218Probable transmembrane protein
NH44784_015461316-1.741419Ribosomal protein L11 methyltransferase
NH44784_015471414-2.310420Biotin carboxylase of acetyl-CoA carboxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015241ISCHRISMTASE681e-15 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 68.1 bits (166), Expect = 1e-15
Identities = 52/205 (25%), Positives = 82/205 (40%), Gaps = 26/205 (12%)

Query: 28 RTAVVVIDMQQYFTLPGYQGECAAARDIIAPVNRLCDAVRAAGGTVVWV-QTASDNADA- 85
R +++ DMQ YF + + ++ A + +L + G VV+ Q S N D
Sbjct: 30 RAVLLIHDMQNYFVDA-FTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDR 88

Query: 86 -----FWSHHHGVMLTPERSARRLETLRRDSPGFALHPDLRPADSDLRVTKRFYSAMATG 140
FW L + +L P D DL +TK YSA
Sbjct: 89 ALLTDFWG----------------PGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFK-- 130

Query: 141 SSELEPLLRGRGVDTLLIAGTVTNVCCESTARDAMMRDFRTIMVDDALAAVTPAEHENAL 200
+ L ++R G D L+I G ++ C TA +A M D + V DA+A + +H+ AL
Sbjct: 131 RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKHQMAL 190

Query: 201 HGWLLFFGDVLSVDEVSTRLRPAQA 225
+ D + +L+ A A
Sbjct: 191 EYAAGRCAFTVMTDSLLDQLQNAPA 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015271TCRTETB699e-15 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 68.8 bits (168), Expect = 9e-15
Identities = 60/371 (16%), Positives = 127/371 (34%), Gaps = 44/371 (11%)

Query: 32 LPQLAAEFGATTGQAARAVTAFAVAYGVLQMFFGPVGDRYGKYRVVSVATFACALGSAGA 91
LP +A +F TAF + + + +G + D+ G R++ GS
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 92 FVAES-LDVLVFCRALSGAAGAGIVPLSMAWIGDTVPYERRQATLARFLTGTILGMAAGQ 150
FV S +L+ R + GA A L M + +P E R + +G G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 151 LAGGLFADTLGWRWAFAALVVGYLVVGTLLQLEVRRQRALGLGVVDPNAPRQGFVAQARL 210
GG+ A + W + ++ ++ +++ ++ G D V
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMI--TIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFF 214

Query: 211 VLGTP---WARVVLATVF------------------------------IEGLLVFGALA- 236
+L T + ++++ + + G ++FG +A
Sbjct: 215 MLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAG 274

Query: 237 ---FAPSYLHERFDISLTAAGALVAVYA-VGGLLYTVVAGRVLKRLGERGLAVAGGLVLG 292
P + + +S G+++ + +++ + G ++ R G + G L
Sbjct: 275 FVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334

Query: 293 VAFLSYLLGPVW-LWSLLASVLAGFGYYLLHATLQTNATQMV--PSARGTAVAWFASCLF 349
V+FL+ W + ++ G T+ + G ++ F
Sbjct: 335 VSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394

Query: 350 MGQAAGVALAG 360
+ + G+A+ G
Sbjct: 395 LSEGTGIAIVG 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015301PF06057280.029 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.3 bits (63), Expect = 0.029
Identities = 13/70 (18%), Positives = 26/70 (37%), Gaps = 15/70 (21%)

Query: 118 VVVGHSMGA--LPA----LLAASQRRVAGVVLMAPSPPANLP---------GALGLPPVP 162
+++G+S GA +P + A ++ V G VL++PS ++ +
Sbjct: 120 ILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVTSDNQSARYLT 179

Query: 163 ADAVRATPAA 172
V
Sbjct: 180 LPEVNKQTTV 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015311PHAGEIV310.004 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 31.4 bits (71), Expect = 0.004
Identities = 14/50 (28%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 54 RTFAQFLGKQTGQTVIVEDRPGAGGIVGTNAARNAAADGYTFLLSTNSTH 103
R F + KQTG++VIV P G V ++ + F +S +
Sbjct: 32 RDFVTWYSKQTGESVIVS--PDVKGTVTVYSSDVKPENLRDFFISVLRAN 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015431PF07132290.014 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 28.5 bits (63), Expect = 0.014
Identities = 26/87 (29%), Positives = 37/87 (42%)

Query: 31 GGCANRSASSGVYSYDQAQREQIVRTGTVTGVRPIVIQNDKSSGVGMLAGGALGGVAGNA 90
GG ++SA G S Q I+ T G G+G GG GG+ G
Sbjct: 32 GGSPSQSAFGGQRSNIAEQLSDIMTTMMFMGSMMGGGLGGGLGGLGSSLGGLGGGLLGGG 91

Query: 91 IGGGTGRTIATVGGAILGALAGNAVEN 117
+GGG G ++ + G+ LG G A+
Sbjct: 92 LGGGLGSSLGSGLGSALGGGLGGALGA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015451PF03544362e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 36.1 bits (83), Expect = 2e-04
Identities = 22/114 (19%), Positives = 33/114 (28%), Gaps = 5/114 (4%)

Query: 51 MPAAAIPASNTPIAPPA---PAAPQAIAAPVL--VTPPAPPVDDGAPPPVRRPSVEPREV 105
+PA A P S T +AP P A Q PV+ P P + PV +P+
Sbjct: 43 LPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 106 APWDDLPDEPASPGIRPGASGASPAAAPSVAPTEEPAPRPLASAVPPAVLRGRD 159
+ + + AP + A+ P
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156



Score = 28.4 bits (63), Expect = 0.048
Identities = 20/97 (20%), Positives = 24/97 (24%), Gaps = 1/97 (1%)

Query: 63 IAPPAPAAPQAIAAPVLVTPPAPPVDDGAPPPVRRPSVEPREVAPWDDLPDEPASPGIRP 122
I PAPA P ++ P P PV P EP E P +P
Sbjct: 41 IELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEP-EPIPEPPKEAPVVIEKPKP 99

Query: 123 GASGASPAAAPSVAPTEEPAPRPLASAVPPAVLRGRD 159
P + P A P
Sbjct: 100 KPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPAR 136


15NH44784_016121NH44784_016211Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0161210133.159166Maltose/maltodextrin transport ATP-binding
NH44784_0161310133.561870probable sugar ABC transporter, permease
NH44784_0161410113.838819putative sugar uptake ABC transporter permease
NH44784_016151-194.041500hypothetical protein
NH44784_0161610112.340578transcriptional regulator, TetR family
NH44784_0161712151.964974D-aminoacylase
NH44784_0161812151.592315D-amino acid dehydrogenase small subunit
NH44784_0161911150.873775hypothetical protein
NH44784_0162011150.824048RND efflux system, outer membrane
NH44784_0162113160.731142AcrB/AcrD/AcrF family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016161HTHTETR531e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.7 bits (126), Expect = 1e-10
Identities = 23/130 (17%), Positives = 42/130 (32%), Gaps = 5/130 (3%)

Query: 3 RRENTERQILQALEDQIKETGMGGVGINAIAKRAGVSKELIYRYFDGMPGLMLAWMQEQ- 61
+ T + IL + G+ + IAK AGV++ IY +F L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 62 DFWTRNPGLLAADESSQRSPGELVLSMLRAQIDALAGNETLREVRRWELIERNEVSAPLA 121
A P ++ +L +++ E R + E+I
Sbjct: 68 SNIGELELEYQAKFP--GDPLSVLREILIHVLESTVTEERRRLLM--EIIFHKCEFVGEM 123

Query: 122 ERRERAARGF 131
++A R
Sbjct: 124 AVVQQAQRNL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016171UREASE454e-07 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 45.5 bits (108), Expect = 4e-07
Identities = 31/128 (24%), Positives = 46/128 (35%), Gaps = 27/128 (21%)

Query: 2 PAQPDPQEYDLIISGGLIADGLGGPARRADVAINGERIAAIGD--------------GSG 47
+ D +I+ LI D G +AD+ + RIAAIG G G
Sbjct: 60 QVTREGGAVDTVITNALILDHWG--IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 48 WRARRRIDASGRVVAPGFIDCHTHDDRALLGSPLMRPKVSQGVTTVVTGNCGISLAPVHA 107
I G++V G +D H H + + + G+T ++ G G P H
Sbjct: 118 TEV---IAGEGKIVTAGGMDSHIH----FICPQQIEEALMSGLTCMLGGGTG----PAHG 166

Query: 108 PGDTVPPP 115
T P
Sbjct: 167 TLATTCTP 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016181adhesinmafb300.026 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.6 bits (66), Expect = 0.026
Identities = 15/44 (34%), Positives = 20/44 (45%), Gaps = 1/44 (2%)

Query: 9 GVVGMATAYYLHREGHRVSVVDASPGP-GLATSRANGAQLSYSF 51
V T Y L+ EGH DA GP G + GA+ Y++
Sbjct: 123 NVDEGFTVYRLNWEGHEHHPADAYDGPKGGNYPKPTGARDEYTY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016211ACRIFLAVINRP8200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 820 bits (2119), Expect = 0.0
Identities = 283/1032 (27%), Positives = 509/1032 (49%), Gaps = 27/1032 (2%)

Query: 7 FIVRPVATTLLSLAVVLAGMLSFFLLPVAPLPQMDIPTISVSASLPGASPETMASSVATP 66
FI RP+ +L++ +++AG L+ LPVA P + P +SVSA+ PGA +T+ +V
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 67 LERSLGSIAGVTEMTSSS-SQGSTRVTLQFDLSRDINGAARDVQAAINAARSLLPTSLRS 125
+E+++ I + M+S+S S GS +TL F D + A VQ + A LLP ++
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ- 123

Query: 126 NPTYHKSNPSDAPIMTLAMTSDT--LSQGQMYDLASTIVAQKLSQVDGVGEVTVGGSSLP 183
S + +M SD +Q + D ++ V LS+++GVG+V + G+
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY- 182

Query: 184 AVRVTVLPGALANRGVSLDEVRTALANANANRPKGVLENDQY------HWQIMVNDQLSR 237
A+R+ + L ++ +V L N G L + I+ +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 238 AEQYRPLIV-AWRDGAPVRVSDVARVEDSVEDLFQTGFYNQRNAILMIVRRQADANIIET 296
E++ + + DG+ VR+ DVARVE E+ N + A + ++ AN ++T
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 297 VDAVRAQLPTLLALMPADVQLTVAQDRTPSIRASLHEAELTLIIAVGLVVLVVLLFLRRW 356
A++A+L L P +++ D TP ++ S+HE TL A+ LV LV+ LFL+
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 357 RAAIIPSVAVPVSLIGTFCIMYLCGFTLNTISLMALIVATGFVVDDAIVVLENIMRH-VE 415
RA +IP++AVPV L+GTF I+ G+++NT+++ +++A G +VDDAIVV+EN+ R +E
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 416 NGMSPMRAALRGSREVGFTVLSMSLSLVAVFIPILLMGGVAGRLFREFAVTLSASIMVSL 475
+ + P A + ++ ++ +++ L AVFIP+ GG G ++R+F++T+ +++ +S+
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 476 VVSLTLTPMMCARLLKAEPKEP-KPPGRLARWAERGFDWMQDGYRRSLAWALAHGRLMML 534
+V+L LTP +CA LLK E + G W FD + Y S+ L +L
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLL 542

Query: 535 VLAAAVGLNVYLYTVVPKGFFPQQDTGQLLGFFRVDQGTSFQATVPKLEALRKVVLADP- 593
+ A V V L+ +P F P++D G L ++ G + + T L+ + L +
Sbjct: 543 IYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEK 602

Query: 594 ----AVQSMTGYAGGRGGSNSSFMQIQLKPLGER---KVSADQVINRLRARLQNMPGARM 646
+V ++ G++ N+ + LKP ER + SA+ VI+R + L + +
Sbjct: 603 ANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFV 662

Query: 647 FLVAQQDIRVGGRQSQGSYDYTLMSG-DLQLLRAWMPKVQQAMAKVP-EITDVDTDVEDK 704
I G + ++ +G L ++ A+ P + V + +
Sbjct: 663 IPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLED 722

Query: 705 GREINLVIDRDAATRLGVSMATISTVLNNSFSQRQVSVMYGPLNQYHVVLGVDQRFAQDI 764
+ L +D++ A LGVS++ I+ ++ + V+ + + D +F
Sbjct: 723 TAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLP 782

Query: 765 ESLKQVEVITSTGARVPMAAFARFENANAPLSVQHQGLFVADTVSFSLAPGVSLGQATAA 824
E + ++ V ++ G VP +AF ++ + + APG S G A A
Sbjct: 783 EDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMAL 842

Query: 825 IDAAVARIGLPSDQIQAGFQGTAAVLQQTLAQQPWLILAALVTMYIVLGILYESFVHPLT 884
++ ++ LP+ I + G + + + Q P L+ + V +++ L LYES+ P++
Sbjct: 843 MENLASK--LPAG-IGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 885 ILSTLPSAGLGALLALLLVRTDFTLIALIGVFLLIGIVKKNAIMMVDFALEAERNQHMAP 944
++ +P +G LLA L + ++G+ IG+ KNAI++V+FA + +
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 945 REAIFQACLTRFRPIMMTTMAAIFGALPLVLATGAGVEMRQPLGITIVGGLVLSQILTLY 1004
EA A R RPI+MT++A I G LPL ++ GAG + +GI ++GG+V + +L ++
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1005 TTPVVYLYLDRF 1016
PV ++ + R
Sbjct: 1020 FVPVFFVVIRRC 1031


16NH44784_017431NH44784_017591Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_017431220-0.670410Organic hydroperoxide resistance protein
NH44784_017441221-1.535027putative secreted protein
NH44784_017451324-2.1479554-carboxymuconolactone decarboxylase
NH44784_017461428-2.167352probable major facilitator superfamily (MFS)
NH44784_017471632-4.689683Glycosyltransferase
NH44784_017481530-5.095113Endonuclease relaxase
NH44784_017491338-7.165154hypothetical protein
NH44784_017501748-9.092006hypothetical protein
NH44784_017511645-9.167959hypothetical protein
NH44784_017521647-9.241466hypothetical protein
NH44784_017531540-7.222174COG1961: Site-specific recombinases, DNA
NH44784_017541549-9.791320hypothetical protein
NH44784_017551549-9.986751hypothetical protein
NH44784_017561445-9.306591putative antirestriction protein
NH44784_017571330-6.000032hypothetical protein
NH44784_017581223-4.481443VirD4 protein
NH44784_017591119-3.708181hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_017441NUCEPIMERASE412e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 40.9 bits (96), Expect = 2e-06
Identities = 16/40 (40%), Positives = 20/40 (50%), Gaps = 6/40 (15%)

Query: 1 MKILVIGGTGLIGSKLVKLLIERGHDAVAASPATGVDTIT 40
MK LV G G IG + K L+E GH V G+D +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVV------GIDNLN 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_017461TCRTETA861e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 86.0 bits (213), Expect = 1e-20
Identities = 67/215 (31%), Positives = 92/215 (42%), Gaps = 16/215 (7%)

Query: 55 LISSYAFAYAIAAPILGHLSDRVDRRRLLLAGLLSFAVDGVGVALAPTLEVAMALRIFGG 114
L++ YA AP+LG LSDR RR +LL L AVD +A AP L V RI G
Sbjct: 48 LLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAG 107

Query: 115 LASAVIIPNAFALVADIMPRDRQAEAMGVVMLGMTAGIAMGPALAGLLTDWIGWRAPFLL 174
+ A A A +ADI D +A G + G+ GP L GL+ APF
Sbjct: 108 ITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFA 165

Query: 175 TSAGCLVACVVSGMSIPRRHAAR-------APRQRSAIHWLRARTIVRPLLAKG----LW 223
+A + + +P H A ++ W R T+V L+A L
Sbjct: 166 AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLV 225

Query: 224 NGAGVAAFLLSGEILRARYQLDVAQVGLSATAFGI 258
A +++ GE R+ D +G+S AFGI
Sbjct: 226 GQVPAALWVIFGE---DRFHWDATTIGISLAAFGI 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_017571CHANLCOLICIN270.004 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 26.6 bits (58), Expect = 0.004
Identities = 9/49 (18%), Positives = 16/49 (32%), Gaps = 1/49 (2%)

Query: 4 AYLTKTWGRDFSAILVWILGAIEAYVIYAAVSFVLSAAL-VWAMTALAA 51
T W F + A +YV+ S + L +W + +
Sbjct: 454 IKDTGDWKPLFLTLEKKAADAGVSYVVALLFSLLAGTTLGIWGIAIVTG 502


17NH44784_017921NH44784_017971Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_017921218-2.147098Uroporphyrinogen III decarboxylase
NH44784_017931319-2.431770ortholog of Bordetella pertussis (BX470248)
NH44784_017941520-3.696171ATP synthase epsilon chain
NH44784_017951522-4.385067ATP synthase beta chain
NH44784_017961320-4.698275ATP synthase gamma chain
NH44784_017971221-4.563278ATP synthase alpha chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_017931HTHFIS502e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 50.2 bits (120), Expect = 2e-09
Identities = 20/99 (20%), Positives = 33/99 (33%), Gaps = 2/99 (2%)

Query: 147 AAAPRAVGAADVRPRVPRQVLQQGLAGMRYQPGREAQEPFRQAKARVVDGFESDYIRLAL 206
+P AA Q +++ + G RV+ E I AL
Sbjct: 388 PDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPS--GLYDRVLAEMEYPLILAAL 445

Query: 207 SRHQGNVARAARASSKHRRAFWALMRKHGIDAGPYRRQA 245
+ +GN +AA +R +R+ G+ R A
Sbjct: 446 TATRGNQIKAADLLGLNRNTLRKKIRELGVSVYRSSRSA 484


18NH44784_018831NH44784_018941Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_018831218-0.225195Ferredoxin
NH44784_018841217-0.560814hypothetical protein
NH44784_018851213-0.034438Ferredoxin
NH44784_0188612130.026122Paralog of coenzyme PQQ synthesis protein C
NH44784_0188711111.172677hypothetical protein
NH44784_018881-180.777975hypothetical protein
NH44784_018891070.066514hypothetical protein
NH44784_01890117-0.171231Twin-arginine translocation pathway signal
NH44784_01891129-0.351551probable sulfite:cytochrome c oxidoreductase
NH44784_018921310-0.557402hypothetical protein
NH44784_018931312-0.075179Site-specific recombinase
NH44784_018941211-0.089514Sigma-fimbriae tip adhesin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_018921TCRTETB471e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 47.2 bits (112), Expect = 1e-07
Identities = 34/163 (20%), Positives = 67/163 (41%), Gaps = 6/163 (3%)

Query: 42 IIW--IANLFANLGTWAQSVAAAWVITTEQSSPLMVAMIQVAAAFPLVALSILTGVLADN 99
+IW I + F+ L +V+ + P + A + + G L+D
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 100 YDRRKVMLVGMLLELAGGIFITILAYIGLLHPITLIISVFCIAIGSAIVTPAWQAAVGEQ 159
++++L G+++ G +++ ++G LI++ F G+A V
Sbjct: 76 LGIKRLLLFGIIINCFG----SVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131

Query: 160 VPREHIGSAVLLNSVNYNVARAVGPAIGGLLLGAAGAPTVFLL 202
+P+E+ G A L + VGPAIGG++ + L+
Sbjct: 132 IPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174



Score = 29.8 bits (67), Expect = 0.032
Identities = 23/102 (22%), Positives = 36/102 (35%), Gaps = 3/102 (2%)

Query: 90 SILTGVLADNYDRRKVMLVGMLLELAGGIFITILAYIGLLHPITLIISVFCIAIGSAIVT 149
+ G+L D V+ +G+ F+T + II VF + S T
Sbjct: 310 GYIGGILVDRRGPLYVLNIGVTFLSVS--FLTASFLLETTSWFMTIIIVFVLGGLSFTKT 367

Query: 150 PAWQAAVGEQVPREHIGSAVLLNSVNYNVARAVGPAIGGLLL 191
+E LLN ++ ++ G AI G LL
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSF-LSEGTGIAIVGGLL 408


19NH44784_019381NH44784_019521Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_019381-1183.019602hypothetical protein
NH44784_0193910241.975705hypothetical protein
NH44784_0194011201.265488Coenzyme F420-dependent N5,N10-methylene
NH44784_0194110180.947071The iron-vibriobactin/enterobactin uptake
NH44784_019421-223-1.328557ABC-type Fe3+-siderophore transport
NH44784_019431-326-2.732567COG0609: ABC-type Fe3+-siderophore transport
NH44784_019441-227-3.690574Ferric enterobactin-binding periplasmic protein
NH44784_019451-230-4.113507Ferrichrome-iron receptor
NH44784_019461-137-4.237072putative transcriptional regulator
NH44784_019471-138-4.810570hypothetical protein
NH44784_019481-138-4.857139hypothetical protein
NH44784_019491-139-4.821940hypothetical protein
NH44784_019501-229-4.191629UvrD/REP helicase
NH44784_019511-324-4.296394hypothetical protein
NH44784_019521-212-4.108210hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019381CHANLCOLICIN290.037 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.9 bits (64), Expect = 0.037
Identities = 37/161 (22%), Positives = 63/161 (39%), Gaps = 6/161 (3%)

Query: 161 NGTPAGTAEGADATGAGASTPTTAASRALAAYAAASDTALDSDADTADTPAADTAAQGAA 220
NGTP G+ G G+ + ++AA A A ++ A + A+ A A AQ A
Sbjct: 26 NGTPDGSGSGGGGGKGGSKSESSAAIHATAKWSTAQ--LKKTQAEQAARAKAAAEAQAKA 83

Query: 221 AAARADAAKAASASVDTDTAGDRNDKAGAPTESRHEPQAAQRRQDDAQAVRDAIRAMKEL 280
A R DA + + + + TE H AA + +D+ + A ++
Sbjct: 84 KANR-DALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKE 142

Query: 281 LAMLKSKLR---QHDKEAQKQVAEVRDALAEAEKTAQAIDG 318
+ + Q KE +++ AE L AE + +
Sbjct: 143 AEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAA 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019431RTXTOXINA280.042 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.042
Identities = 13/27 (48%), Positives = 18/27 (66%)

Query: 228 SGTAAAIVLLAGAATAAAGPLAFIGLA 254
S +AAA L+A A T A PL+F+ +A
Sbjct: 301 STSAAAAGLIASAVTLAISPLSFLSIA 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019451RTXTOXINC290.042 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 29.1 bits (65), Expect = 0.042
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 3/52 (5%)

Query: 264 YIKNTTFLGEPDWNTYDRNIWTAGWQLEHQFNDAWKVSQNARYTHVDSLYRA 315
Y+ + T L DW + DR W W F D + + R D L+RA
Sbjct: 70 YLNDVTSLVAEDWTSGDR-KWFIDWIAP--FGDNGALYKYMRKKFPDELFRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019491HTHFIS320.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.004
Identities = 17/119 (14%), Positives = 39/119 (32%), Gaps = 10/119 (8%)

Query: 310 LRRSAERYLALSHTDPDAMAFCRECALVSNAEQLEHLLRR----------DGRVLRLVGR 359
++++ + L + D +A+ + N +LE+L+RR ++ R
Sbjct: 325 VQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELR 384

Query: 360 QVKPGSAFRRGMQNADTPDPSAALDPDDAGQADGVIAWPAGLSFRVRSLYALNRDLLEE 418
P S + + + S A++ + R L + L+
Sbjct: 385 SEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILA 443


20NH44784_019841NH44784_019981Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0198413131.035186Cytosine permease
NH44784_0198515130.926011Allophanate hydrolase 2 subunit 2
NH44784_019861412-0.142379Allophanate hydrolase 2 subunit 1
NH44784_0198712130.670282Biotin carboxylase
NH44784_0198810101.520213Biotin carboxyl carrier protein
NH44784_019891091.210931Lactam utilization protein LamB
NH44784_019901-1102.220911Transcriptional regulator, LysR family
NH44784_0199110112.446211Outer membrane protein W precursor
NH44784_0199210102.671199hypothetical protein
NH44784_0199310101.835048Possible MFS Superfamily transporter precursor
NH44784_0199410120.479312hypothetical protein
NH44784_0199512120.536029Enoyl-[acyl-carrier-protein] reductase [FMN]
NH44784_019961314-0.945914Methionine ABC transporter ATP-binding protein
NH44784_019971212-1.610313Uncharacterized ABC transporter, permease
NH44784_019981313-2.005276Uncharacterized ABC transporter, periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019881RTXTOXIND411e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.6 bits (95), Expect = 1e-07
Identities = 15/54 (27%), Positives = 26/54 (48%), Gaps = 5/54 (9%)

Query: 26 VEVGATIKPGDVVGIVEVMKQFNQIEAEVGGVVAEILVADGDPVEPGQALLRIE 79
VE+ AT G + + +I+ +V EI+V +G+ V G LL++
Sbjct: 80 VEIVAT-----ANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLT 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019911OUTRMMBRANEA344e-04 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 33.8 bits (77), Expect = 4e-04
Identities = 43/212 (20%), Positives = 73/212 (34%), Gaps = 23/212 (10%)

Query: 1 MYRRTRALALASACGATLMLAAPAQAHEAGDILFRVGATQVRPSSNNGSVLDGSVKLDVN 60
M + A+A+A A AT+ AAP D + GA ++ ++ +
Sbjct: 1 MKKTAIAIAVALAGFATVAQAAPK------DNTWYTGAKLGWSQYHDTGFINNNGP-THE 53

Query: 61 NNVRPSFTLAYMATRNIGIELLGAWPFEHDVRGSGLGKIGSSKQLPPTLSLQWHILPDSM 120
N + Y +G E+ W +GS ++ + T L + I D
Sbjct: 54 NQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLD 113

Query: 121 VQPYIGVGINYTKFFDTKAEGALKGSDLKLGDSWGVAAQLGADIKISERWFMNADIRYID 180
+ +G + DTK+ K D + V A G + I+ + ++ +
Sbjct: 114 IYTRLGGMVWRA---DTKSNVYGKNHDTGVS---PVFA-GGVEYAITPEIATRLEYQWTN 166

Query: 181 IKSKVKLNGEHIGTARINPWVA--TLGVGYRF 210
N T P +LGV YRF
Sbjct: 167 -------NIGDAHTIGTRPDNGMLSLGVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_019931TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 34/157 (21%), Positives = 57/157 (36%), Gaps = 5/157 (3%)

Query: 226 GAGYIIPATFLPAMAREIIPDPAVFGWAWPLFGLAAALSCLLAPRVAQARDD---KTVWR 282
G G I+P LP + R+++ V L L A + AP + D + V
Sbjct: 20 GIGLIMPV--LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLL 77

Query: 283 GAQALMALGMLAVALWHDIAAVIVAALLVGGTFMVITQSGMLVAQRLAGSAAPRAAAVMT 342
+ A A+ +A + + + ++ G T +G +A G R M+
Sbjct: 78 VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMS 137

Query: 343 SAFAAGQIVGPLLASGAARWGATLPQVLAAGAALLAL 379
+ F G + GP+L + P AA L
Sbjct: 138 ACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNF 174


21NH44784_020191NH44784_020341Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_020191313-3.197830Outer membrane stress sensor protease DegS
NH44784_020201420-4.805138FIG137478: Hypothetical protein YbgI
NH44784_020211418-5.812606Large-conductance mechanosensitive channel
NH44784_020221320-4.505367Ubiquinol-cytochrome C reductase iron-sulfur
NH44784_020231-115-3.544509Ubiquinol--cytochrome c reductase, cytochrome B
NH44784_020241012-3.169542ubiquinol cytochrome C oxidoreductase,cytochrome
NH44784_020251-19-1.108347Stringent starvation protein A
NH44784_0202610100.330828ClpXP protease specificity-enhancing factor /
NH44784_0202711101.414660hypothetical protein
NH44784_020291091.223986*Gluconate transporter family protein
NH44784_020311-192.149886*hypothetical protein
NH44784_020321073.885237transcriptional regulator, LysR family
NH44784_020331183.456696hypothetical protein
NH44784_020341271.654641Cysteine desulfurase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_020191V8PROTEASE702e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 70.0 bits (171), Expect = 2e-15
Identities = 35/166 (21%), Positives = 62/166 (37%), Gaps = 32/166 (19%)

Query: 107 ASTSLGSGVVVNHDGYVLTNYHVVQAADAIEVALA------------DGRKDTAKVVGAD 154
T + SGVVV +LTN HVV A AL +G ++
Sbjct: 99 TGTFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYS 157

Query: 155 PDTDLAVLKLATLR-------NLPAATLAPDRGLRVGDVVLAIGNPFG---VGQTTTQGI 204
+ DLA++K + + AT++ + +V + G P ++G
Sbjct: 158 GEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGK 217

Query: 205 VSALGRNGLGLNTYENFIQTDAAINPGNSGGALVDAQGNLVGINTA 250
++ L + Q D + GNSG + + + ++GI+
Sbjct: 218 ITYLKGEAM---------QYDLSTTGGNSGSPVFNEKNEVIGIHWG 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_020211MECHCHANNEL1255e-40 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 125 bits (316), Expect = 5e-40
Identities = 66/146 (45%), Positives = 86/146 (58%), Gaps = 15/146 (10%)

Query: 6 GFVKEFRDFAVKGNAVDLAVGVIIGAAFGKIVDSLVKDIVMPLVNFVLGGSVDFSNKFLV 65
+KEFR+FA++GN VDLAVGVIIGAAFGKIV SLV DI+MP + ++GG +DF +
Sbjct: 2 SIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGG-IDFKQFAVT 60

Query: 66 LSMPAGYNGPMTYADLTKAGANVFAWGNFVTIIINFVLLAFVIFWMVKAIYKARTKAEEA 125
L G A V +G F+ + +F+++AF IF +K I K K EE
Sbjct: 61 LRDAQG-----------DIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEP 109

Query: 126 PAPAAPAATPEDVALLREIRDLLKKQ 151
A AP + LL EIRDLLK+Q
Sbjct: 110 AAAPAPTK---EEVLLTEIRDLLKEQ 132


22NH44784_020871NH44784_020971Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_020871210-0.026744probable glyoxalase
NH44784_0208812100.848989transcriptional regulator, AraC family
NH44784_020891281.240643N-methylhydantoinase A
NH44784_020901282.149414N-methylhydantoinase B
NH44784_0209112103.774279MFS general substrate transporter
NH44784_0209211103.847414Transcriptional regulator, IclR family
NH44784_0209311113.604280thiolase
NH44784_0209411123.469618Protein of unknown function DUF35
NH44784_0209510113.109559hypothetical protein
NH44784_0209611113.100392Acetyl-CoA synthetase (ADP-forming) alpha and
NH44784_0209710143.1040973-oxoacyl-[acyl-carrier protein] reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_020911TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.7 bits (90), Expect = 3e-05
Identities = 64/339 (18%), Positives = 124/339 (36%), Gaps = 19/339 (5%)

Query: 67 VVGHVADRYDRRKIVAICMAVETLATLLLAIAALHGVGGKTLIYATLIIMSSARAFEAPT 126
V+G ++DR+ RR ++ + +A + ++A A V +Y I+ A
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWV-----LYIGRIVAGITGA-TGAV 115

Query: 127 LTTLIPAIVPREWLPRATALSSSGGQIAQIAGPALGGIGYGLGAGWVYCVAAALYLCGFA 186
I I + R S+ +AGP LGG+ G + AAAL F
Sbjct: 116 AGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFL 175

Query: 187 CIASMSIDRAPPRREPTTWRT--LFAGITFIFQRRILLGTLSLDLFAVLLGGA-TALLPI 243
+ + R P A + ++ +++ L+G AL I
Sbjct: 176 TGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 244 FAKDILNAGPWALG-ALRAAPACGAMLMSLTLARLS--LGEHVGRLLFGALILFGLATTV 300
F +D + +G +L A ++ ++ ++ LGE R L +I G +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGER--RALMLGMIADGTGYIL 293

Query: 301 FGLSTSIPLSVGALVVLGAADAISVVVRSSLVQLNTPDHMLGRVSAVNTLFVGASNQLGE 360
+T ++ +V+L + + +++ + G++ ++ +G
Sbjct: 294 LAFATRGWMAFPIMVLLASGGIGMPAL-QAMLSRQVDEERQGQLQGSLAALTSLTSIVGP 352

Query: 361 FESGVMAALLGAVPAVVVGGLGTIAVAGLWMHWFPELRR 399
++ + A G IA A L++ P LRR
Sbjct: 353 ----LLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_020971DHBDHDRGNASE1211e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 121 bits (305), Expect = 1e-35
Identities = 71/252 (28%), Positives = 122/252 (48%), Gaps = 10/252 (3%)

Query: 5 LKSKTALVTGAYSGLGRHFALRLAEAGARVALCGRRTELGETLATEIRAQGGQACVAGMD 64
++ K A +TGA G+G A LA GA +A E E + + ++A+ A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 65 VTQPDSVLAAFTQVEQAFGPVDIVVNNAGIAMTRPALDISEDEWTGLIDVNLNGAWRVAQ 124
V ++ ++E+ GP+DI+VN AG+ +S++EW VN G + ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 125 CAARHFHGHGRPGAIINIASILGQRVASHVAAYTAAKAGLLHLTRALALEWARHGIRVNA 184
+++ R G+I+ + S + +AAY ++KA + T+ L LE A + IR N
Sbjct: 126 SVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 185 LAPGYIGTDLNRDFFATPAGEALVKR---------IPQRRLGRPQDLDGPLLLLASDASA 235
++PG TD+ +A G V + IP ++L +P D+ +L L S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 236 YMTGAVIDVDGG 247
++T + VDGG
Sbjct: 245 HITMHNLCVDGG 256


23NH44784_021101NH44784_021151Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_021101122-3.926023Putative threonine efflux protein
NH44784_021111126-6.119739hypothetical protein
NH44784_021121130-4.817366protein of unknown function DUF1232
NH44784_021131236-4.551023hypothetical protein
NH44784_021141223-1.012645FIG00954026: hypothetical protein
NH44784_0211512180.405068Hypothetical Protein
24NH44784_022011NH44784_022141Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_022011214-3.228624FIG002283: Isochorismatase family protein
NH44784_022021214-3.099125FIG022886: Transcriptional regulator, LysR
NH44784_022031313-3.205959DNA gyrase subunit B
NH44784_022041213-3.738697DNA polymerase III beta subunit
NH44784_022051111-2.693852Chromosomal replication initiator protein DnaA
NH44784_022061314-2.532844hypothetical protein
NH44784_022071013-0.549983LSU ribosomal protein L34p
NH44784_022081-112-0.104819Ribonuclease P protein component
NH44784_022091013-0.342785hypothetical protein
NH44784_022101012-0.557968Inner membrane protein translocase component
NH44784_022111-110-0.021036Catalase
NH44784_0221211100.576265Cysteine synthase
NH44784_022131112-0.394505hypothetical protein
NH44784_022141213-0.961216Alpha-ketoglutarate-dependent taurine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_022051PYOCINKILLER310.008 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 31.3 bits (70), Expect = 0.008
Identities = 12/45 (26%), Positives = 16/45 (35%)

Query: 43 TQGTTPRMPVAPVRAQPPVQTAPSAATPLGGAPPPGMPAQPAAPA 87
T P + + A PP PS+ TP+ P P P
Sbjct: 405 TTAEAPPLILTWTPASPPGNQNPSSTTPVVPKPVPVYEGATLTPV 449


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_02210160KDINNERMP464e-161 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 464 bits (1194), Expect = e-161
Identities = 199/562 (35%), Positives = 311/562 (55%), Gaps = 36/562 (6%)

Query: 1 MIFSFSLLLLWNNWQIHNGKPSLFGAPTPTASTNANGTPAAANNATPSVPNAPPATAAAP 60
+ F ++W W+ A T +T A +AA
Sbjct: 10 IALLFVSFMIWQAWEQDKNPQPQ--AQQTTQTT------------------TTAAGSAAD 49

Query: 61 SAVPGAAAPVPTRSEEVVITTDVLRLTFDTLGAQLVRAELLKYPATGQADKPTVLLDRTA 120
VP + + + + + TDVL LT +T G + +A L YP + +P LL+ +
Sbjct: 50 QGVPASG-----QGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSP 104

Query: 121 GLNYVAQTGVVGAANGQNFPTHQTPFRVVSTDRQMTGD---SLAVAFE-AESGGVKVTKT 176
Y AQ+G+ G N P V D + + L V ++ G TKT
Sbjct: 105 QFIYQAQSGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKT 164

Query: 177 FTLHRGRYDIDVRHDLANVGAAPVTPALYLQLERDGNDP--ADTSS---FYHTFTGMAVY 231
F L RG Y ++V +++ N G P+ + + QL++ P DT S HTF G A
Sbjct: 165 FVLKRGDYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYS 224

Query: 232 SEQDKFQKMTFADIEKKKANYIKQADNGWIAVVQHYFATAWVPPQGKPRNNEVLEVQKNL 291
+ +K++K F I + N + GW+A++Q YFATAW+P N + +
Sbjct: 225 TPDEKYEKYKFDTIADNE-NLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGI 283

Query: 292 YAARTIEAVGEIAPGAAARVDSHLWVGPQDQKAMAALAPGLELVVDYGWLTIIAKPLFTL 351
A + PG ++S LWVGP+ Q MAA+AP L+L VDYGWL I++PLF L
Sbjct: 284 AAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKL 343

Query: 352 MTWLHSILGNWGWTIVALTVLIKALFYPLAAASYRSMARMKQVAPRLQALKEKYGDDKQK 411
+ W+HS +GNWG++I+ +T +++ + YPL A Y SMA+M+ + P++QA++E+ GDDKQ+
Sbjct: 344 LKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQR 403

Query: 412 LNAAMMEMYRTEKINPLGGCLPMLVQIPVFISLYWVLLASVEMRGAPWILWVHDLSIRDP 471
++ MM +Y+ EK+NPLGGC P+L+Q+P+F++LY++L+ SVE+R AP+ LW+HDLS +DP
Sbjct: 404 ISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDP 463

Query: 472 YFILPAIMMATMFLQIKLNPTP-PDPVQAKVMMIMPLVFGGMMFFFPAGLVLYWCVNNTL 530
Y+ILP +M TMF K++PT DP+Q K+M MP++F +FP+GLVLY+ V+N +
Sbjct: 464 YYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLV 523

Query: 531 SIAQQWSITRAMQRKTEAAANR 552
+I QQ I R ++++ + +
Sbjct: 524 TIIQQQLIYRGLEKRGLHSREK 545


25NH44784_022271NH44784_022401Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_022271-217-3.292052Tricarboxylate transport protein TctB
NH44784_022281-216-3.219952Tricarboxylate transport membrane protein TctA
NH44784_022291-214-3.238556Aspartate aminotransferase
NH44784_022301-117-2.841725Ammonia monooxygenase
NH44784_022311-116-3.015997Cold shock protein CspG
NH44784_022321-213-2.044513tRNA uridine 5-carboxymethylaminomethyl
NH44784_022331-111-1.014742rRNA small subunit methyltransferase, glucose
NH44784_022341-211-1.166231Chromosome (plasmid) partitioning protein ParA /
NH44784_022351-111-0.392432Putative acetyltransferase, GnaT family
NH44784_022361-1110.057608Chromosome (plasmid) partitioning protein ParB /
NH44784_0223712100.467374L-asparaginase
NH44784_0223811100.245609LysR family transcriptional regulator lrhA
NH44784_022391290.093778hypothetical protein
NH44784_0224012100.391407Cystathionine beta-lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_022351SACTRNSFRASE343e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 3e-04
Identities = 14/75 (18%), Positives = 27/75 (36%), Gaps = 2/75 (2%)

Query: 184 CVAGGMAVRDG--ELAGLFDVVTDPGQRRKGHAATLVDHLLALCVEDGATTAYLQVEPQN 241
G + +R A + D+ R+KG L+ + E+ L+ + N
Sbjct: 75 NCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDIN 134

Query: 242 TAARALYGRYGFKDC 256
+A Y ++ F
Sbjct: 135 ISACHFYAKHHFIIG 149


26NH44784_023291NH44784_023371Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_023291-113-3.002008Deoxyguanosinetriphosphate triphosphohydrolase
NH44784_023301228-6.135196Glutathione S-transferase
NH44784_023311233-7.274173Excinuclease ABC subunit A
NH44784_023321452-10.818921Putative transport protein
NH44784_023331457-12.286930Single-stranded DNA-binding protein
NH44784_023341460-12.681852putative enzyme; Integration, recombination
NH44784_023351452-11.293653FIG00959392: hypothetical protein
NH44784_023361131-6.633463Phosphonate ABC transporter permease protein
NH44784_023371-123-3.839910Phosphonate ABC transporter phosphate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023321TCRTETA613e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 60.6 bits (147), Expect = 3e-12
Identities = 41/128 (32%), Positives = 58/128 (45%), Gaps = 2/128 (1%)

Query: 29 LGLFLLLPVFAVAARGL-PGGDDPARVGLALGMYGLTQAFMQIPFGLASDRWGRRPVVLF 87
+G+ L++PV R L D A G+ L +Y L Q G SDR+GRRPV+L
Sbjct: 19 VGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLV 78

Query: 88 GLLLFVAGSVVCAQADDVFWITIGRAIQG-AGAISAAVTAWLADATRDEVRTRAMAMVGG 146
L + A A ++ + IGR + G GA A A++AD T + R R +
Sbjct: 79 SLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSA 138

Query: 147 SIGLSFAA 154
G A
Sbjct: 139 CFGFGMVA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023331PRTACTNFAMLY280.024 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.024
Identities = 19/75 (25%), Positives = 24/75 (32%)

Query: 122 DTGADRYSTEIVADQMQMLGGREDGGGGGGSYGGGGGYDDAPARQPQQRAPAQRPAPQQR 181
D G RY + L G + + G P QP+ AP +
Sbjct: 548 DIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELS 607

Query: 182 PAPQAAPAGGGANLA 196
A AA GG LA
Sbjct: 608 AAANAAVNTGGVGLA 622


27NH44784_023481NH44784_023811Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_023481317-0.846111hypothetical protein
NH44784_023491212-1.085453hypothetical protein
NH44784_023501110-1.0796034-hydroxybenzoate polyprenyltransferase
NH44784_023511-110-0.264920putative aminotransferase
NH44784_023521215-0.509993Free methionine-(R)-sulfoxide reductase,contains
NH44784_023531115-0.649092Dihydrodipicolinate synthase
NH44784_023541114-0.951702putative desaturase
NH44784_023551-113-0.592690putative fatty acid desaturase
NH44784_023561-113-0.9608393-oxoacyl-[acyl-carrier-protein] synthase III
NH44784_023571-111-1.443223FIG00431745: hypothetical protein
NH44784_023581-211-1.132338hypothetical protein
NH44784_023591-111-2.637515Type I antifreeze protein
NH44784_023601-115-3.808013Transporter
NH44784_023611123-5.446058Aspartyl-tRNA synthetase
NH44784_023621337-7.994825putative endonuclease/exonuclease/phosphatase
NH44784_023631241-9.381889Cardiolipin synthetase
NH44784_023641249-10.780437Permease of the drug/metabolite transporter
NH44784_023651249-11.552432Multidrug resistance protein B
NH44784_023661347-12.798740UDP-glucose dehydrogenase
NH44784_023671447-12.742196dTDP-glucose 4,6-dehydratase
NH44784_023681344-11.417711UDP-N-acetylglucosamine 4,6-dehydratase
NH44784_023691250-10.710861Bacillosamine/Legionaminic acid biosynthesis
NH44784_023701249-10.932023hypothetical protein
NH44784_023711252-10.248215Pseudaminic acid biosynthesis protein
NH44784_023721154-9.808826Imidazole glycerol phosphate synthase
NH44784_023731053-10.018041Imidazole glycerol phosphate synthase cyclase
NH44784_023741155-11.188458N-Acetylneuraminate cytidylyltransferase
NH44784_023751254-10.818289N-acetylneuraminate synthase
NH44784_023761153-10.034370ORF_14
NH44784_023771150-9.224188hypothetical protein
NH44784_023781150-8.627783hypothetical protein
NH44784_023791149-8.278984Putative
NH44784_023801046-6.932306Undecaprenyl-phosphate N-acetylglucosaminyl
NH44784_023811-139-5.784702UDP-N-acetylglucosamine 4,6-dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023651TCRTETB1037e-26 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 103 bits (257), Expect = 7e-26
Identities = 82/413 (19%), Positives = 155/413 (37%), Gaps = 37/413 (8%)

Query: 27 VVLAAFDSTIISTTLPRVAEALDGM-ALYAWVGTGYLLTTAASILIFGRLGDMFGRKSLM 85
+ + +++ +LP +A + A WV T ++LT + ++G+L D G K L+
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 86 LVSVLIVALGSIACGLAQSM-EQLIAFRSLQGIGGGMMIATAFAAPADLFPDARQRVRWM 144
L ++I GS+ + S LI R +QG G A A P R +
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN-RGKAF 141

Query: 145 ALISAAFAMASGIGPVLGGTATQVLGWRAAFFISPIAALAALALLARYFPRIPPPHAADR 204
LI + AM G+GP +GG + W I I
Sbjct: 142 GLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMI--TIITVPFLMKLL--KKEVRIKG 197

Query: 205 RIDWIGAILLVAAVGAPLAALELAFAPGDSGQPMLALCLAFIGVAAIALLIPIERRLASP 264
D G IL+ + + F + + L ++ + + + + R++ P
Sbjct: 198 HFDIKGIILMSVGI--------VFFMLFTTSYSISFLIVSVL---SFLIFVKHIRKVTDP 246

Query: 265 IFPLRVLASQEPRLLNLAATMVGAVMFVLIFYS--------PLLLQQVLGYTPSEAG-LL 315
+P L M+G + +IF + P +++ V + +E G ++
Sbjct: 247 FV--------DPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 316 LTPLVAAISVGSIINGRLYPKQSEPQRLMVFGSCLLATGTLAVLLISSATPAWWILAAFF 375
+ P ++ + I G L ++ P ++ G L+ L + T W++
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRG-PLYVLNIGVTFLSVSFLTASFLLETTS-WFMTIIIV 356

Query: 376 INGCALGFLLPNLTLFMQLLSERRDIGVASALVQTTRAIGSALGTAIVGILIS 428
L F ++ + ++++ G +L+ T + G AIVG L+S
Sbjct: 357 FVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023671NUCEPIMERASE1721e-53 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 172 bits (438), Expect = 1e-53
Identities = 78/336 (23%), Positives = 135/336 (40%), Gaps = 38/336 (11%)

Query: 3 RILVTGGAGFLGSHLCERLLNEGNDVLCVDN----YFTGNKDNIAHLMSNPHFEAMRHDV 58
+ LVTG AGF+G H+ +RLL G+ V+ +DN Y K L++ P F+ + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 59 T-------IPLYVEVDEIYNLACPASPVHYQF-DPVQTTKTSVHGAINMLGLAKRVKAK- 109
+ + ++ + V Y +P +++ G +N+L + K +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLA-VRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 110 ILQASTSEVYGDPEIHPQTEAYWGNVNPIGLRSCYDEGKRCAETLFFDYWRQYSVKIKVV 169
+L AS+S VYG P + + +P+ S Y K+ E + Y Y + +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVD-HPV---SLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 170 RIFNTYGPRMHPNDGRVVSNFIVQALRGDDITIYGDGSQTRSFCYVDDLIEAFIRAMGTP 229
R F YGP P+ + F L G I +Y G R F Y+DD+ EA IR
Sbjct: 177 RFFTVYGPWGRPD--MALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 230 DDFTGPV-----------------NIGNPAEFTMLELAEKVIALTGTASKIVFKPLPSDD 272
NIGN + +++ + + G +K PL D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 273 PKQRQPNISLAKSAMNGWEPSVQLDQGLGKTIDYFR 308
+ + + G+ P + G+ ++++R
Sbjct: 295 VLETSADTKALYEVI-GFTPETTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023681NUCEPIMERASE761e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.6 bits (186), Expect = 1e-17
Identities = 50/267 (18%), Positives = 94/267 (35%), Gaps = 42/267 (15%)

Query: 6 SILITGGTGSFGKAFVRTVLERYPEVKRLVIFSR---DELKQFEMSQEFPEAKYPAVRYF 62
L+TG G G + +LE +V + + LKQ + P ++
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLEL----LAQPGFQFH 57

Query: 63 IGDVRDEARIRR--ALEGIDVVIHAAALKQVPAAEYNPFECIKTNVLGAQNLIEACLDSG 120
D+ D + A + V + V + NP +N+ G N++E C +
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK 117

Query: 121 VKRVVALST---------------DKAAAPINLYGATKLCSDKLFVAANNIKGNRDIRFS 165
++ ++ S+ D P++LY ATK ++ + +++ G + +
Sbjct: 118 IQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG---LPAT 174

Query: 166 VVRYGNVLGSRGS---VVPFFLNKRKSGVLPIT---DSDMTR---FNISLQEGVDMVLWS 216
+R+ V G G + F G I M R + + E + +
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGK-SIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 217 I---ENAWGGEVLVPKIPS--YRVVDV 238
I + W E P YRV ++
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNI 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023801PREPILNPTASE300.015 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.8 bits (67), Expect = 0.015
Identities = 25/99 (25%), Positives = 37/99 (37%), Gaps = 1/99 (1%)

Query: 134 LGALLAAVALVWLLNLYNFMDGIDGIAAAEAVCVCSGGAILYLLTGRPEAGYAPLLLAAA 193
L L + L + D + G A A + + S LLTG+ GY L AA
Sbjct: 163 LPLLWGGLLFNLLGGFVSLGDAVIG-AMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAA 221

Query: 194 AAGFLCWNFPPAKIFMGDVGSGFLGITLGVLALQAGWTA 232
+L W P + + + F+GI L +L
Sbjct: 222 LGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKP 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023811NUCEPIMERASE781e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 77.5 bits (191), Expect = 1e-17
Identities = 55/298 (18%), Positives = 110/298 (36%), Gaps = 42/298 (14%)

Query: 281 TVLVTGAGGSIGSELCRQLARFSPARLVLVEANEFALYNVEQWFHQHWPQLELVLLAGDV 340
LVTGA G IG + ++L + + N++ +++Q + Q D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 341 KDAARMEEVFQGWRPDVVFHAAAYKHVPLMEVANAWQAARNNVLGTLRVAECAVRYGASR 400
D M ++F + VF + V + N A +N+ G L + E
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 401 FVLIST---------------DKAVNPTNVMGATKRMAEMVCESLYRRHQTTQFSMVRFG 445
+ S+ D +P ++ ATK+ E++ + Y + +RF
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHT-YSHLYGLPATGLRFF 179

Query: 446 NVLGSTGS---VIPKFQAQIARGGPVTV-THPEINRYFMSIPEAAQLVLQA--------- 492
V G G + KF + G + V + ++ R F I + A+ +++
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADT 239

Query: 493 ---------ASMGAGGEIFVLDMGHPVKIVDLARNMIRLSGYSEAQIRIEFTGLRPGE 541
A+ A ++ + PV+++D + + G + + L+PG+
Sbjct: 240 QWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG---IEAKKNMLPLQPGD 294


28NH44784_023931NH44784_024201Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_023931-121-4.227442D-alanyl-D-alanine carboxypeptidase
NH44784_023941134-6.850198D-alanine aminotransferase
NH44784_023951240-7.939921Proposed lipoate regulatory protein YbeD
NH44784_023961243-8.067229Octanoate-[acyl-carrier-protein]-protein-N-
NH44784_023971247-9.593165FOG: TPR repeat, SEL1 subfamily
NH44784_023981253-11.314381hypothetical protein
NH44784_023991351-11.344550Glucose-1-phosphate thymidylyltransferase
NH44784_024001349-10.897288hypothetical protein; putative methyl
NH44784_024011351-11.261295hypothetical protein
NH44784_024021449-11.426023hypothetical protein
NH44784_024031350-11.2200123-oxoacyl-[acyl-carrier-protein]
NH44784_024041146-10.787863UDP-3-O-(3-hydroxymyristoyl) glucosamine
NH44784_024051040-9.955435hypothetical protein
NH44784_024061123-7.173979hypothetical protein
NH44784_024071119-6.187928putative transcription regulator transcription
NH44784_024081017-5.987874Lipoate synthase
NH44784_024091113-4.773822dTDP-glucose 4,6-dehydratase
NH44784_024101215-2.850924hypothetical protein
NH44784_024111013-2.681877ATP-dependent hsl protease ATP-binding subunit
NH44784_024121-112-1.804274ATP-dependent protease HslV
NH44784_024131013-1.575213C4-type zinc finger protein, DksA/TraR family
NH44784_024141112-1.202036Putative metal chaperone, involved in Zn
NH44784_0241512180.881352Zinc uptake regulation protein ZUR
NH44784_0241614191.146747Zinc ABC transporter, ATP-binding protein ZnuC
NH44784_0241714161.187613Zinc ABC transporter, inner membrane permease
NH44784_0241813140.910513Zinc ABC transporter, periplasmic-binding
NH44784_0241913130.902691hypothetical protein
NH44784_0242013111.678643hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023931BLACTAMASEA392e-05 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 38.6 bits (90), Expect = 2e-05
Identities = 35/152 (23%), Positives = 54/152 (35%), Gaps = 21/152 (13%)

Query: 57 IVIDVNSGQTLAASNPDMKVEPASLTKIMTAYVVFNALEEKRLTLEQTVPVSEHAWRTGG 116
I +D+ SG+TL A D + S K++ V ++ LE+ + +
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQD----- 97

Query: 117 SRMFIEPRKPVTVDELNQGM---------IVQSGNDASVALAEAVGGSEA--AFATLMNQ 165
+ PV+ L GM I S N A+ L VGG AF +
Sbjct: 98 ----LVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGD 153

Query: 166 EAERLGMRNTHFMNATGLPDPQHMTSTRDLAT 197
RL R +N D + T+ +A
Sbjct: 154 NVTRLD-RWETELNEALPGDARDTTTPASMAA 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_024091NUCEPIMERASE1595e-48 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 159 bits (403), Expect = 5e-48
Identities = 85/352 (24%), Positives = 139/352 (39%), Gaps = 42/352 (11%)

Query: 1 MSIIVTGGAGFIGSNFVLDWLSVKDEAVTTLDKLT--YAGNLENL-ASLAGDSRHSFIHG 57
M +VTG AGFIG + L + V +D L Y +L+ L F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DIGDSRLVGDVLARHQPRAIINFAAESHVDRSIHAPAAFIQTNVVGTFQLLESVREYWSA 117
D+ D + D+ A + V S+ P A+ +N+ G +LE R
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--K 117

Query: 118 LQEPARSNFRLLHVSTDEVYGSLDAHAAAFQETTAYAPNSPYAASKAASDHLMRAYHHTY 177
+Q LL+ S+ VYG L+ + + P S YAA+K A++ + Y H Y
Sbjct: 118 IQ-------HLLYASSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 178 GLPILTTNCSNNYGPFHFPEKLIPLLIHHALAGKPLPIYGDGQQIRDWLYVKDHCSALRR 237
GLP YGP+ P+ + L GK + +Y G+ RD+ Y+ D A+ R
Sbjct: 170 GLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIR 229

Query: 238 VLEAGR------------------PGQTYNVGGNNERTNLEVAHHICDLLDELQPRVDGR 279
+ + P + YN+ GN+ L D + L+ +
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRVYNI-GNSSPVELM------DYIQALEDALGIE 282

Query: 280 SYRQQIVFVADRPGHDRRYAVDSRRLEQELNWTPAETFESGMLQTVRWYLAN 331
+ + + +PG + D++ L + + +TP T + G+ V WY
Sbjct: 283 A---KKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_024101NEISSPPORIN260.037 Neisseria sp. porin signature.
		>NEISSPPORIN#Neisseria sp. porin signature.

Length = 348

Score = 25.7 bits (56), Expect = 0.037
Identities = 15/37 (40%), Positives = 19/37 (51%)

Query: 57 ALDLPAMTLVYDADPALLAKIKAGDKVRFTAERKDGK 93
AL L A+ + AD L IKAG + + E DGK
Sbjct: 7 ALTLAALPVAAMADVTLYGAIKAGVQTYRSVEHTDGK 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_024181adhesinb2621e-88 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 262 bits (671), Expect = 1e-88
Identities = 96/325 (29%), Positives = 163/325 (50%), Gaps = 35/325 (10%)

Query: 13 ALLAAAGLWLSTAFGAAHAAEPLPVVASFSILGDIVREVGGDDIKLGTLVGPDGDAHEYE 72
A + A + +++ L VVA+ SI+ DI + + GD I L ++V D HEYE
Sbjct: 13 AFVGLAACSSQKSSTETGSSK-LNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEYE 71

Query: 73 PTPGDAKKLAAARILFVNGLDFE----AWLPRLVKAAGFTG--PTVVASKGVTPRKFAGH 126
P P D KK + A ++F NG++ E AW +LV+ A S+GV G
Sbjct: 72 PLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQ 131

Query: 127 EHGTHDHGHGGKDHDHDHDHDHGAGEGHHHHGDADPHAWQNLANGVTYARNVAEGLAAAD 186
G DPHAW NL NG+ YA+N+A+ L+ D
Sbjct: 132 SEK----------------------------GKEDPHAWLNLENGIIYAQNIAKRLSEKD 163

Query: 187 PAHADAYRKRADAYIARLQAADAAARKAFAAIPAERRKVVTSHDAFGYFGDAYGVDFIAA 246
PA+ + Y K AY+ +L A D A++ F IP E++ +VTS F YF AY V
Sbjct: 164 PANKETYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYI 223

Query: 247 MGVSTAAEPSAGDVARIIEQVKRDKVPAVFVENITSPRLVQQIARETGAKVGGTLYSDAL 306
++T E + + ++E++++ KVP++FVE+ R ++ ++++T + +++D++
Sbjct: 224 WEINTEEEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSV 283

Query: 307 SKPGQPGATYLEMFEWNVRQLTAAL 331
++ G+ G +Y M ++N+ ++ L
Sbjct: 284 AEKGEEGDSYYSMMKYNLEKIAEGL 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_024201FLGMRINGFLIF300.014 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 30.3 bits (68), Expect = 0.014
Identities = 14/36 (38%), Positives = 22/36 (61%), Gaps = 2/36 (5%)

Query: 140 QATRWLELASY--ALVTLLGAWLVWIKAIRPLLRRR 173
Q + +L + L+ L+ AW++W KA+RP L RR
Sbjct: 451 QQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRR 486


29NH44784_025911NH44784_026281Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_025911217-1.048461hypothetical protein
NH44784_025921115-1.103740putative transcriptional regulator
NH44784_025931217-1.729054Glutamate 5-kinase
NH44784_025941313-1.348684COG0536: GTP-binding protein Obg
NH44784_025951220-1.940162LSU ribosomal protein L27p
NH44784_025961116-0.513795LSU ribosomal protein L21p
NH44784_0259711141.221546Octaprenyl-diphosphate synthase
NH44784_0259912141.694710*hypothetical protein
NH44784_0260012131.504750Biotin carboxylase
NH44784_026011211-0.363411Hydrolases of the alpha/beta superfamily
NH44784_026021210-0.306919Nitrilase/cyanide hydratase and apolipoprotein
NH44784_02603119-0.035203Beta-lactamase
NH44784_026041190.063565FIG00732247: hypothetical protein
NH44784_0260511130.141427hypothetical protein
NH44784_0260610120.559871Malate Na(+) symporter
NH44784_0260711142.864210transcriptional regulator of protochatechuate
NH44784_0260811103.730635Quinone oxidoreductase
NH44784_026091-1163.335881IroE protein
NH44784_0261010152.842601transcriptional regulator, LysR family
NH44784_0261113133.011026Mercuric resistance operon regulatory protein
NH44784_0261212142.958061Non-heme chloroperoxidase
NH44784_0261311142.505356hypothetical protein
NH44784_0261412182.832683hypothetical protein
NH44784_0261512163.152108Ribulose-5-phosphate 4-epimerase and related
NH44784_0261611142.979899Hydroxypyruvate isomerase
NH44784_0261712122.539097Nucleoside-diphosphate-sugar epimerases
NH44784_0261812132.239997D-beta-hydroxybutyrate dehydrogenase
NH44784_026191-1111.729316Predicted pyridoxine biosynthesis protein
NH44784_026201-1101.268889regulatory protein GntR, HTH:GntR, C-terminal
NH44784_026211-1120.916319transcriptional regulator, LysR family
NH44784_0262210120.818505major facilitator superfamily MFS_1
NH44784_026231-1120.624547hypothetical protein
NH44784_026241-2120.158034Zinc-regulated outer membrane receptor
NH44784_026251214-0.023724Hydroxypyruvate isomerase
NH44784_026261118-3.461647hypothetical protein
NH44784_026271015-3.006570Putative DMT superfamily metabolite efflux
NH44784_026281-312-3.089788Lysine exporter protein (LYSE/YGGA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_025931CARBMTKINASE377e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.1 bits (86), Expect = 7e-05
Identities = 29/106 (27%), Positives = 39/106 (36%), Gaps = 7/106 (6%)

Query: 138 GVVPIVNENDTVVTDEIRLGDNDTLGALVTNLIEADTLIILTDQRGLYDSDPRKNPDAKF 197
G VP++ E+ + E + D D G + + AD +ILTD G +
Sbjct: 195 GGVPVILEDGEIKGVEAVI-DKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLR 253

Query: 198 MAHVQAGDPALEAMAGGAGSGIGTGGMLTKVLAAKRAAHSGAHTVI 243
V+ E AGS M KVLAA R G I
Sbjct: 254 EVKVEELRKYYEEGHFKAGS------MGPKVLAAIRFIEWGGERAI 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_025951TYPE3IMRPROT250.024 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 25.5 bits (56), Expect = 0.024
Identities = 9/31 (29%), Positives = 13/31 (41%)

Query: 33 AGSIIVRQRGTRFHAGVNVGMGKDHTLFALV 63
AG II Q G F V+ + + A +
Sbjct: 98 AGEIIGLQMGLSFATFVDPASHLNMPVLARI 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026031BLACTAMASEA473e-08 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 46.7 bits (111), Expect = 3e-08
Identities = 59/254 (23%), Positives = 95/254 (37%), Gaps = 27/254 (10%)

Query: 31 ARVSLSLRDL-DGGTVLAHHADRVQPSASIIKVPILLALLEAMANGRYALEQPLAL---- 85
RV + DL G T+ A AD P S KV + A+L + G LE+ +
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQD 97

Query: 86 -----PACGDRAGGTGILAQLPSVASLSLAELARLMIVLSDNAATNALIDLL-GFDAINQ 139
P +++ EL I +SDN+A N L+ + G +
Sbjct: 98 LVDYSPVSEKHLADG-----------MTVGELCAAAITMSDNSAANLLLATVGGPAGLTA 146

Query: 140 WSQRAGLAASRLQRRMMDAAAREAGRDNFTSADDATAA-LCWMLRDGMLPSPLRTFALD- 197
+ ++ G +RL R + G T+ + AA L +L L + + L
Sbjct: 147 FLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQW 206

Query: 198 LLADQRERAHFGAALPAPAHLASKAGQLP-GLRHDAGILTVADRS--VVLAVLADGFTDW 254
++ D+ + LPA +A K G G R +L +++ +V+ L D
Sbjct: 207 MVDDRVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASM 266

Query: 255 QTAQTLQGGEGAAL 268
G GAAL
Sbjct: 267 AERNQQIAGIGAAL 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026171NUCEPIMERASE914e-23 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 91.4 bits (227), Expect = 4e-23
Identities = 54/219 (24%), Positives = 90/219 (41%), Gaps = 28/219 (12%)

Query: 1 MKILITGGAGFLGQRLARKLLEQGRLALSGERVPISQID-----LLDVTRTDAINDTRVR 55
MK L+TG AGF+G ++++LLE G + V I ++ L R + + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGH-----QVVGIDNLNDYYDVSLKQARLELLAQPGFQ 55

Query: 56 SVEGDVADPDCLRSLIG-ADTAVIFHLAAIVSGQAEADFDLG-----MRINLDASRALLE 109
+ D+AD + + L +F + + L NL +LE
Sbjct: 56 FHKIDLADREGMTDLFASGHFERVFISPH----RLAVRYSLENPHAYADSNLTGFLNILE 111

Query: 110 ACRRQGHRPRVVFTSSVAVYGGA--LPETVRDDTALNPQSSYGTQKAIAELLLADYTRRG 167
CR + +++ SS +VYG +P + DD+ +P S Y K EL+ Y+
Sbjct: 112 GCRHNKIQ-HLLYASSSSVYGLNRKMPFST-DDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 168 FVDGRALRLPTISVRPGRPNAAASSFASGIIREPLNGEP 206
+ LR T+ GRP+ A F + L G+
Sbjct: 170 GLPATGLRFFTVYGPWGRPDMALFKFTKAM----LEGKS 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026201DHBDHDRGNASE310.002 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 31.2 bits (70), Expect = 0.002
Identities = 19/49 (38%), Positives = 23/49 (46%), Gaps = 5/49 (10%)

Query: 186 EARHADQMRAVLREHEAIAEAIRRQDEDAAREAARI-HMVNAAKRLRLA 233
EARHA+ A +R+ AI E R RE I +VN A LR
Sbjct: 55 EARHAEAFPADVRDSAAIDEITAR----IEREMGPIDILVNVAGVLRPG 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026221TCRTETB290.029 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.029
Identities = 24/143 (16%), Positives = 54/143 (37%), Gaps = 15/143 (10%)

Query: 78 VGGMLIGSYADRHGRRAAMTLTLWLMGLGCALIAAAPTHAQMGLLGPVMMVLARLIQGFA 137
+G + G +D+ G + + + + G + H+ LL ++AR IQG
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGS--VIGFVGHSFFSLL-----IMARFIQGAG 116

Query: 138 AGGEVGASTTLLVEHAPPSRRGFYSSWQFGSQSLGVALGAVVVGTLTAALSAEQMQAWGW 197
A ++ + P RG ++G +G + G + + W
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI--------HW 168

Query: 198 RVPFVIGILTVPVGAYIRRNLEE 220
+I ++T+ ++ + L++
Sbjct: 169 SYLLLIPMITIITVPFLMKLLKK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026241PF00577320.011 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 31.7 bits (72), Expect = 0.011
Identities = 19/125 (15%), Positives = 33/125 (26%), Gaps = 14/125 (11%)

Query: 575 VEGELRHQFTP---VFSAAVFGDYVRGKLTGGDGNLPRIPA-------ARAGLRGNVKWQ 624
+ L H ++ D R G N+ + A A + L + +
Sbjct: 396 FQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHD 455

Query: 625 QWSGGIEYARVFSQKD----IAAYESSTPGYNLVNAVVAYRGRYGATGYEVYLRGTNLLN 680
S Y + ++ + Y ST GY R + +
Sbjct: 456 GQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKF 515

Query: 681 KLAYN 685
YN
Sbjct: 516 TDYYN 520


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026271PF06580290.021 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.021
Identities = 16/102 (15%), Positives = 34/102 (33%), Gaps = 5/102 (4%)

Query: 63 IALYGVTLACMNLLFYMALRTLPLGVAIAIEFTGPLTLAVVLSRRAIDFVWIACALAGLV 122
++ + ++ M L+ A R+ G + L V+ + I VW +
Sbjct: 41 SMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWR 100

Query: 123 LLIPTGQSMHDLDPVGIAYALGAAVCWALYIIFGKMAGNVHG 164
LL PV L ++ + + ++ + G
Sbjct: 101 LLAFINTK-----PVAFTLPLALSIIFNVVVVTFMWSLLYFG 137


30NH44784_026481NH44784_026591Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0264812131.890726Permeases of the major facilitator superfamily
NH44784_0264911121.193024transcriptional regulator, LysR family
NH44784_0265011140.115115hypothetical protein
NH44784_026511-110-2.077224Transcriptional regulator, LysR family
NH44784_026521-212-3.982620hypothetical protein
NH44784_026531329-8.148556hypothetical protein
NH44784_026541331-9.150038transposase, IS4 family protein
NH44784_026551431-8.720757Glucosamine--fructose-6-phosphate
NH44784_026561537-9.171528Glycosyl transferase, group 2 family protein
NH44784_026571636-8.684234D amino acid oxidase (DAO) family
NH44784_026581635-8.436692hypothetical protein
NH44784_026591027-4.245131FIG01204965: hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026481TCRTETB1036e-26 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 103 bits (259), Expect = 6e-26
Identities = 86/405 (21%), Positives = 157/405 (38%), Gaps = 17/405 (4%)

Query: 16 VCLAALMFGLEISSVPVILPTLEQALHSGFQELQWIMNAYTIACTTVLMAAGTLADRYGR 75
+C+ + L + V LP + + W+ A+ + + G L+D+ G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 76 KRLLLASLAVFGLTSLVCGWAQDT-GVLIAGRALQGLSGGAMLICQVAVLSHRFQDGAAR 134
KRLLL + + S++ +LI R +QG +G A V V+ R+ R
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQG-AGAAAFPALVMVVVARYIPKENR 137

Query: 135 GRAFGAWGIAFGVGLGFGPVIGSAIVAVASWQWVFLAHGPLALIALGLAWKGVDESRDPR 194
G+AFG G +G G GP IG I W ++ L P+ I + +
Sbjct: 138 GKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI--PMITIITVPFLMKLLKKEVRI 195

Query: 195 QRKLDVAGIVSLSVAVFGLAWFITQGPSAGFASLATLASLGAAIVALVVFVIAERRCADP 254
+ D+ GI+ +SV + F T S F + ++++ ++FV R+ DP
Sbjct: 196 KGHFDIKGIILMSVGIVFFMLFTTSY-SISFLIV--------SVLSFLIFVKHIRKVTDP 246

Query: 255 MFDFAVFRIRRFSGALLASAAMNFSFWPFMIYLPLYFQNGLGYGGLGTGLS-LLAYTLPT 313
D + + F +L + + F+ +P ++ G + T+
Sbjct: 247 FVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 314 LVAPPLGEKLALRYRPDTVIPAGLFAICLGFALMKWGSGVAQASWLTMLPGCLLAGTGLG 373
++ +G L R P V+ G+ + + F + SW + + G GL
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTA--SFLLETTSWFMTIIIVFVLG-GLS 363

Query: 374 LINTPVTNTTTGAVPGDRAGMASGIDISVRMISLSINIALMGFLL 418
T ++ + ++ AG + +S IA++G LL
Sbjct: 364 FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


31NH44784_027411NH44784_027841Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0274112101.338536response regulator, NarL-family
NH44784_0274211100.963120two-component hybrid sensor and regulator
NH44784_027431090.541190Oxalate/formate antiporter
NH44784_0274410100.488219cupin domain protein
NH44784_027451-190.246886putative integrase (Remnant
NH44784_027461-190.579167hypothetical protein
NH44784_027471-1100.604949hypothetical protein
NH44784_027481214-1.107552putative phage repressor
NH44784_027491213-1.396074hypothetical protein
NH44784_027501315-2.540412hypothetical protein
NH44784_027511325-2.958965Phage tail assembly chaperone
NH44784_027521330-5.035921hypothetical protein
NH44784_027531226-5.155270COG5281: Phage-related minor tail protein
NH44784_027541326-4.448324hypothetical protein
NH44784_027551124-3.694505hypothetical protein
NH44784_027561212-2.500380hypothetical protein
NH44784_027571211-2.518448hypothetical protein
NH44784_02758119-1.725189Phage minor tail protein
NH44784_02759119-1.281306Phage tail assembly protein
NH44784_02760118-1.316116Phage tail assembly protein I
NH44784_02761128-0.863884Phage tail fiber protein
NH44784_02762118-0.461936hypothetical protein
NH44784_02763119-0.387668hypothetical protein
NH44784_027641110-0.341840Phage tail assembly protein I
NH44784_027651010-0.879202transcriptional regulator, LysR-family
NH44784_02766109-1.537730D-beta-hydroxybutyrate dehydrogenase
NH44784_027671116-2.088432Leucine-, isoleucine-, valine-, threonine-, and
NH44784_027681121-2.343164hypothetical protein AF1210
NH44784_027691346-7.044608hypothetical protein
NH44784_027701245-8.777351Peptidyl-prolyl cis-trans isomerase ppiC
NH44784_027711146-9.002815Glutamate Aspartate periplasmic binding protein
NH44784_027721155-10.563450hypothetical protein
NH44784_027731150-12.264631hypothetical protein
NH44784_027741049-11.827915hypothetical protein
NH44784_027751-138-10.341412Integral inner membrane protein of type IV
NH44784_027761-134-9.075691hypothetical protein
NH44784_027771-239-6.994063Minor pilin of type IV secretion complex (VirB5
NH44784_027781-143-6.305506FIG00431650: hypothetical protein
NH44784_027791-241-6.888765IncQ plasmid conjugative transfer DNA nicking
NH44784_027801-136-5.884132hypothetical 20.3 kDa protein
NH44784_027811-223-4.072488FIG00482846: hypothetical protein
NH44784_027821-219-3.477806Transcriptional regulator, GntR family domain /
NH44784_027831011-3.671043hypothetical protein
NH44784_027841010-3.553993FIG01212980: hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027411HTHFIS702e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 2e-16
Identities = 33/156 (21%), Positives = 60/156 (38%), Gaps = 4/156 (2%)

Query: 11 RVLLADDHGIVREGLKMVLQQAPAMIGSIDEAATGEQVLALLAAHGADVLVLDLGMPGVA 70
+L+ADD +R L L +A + + + +AA D++V D+ MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 71 GSGWVRALRDRHPALHIMVLTANTDARSRQAILEAGADAYLAKTGSSHELMAAIQR-LHQ 129
+ ++ P L ++V++A + E GA YL K EL+ I R L +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 130 DPADRPVRARPNHTPAPADTLTRREQQVLALAAQGA 165
+ P + Q++ + A+
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLM 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027421PF06580415e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.4 bits (97), Expect = 5e-06
Identities = 41/191 (21%), Positives = 73/191 (38%), Gaps = 40/191 (20%)

Query: 258 ARRQQTVLDSLNRLLDGLLDVSRMDAHLLRIE----RRAVDLDQL-FDDIRLDFESLARA 312
+ + +L SL+ L+ L S L E + L + F+D RL FE
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFED-RLQFE----- 243

Query: 313 RGLRLSVAPTSLALDTDPALLRRILDNLLTNALSYTPKGG-VLLSARRRRGGVLIQVWDT 371
+ P + + P L++ +++N + + ++ P+GG +LL + G V ++V +T
Sbjct: 244 ----NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 372 GVGIAPELQEAVFTEFTRGAGAPASPPAGQGLGLGLAIVRRLAGLLHGE---ITLRSTPG 428
G + G GL VR +L+G I L G
Sbjct: 300 GSLALKN--------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 429 KGSVFSLWLPA 439
K + + +P
Sbjct: 340 KVNAM-VLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027431TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 29/129 (22%), Positives = 58/129 (44%), Gaps = 4/129 (3%)

Query: 65 LTWGVVQPFTGMIADRHGSAKVILGGLACYALGLAGMAHAGTVTAFMLSAGVCIGIALSG 124
LT+ + G ++D+ G +++L G+ G + + ++ A G
Sbjct: 60 LTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG--A 117

Query: 125 TAF-AVIYGALSRLVPPERRGWALGVAGAVGGLGQFTMVPAAQWLIGSRGWVDALLIFAV 183
AF A++ ++R +P E RG A G+ G++ +G+ + PA +I LL+ +
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGE-GVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 184 VLAVVVPLA 192
+ + VP
Sbjct: 177 ITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027661DHBDHDRGNASE919e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.9 bits (225), Expect = 9e-24
Identities = 65/254 (25%), Positives = 103/254 (40%), Gaps = 15/254 (5%)

Query: 18 LTGQRVIVTAGAAGIGAAIAGAFAARGAQVHICDVDERALAACPHPWS---------RAD 68
+ G+ +T A GIG A+A A++GA + D + L AD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 69 VSQRDQIDRYMQEALARLGGLDVLVNNAGIAGPTAGIADIAPQELDATLDINLASQFHTV 128
V ID +G +D+LVN AG+ P I ++ +E +AT +N F+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGL-IHSLSDEEWEATFSVNSTGVFNAS 124

Query: 129 RHALPALRQAGGGSIINISSVAGRMGVPMRTPYAATKWGVVGLTRSLAVELGGYGIRVNA 188
R + GSI+ + S + YA++K V T+ L +EL Y IR N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 189 LLPGL--VAGPRIDRVIEARARNMGVTVAEETRLELAGVSLGQFVRAADIANMALFLASP 246
+ PG E A + E + G+ L + + +DIA+ LFL S
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKT---GIPLKKLAKPSDIADAVLFLVSG 241

Query: 247 FGAMVSGQAISIDG 260
++ + +DG
Sbjct: 242 QAGHITMHNLCVDG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027741cdtoxina290.045 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 28.9 bits (64), Expect = 0.045
Identities = 14/40 (35%), Positives = 18/40 (45%), Gaps = 1/40 (2%)

Query: 276 SDTSPSPQPQPTAMSDNGTSLPTSKP-PPARPGTPPPSAL 314
T PSP + G +LPT+ P PGT P +L
Sbjct: 43 GPTVPSPDEPGLPLPGPGPALPTNGAIPIPEPGTAPAVSL 82


32NH44784_028001NH44784_028201Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_028001027-3.8865822-dehydro-3-deoxygalactonokinase
NH44784_028011-130-3.409389D-galactonate regulator, IclR family
NH44784_028021-130-2.772825FIG006045: Sigma factor, ECF subfamily
NH44784_028031-132-2.102289Iron siderophore sensor protein
NH44784_028041-129-2.361734TonB-dependent receptor
NH44784_028051-1251.021918extracellular heme-binding protein
NH44784_028061-1202.065109transport protein HasD
NH44784_0280710141.859463ABC exporter for hemopore HasA, membrane fusion
NH44784_0280810140.970713putative outer membrane protein
NH44784_028091022-3.800655probable LysR-family transcriptional regulator
NH44784_028101226-5.005110hypothetical protein
NH44784_028111125-4.910062Aldehyde dehydrogenase
NH44784_028121128-5.978318UDP-glucose 4-epimerase
NH44784_028131232-6.547770Tricarboxylate transport protein TctC
NH44784_028141436-7.124869hypothetical protein
NH44784_028151217-0.008399hypothetical protein
NH44784_0281612140.496041hypothetical protein
NH44784_0281712160.416147Uncharacterized glutathione S-transferase-like
NH44784_0281814160.297186N-acetylmuramoyl-L-alanine amidase
NH44784_028191416-0.203316hypothetical protein
NH44784_028201216-0.477118FIG001154: CcsA-related protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028051PF06438632e-14 Heme acquisition protein HasAp
		>PF06438#Heme acquisition protein HasAp

Length = 205

Score = 62.7 bits (152), Expect = 2e-14
Identities = 44/140 (31%), Positives = 65/140 (46%), Gaps = 19/140 (13%)

Query: 37 GEHTGVPNSGGFFGGGFFSGTEYGYTSKTVDGYAFVASGDLNYYFPPFSGPKMEGATPHT 96
G+ N+GGF G F G++Y S D AF+A GDL+Y FS P HT
Sbjct: 35 GQVVDGSNTGGFNPGPF-DGSQYALKSTASD-AAFIAGGDLHYTL--FSNPS------HT 84

Query: 97 LWGTLESVTLGAGVDKAG-----HVVDPFITFTFDDPLHGDLADGRGNVTHDIIWGLMNG 151
LWG L+S+ LG + + ++F+ L +A GR H +++GLM+G
Sbjct: 85 LWGKLDSIALGDTLTGGASSGGYALDSQEVSFSNLG-LDSPIAQGRDGTVHKVVYGLMSG 143

Query: 152 ---SVEGADDKLGSGTHGGL 168
+++G D L L
Sbjct: 144 DSSALQGQIDALLKAVDPSL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028071RTXTOXIND391e-135 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 391 bits (1006), Expect = e-135
Identities = 93/418 (22%), Positives = 155/418 (37%), Gaps = 6/418 (1%)

Query: 33 ARFGWWVLALGFGGFLAWAAWAPLDNGVAMPGIVVVTGERQAVDSIEGGVVSALLVAEGD 92
R + + + ++ G + +G + + IE +V ++V EG+
Sbjct: 57 PRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGE 116

Query: 93 PVRAGQALVRLDDTRVRGEARSLRAQQAAVMAREARLLAERDGRDDLAPPDAQANAEAAV 152
VR G L++L + ++ + R + P+ + E
Sbjct: 117 SVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYF 176

Query: 153 AYAME------RQLFASRRAALAGELAGIEATLAGSRALAGGLESTLAHKRTQRALLREQ 206
E L + + + E L RA + + + + + +
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 207 LGNLRDLAREGYVPRNRVLELERSLAQLDGDLASDLGALGQTRQQIAELALRGQQRRDAF 266
L + L + + ++ VLE E + +L L Q +I Q F
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 267 QREVRTDLAETRVQREQLTQKLATAEFDLAHSEIRAPVSGTVVALATHTVGGVVQPGSRL 326
+ E+ L +T LT +LA E S IRAPVS V L HT GGVV L
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL 356

Query: 327 MEIVPEDVPLVVEGRLPVESIDKVRAGLPVELMFTAFDASRTPRLEGTVTLVSADRFEDD 386
M IVPED L V + + I + G + AF +R L G V ++ D ED
Sbjct: 357 MVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQ 416

Query: 387 RNGRPYYRLRADVAPGQLRRIGDAPLRAGMPVEVFVRTGERSLLNYLFKPLLDRARLA 444
R G + + + + PL +GM V ++TG RS+++YL PL + +
Sbjct: 417 RLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTES 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028121NUCEPIMERASE1693e-52 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 169 bits (431), Expect = 3e-52
Identities = 79/346 (22%), Positives = 134/346 (38%), Gaps = 48/346 (13%)

Query: 16 RILIAGAASLVGSHTADALLAAGVREVILLDNFA----FGTPEAIAHLQGNPRVKVVRGD 71
+ L+ GAA +G H + LL AG +V+ +DN +A L P + + D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 72 LMRLPDL--LAATEGVDGVLHLAAYMTLGFA-QTPWQAVDVNIRGAQNMLEACRANKVKK 128
L + L A+ + V + + ++ + P D N+ G N+LE CR NK++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 129 FVFASSNAVYGYGSGIAGALAENGPF---HSVGAPPAAILYGASKIMGEQLCRQYYQKAG 185
++ASS++VYG L PF SV P LY A+K E + Y G
Sbjct: 121 LLYASSSSVYG--------LNRKMPFSTDDSVDHP--VSLYAATKKANELMAHTYSHLYG 170

Query: 186 LDYVVLRYSTVYGERQHYRAANSLYIMETYDRVRQGERPVLPGDGTDTKHFVHVSDVARA 245
L LR+ TVYG A + + +G+ + G + F ++ D+A A
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFT----KAMLEGKSIDVYNYGKMKRDFTYIDDIAEA 226

Query: 246 NVAAFQSDATDVA------------------VNVSGPAPITTGELVRLVLDYCKSDLEPE 287
+ N+ +P+ + ++ + D + +
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKN 286

Query: 288 IRPDPPGTVRLTSGGAFHIPHDLAGQQIGWQPQVGMAEGVTRLLAW 333
+ P PG V T + IG+ P+ + +GV + W
Sbjct: 287 MLPLQPGDVLET-----SADTKALYEVIGFTPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028141RTXTOXINA330.007 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.6 bits (74), Expect = 0.007
Identities = 18/55 (32%), Positives = 24/55 (43%), Gaps = 3/55 (5%)

Query: 537 DDIYKNI--DYIIGSSYDDIIVGDMLGGVTMTGGAGNDTFVLHGGGNRVIFNDGD 589
D I N D + G +D + G G + GG GND + G N + DGD
Sbjct: 747 DLIEGNDGNDRLYGDKGNDTLSGGN-GDDQLYGGDGNDKLIGVAGNNYLNGGDGD 800


33NH44784_028821NH44784_028941Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0288212131.332224Transcriptional regulator, GntR family domain /
NH44784_0288313141.374930hypothetical protein
NH44784_0288412130.896672putative Cytochrome bd2, subunit II
NH44784_0288512111.076420putative Cytochrome bd2, subunit I
NH44784_0288614110.696324Probable signal peptide protein
NH44784_0288715130.650000Permease of the drug/metabolite transporter
NH44784_0288814130.589877Transcriptional regulator, LysR family
NH44784_0288913121.499786Acetoacetyl-CoA reductase
NH44784_0289012112.142881hypothetical protein
NH44784_0289112112.188294ABC-type multidrug transport system, permease
NH44784_028921192.415860ABC-type multidrug transport system, permease
NH44784_028931-183.209041secretion protein HlyD
NH44784_028941093.055224RND efflux system, outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028891DHBDHDRGNASE1225e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 122 bits (307), Expect = 5e-36
Identities = 78/249 (31%), Positives = 111/249 (44%), Gaps = 10/249 (4%)

Query: 5 RTALVTGGTGCLGRAIARALLDAGHDVIVTCHASEAATRQWLDQEAAAGRRYDMVKVDVA 64
+ A +TG +G A+AR L G + + E + + A R + DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKV-VSSLKAEARHAEAFPADVR 67

Query: 65 DYDACQALARRLADDGRQIDILVNNAGITRDASLRKMSYENWNDVLRSNLDSMFNMTQPL 124
D A + R+ + IDILVN AG+ R + +S E W N +FN ++ +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 125 CGPMADRGWGRIVNISSVNGSKGAFGQTNYAASKAGIHGFTKSLALELARKGVTVNTVSP 184
M DR G IV + S YA+SKA FTK L LELA + N VSP
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GYLATRMVEAV------PEDVLK---EKILPQIPLGRLGQPDEIAALVAFICSDAAAFMT 235
G T M ++ E V+K E IPL +L +P +IA V F+ S A +T
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 236 GSNVAMNGG 244
N+ ++GG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028911ABC2TRNSPORT581e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 57.6 bits (139), Expect = 1e-11
Identities = 46/176 (26%), Positives = 73/176 (41%), Gaps = 5/176 (2%)

Query: 197 AALIREREHGTIEHLLVMPVTPTEIMLAKV-WSMGLVVLVSAGLSLTFVVRGLLQVPVEG 255
AA R T E +L + +I+L ++ W+ L AG+ + G Q
Sbjct: 89 AAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWL--- 145

Query: 256 SVALFLAGVALHLFATTSMGIFMATLARSMPQFGMLLVLVLLPLQMLSGGTTPRESMPDF 315
S+ L +AL A S+G+ + LA S F LV+ P+ LSG P + +P
Sbjct: 146 SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIV 205

Query: 316 VQNIMLAAPTTHFVELGQAILFRGAGLGVVWQPFLALALIGSVLFAFSLTRFRKTL 371
Q P +H ++L + I+ + V Q AL + + F S R+ L
Sbjct: 206 FQTAARFLPLSHSIDLIRPIMLGHPVVDVC-QHVGALCIYIVIPFFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028931RTXTOXIND702e-15 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 70.2 bits (172), Expect = 2e-15
Identities = 65/391 (16%), Positives = 131/391 (33%), Gaps = 84/391 (21%)

Query: 11 LLAIVAVAAAGYYGWRMLSDTGPGAGFVSGNGRIEATEVDVATKLAGRVQDVLVAEGDFV 70
+ V A + G ++ +GR + + V++++V EG+ V
Sbjct: 63 FIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVKEGESV 118

Query: 71 SAGQPLARM------------QIDTLQAQREEAR-------------------------- 92
G L ++ Q LQA+ E+ R
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 93 AQHQQAVNNAASASAQVAQRESDKLAAEAVVVQRESELDAARRRLARSE----------- 141
++ + + Q + ++ K E + ++ +E R+ R E
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 142 ---TLSREGASSIQELDDDRARVRSAQAAVNAGRAQVKAAQAAIDAAKAAQVG------- 191
+L + A + + + + A + ++Q++ ++ I +AK
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 192 ------AQSAVNAALAT--IARIEADIADSELRAPRDGRV-QYRVAQAGEVLGAGGKVLN 242
Q+ N L T +A+ E S +RAP +V Q +V G V+ ++
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358

Query: 243 MVDLADVY-MTFFLPEQAAGRVALGQDVRIVLDAAPQY---VIPAKVSFVASTAQFTPKT 298
+V D +T + + G + +GQ+ I ++A P + KV + A
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA------ 412

Query: 299 VETATERQKLMFRVKAQIAPELLRQHLRQVK 329
+R L+F V I L + +
Sbjct: 413 --IEDQRLGLVFNVIISIEENCLSTGNKNIP 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028941RTXTOXIND330.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.003
Identities = 22/171 (12%), Positives = 44/171 (25%), Gaps = 36/171 (21%)

Query: 86 DLRTALLRVEEARALYGIQRADQFPTIGAQADGSRGRTPGDLNLTGQPQVASQYQVGVGM 145
L A L + L ++ P + + P N++ + +
Sbjct: 142 SLLQARLEQTRYQILSRSIELNKLPELKLPDE------PYFQNVSEEEVL---------- 185

Query: 146 AAWELDFWGRVRSLKDAALENYLASDAAAEAATLSLIAQVADSYLTLRELDERLALTRAT 205
R+ SL + E A+ + R+
Sbjct: 186 ---------RLTSLIKEQFSTWQNQKYQKELNLDKKRAERL-------TVLARINRYENL 229

Query: 206 IASREESLRIFRRRYEVGSISKLDLTQVE----TLWQQARALGADLEQARA 252
+ L F +I+K + + E + R + LEQ +
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIES 280


34NH44784_029871NH44784_030021Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_029871313-2.134723LSU ribosomal protein L25p
NH44784_029881113-2.951222Peptidyl-tRNA hydrolase
NH44784_029891216-5.241018hypothetical protein
NH44784_029901119-6.989056GTP-binding and nucleic acid-binding protein
NH44784_029911127-8.828690putative integrase
NH44784_029921127-8.805636Prophage P4 integrase
NH44784_029931028-8.851152Bacterial regulatory proteins, AsnC family
NH44784_029941029-8.673545type III restriction-modification
NH44784_029951026-7.397823Adenine specific DNA methylase (Mod-related
NH44784_029961-126-5.621462Sll1503 protein
NH44784_029971-121-3.295512putative DNA helicase
NH44784_029981-119-1.717017Integrase
NH44784_029991-114-0.781964FIG141751: hypothetical protein in PFGI-1-like
NH44784_030001011-0.565592TonB-dependent siderophore receptor
NH44784_030011211-0.389680ABC transporter ATP-binding protein
NH44784_03002129-0.027737Permeases of the major facilitator superfamily
35NH44784_030381NH44784_030671Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_030381314-0.9904172-aminomuconate deaminase
NH44784_030391314-1.3480512-aminomuconate semialdehyde dehydrogenase
NH44784_030401216-3.125803Transcriptional regulator, MarR family
NH44784_030411416-3.642153serum resistance protein
NH44784_030421423-5.064250Heat shock protein 60 family chaperone GroEL
NH44784_030431328-5.971478Heat shock protein 60 family co-chaperone GroES
NH44784_030441439-10.371932hypothetical protein
NH44784_030451441-11.468649Integrase, catalytic region
NH44784_030461542-11.277839transposase IS3/IS911 family protein
NH44784_030471442-11.052579hypothetical protein
NH44784_030481242-10.980524hypothetical protein
NH44784_030491142-10.334349hypothetical protein
NH44784_030501141-9.313450hypothetical protein
NH44784_030511040-8.458613outer membrane efflux protein
NH44784_030521137-7.454476HlyD family secretion protein
NH44784_030531234-6.631395cyclolysin secretion ATP-binding protein
NH44784_030541331-6.086821Alkaline phosphatase
NH44784_030551118-1.741481hypothetical protein
NH44784_030561117-2.022149hypothetical protein
NH44784_030571117-2.132510Choline dehydrogenase
NH44784_030581017-3.569453Betaine aldehyde dehydrogenase
NH44784_030591017-4.500244Hydroxymethylpyrimidine ABC
NH44784_030601115-4.146675Hydroxymethylpyrimidine ABC
NH44784_030611118-4.058777ABC transporter (ATP-binding protein
NH44784_030621218-3.900837Uncharacterized glutathione S-transferase-like
NH44784_030631418-3.943506Endoribonuclease L-PSP
NH44784_030641519-3.948279hypothetical protein
NH44784_030651425-4.205715Gentisate 1,2-dioxygenase
NH44784_030661330-3.589766GntR domain protein
NH44784_030671326-3.276190N-methylhydantoinase B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_030521RTXTOXIND356e-121 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 356 bits (915), Expect = e-121
Identities = 167/476 (35%), Positives = 260/476 (54%), Gaps = 8/476 (1%)

Query: 3 IQAAWDLLRRYAMVFHHAWSQRRQEDDVPLDRHEREFLPANLELMETPAHPAAHLTMRAI 62
+ + L RY +V+ W R+Q D ++ E EFLPA+LEL+ETP L I
Sbjct: 5 LMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFI 64

Query: 63 LLLIVVILLIAVFGRLDIVAVAKGKLIPSERVKTIQPAITGVVKGILVRDGERVVAGQPL 122
+ +V+ +++V G+++IVA A GKL S R K I+P +VK I+V++GE V G L
Sbjct: 65 MGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVL 124

Query: 123 IVLDATQAAADADKARSQRIDASLAAARAKALLEAQAEQRQPQVRQVEDTTAEKQSEAQR 182
+ L A A AD K +S + A L R + L + + P+++ ++ + SE +
Sbjct: 125 LKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 183 FATAV-----YQEYTDKLSVAQAELERREAELATTRRLIRKLEITVPIARREADDYKELA 237
+ + ++ + L+++ AE T I + E + + DD+ L
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLL 244

Query: 238 RDRYVAQHDYLDKTKQAVEQEHELSMQRSRATELAAAIDQQRAVLSQTASQFRREQLRDL 297
+ +A+H L++ + VE +EL + +S+ ++ + I + F+ E L L
Sbjct: 245 HKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKL 304

Query: 298 EQSSQQALQSRNDETKANTRQQLLTLHAPVGGVVQQSAVHTLGGVVTTAQSLMVIVPED- 356
Q++ + K RQQ + APV VQQ VHT GGVVTTA++LMVIVPED
Sbjct: 305 RQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDD 364

Query: 357 VLEVEAHIENKDVGFVNEGDDTIVKIDAFPYTRYGYLTGKVVSVSSDAVQHRDRRAALTF 416
LEV A ++NKD+GF+N G + I+K++AFPYTRYGYL GKV +++ DA++ D+R L F
Sbjct: 365 TLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIE--DQRLGLVF 422

Query: 417 TARIRLQTNQMLIDGKSINLTPGMEVSAEIKTGKRAVAAYFFDPLLQVAQESMRER 472
I ++ N + K+I L+ GM V+AEIKTG R+V +Y PL + ES+RER
Sbjct: 423 NVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_030541RTXTOXINA655e-12 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 64.6 bits (157), Expect = 5e-12
Identities = 69/308 (22%), Positives = 112/308 (36%), Gaps = 72/308 (23%)

Query: 1189 DDIFDGKGGDDVISGGGGNDVFLFERG-YGRLTIDQYRWSDQHRSTLRFGAGIGAADVAV 1247
D +F G I G G+DV +++ G LTID + ++ T+
Sbjct: 621 DKVFLS-AGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVT------------ 667

Query: 1248 SALGDPFGGSIALTMGADRVVLKGMALGVPGFGVQLIQFAD-------GTSWTAEQILAA 1300
G L + V+K + V G + Q+ G + T L +
Sbjct: 668 ---RVLGGDVKVL-----QEVVKEQEVSV-GKRTEKTQYRSYEFTHINGKNLTETDNLYS 718

Query: 1301 ARDVRGTADSEILYGTPWDDVFRGNGGNDVLFGNGGNDVFLFEKGDGQIEIVARSYVKVN 1360
++ GT ++ +G+ + D+F G G+D++ GN GND +KG+
Sbjct: 719 VEELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGN-------------- 764

Query: 1361 NVVRLGAGIQVEDVSVTMSRDGEYDIFLNIGSDRIKLKKAGQVSLADAPTGFDLSVPRIE 1420
+ + G G D L G KL + + G D +
Sbjct: 765 DTLSGGNG----------------DDQLYGGDGNDKLIGVAGNNYLNGGDGDD----EFQ 804

Query: 1421 FADGTVWAPEHLLDLARRFEGNSGDDNLVGTYGADRFDGHGGDDYIAGHGGGDTFIFNQG 1480
++ G G+D L G+ GAD DG GDD + G G D + + G
Sbjct: 805 VQGNSLAKNVL--------FGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSG 856

Query: 1481 YGRLVIDE 1488
YG +ID+
Sbjct: 857 YGHHIIDD 864



Score = 62.7 bits (152), Expect = 1e-11
Identities = 77/338 (22%), Positives = 126/338 (37%), Gaps = 48/338 (14%)

Query: 1842 GNDYIFGGGGGDTFIYKPGYGHLEI-YENDYAGTNTLKFGTGIDPGMMRIKGNSTSGIEI 1900
G+D +F G + G GH + Y+ G T+ + G + +++
Sbjct: 619 GDDKVFLSAG--SANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGDVKV 676

Query: 1901 I-DRISGGKVTLTNAMYGAGYGLQRVEFSDGTVWTASQLRSAVSNAVATTEDDALFGTEV 1959
+ + + +V++ Y +G T + +V + TT D FG++
Sbjct: 677 LQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFGSKF 736

Query: 1960 AEVFDGLGGGDYIQGGGGGDTFHYKVGYGRLEILEVESAGATNVLKFGPGISASSMQVVG 2019
++F G G D I+G G D + G L S G + +G + + V G
Sbjct: 737 TDIFHGADGDDLIEGNDGNDRLYGDKGNDTL------SGGNGDDQLYGGDGNDKLIGVAG 790

Query: 2020 ADDLSITLKTGATGDAVVLSMAMSPAAGVQRIDFEDGTSWTGDDVRALLRRATTGNDVLF 2079
+ L+ G GD D G+ + + GND L+
Sbjct: 791 NNYLN-----GGDGD--------------------DEFQVQGNSLAKNVLFGGKGNDKLY 825

Query: 2080 GSLLGPDRLDGLGGIDTVYGNGGNDTYVLGAGYGGQLTIFNGYMYGGADGKLEL-DVSAD 2138
GS G D LDG G D + G GND Y +GYG + GG + KL L D+
Sbjct: 826 GSE-GADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHII----DDDGGKEDKLSLADIDFR 880

Query: 2139 RLWLQRSGQDL-------KVSIMGTDASAVLRDWYKDD 2169
+ +R G DL V +G R+W++ +
Sbjct: 881 DVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKE 918



Score = 60.0 bits (145), Expect = 1e-10
Identities = 46/178 (25%), Positives = 67/178 (37%), Gaps = 39/178 (21%)

Query: 1694 IGSSGNDALTGTSDADYFDGQGGDDTVNGMGGGDTIIFNAGYGKLRISEPYWPALPENVV 1753
IG++ D G+ D F G GDD + G G D + + G N
Sbjct: 723 IGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKG----------------NDT 766

Query: 1754 VLGAGLSTAAAQIKALRNGDISLTFAGTSDQITFAATALFGGVGIPAIVFADGT---RWS 1810
+ G NGD L +D+ L G G + DG +
Sbjct: 767 LSGG-------------NGDDQLYGGDGNDK-------LIGVAGNNYLNGGDGDDEFQVQ 806

Query: 1811 GAEVFNAAMQLTPGDDVLFGNSSAEYFDGLGGNDYIFGGGGGDTFIYKPGYGHLEIYE 1868
G + + G+D L+G+ A+ DG G+D + GG G D + Y GYGH I +
Sbjct: 807 GNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDD 864



Score = 59.6 bits (144), Expect = 1e-10
Identities = 54/191 (28%), Positives = 71/191 (37%), Gaps = 28/191 (14%)

Query: 801 IVGTSAGEVLDGGGGNDYILGGGGADTYLFNAGYGLLTINQAEGAPGVISRLRFGAGIDA 860
++GT+ + G D G G D N G L ++ L G G D
Sbjct: 722 LIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGN------DTLSGGNGDDQ 775

Query: 861 RDITVKSDATGNLILLHANGTDKVVLELQASRPKYGVGVVEFADGSTWTRQQLLQMSLTG 920
GN L+ G + + G + S L G
Sbjct: 776 LY-----GGDGNDKLIGVAGNNYLN-------GGDGDDEFQVQGNSLAKNV------LFG 817

Query: 921 SGAMDNIYGTTGADFIDGKGGTDYLKGRGGNDIFYYASGYGHVIIDEEDFARDAQNVLRF 980
D +YG+ GAD +DG G D LKG GNDI+ Y SGYGH IID++ D L
Sbjct: 818 GKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDDGGKEDK---LSL 874

Query: 981 GAGISLTDLRF 991
A I D+ F
Sbjct: 875 -ADIDFRDVAF 884



Score = 58.4 bits (141), Expect = 3e-10
Identities = 45/177 (25%), Positives = 60/177 (33%), Gaps = 52/177 (29%)

Query: 1439 FEGNSGDDNLVGTYGADRFDGHGGDDYIAGHGGGDTFIFNQGYGRLVIDEFARSANEVNV 1498
F G GDD + G G DR G G+D ++G G D G +L
Sbjct: 740 FHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKL-------------- 785

Query: 1499 LRFGTGIAPQEVSMKIDEQGNVVMIIGTDQVVLSNMAAGLERNGVQRVEFADGTTWTRQQ 1558
I GN + G +G + +
Sbjct: 786 ---------------IGVAGNNYLNGG---------------DGDDEFQVQGNSLAKNV- 814

Query: 1559 IHDRLRTIQGDAGGQTLYGTDLADRIDGGGGPDVLYGKEGDDTYVYKLGYGIMTIAD 1615
+ G G LYG++ AD +DGG G D+L G G+D Y Y GYG I D
Sbjct: 815 -------LFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDD 864



Score = 56.1 bits (135), Expect = 2e-09
Identities = 73/318 (22%), Positives = 120/318 (37%), Gaps = 62/318 (19%)

Query: 950 GNDIFYYASGYGHVIIDEEDFARDAQNVLRFGAG--ISLTDLRFSVTSDGDVTITHGVPG 1007
G+D + ++G ++ +A +V+ + LT T G+ T+T + G
Sbjct: 619 GDDKVFLSAGSANI------YAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGG 672

Query: 1008 DEIVIRNMAGWDYSGVQR-----------LEFANGESISKPDFLARTQY--GSSGSDTLY 1054
D V++ + V + NG+++++ D L + G++ +D +
Sbjct: 673 DVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFF 732

Query: 1055 GSRYVDSIDGLGGDDTVYARGGGGDVFVFNQGYGHLTIDSYDALRSDMSTLKLGPGILAT 1114
GS++ D G GDD + N G L D + TL G G
Sbjct: 733 GSKFTDIFHGADGDDLIEG----------NDGNDRLYGDKGN------DTLSGGNG---- 772

Query: 1115 NIQIRLSGPGGQDISVLMGTDTVTLIGMGADPSRSGVDRILFSDGTILNAAEVAARAHQI 1174
+L G G D LIG+ + +G D + + +
Sbjct: 773 --DDQLYGGDGND----------KLIGVAGNNYLNGGD-----GDDEFQVQGNSLAKNVL 815

Query: 1175 SGSAGSDTLTGTLVDDIFDGKGGDDVISGGGGNDVFLFERGYGRLTIDQYRWSDQHRSTL 1234
G G+D L G+ D+ DG GDD++ GG GND++ + GYG ID L
Sbjct: 816 FGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDD---DGGKEDKL 872

Query: 1235 RFGAGIGAADVAVSALGD 1252
A I DVA G+
Sbjct: 873 SL-ADIDFRDVAFKREGN 889



Score = 53.4 bits (128), Expect = 1e-08
Identities = 81/380 (21%), Positives = 117/380 (30%), Gaps = 101/380 (26%)

Query: 1383 EYDIFLNI-GSDRIKLKKAGQVSLADAPTGFDLS--VPRIEFADGTVWAPEHLLDLARRF 1439
EY L + G D+ +K D +D S + + R
Sbjct: 566 EYITELLVKGVDKWTVKGVQ-----DKGAVYDYSNLIQHASVGNNQY--------REIRI 612

Query: 1440 EGNSGDDNLVGTYGADRFDGHGGDDYIAGHGGGDTFIFNQG-YGRLVID-EFARSANEVN 1497
E + GD + A + + G G D +++ G L ID A A
Sbjct: 613 ESHLGDGDDKVFLSAGSANIYAGK-------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 1498 VLRFGTG--------IAPQEVSM-----KIDEQGNVVMIIGTDQVVLSNMAAGLER--NG 1542
V R G + QEVS+ K + I + ++ +E
Sbjct: 666 VTRVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGT 725

Query: 1543 VQRVEFADGTTWTRQQIHDRLRTIQGDAGGQTLYGTDLADRIDGGGGPDVLYGKEGDDTY 1602
+ +F D I+G+ G LYG D + GG G D LYG +G+D
Sbjct: 726 TRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKL 785

Query: 1603 VYKLGYGIMTIADGEFGVDAANVLEFGAGIDASMITAKAGLDGSLTLLLGPSGGAVVLQN 1662
GV N L G G D + G + +N
Sbjct: 786 ---------------IGVAGNNYLNGGDGDDEFQV-----------------QGNSLAKN 813

Query: 1663 MLRGRYWGVQEVKFADGVHWNRDDVLALISTIGSSGNDALTGTSDADYFDGQGGDDTVNG 1722
+L G GND L G+ AD DG GDD + G
Sbjct: 814 VLFG-----------------------------GKGNDKLYGSEGADLLDGGEGDDLLKG 844

Query: 1723 MGGGDTIIFNAGYGKLRISE 1742
G D + +GYG I +
Sbjct: 845 GYGNDIYRYLSGYGHHIIDD 864



Score = 51.9 bits (124), Expect = 3e-08
Identities = 68/326 (20%), Positives = 115/326 (35%), Gaps = 57/326 (17%)

Query: 770 NGVQQVEFSDGVTWTREQVISHLSVGAGSDYIVGTSAGEVLDGGGGNDYILGGGGADTYL 829
N +Q + ++ SHL G G D + ++ + G G+D + Y
Sbjct: 595 NLIQHASVGNNQY-REIRIESHL--GDGDDKVFLSAGSANIYAGKGHDVV--------YY 643

Query: 830 FNAGYGLLTIN-QAEGAPGVISRLRFGAGIDARDITVKSDATGNLILLHANGTDKVVLEL 888
G LTI+ G + R G D+ V + + T+K +
Sbjct: 644 DKTDTGYLTIDGTKATEAGNYTVTRVLGG----DVKVLQEVVKEQEVSVGKRTEKT--QY 697

Query: 889 QASRPKYGVGVVEFADGSTWTRQQLLQMSLTGSGAMDNIYGTTGADFIDGKGGTDYLKGR 948
++ + G + ++ ++L+ G+ D +G+ D G G D ++G
Sbjct: 698 RSYEFTHINGKNLTETDNLYSVEELI-----GTTRADKFFGSKFTDIFHGADGDDLIEGN 752

Query: 949 GGNDIFYYASGYGHVIIDEEDFARDAQNVLRFGAGISLTDLRFSVTSDGDVTITHGVPGD 1008
GND Y G ++ + + L G G + G+ + G D
Sbjct: 753 DGNDRLYGDKG------NDTLSGGNGDDQLYGGDGNDKL-----IGVAGNNYLNGGDGDD 801

Query: 1009 EIVIRNMAGWDYSGVQRLEFANGESISKPDFLARTQYGSSGSDTLYGSRYVDSIDGLGGD 1068
E ++ + V G G+D LYGS D +DG GD
Sbjct: 802 EFQVQGNSLA--KNVLF--------------------GGKGNDKLYGSEGADLLDGGEGD 839

Query: 1069 DTVYARGGGGDVFVFNQGYGHLTIDS 1094
D + G G D++ + GYGH ID
Sbjct: 840 DLLKG-GYGNDIYRYLSGYGHHIIDD 864



Score = 50.0 bits (119), Expect = 1e-07
Identities = 28/75 (37%), Positives = 40/75 (53%), Gaps = 4/75 (5%)

Query: 792 LSVGAGSDYIVGTSAGEVLDGGGGNDYILGGGGADTYLFNAGYGLLTINQAEGAPGVISR 851
L G G+D + G+ ++LDGG G+D + GG G D Y + +GYG I+ G +
Sbjct: 815 LFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDDGGKE---DK 871

Query: 852 LRFGAGIDARDITVK 866
L A ID RD+ K
Sbjct: 872 LSL-ADIDFRDVAFK 885



Score = 47.7 bits (113), Expect = 6e-07
Identities = 42/171 (24%), Positives = 65/171 (38%), Gaps = 60/171 (35%)

Query: 1824 GDDVLFGNSSAEYFDGLGGNDYIFGGGGGDTFIYKPGYGHLEIYENDYAGTNTLKFGTGI 1883
G+D L+G+ + G G+D ++GG G D I G+ N L G G
Sbjct: 754 GNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLI--GVAGN-----------NYLNGGDGD 800

Query: 1884 DPGMMRIKGNSTSGIEIIDRISGGKVTLTNAMYGAGYGLQRVEFSDGTVWTASQLRSAVS 1943
D +++GNS + + + GGK
Sbjct: 801 D--EFQVQGNSLAK----NVLFGGK----------------------------------- 819

Query: 1944 NAVATTEDDALFGTEVAEVFDGLGGGDYIQGGGGGDTFHYKVGYGRLEILE 1994
+D L+G+E A++ DG G D ++GG G D + Y GYG I +
Sbjct: 820 ------GNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDD 864



Score = 35.7 bits (82), Expect = 0.002
Identities = 36/111 (32%), Positives = 47/111 (42%), Gaps = 17/111 (15%)

Query: 2074 GNDVLFGSLLGPDRLDGLGGIDTVYGNGGNDTYVLGAGYGGQLTIFNGYMYGGADGKLEL 2133
GND L+G G D L G G D +YG GND + AG N Y+ GG DG E
Sbjct: 754 GNDRLYGDK-GNDTLSGGNGDDQLYGGDGNDKLIGVAG--------NNYLNGG-DGDDEF 803

Query: 2134 DVSADRLWL-QRSGQDLKVSIMGTDASAVLRDWYKDDFRKLAAVQSGGEGS 2183
V + L G + G++ + +L DD K GG G+
Sbjct: 804 QVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLK------GGYGN 848


36NH44784_031111NH44784_031241Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_031111216-0.570197Transcriptional regulator, TetR family
NH44784_0311213140.008440hypothetical protein
NH44784_0311313150.483958hypothetical protein
NH44784_0311413160.586109LysR family transcriptional regulator STM2281
NH44784_0311511131.307663LysR family transcriptional regulator STM2281
NH44784_0311612151.404479Aspartate aminotransferase
NH44784_0311711151.522502similar to acetolactate synthase large subunit
NH44784_031181-1171.524490hypothetical protein
NH44784_0311910150.931962Acetylornithine deacetylase
NH44784_0312010140.553973Permease of the drug/metabolite transporter
NH44784_031211218-0.059503hypothetical protein
NH44784_0312211150.153257L-carnitine dehydratase/bile acid-inducible
NH44784_0312312160.460301hypothetical protein
NH44784_031241316-0.212204Acyl-CoA dehydrogenase, short-chain specific
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_031111HTHTETR601e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.4 bits (146), Expect = 1e-13
Identities = 23/171 (13%), Positives = 47/171 (27%), Gaps = 3/171 (1%)

Query: 1 MSLELSPRAIEIVEQTKLLLAAGGYHGFSYADVSERVHIGKASIHHHFPTKAELVLAVVA 60
E I++ L + G S ++++ + + +I+ HF K++L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RHRAQAREGLAGLDRHV-EDPLARLTAY-TEYWAQCIREGSLP-MCICAMLAAELPLIPP 117
+ E DPL+ L + E + E
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA 124

Query: 118 EIADEVRAYFGELSGWLASVLEKGVATGHFRLRDGVQAEAQAFMSTVHGAM 168
+ R E + L+ + + A + G M
Sbjct: 125 VVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLM 175


37NH44784_031341NH44784_031401Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0313412102.904814mRNA 3-end processing factor
NH44784_031351392.861667ATP-dependent DNA ligase
NH44784_031361493.048998FIG003033: Helicase domain protein
NH44784_031371572.535417FIG006285: hypothetical protein
NH44784_031381572.753081Inner membrane ABC-transporter YbtQ
NH44784_031391482.515361Putative ABC iron siderophore transporter, fused
NH44784_0314013102.249190major facilitator superfamily MFS_1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_031401TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.9 bits (75), Expect = 0.002
Identities = 15/60 (25%), Positives = 20/60 (33%)

Query: 242 GWSLADIGLVLNVVGATAGLLVALAYGRLLKRWSRRGATLAAALLQAVGIGGLALPAMAW 301
W IG+ L G L A+ G + R R A + + G LA W
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301


38NH44784_031531NH44784_031681Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0315312142.920443LysR family transcriptional regulator YbhD
NH44784_0315412112.180942D-3-phosphoglycerate dehydrogenase
NH44784_0315511112.5388133-hydroxyisobutyrate dehydrogenase
NH44784_031561-1111.884419Histone acetyltransferase HPA2 and related
NH44784_031571-2102.002272alpha/beta hydrolase fold
NH44784_031581-2101.739535Biotin sulfoxide reductase
NH44784_0315910101.069215Dihydrodipicolinate synthase
NH44784_0316011102.156352hypothetical protein
NH44784_0316112102.357930hypothetical protein
NH44784_031621392.545822Transcriptional regulator, AraC family
NH44784_031631492.691514Hypothetical Protein
NH44784_0316415102.658214Probable MFS transporter
NH44784_031651192.461056transcriptional regulator
NH44784_0316611101.369792Multidrug-efflux transporter, major facilitator
NH44784_031671092.123949ATP-dependent Clp protease proteolytic subunit
NH44784_031681-1113.186683hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_031561SACTRNSFRASE413e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 3e-07
Identities = 19/55 (34%), Positives = 25/55 (45%), Gaps = 5/55 (9%)

Query: 72 MEALFVDPDARGTGVGRAL----VEDALRRHP-GLSTDVNEQNAQAIGFYERLGF 121
+E + V D R GVG AL +E A H GL + + N A FY + F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_031641TCRTETA965e-24 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 96.4 bits (240), Expect = 5e-24
Identities = 99/357 (27%), Positives = 152/357 (42%), Gaps = 37/357 (10%)

Query: 43 MPVIGPVVRGLRLAE---WHGGLVVTVAGVLWMLMARYWGGRSDRVGRRRVLLRAAAGYI 99
MPV+ ++R L + H G+++ + ++ A G SDR GRR VLL + AG
Sbjct: 25 MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAA 84

Query: 100 VSYLLLAIGLDLMLREPPPAWLALLALMALRALLGAFYAAMPTVSAARIADITAPEQRGA 159
V Y ++A P W+ L R + G A V+ A IADIT ++R
Sbjct: 85 VDYAIMATA--------PFLWV----LYIGRIVAGITGATGA-VAGAYIADITDGDERAR 131

Query: 160 MMARLGAANGLGMVLGPAIGGLLVKDSLTLPLYIAAVLPLLGTLWLAAKLPSDAAHEPRV 219
+ A G GMV GP +GGL+ S P + AA L L L LP E R
Sbjct: 132 HFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 220 TPPLKLSDRRLRLPLAAMLLAMSAVITAQMIVGFYAMDALRQEP------------GPAA 267
L+ PLA+ A + A ++ F+ M + Q P A
Sbjct: 192 LRREALN------PLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDA 245

Query: 268 RTAGWAMTTVGVMLILVQVGLAK--ARRMNAARCLAGGALLSGVGFLLVPVLGGAGGLVA 325
T G ++ G++ L Q + A R+ R L G + G G++L+ G +
Sbjct: 246 TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL-AFATRGWMAF 304

Query: 326 CYAVMAAGMGLVFPSIQTLAANAVTADEQGIAAGSVIAVQGMAMVVAPLVCTLLYGV 382
V+ A G+ P++Q + + V + QG GS+ A+ + +V PL+ T +Y
Sbjct: 305 PIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_031661TCRTETA1046e-27 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 104 bits (262), Expect = 6e-27
Identities = 78/364 (21%), Positives = 143/364 (39%), Gaps = 10/364 (2%)

Query: 25 RNLAVCFAGSFSTIVAMTLLLPFLPLYVEELGVQGHAAIVQWSGVAYGATFFAAALVAPL 84
R L V + V + L++P LP + +L G+ AP+
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVT--AHYGILLALYALMQFACAPV 62

Query: 85 WGRLGDRYGRKLMLVRASFGMAICMSLTGMVQSVWQLVLLRLLVGLAGGYASGSTILVAM 144
G L DR+GR+ +L+ + G A+ ++ +W L + R++ G+ G + + +A
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD 122

Query: 145 QAPKARSGWALGVLSAGIMAGNLVGPLIGGGLPPLIGIRATFLLAGGVIFLAFLATVFLI 204
G +SA G + GP++GG + A F A + L FL FL+
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLL 181

Query: 205 KE-MPRPAKPARQEGKRRG-GWAQIPDKRPVAAMLATGMLLMFATMSIEPIITVYVGQLV 262
E +P R+E + VAA++A ++ + ++
Sbjct: 182 PESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRF 241

Query: 263 DDPARVTLVSGVVMSAAALGAILSASRLGRLADRVGHWTVIIGALAVAALLLVPQAFVTA 322
A T + + + L ++ A G +A R+G ++ + + AF T
Sbjct: 242 HWDA--TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATR 299

Query: 323 GWQLVALRFLMGLALGGL-LPCVTSVIRHNVPDGVGGNVLGMSISAQYAGQVAGPLLGGF 381
GW +A ++ LA GG+ +P + +++ V + G + G + + GPLL
Sbjct: 300 GW--MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 382 IGGH 385
I
Sbjct: 358 IYAA 361


39NH44784_031961NH44784_032071Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0319612113.264480hypothetical protein
NH44784_0319713133.640643Transcriptional regulator, DeoR family
NH44784_031981092.202938Glutathione S-transferase
NH44784_031991-180.674345FIG00455408: hypothetical protein
NH44784_032001-110-0.209718Transcriptional regulator, ArsR family
NH44784_03201119-0.117250hypothetical protein
NH44784_032021012-2.049076Transcriptional regulator, AraC family
NH44784_032031213-2.800788outer membrane porin
NH44784_032041311-3.230438outer membrane porin
NH44784_032051311-2.604056outer membrane porin
NH44784_032061212-2.277366Cytochrome oxidase biogenesis protein
NH44784_032071212-3.037398Cytochrome O ubiquinol oxidase subunit IV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032011TCRTETB574e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.2 bits (138), Expect = 4e-11
Identities = 39/169 (23%), Positives = 70/169 (41%), Gaps = 4/169 (2%)

Query: 5 WLVIALLALPQVAETILAPALPDLARHWRLDAAATQPVMGIFFVGFAAGVLLWGHLADTR 64
WL I + E +L +LPD+A + A+T V F + F+ G ++G L+D
Sbjct: 18 WLCILSFFSV-LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 65 G-RRPAMLAGLALGLAGTLCALAAPAYPWLLAGRFLQALGLAACSVVTQTVLRDCLDGPR 123
G +R + + + + + L+ RF+Q G AA + V+ +
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 124 LTHYFVTLGTVLAWSPAVGPLTGQALAD--GHGYGGVLAAIAVVVALLL 170
F +G+++A VGP G +A Y ++ I ++ L
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032031ECOLNEIPORIN961e-24 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 96.0 bits (239), Expect = 1e-24
Identities = 93/371 (25%), Positives = 139/371 (37%), Gaps = 43/371 (11%)

Query: 1 MKKTLLAAAMLTTFAGVAQAETSVTLYGVIDTGIGYNK-IKGNGYDGSKLGMINGI-QAG 58
MKK+L+A LT A A VTLYG I G+ ++ + NG + + GI G
Sbjct: 1 MKKSLIA---LTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 59 SRWGLRGSEDLGDGLRAVFTLESGFDSGNGTRGQSGRLFGRQATIGLANDAWGTIEFGRQ 118
S+ G +G EDLG+GL+A++ +E G RQ+ IGL +G + GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQK----ASIAGTDSGWGNRQSFIGLKGG-FGKLRVGRL 112

Query: 119 ATVGSNYLADIDPFYTSYTQSNLGLGFSAANTMRWDNMVMYRSPSMSGFQFAAGYSFNVD 178
+V + DI+P + S LG A V Y SP +G + Y+ N D
Sbjct: 113 NSVLKDT-GDINP-WDS-KSDYLG-VNKIAEPEARLISVRYDSPEFAGLSGSVQYALN-D 167

Query: 179 DTNNDETHFRTNDNSRGITAGLRYVEGPVNVTLTYDQLNGSNRASIDHDATPRQYAVGLS 238
+ NS AG Y G V + + + +
Sbjct: 168 NAGRH--------NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSG 219

Query: 239 YDLEVVKLAAAYGRTTDGWFVGQDLPAGTPFSDEFGTNRYVDGFKANAYMLGATMP-VGG 297
YD + + A+ + D + + N + AY G P V
Sbjct: 220 YDNDAL-YASVAVQQQDA----------KLVEENYSHNSQTEVAATLAYRFGNVTPRVSY 268

Query: 298 ASSLFASWQHVSPSNDRLTGGDANMNVWSVGYTYDLSKRTNLYAYGSYGKDYAFIDGLKS 357
A S+ + + + VG YD SKRT+ + ++ S
Sbjct: 269 AHGFKGSFDAT--------NYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVS 320

Query: 358 TAAGVGIRHVF 368
TA GVG+RH F
Sbjct: 321 TAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032041ECOLNEIPORIN852e-20 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 84.9 bits (210), Expect = 2e-20
Identities = 89/400 (22%), Positives = 135/400 (33%), Gaps = 71/400 (17%)

Query: 1 MKKTLLAAAMLATFAGVAQAETSVTLYGIIDTGIGYNKVSGTEPGINLGTGKPQNVDVSG 60
MKK+L+A + A A VTLYG I G+ ++ N +
Sbjct: 1 MKKSLIALTLAAL---PVAAMADVTLYGTIKAGVETSR------------SVAHNGAQAA 45

Query: 61 SRIGMINGVQSGSRWGLKGSEDLGDGLRAMFQLESGFDSGNGDTTLDRLFGRQATVGLAN 120
S V GS+ G KG EDLG+GL+A++Q+E D+ RQ+ +GL
Sbjct: 46 SVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTDS---GWGNRQSFIGLKG 102

Query: 121 DAWGSVEFGRQATVGSNFLAEIDPFAASFTQANIGTGLSAANTMHWDNMIMYRSPWTDGF 180
+G + GR +V L + ++++ A + Y SP G
Sbjct: 103 G-FGKLRVGRLNSV----LKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGL 157

Query: 181 QFALGYSFNVDTDDGNQTGFRTADNARGITAGLRYVQGPLNVSLTYDQLNSSNKAYATGK 240
++ Y+ N + G N+ AG Y G V
Sbjct: 158 SGSVQYALN------DNAGR---HNSESYHAGFNYKNGGFFVQYGGAYKRH--------- 199

Query: 241 NGRPVFDDAGNRVALDNNITPRQYAVAVSYDLEVLKLAAAYGRTTDGWFVGQDLPEGSAS 300
V ++ + + + YD + L A+ + D V ++ S +
Sbjct: 200 ------HQVQENVNIEKY---QIHRLVSGYDNDAL-YASVAVQQQDAKLVEENYSHNSQT 249

Query: 301 NHFGTYRYAEGF--KANSYMLGATLRLDGASNLFGSWQHVSPSNDLLTGDDARMNIWSVG 358
T Y G SY G D + + VG
Sbjct: 250 EVAATLAYRFGNVTPRVSYAHGFKGSFDAT------------------NYNNDYDQVVVG 291

Query: 359 YTYDLSKRTSLYAYGSYGKNYAFIDGLKSTAGGVGMRHLF 398
YD SKRTS + + STAGGVG+RH F
Sbjct: 292 AEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032051ECOLNEIPORIN941e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 93.7 bits (233), Expect = 1e-23
Identities = 92/394 (23%), Positives = 140/394 (35%), Gaps = 68/394 (17%)

Query: 1 MKKKLLAAAVLTALASVAQAATSVTLYGLIDTGIGYNRITGDANGADYSGSRIGMINGVQ 60
MKK L+A LT A A VTLYG I G+ +R S I V
Sbjct: 1 MKKSLIA---LTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGI--VD 55

Query: 61 AGSRWGLRGTEDLGDGLRAVFRLENGFNSADGSRLQQGRMFGRQATIGLADDAWGSVDFG 120
GS+ G +G EDLG+GL+A++++E + A RQ+ IGL +G + G
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQKASIAGT----DSGWGNRQSFIGLKGG-FGKLRVG 110

Query: 121 RQTSVGSLLLADINPFRTTFTQASIGTTFSAANTMRWDNMVLYRSPWTDGFQFAVGYSFN 180
R SV DINP+ + + + V Y SP G +V Y+ N
Sbjct: 111 RLNSV-LKDTGDINPWDSKSDYLGVNKIAEPEARL---ISVRYDSPEFAGLSGSVQYALN 166

Query: 181 VDGTDEEQSGFRTADNARGITAGLRYANGPLNIVLTFDQLNGSNLASVDAFGDPVDHNAT 240
+ +G N+ AG Y NG + + + + +
Sbjct: 167 DN------AGR---HNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQE-NVNIE--KYQI- 213

Query: 241 PRQYAVGMSYDLEVVKLAAAYGRTTDGWFVGQDLPAGVRGKQAAGAMRRSEKLFGTNRYE 300
+ + YD + + A+ + + E+ + N
Sbjct: 214 ---HRLVSGYDNDAL-YASVAVQQ--------------------QDAKLVEENYSHNSQT 249

Query: 301 EGFRANSYMLGATVP-----LGGGGSVFGAWQHASASSDALTGDDANMDIWSLGYTYDLS 355
E +Y G P G GS + + D +G YD S
Sbjct: 250 EVAATLAYRFGNVTPRVSYAHGFKGSFDAT------------NYNNDYDQVVVGAEYDFS 297

Query: 356 KRTNLYVYGSYGKNYAFIEGLKSTAGGVGIQHRF 389
KRT+ V + + STAGGVG++H+F
Sbjct: 298 KRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032071ACRIFLAVINRP270.038 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.038
Identities = 13/61 (21%), Positives = 24/61 (39%), Gaps = 11/61 (18%)

Query: 65 LAFAVVQIVVHIIYFLHMDTKSESGWNMLALIFTLVLVVITLSGSIWIMYHLN--SNMMP 122
L A++ + + + FL NM A + + V + L G+ I+ N +
Sbjct: 344 LFEAIMLVFLVMYLFLQ---------NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLT 394

Query: 123 M 123
M
Sbjct: 395 M 395


40NH44784_032391NH44784_032881Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_032391093.286081Hydrogen peroxide-inducible genes activator
NH44784_032401083.432029RNA polymerase sigma-70 factor, ECF subfamily
NH44784_032411083.730713transmembrane sensor, putative
NH44784_032421-183.314607Ferrichrome-iron receptor
NH44784_032431184.278412transcriptional regulator, GntR family
NH44784_0324413104.501383Aconitate hydratase
NH44784_0324513103.689755hypothetical protein
NH44784_0324613123.927902DNA-3-methyladenine glycosylase II
NH44784_0324711133.179425Putative phosphatase YieH
NH44784_0324812123.1511683-oxoacyl-[acyl-carrier protein] reductase
NH44784_0324911113.234564putative Mg(2+) transporter
NH44784_032501-1103.332285Mannose-6-phosphate isomerase
NH44784_0325110103.805035Transcriptional regulator, AraC family
NH44784_0325210103.069988Glyoxalase family protein
NH44784_032531-192.990738PvdO, pyoverdine responsive serine/threonine
NH44784_032541-292.324990Transcriptional regulator, AraC family
NH44784_032551081.588507Macrophage infectivity potentiator-related
NH44784_032561-180.937076Alkylated DNA repair protein AlkB
NH44784_032571-181.437887hypothetical protein
NH44784_032581-2101.965209Cardiolipin synthetase
NH44784_0325911131.796973Predicted transcriptional regulators
NH44784_0326010123.016582Histone acetyltransferase HPA2 and related
NH44784_0326111123.628409hypothetical protein
NH44784_0326211134.314158Epoxide hydrolase
NH44784_0326312133.877007Transcriptional regulator
NH44784_0326414133.549434hypothetical protein
NH44784_0326513123.407310short-chain dehydrogenase/reductase SDR
NH44784_0326615133.411999hypothetical protein
NH44784_0326715133.398264hypothetical protein
NH44784_0326815123.964708Hydrogen peroxide-inducible genes activator
NH44784_0326913133.412378MATE efflux family protein
NH44784_032701-2201.499147probable transcriptional regulator
NH44784_032711-2201.482613FIG00986051: hypothetical protein
NH44784_032721-2230.846704Permease of the drug/metabolite transporter
NH44784_032731-1260.089128Transcriptional regulator, AraC family
NH44784_032741230-1.569805Galactose-binding protein regulator
NH44784_032751123-0.074672Chloramphenicol acetyltransferase
NH44784_0327610101.920180NADP-dependent malic enzyme
NH44784_0327711102.415455hypothetical protein
NH44784_032781-1102.318470hypothetical protein
NH44784_032791-2111.130991hypothetical protein
NH44784_032801-112-0.868829TonB-dependent receptor
NH44784_032811215-3.677159Nitrogenase FeMo-cofactor synthesis molybdenum
NH44784_032821217-4.334582hypothetical protein
NH44784_032831215-3.620663Cytochrome O ubiquinol oxidase subunit IV
NH44784_032841115-2.899708Cytochrome O ubiquinol oxidase subunit III
NH44784_032851116-1.916993Cytochrome O ubiquinol oxidase subunit I
NH44784_0328610130.887101Cytochrome O ubiquinol oxidase subunit II
NH44784_0328710152.557872Dna binding response regulator PrrA (RegA
NH44784_0328812142.992752Sensor histidine kinase PrrB (RegB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032481DHBDHDRGNASE1082e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 108 bits (270), Expect = 2e-30
Identities = 71/254 (27%), Positives = 111/254 (43%), Gaps = 13/254 (5%)

Query: 7 GKRVLITGASKGIGLACATAFAREGAEPILAARDDAALQAASQAILAQTGRQPRTVAVDL 66
GK ITGA++GIG A A A +GA I A + + L R D+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 67 AQPGA----AARVLERTGAIDILVNNAGAVPGGALDQVQDERWRAGWDLKVHGYIDLARH 122
A AR+ G IDILVN AG + G + + DE W A + + G + +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 YYPLMRAAGAGVIANIIGMAGSAPRADYICGAAANASLIAFTRALGGDGPRHGVRVFGVN 182
M +G I + PR A++ A+ + FT+ LG + + +R V+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PSRTRTDRVLTLARQRAQARWGDESRWQETLND----LPFGRLMEPEEVADMVVFGASPR 238
P T TD + G E + +L +P +L +P ++AD V+F S +
Sbjct: 187 PGSTETD----MQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 239 AGYLSGTVIDLDGG 252
AG+++ + +DGG
Sbjct: 243 AGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032601SACTRNSFRASE371e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.2 bits (86), Expect = 1e-05
Identities = 19/90 (21%), Positives = 37/90 (41%), Gaps = 5/90 (5%)

Query: 66 ALLVAWQDGRIAGSVQLDCDTPPNQPHRAEIRKLMVHPDFRRQGIARALMRAAEAAAVVA 125
A + + + G +++ N A I + V D+R++G+ AL+ A A
Sbjct: 66 AAFLYYLENNCIGRIKIR----SNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN 121

Query: 126 GRSLITLDTRTGD-NAEPLYASLGYHTVGV 154
+ L+T+ + +A YA + V
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032651DHBDHDRGNASE952e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 95.1 bits (236), Expect = 2e-25
Identities = 55/184 (29%), Positives = 87/184 (47%), Gaps = 3/184 (1%)

Query: 5 KIWFVTGASGGLGLALVRMLLDQGHKVAATSRDVDALMDAVGA-KLEGRFL-PLEVNLAD 62
KI F+TGA+ G+G A+ R L QG +AA + + L V + K E R ++ D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 EGNVRRAVDTTVAAFGRIDVVVNNAGHGLAGALETLSDAELRDSFDINVFGVLNVVRAVM 122
+ G ID++VN AG G + +LSD E +F +N GV N R+V
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 123 PRLRAQRGGHVFNISSILGFDGGSAGWGAYSAAKFAVSGLSETLALEAAPHGVRVSLVYP 182
+ +R G + + S AY+++K A ++ L LE A + +R ++V P
Sbjct: 129 KYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 183 GAMR 186
G+
Sbjct: 188 GSTE 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032871HTHFIS968e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.4 bits (240), Expect = 8e-26
Identities = 28/99 (28%), Positives = 52/99 (52%)

Query: 9 LLIDDDELYVRTLQRSLSRRGLETRTATGIAEALRVAEEIRPAFALVDLRLGEDSGLTLI 68
L+ DDD L ++LSR G + R + A R + D+ + +++ L+
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLL 66

Query: 69 RPLRALRADMRILLVTGYASVATAVEAIKRGADDYLPKP 107
++ R D+ +L+++ + TA++A ++GA DYLPKP
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP 105



Score = 50.2 bits (120), Expect = 9e-10
Identities = 16/40 (40%), Positives = 25/40 (62%)

Query: 134 LHRLEWEHIHQALHETGGNVSAAARLLGMHRRSLQRKLAK 173
L +E+ I AL T GN AA LLG++R +L++K+ +
Sbjct: 433 LAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


41NH44784_033201NH44784_033251Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0332010133.690165hypothetical protein
NH44784_0332110154.192358ADA regulatory protein /
NH44784_0332212133.417658Methylated-DNA--protein-cysteine
NH44784_0332311133.930473probable permease protein
NH44784_033241-1133.165384hypothetical protein
NH44784_0332510143.061699Probable two-component sensor, near polyamine
42NH44784_033531NH44784_033621Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0335312111.753893hypothetical protein
NH44784_033541182.199883Propeptide, PepSY amd peptidase M4
NH44784_033551192.680573Two-component system response regulator QseB
NH44784_033561192.868592Sensor protein PhoQ
NH44784_0335713112.424362Universal stress protein family
NH44784_0335812112.243466putative universal stress protein
NH44784_0335911102.528779Glutamate racemase
NH44784_0336012111.400064LysR family transcriptional regulator STM3121
NH44784_0336113110.841550Chromate transport protein ChrA
NH44784_0336212121.521553Chromate transport protein ChrA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_033551HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.5 bits (235), Expect = 1e-24
Identities = 34/119 (28%), Positives = 59/119 (49%)

Query: 2 RILVVEDEPTLAAQLAEALRLAGYTVDTAADGADAHYMGEVETYDVVVLDLGLPVMDGLT 61
ILV +D+ + L +AL AGY V ++ A D+VV D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLKQWRAAGRGMPVLILTARSNWHEKVAGIDAGADDYLTKPFHMEELLARVRALLRRYS 120
+L + + A +PVL+++A++ + + + GA DYL KPF + EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_033561PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.002
Identities = 41/219 (18%), Positives = 73/219 (33%), Gaps = 50/219 (22%)

Query: 271 ATREEGAFARLVRE-QVAMAQRQIDHH-----LARARAAAASGTAGGRTPLRAPLQALLR 324
A ++ A + +E Q+ + QI+ H L RA R L + L L+R
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTS-LSELMR 205

Query: 325 ------------VMQRLYAARE-LQLEMDEFADRL--EFRGEEQDLQEMVGNLL-----D 364
+ L LQL +F DRL E + + V +L +
Sbjct: 206 YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVE 265

Query: 365 NACKWAAR------RVRIGAQATAPDRLSIVIDDDGPGISEDERERIFERGVRMDEQRPG 418
N K ++ + +++ +++ G ++ +E
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDN-GTVTLEVENTGSLALKNTKE--------------S 310

Query: 419 SGLGLDIVRD-LAGTYGGEVSAG-PSPLGGLRVTLTLPA 455
+G GL VR+ L YG E G + + +P
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


43NH44784_035781NH44784_036041Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_035781020-6.608950Phosphate ABC transporter, periplasmic
NH44784_035791024-6.891140Phosphate transport system permease protein PstC
NH44784_035801127-7.000206Phosphate transport system permease protein PstA
NH44784_035811132-7.341684Phosphate transport ATP-binding protein PstB (TC
NH44784_035821137-6.976353benzoate MFS transporter BenK
NH44784_035841143-8.137782*type III restriction enzyme, res subunit
NH44784_035851-223-3.947497Transcriptional regulator, LysR family
NH44784_035861-222-2.750988hypothetical protein
NH44784_035871-121-1.6785494-carboxymuconolactone decarboxylase
NH44784_035881-124-2.3269244-carboxymuconolactone decarboxylase
NH44784_035891-222-2.885877FIG01219199: hypothetical protein
NH44784_035901-120-3.197360oxidoreductase, short-chain
NH44784_035911020-2.675202hypothetical protein
NH44784_035921-117-3.256169drug:proton antiporter
NH44784_035931016-3.821567NAD(P)H oxidoreductase YRKL
NH44784_035941016-3.714756Butyryl-CoA dehydrogenase
NH44784_035951117-3.307232hypothetical protein
NH44784_035961016-2.330502hypothetical protein
NH44784_035971-118-2.100885Aldehyde dehydrogenase
NH44784_035981018-2.596695putative thiolase
NH44784_035991121-2.920099protein of unknown function DUF35
NH44784_036001118-3.233604Enoyl-CoA hydratase
NH44784_036011118-2.892404Enoyl-CoA hydratase
NH44784_036021118-3.911782hypothetical protein
NH44784_036031118-3.860574FIG01198985: hypothetical protein
NH44784_036041015-3.401455hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_035821TCRTETA561e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.0 bits (135), Expect = 1e-10
Identities = 81/412 (19%), Positives = 144/412 (34%), Gaps = 38/412 (9%)

Query: 21 LFVCWLAILFEGYDVGVMGAVLPALADD--KSWNLSSIELGMLGSYALAGMFVGSFAVGT 78
L V + + +G++ VLP L D S ++++ +L YAL G
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVL-GA 65

Query: 79 LSDLLGRRRMLLACVTLFSLTMIGAAWAPTPTWFAAFRFIGGLGLGGVIPVAAALTIEYS 138
LSD GRR +LL + ++ A AP R + G+ G VA A + +
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADIT 124

Query: 139 PPAKRGFNYGVMYSGYSLGILCSAAVAMAVLQHWGWRAVILVGAAPLLLVPVLARLLPES 198
+R ++G M + + G++ + ++ + A AA L +
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSPHAPFFAAAALNGLNFLTG------ 177

Query: 199 LEYLLAKGDEAGARAMAQRLGIASLESFRPREVTGEKTTWRDVMAAVFAPRHLRATLCFW 258
L G R +R + L SFR W M +
Sbjct: 178 --CFLLPESHKGERRPLRREALNPLASFR----------WARGMT----------VVAAL 215

Query: 259 GALFLGMMLVYGLNTWLPQIMRKNGYDLGSS---ISFLLVFSLASAIGGLVLGQFADKSE 315
A+F M LV + L I ++ + ++ IS L S ++ G A +
Sbjct: 216 MAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLG 275

Query: 316 PRRIVALSYLVGAIGIYCLTYNNPLAVNYLWVALAGIGSISSSLILTGYLAGYYPAHARG 375
RR + L + G L + + + + L G I L L+ +G
Sbjct: 276 ERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMP-ALQAMLSRQVDEERQG 334

Query: 376 AATGWALSFARIGAMSGPIVGGYLAGLQLPLAWNFMAFSCAAVLAAVFVILI 427
G + + ++ GP++ + + WN A+ A L + + +
Sbjct: 335 QLQGSLAALTSLTSIVGPLLFTAIYAASIT-TWNGWAWIAGAALYLLCLPAL 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_035901DHBDHDRGNASE872e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.6 bits (214), Expect = 2e-22
Identities = 64/250 (25%), Positives = 109/250 (43%), Gaps = 13/250 (5%)

Query: 2 IKDKVIIITGASSGIGEATAKLLASKGARVVLGARRADNLKRIADEIKQHGGQAVYRELD 61
I+ K+ ITGA+ GIGEA A+ LAS+GA + + L+++ +K A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 62 VAKQADNDAIVKLAKEKFGRVDAIFLNAGLMPTSPLSALKTDDWHQMVDVNVKGVLNGVA 121
V A D I + + G +D + AG++ + +L ++W VN GV N
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 122 AVLPEFLAQKSGHVIATSSVAGLKAYPGSAVYGGTKWFVRDFMEVLRIESAMEGTNIRTA 181
+V + ++SG ++ S A Y +K F + L +E A NIR
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA--EYNIRCN 183

Query: 182 TIYPAAINTELLATI--SDKQSLDQMRGLYDTY--GI------SPDRIANVVAF-AIDQP 230
+ P + T++ ++ + + ++G +T+ GI P IA+ V F Q
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 231 DDTTINEFTV 240
T++ V
Sbjct: 244 GHITMHNLCV 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_035911TCRTETB362e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.6 bits (82), Expect = 2e-06
Identities = 17/45 (37%), Positives = 22/45 (48%), Gaps = 1/45 (2%)

Query: 11 HRAQATLVAARLAQGVGAAATVPASMALIRQAYPDPLRRGHAVAL 55
H + L+ AR QG GAAA PA + ++ Y RG A L
Sbjct: 100 HSFFSLLIMARFIQGAGAAA-FPALVMVVVARYIPKENRGKAFGL 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_035921TCRTETB280.028 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.3 bits (63), Expect = 0.028
Identities = 12/65 (18%), Positives = 25/65 (38%), Gaps = 3/65 (4%)

Query: 10 GPVIGGWLVSLDWRWVFYLNLPVGLLTVAFLART--PVSPRRTAPFDMAGQAMATLAMGA 67
GP IGG +++ W + L +P+ + R FD+ G + ++ +
Sbjct: 155 GPAIGG-MIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVF 213

Query: 68 LIYGA 72
+
Sbjct: 214 FMLFT 218


44NH44784_036161NH44784_036401Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_036161017-3.453246Enoyl-CoA hydratase/isomerase
NH44784_036171019-3.982349Long-chain-fatty-acid--CoA ligase
NH44784_036181-122-4.884670Naphthoate synthase
NH44784_036191-127-5.029174L-carnitine dehydratase/bile acid-inducible
NH44784_036201-130-6.020410sigma54 specific transcriptional regulator, Fis
NH44784_036211034-6.305216hypothetical protein
NH44784_036221-128-4.616990Oxidoreductase, aldo/keto reductase family
NH44784_036231023-4.6082583-oxoacyl-[acyl-carrier protein] reductase
NH44784_036241024-4.563278D-beta-hydroxybutyrate dehydrogenase
NH44784_036251023-4.050878Cupin 2 conserved barrel domain protein
NH44784_036261127-5.0381914-carboxymuconolactone decarboxylase
NH44784_036271029-5.895630hypothetical protein
NH44784_036281131-6.315745conserved hypothetical signal peptide protein
NH44784_036291239-7.966108hypothetical protein
NH44784_036301238-7.795071Xanthine transporter,putative
NH44784_036311233-7.243359Transcriptional regulator, LysR family
NH44784_036321129-6.311810Outer membrane protein romA
NH44784_036331126-5.207049DNA repair protein RadC
NH44784_036341226-4.668751phage integrase, putative
NH44784_036351-117-2.245695hypothetical protein
NH44784_036361-213-2.944746putative transferase
NH44784_036371112-4.241237hypothetical protein
NH44784_036381012-3.819986Transporter, LysE family
NH44784_036391-111-3.742165hypothetical protein
NH44784_036401-28-3.252232hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_036201HTHFIS358e-120 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 358 bits (920), Expect = e-120
Identities = 132/365 (36%), Positives = 202/365 (55%), Gaps = 25/365 (6%)

Query: 206 ISRLRKEVNFYRRELSSMDQPNRGLDAIIGDSEPVARLKEQIVKIAPLDVPVLLLGESGV 265
I + + + +R S ++ ++ ++G S + + + ++ D+ +++ GESG
Sbjct: 112 IGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGT 171

Query: 266 GKDVVAHAIHMLGPRASDPMVVINAAALPVSLVESELFGYESGAFTGADRKGRPGKFEQA 325
GK++VA A+H G R + P V IN AA+P L+ESELFG+E GAFTGA + G+FEQA
Sbjct: 172 GKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQA 230

Query: 326 DGGSLFLDEIGDMPLEVQVKLLRTLQDGTFQRVGGESLHRSNFRLISASNRDFHRMLDSG 385
+GG+LFLDEIGDMP++ Q +LLR LQ G + VGG + RS+ R+++A+N+D + ++ G
Sbjct: 231 EGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQG 290

Query: 386 AFRLDLFYRISAVTLRLPSLRERLEDIPVLAGTFLKQFADRHGLPVKEIEPAALRFLQSL 445
FR DL+YR++ V LRLP LR+R EDIP L F++Q + GL VK + AL +++
Sbjct: 291 LFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAH 349

Query: 446 PWPGNVRQLQHAIERAAIFCDGDAIRVADFE-----------------------MAGEAR 482
PWPGNVR+L++ + R D I E ++
Sbjct: 350 PWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVE 409

Query: 483 GAYHLDTARVATAAPRDGSMKEVKDRMEADLIVEYLQRFHGNKKKVAEALGISRSYLYKR 542
A A P G V ME LI+ L GN+ K A+ LG++R+ L K+
Sbjct: 410 ENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKK 469

Query: 543 LAEIN 547
+ E+
Sbjct: 470 IRELG 474


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_036231DHBDHDRGNASE1031e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (257), Expect = 1e-28
Identities = 71/249 (28%), Positives = 120/249 (48%), Gaps = 16/249 (6%)

Query: 3 ENIKDKVVVITGASSGLGEATARLLAQNGAKLILAARRLDRLQALAHQLGLGA--DATVK 60
+ I+ K+ ITGA+ G+GEA AR LA GA + ++L+ + L A
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 61 ADVTDRSQVEALIAHAVKKHGRVDVLLNNAGLMPSSMLENLHIDEWDRMIDVNIKGVLYG 120
ADV D + ++ + A ++ G +D+L+N AG++ ++ +L +EW+ VN GV
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 121 IAAVLPVMQRQKSGHIINVSSVAGHKVGPGGTVYAATKHAVRVISEGLRQEVKPYNIRTT 180
+V M ++SG I+ V S YA++K A + ++ L E+ YNIR
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 181 IISPGAVATNL-----IDTITDPAIAANMKKTYEKAIPAESFARVVAFAMSQPEDVDVNE 235
I+SPG+ T++ D + +T++ IP + A+ P D+ +
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAK--------PSDI-ADA 234

Query: 236 VLFRPTSQA 244
VLF + QA
Sbjct: 235 VLFLVSGQA 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_036241DHBDHDRGNASE1341e-40 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 134 bits (339), Expect = 1e-40
Identities = 83/253 (32%), Positives = 119/253 (47%), Gaps = 8/253 (3%)

Query: 6 EGKVALVTGASAGIGLATAKAFAEAGARVALAAHDKYKLEEAANTLSAAGHEVLAVTCDV 65
EGK+A +TGA+ GIG A A+ A GA +A ++ KLE+ ++L A A DV
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 66 ADEAQVKAMVERTIKKFGRLDAAYNNAGVQSPVAETADADGKEFDHVQAVNLRGVWSCMK 125
D A + + R ++ G +D N AGV P +E++ +VN GV++ +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRP-GLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 126 YELIQMRDQGSGAIVNCSSMGGLIGIAGRGAYHASKHGVIGLTKSAALEYAARGIRINAV 185
M D+ SG+IV S + AY +SK + TK LE A IR N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 186 CPGIIETPMVSRMLKREPEAMDILMKDQ-------PIGRLGRPEEIASAVLWLCSPGASF 238
PG ET M + E A ++ P+ +L +P +IA AVL+L S A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 239 VIGQALAVDGGYT 251
+ L VDGG T
Sbjct: 246 ITMHNLCVDGGAT 258


45NH44784_037561NH44784_038201Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_037561011-3.099383GTP-binding protein EngA
NH44784_037571013-2.897121Histidinol-phosphate aminotransferase
NH44784_037581015-2.691425RNA-binding protein Hfq
NH44784_037591-114-2.598393GTP-binding protein HflX
NH44784_037601-113-2.791841HflK protein
NH44784_037611-113-2.508700HflC protein
NH44784_037621113-2.938134ATP phosphoribosyltransferase regulatory
NH44784_037631112-3.463665Adenylosuccinate synthetase
NH44784_037641113-3.432496Xanthine-guanine phosphoribosyltransferase
NH44784_037651112-2.905133SSU ribosomal protein S21p
NH44784_037661011-2.308140DNA primase
NH44784_037671-110-2.568676RNA polymerase sigma factor RpoD
NH44784_037691-117-1.910872*hypothetical protein
NH44784_037701-217-1.938485Putative lipopolysaccharide biosynthesis-related
NH44784_037711018-2.854929hypothetical protein
NH44784_037721118-3.159727Probable lipopolysaccharide modification
NH44784_037731321-3.758471hypothetical protein
NH44784_037741219-2.987239hypothetical protein
NH44784_037751220-3.016295phage-related hypothetical protein
NH44784_037761537-7.137287Recombinational DNA repair protein RecT
NH44784_037771542-7.164218hypothetical protein
NH44784_037781346-7.287558hypothetical protein
NH44784_037791246-7.240681phage-related hypothetical protein
NH44784_037801247-7.654980hypothetical protein
NH44784_037811143-6.131054hypothetical protein
NH44784_037821-1300.146029hypothetical protein
NH44784_037831-228-0.778774hypothetical protein
NH44784_037841-228-0.869002hypothetical protein
NH44784_037851-125-0.748266hypothetical protein
NH44784_037861124-1.211108hypothetical protein
NH44784_037871223-1.446792hypothetical protein
NH44784_037881219-2.065462hypothetical protein
NH44784_037891218-1.247388FIG00956635: hypothetical protein
NH44784_037901216-1.012503hypothetical protein
NH44784_037911215-1.359984hypothetical protein
NH44784_037921214-0.829316hypothetical protein
NH44784_037931113-0.767962hypothetical protein
NH44784_037941114-1.075996Mlr8006 protein
NH44784_037951217-1.289830hypothetical protein
NH44784_037961216-1.732146hypothetical protein
NH44784_037971020-1.605718hypothetical protein
NH44784_037981-118-1.903493hypothetical protein
NH44784_037991018-2.279942hypothetical protein
NH44784_038001-117-2.012237hypothetical protein
NH44784_038011017-2.300216hypothetical protein
NH44784_038021017-2.399544hypothetical protein
NH44784_038031118-2.214096hypothetical protein
NH44784_038041220-1.966564hypothetical protein
NH44784_038051219-1.880479tail length tape measure protein
NH44784_038061320-1.471561hypothetical protein
NH44784_038071221-1.899316hypothetical protein
NH44784_038081221-1.799814putative bacteriophage protein
NH44784_038091125-1.329684hypothetical protein
NH44784_038101127-1.953801hypothetical protein
NH44784_038111129-2.377555Putative bacteriophage protein
NH44784_038121333-3.306606hypothetical membrane associated protein
NH44784_038131036-0.594498phage-related putative membrane protein
NH44784_038141038-0.262846phage-related hypothetical protein
NH44784_0381510370.255658hypothetical protein
NH44784_0381610320.546766hypothetical protein
NH44784_0381711310.780478integral membrane protein
NH44784_0381811310.889638hypothetical protein
NH44784_0381913320.958502hypothetical protein
NH44784_0382013300.111432Flp pilus assembly protein RcpC/CpaB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_037941RTXTOXIND290.034 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.034
Identities = 19/146 (13%), Positives = 43/146 (29%), Gaps = 26/146 (17%)

Query: 217 DAEAARADAEKARADQAEQSIEQARQDAKGAALARVKLEAVATEHKVAFKADSTDRALRE 276
A A AD K ++ + +EQ R ++ KL + + + + E
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE------LKLPDEPYFQNVSE 181

Query: 277 GVIKAVRGDSADLADKSDGYIEAAFDLAVGEAKSRQDAVETQRRELAG---------GEP 327
+ + I+ F + ++ ++ +R E
Sbjct: 182 EEVLRLT-----------SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 328 SAARQQRSDSAPPPKSASAARSAYLA 353
+ + D + + A+ A L
Sbjct: 231 RVEKSRLDDFSSLLHKQAIAKHAVLE 256


46NH44784_038351NH44784_038401Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_038351515-2.308974diguanylate cyclase/phosphodiesterase (GGDEF &
NH44784_038361618-2.180683Chorismate synthase
NH44784_038371618-2.025712Zn-dependent protease with chaperone function
NH44784_038381619-2.155353GGDEF and EAL domain proteins
NH44784_038391619-2.292396FIGfam010717
NH44784_038401720-2.338950T1SS secreted agglutinin (RTX
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_038401CABNDNGRPT971e-22 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 97.0 bits (241), Expect = 1e-22
Identities = 46/192 (23%), Positives = 69/192 (35%), Gaps = 21/192 (10%)

Query: 2003 FLEMRDGHAPTNGDLYQYIKDHHADFNLADDPRGGDDTIHGGTGDDIIYGQGGNDTLYGD 2062
F D ++ DF+ + G GN ++
Sbjct: 281 FYTATDSSKALIFSVWDAGGTDTFDFSGYS---NNQRINLNEGSFSDVGGLKGNVSIAHG 337

Query: 2063 DGNDIIYGGAGDDKLYGGEGNDVLHGGSGNDTLEGGNGNDLLIGGKGDDTLIGGAGSDTF 2122
+ GG+G+D L G +++L GG+GND L GG G D L GG G DT + G+G D+
Sbjct: 338 VTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGSGQDST 397

Query: 2123 KWELGDQGTTAKPAVDTIKDFSLDKPADGGDVLDLKDLLVGEKDGTLTQYLNFHKEGNNT 2182
A D I DF G D +DL E + Q F +G
Sbjct: 398 -----------VAAYDWIADFQK-----GIDKIDLSAFR-NEGQLSFVQ-DQFTGKGQEV 439

Query: 2183 VIDVNTQGKLGT 2194
++ + +
Sbjct: 440 MLQWDAANSITN 451


47NH44784_038521NH44784_038591Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_038521022-3.760290tRNA pseudouridine synthase A
NH44784_038531122-4.680793TRAP transporter solute receptor, unknown
NH44784_038541130-7.059826Phosphoribosylanthranilate isomerase
NH44784_038561326-6.461270*Phage integrase
NH44784_038571230-5.566550hypothetical protein
NH44784_038581127-3.893556hypothetical protein
NH44784_038591126-3.591056hypothetical protein
48NH44784_038861NH44784_039081Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_038861215-2.996257hypothetical protein
NH44784_038871113-2.763045Coproporphyrinogen III
NH44784_038881-112-2.575449probable two-component response regulator
NH44784_038891013-3.100664dTDP-4-dehydrorhamnose reductase
NH44784_038901-310-2.174110Threonyl-tRNA synthetase
NH44784_038911-19-0.929400Translation initiation factor 3
NH44784_03892109-0.584480Glutathione synthetase
NH44784_038931011-0.333873PTS system, mannose-specific IIA component
NH44784_038941-111-0.577967phosphocarrier protein HPr
NH44784_038951-1101.869287Phosphoenolpyruvate-protein phosphotransferase
NH44784_0389611101.092052Ammonium transporter
NH44784_0389713151.575254Nitrogen regulatory protein P-II
NH44784_0389813130.486975hypothetical protein
NH44784_038991190.557077COGs COG2960
NH44784_039001290.667950MG(2+) CHELATASE FAMILY PROTEIN / ComM-related
NH44784_039011311-1.615178LSU ribosomal protein L35p
NH44784_039021013-1.370406LSU ribosomal protein L20p
NH44784_039031013-1.117916Phenylalanyl-tRNA synthetase alpha chain
NH44784_039041-113-0.368893Phenylalanyl-tRNA synthetase beta chain
NH44784_039051-126-2.384357Integration host factor alpha subunit
NH44784_039061127-2.576918Transcriptional regulator, MerR family
NH44784_039081127-3.182780*hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_038881HTHFIS548e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.5 bits (131), Expect = 8e-11
Identities = 35/164 (21%), Positives = 65/164 (39%), Gaps = 11/164 (6%)

Query: 39 IRVAILDDHPVVALGVGAYLDSRPGFRVIHQETSARRLLEKLATSPCDVALIDFYLPQEP 98
+ + DD + + L SR G+ V ++A L +A D+ + D +P E
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDV-RITSNAATLWRWIAAGDGDLVVTDVVMPDE- 60

Query: 99 WDGVNYLRRLRRYHPSMAIITFSAGNRQETQYAAFRAGANGYLAKQWGMILLPEMIHGVL 158
+ + L R+++ P + ++ SA N T A GA YL K + L E+I +
Sbjct: 61 -NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD---LTELIG--I 114

Query: 159 SRKDPFLSVQEGKIRPLSPRPPHALLTTSE--VEVLRHISQGMS 200
+ + + L+ S E+ R +++ M
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQ 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_038951PHPHTRNFRASE5700.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 570 bits (1471), Expect = 0.0
Identities = 230/555 (41%), Positives = 329/555 (59%), Gaps = 7/555 (1%)

Query: 3 GKGVARGYAIGRAVVMGAAALEVAHYRIAPEDVPAESSRLTEALASAQQELRQLADTLPA 62
G + G AI +A + +++ I DV E +LT AL +++ELR + D A
Sbjct: 7 GIAASSGVAIAKAFIHLEPNVDIEKTSI--TDVSTEIEKLTAALEKSKEELRAIKDQTEA 64

Query: 63 DAPRELGAMLNVHSLLLGDPLLAEQTLALIAERHYNAEWALTTQGQILGQQFDAMEDEYL 122
+ + H L+L DP L + I NAE+AL + F++M++EY+
Sbjct: 65 SMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYM 124

Query: 123 RERGADVRQVIERVLHVLSGTSAILPDMSHMGGDEALVVVAHDISPADMLRLRGGRFAAF 182
+ER AD+R V +RVL L G ++ + E V++A D++P+D +L F
Sbjct: 125 KERAADIRDVSKRVLGHLIGVE--TGSLATI--AEETVIIAEDLTPSDTAQLNKQFVKGF 180

Query: 183 VTDLGGPTSHTAIVARSMGVPAVVAMGNVRELVRDGDMLIIDGAAGAVVVNPSERILQEY 242
TD+GG TSH+AI++RS+ +PAVV V E ++ GDM+I+DG G V+VNP+E ++ Y
Sbjct: 181 ATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAY 240

Query: 243 RRRQAAYADERAELGLLRDEPSVTLDGIDIVLHANIELPEEAALALASGAHGIGLFRSEF 302
++AA+ ++ E L EPS T DG + L ANI P++ LA+G GIGL+R+EF
Sbjct: 241 EEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEF 300

Query: 303 LFMGRPDLPGEEEQYEAYASVVKVMAGRPVTIRTLDIGSDKTLDG-EATVATNPALGQRA 361
L+M R LP EEEQ+EAY VV+ M G+PV IRTLDIG DK L + NP LG RA
Sbjct: 301 LYMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRA 360

Query: 362 IRYCLARPEMFATQLRAILRASAHGPVRLLIPMIAHMHEVVATRAAIESARRELDARGQA 421
IR CL + ++F TQLRA+LRAS +G ++++ PMIA + E+ +A ++ + +L + G
Sbjct: 361 IRLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVD 420

Query: 422 YAAHMEVGAMVEVPAIAIAIEPFAQALDFLSIGTNDLIQYTLAIDRGDHDVASLYDPLHP 481
+ +EVG MVE+P+ A+A FA+ +DF SIGTNDLIQYT+A DR + V+ LY P HP
Sbjct: 421 VSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHP 480

Query: 482 AVLRLVAHTINAGERAGKPVAVCGEMAGDSRMTRLLLGLGLTEFSMHPQQLLDVKREVRR 541
A+LRLV I A GK V +CGEMAGD LLLGLGL EFSM +L + ++ +
Sbjct: 481 AILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLK 540

Query: 542 AHTNALRVKVASALN 556
L+ AL
Sbjct: 541 LSKEELKPFAQKALM 555


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039001HTHFIS340.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.002
Identities = 33/145 (22%), Positives = 48/145 (33%), Gaps = 13/145 (8%)

Query: 220 ARRALEVAAAGGHSLLMVGPPGAGKSMLAARLPGLLPPLARTQALEVAAIAALAGAPEAR 279
R L +L++ G G GK ++A L R +AA+ P
Sbjct: 149 IYRVLARLMQTDLTLMITGESGTGKELVARALHDYGK--RRNGPFVAINMAAI---PRDL 203

Query: 280 MGTPPFRAPHHSASPFALVGGGGRPRPGEISLAHHGVLFLDELPEFSRRALEALREPLET 339
+ + F H F G G A G LFLDE+ + A L L+
Sbjct: 204 IESELF---GHEKGAFT---GAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQ 257

Query: 340 GRVVIARALHTAQFPARFQLVAAMN 364
G + ++VAA N
Sbjct: 258 GE--YTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039051DNABINDINGHU1172e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (296), Expect = 2e-38
Identities = 37/89 (41%), Positives = 53/89 (59%)

Query: 9 TKAELAELLFERVGLNKREAKDIVDTFFEEIRDALARGDSVKLSGFGNFQVRDKPPRPGR 68
K +L + E L K+++ VD F + LA+G+ V+L GFGNF+VR++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 69 NPKTGETIPIAARRVVTFHASQKLKSVVE 97
NP+TGE I I A +V F A + LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


49NH44784_039211NH44784_039361Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_039211120-3.140309putative lipoprotein
NH44784_039221121-1.680963hypothetical protein
NH44784_039231222-1.748085Transcriptional regulator BkdR of isoleucine and
NH44784_039241319-0.307455Permease of the drug/metabolite transporter
NH44784_039251013-0.311287hypothetical protein
NH44784_0392610140.028270FIG00431532: hypothetical protein
NH44784_039271-1140.488391Mannose-6-phosphate isomerase
NH44784_039281-1130.292190FIG00431487: hypothetical protein
NH44784_039291-3150.946662transcriptional regulator, LysR-family
NH44784_039301-2210.781614hypothetical protein
NH44784_039311-1252.880889putative hydrolase
NH44784_039321-1263.054471hypothetical protein
NH44784_0393311223.451170ABC-type nitrate/sulfonate/bicarbonate transport
NH44784_0393412243.684498Alkanesulfonates transport system permease
NH44784_0393512203.830846Alkanesulfonates ABC transporter ATP-binding
NH44784_0393612133.276742hypothetical protein
50NH44784_039581NH44784_039901Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_039581-1163.597694Transamidase GatB domain protein
NH44784_0395911124.559244FIG138576: 3-oxoacyl-[ACP] synthase
NH44784_0396012105.3323403-oxoacyl-[ACP] reductase
NH44784_039611395.1518423-hydroxydecanoyl-[ACP] dehydratase
NH44784_039621395.1277553-oxoacyl-[ACP] synthase
NH44784_0396313104.833074FIG085779: Lipoprotein
NH44784_0396413114.750061FIG021862: membrane protein, exporter
NH44784_0396512113.584459FIG027190: Putative transmembrane protein
NH44784_0396611133.084577Lysophospholipid acyltransferase
NH44784_0396711132.712221FIG143263: Glycosyl transferase /
NH44784_0396811122.470081FIGfam138462: Acyl-CoA synthetase, AMP-(fatty)
NH44784_0396911142.188791FIG017861: hypothetical protein
NH44784_0397010111.351754Acyl carrier protein
NH44784_0397110112.054472Acyl carrier protein (ACP1
NH44784_0397210102.850418FIG018329: 1-acyl-sn-glycerol-3-phosphate
NH44784_0397310102.4296273-oxoacyl-[ACP] synthase
NH44784_0397411113.0439644'-phosphopantetheinyl transferase superfamily
NH44784_0397513122.993769Flavohemoprotein (Hemoglobin-like protein)
NH44784_0397614143.266443Nitrite-sensitive transcriptional repressor
NH44784_0397714153.525667D-amino acid dehydrogenase small subunit
NH44784_0397813154.449276Permease of the drug/metabolite transporter
NH44784_0397912165.601372Proline dehydrogenase
NH44784_0398012165.374595Transcriptional regulator, AraC family
NH44784_0398112174.922247Hydrolases of the alpha/beta superfamily
NH44784_0398213185.416412Glycine cleavage system transcriptional
NH44784_0398310175.358100Permeases of the major facilitator superfamily
NH44784_039841-1194.125527Probable transcriptional regulator lrhA
NH44784_0398510214.194892putative lysR-family transcriptional regulator
NH44784_0398610204.088101hypothetical protein
NH44784_0398711174.260251Glutathione S-transferase, unnamed subgroup
NH44784_039881-1123.789148Phenylacetate-CoA oxygenase/reductase, PaaK
NH44784_0398910133.555377Transcriptional regulator, HxlR family
NH44784_0399010133.248275Permeases of the major facilitator superfamily
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039581IGASERPTASE320.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.001
Identities = 25/128 (19%), Positives = 49/128 (38%), Gaps = 12/128 (9%)

Query: 4 ATLKTRLSDAVKDAMRAKATERLATLRFLLAAVKQKEVD------ERRDLSDAEITAIIE 57
T++ DA + + + + A + A + EV + ++ + TA +E
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSN-VKANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 58 KQVKQRRESIAAFEQAGRT--ETAEQEKAELVVLQEFLPQAATPEEVAAAIDAALAEVAA 115
K+ K + E+ E T + +QE++E V Q + A + I ++
Sbjct: 1108 KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ---AEPARENDPTVNIKEPQSQTNT 1164

Query: 116 QGVTGAPA 123
T PA
Sbjct: 1165 TADTEQPA 1172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039601DHBDHDRGNASE1022e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 102 bits (254), Expect = 2e-28
Identities = 67/248 (27%), Positives = 110/248 (44%), Gaps = 14/248 (5%)

Query: 6 ILVTGSSRGIGRAIALALADAGHDLVLHCRQQRAQAEAVQAEIAARGRQARVLQFDVADR 65
+TG+++GIG A+A LA G + + V + A R A DV D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSL-KAEARHAEAFPADVRDS 69

Query: 66 AQCAAVLQADVDAHGAYYGVVLNAGLTRDGAFPALTGDDWDQVLRTNLDGFYNVLGPLAM 125
A + G +V AG+ R G +L+ ++W+ N G +N ++
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 126 PMIRRRAPGRVVCMASVSGLIGNRGQVNYSASKAGLIGAAKALAVELAKRQITVNCVAPG 185
M+ RR+ G +V + S + Y++SKA + K L +ELA+ I N V+PG
Sbjct: 130 YMMDRRS-GSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 186 LIDTDMI-------DAHVPV-----EEILKAVPAQRMGRPEEVAATVAFLMSPGAAYITR 233
+TDM + V E +P +++ +P ++A V FL+S A +IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 234 QVIAVNGG 241
+ V+GG
Sbjct: 249 HNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039641ACRIFLAVINRP413e-05 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 40.6 bits (95), Expect = 3e-05
Identities = 29/142 (20%), Positives = 48/142 (33%), Gaps = 26/142 (18%)

Query: 648 LVAAALLCLTL------GRAAAWRILAVPLAATACSLAALGYLGQPLTLFSLFGLLLVSA 701
A L+ L + RA +AVP+ + A L G + ++FG++L
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLG-TFAILAAFGYSINTLTMFGMVLAIG 403

Query: 702 IGVDYAIFMFERVAGAAA-------------------SLVGIMLGAITTLLSFGLLAVSR 742
+ VD AI + E V +LVGI + + S
Sbjct: 404 LLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGST 463

Query: 743 TPAIANFGLAVALGVGFSLLWA 764
F + + + S+L A
Sbjct: 464 GAIYRQFSITIVSAMALSVLVA 485



Score = 30.6 bits (69), Expect = 0.031
Identities = 32/134 (23%), Positives = 56/134 (41%), Gaps = 18/134 (13%)

Query: 257 IAMAGIVLVLLLALRRWLALLALVP-VAVGL-LAGTVACVAVFG-SIHALTLVIGASLIG 313
A+ + LV+ L L+ A L+P +AV + L GT A +A FG SI+ LT+ IG
Sbjct: 346 EAIMLVFLVMYLFLQNMRA--TLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIG 403

Query: 314 VAVDFPM------HWLGKSYGMPDWRA-WPALRRVLPGLTISLAASLVGYVALAFTP--- 363
+ VD + + +P A ++ ++ L ++ +AF
Sbjct: 404 LLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGST 463

Query: 364 ---FPALTQTAVFS 374
+ + T V +
Sbjct: 464 GAIYRQFSITIVSA 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039831TCRTETA348e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 8e-04
Identities = 84/377 (22%), Positives = 116/377 (30%), Gaps = 67/377 (17%)

Query: 53 GLLADRWGDRPVLLGGLGATAAALAWMSLFAAPAHGVAPALWLLALGLLLVGVMGGSVNG 112
G L+DR+G RPVLL L A A M AP LW+L +G ++ G+ G + G
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIM--------ATAPFLWVLYIGRIVAGITGAT--G 113

Query: 113 ASGRAVMA----------WFDEGERGLAMSIRQTAVPLGGGLGALVLPWLASRAGFGAVF 162
A A +A F G A P+ GGL P + A
Sbjct: 114 AVAGAYIADITDGDERARHF--GFMSACFGFGMVAGPVLGGLMGGFSP--HAPFFAAAAL 169

Query: 163 GLLAGLCAGAALLTLGWLREPDRAHPATAAGSTAASHAGDNTPTTSPRSRAATHTPIAAA 222
L L L E H G+ P A
Sbjct: 170 NGL------NFLTGCFLLPES---------------HKGERRPLRREALNPLASFRWARG 208

Query: 223 SSPLRDARVWRAAVAIGLLCCPQFAVLTFATVFLHDFSGAGIATLTAVMVAVQLGAMVAR 282
+ + AV + Q + F ++ L ++
Sbjct: 209 MTVV----AALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQA 264

Query: 283 IAGGRWTDRHGNRRAYLRGCVWLGCVL-FVALAAAAWLQGAVPGNRAALLAVPVLLAAAG 341
+ G R G RRA + G + G +A A W+ + VLLA+ G
Sbjct: 265 MITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM----------AFPIMVLLASGG 314

Query: 342 ICASAWHGVAYTELVTLAGAARAGTALGLANTCGYLGLFLTPLALPRLVAA--ASWP-LA 398
I A + L R G G L + PL + AA +W A
Sbjct: 315 IGMPALQAM----LSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWA 370

Query: 399 WLAAGGAMLAILPLLPR 415
W+A L LP L R
Sbjct: 371 WIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_039901TCRTETB371e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.2 bits (86), Expect = 1e-04
Identities = 32/180 (17%), Positives = 64/180 (35%), Gaps = 3/180 (1%)

Query: 67 LVLLLAVACALSVANVYYAQPLLDAMGREFRLDEGAVGIVVTATQLGCALALVLVVPLGD 126
+++ L + SV N L + +F + V TA L ++ + L D
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 127 LLDRRRLMLAQLGLLVLA-LAAVALASAAPWLLAGMLALGLLGTAMTQGLLALSAALAAP 185
L +RL+L + + + S L+ G A ++ + A
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 186 GERGRVVGAVQGGVVIGLLLARTLAGAVADLWGWRAVYLVSAGLAALLALALARWLPRRA 245
RG+ G + V +G + + G +A W YL+ + ++ + L ++
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192


51NH44784_040071NH44784_040221Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0400712133.684251COG1801: Uncharacterized conserved protein
NH44784_0400813153.577917Putative threonine efflux protein
NH44784_0400912133.963303Transcriptional regulator, GntR family domain /
NH44784_0401013143.302617GNAT family acetyltransferase PA5433
NH44784_0401113143.715821Permease of the drug/metabolite transporter
NH44784_0401212163.517479Permease of the drug/metabolite transporter
NH44784_0401311153.304328putative thioredoxin
NH44784_0401411153.216834N-carbamoyl-L-amino acid hydrolase
NH44784_040151-1132.420657Na+-driven multidrug efflux pump
NH44784_040161-3132.121377Methylglutaconyl-CoA hydratase
NH44784_040171-2162.061815Citronellol and citronellal dehydrogenase
NH44784_040181-2161.821392Predicted transcriptional regulator LiuQ of
NH44784_040191-1172.137351Acetyl-coenzyme A synthetase
NH44784_040201-1182.761132Phosphoenolpyruvate carboxykinase [GTP]
NH44784_040211-1163.236468SAM-dependent methyltransferases
NH44784_040221-3143.088814short-chain dehydrogenase/reductase SDR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040171DHBDHDRGNASE1079e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (268), Expect = 9e-30
Identities = 75/257 (29%), Positives = 112/257 (43%), Gaps = 18/257 (7%)

Query: 24 DGKTVMVTGGGSGLGRCTAHELASLGARLALVGRKPEKLEAARDELARI--YPEAADRIT 81
+GK +TG G+G A LAS GA +A V PEKLE L + EA
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA----- 61

Query: 82 LHACDIRGETGVRGAVADALAAHGAIDGLFNCAGGQFPAPLEQISFNGWNAVVQNNLHGT 141
D+R + A G ID L N AG P + +S W A N G
Sbjct: 62 -FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 142 FLMAREVYTQHMRQHGGAIVNMLADIWGGMP--GMGHSGAARAGVWNLTETAACEWAHAG 199
F +R V M + G+IV + ++ G+P M +++A T+ E A
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNP-AGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 200 VRVNAVAPGWIASSGMDS-YDEDYRA--VLRGLAAK----VPLQRFGTEAELAAAVVFLL 252
+R N V+PG + S + ++ A V++G +PL++ +++A AV+FL+
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 253 SPAAAFINGSVIRVDGG 269
S A I + VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040181HTHTETR774e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 76.6 bits (188), Expect = 4e-19
Identities = 36/205 (17%), Positives = 67/205 (32%), Gaps = 13/205 (6%)

Query: 13 PASAKSEQRIRDILRVARQVFSEAGFQAATTTEIAQRLGVSEGTVFTYFGGKRELCVRVI 72
++++ + IL VA ++FS+ G + + EIA+ GV+ G ++ +F K +L +
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 73 SDWYDEIIAKVEDYLPHMQGA---------RAQLHYLVHAHLRHLLAEGTGLCAFILSEG 123
I +Y G L V R LL E F E
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME----IIFHKCEF 119

Query: 124 RARNDDFGEIYAGLQRRYTAPLMRILAQGQADGHVRQDMPLRLLRSAIYGPMEHVLWDAI 183
+ L + + L + D+ R + G + ++ + +
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 184 LRGTRVDVAATADQLVDFLWQALQP 208
D+ A V L +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040221DHBDHDRGNASE1081e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 108 bits (271), Expect = 1e-30
Identities = 74/254 (29%), Positives = 111/254 (43%), Gaps = 12/254 (4%)

Query: 2 KGLADKTAIVTGGATLIGAGVVAALRAGGVRVAIFDIDAEGAARVAASDPDGIR---AWP 58
KG+ K A +TG A IG V L + G +A D + E +V +S R A+P
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 59 LDITDDTQLARAVAEVAAHFGRIDYLVNLAATYLDDGAAS--GRADWLRALDINVVSAVM 116
D+ D + A + G ID LVN A L G +W +N
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVN-VAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 117 AARAVHPHLVAAGGGAIVNFTSISSRVAQTGRWLYPVSKAALLQVTRSLAMDYAGDRIRV 176
A+R+V +++ G+IV S + V +T Y SKAA + T+ L ++ A IR
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 177 NSVSPGWTWSRVMDELTHGDRAKTDRVAAD---FHL---LGRAGDPAEVAEVVAFLLSDH 230
N VSPG T + + L + + F L + P+++A+ V FL+S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 231 ASFVTGADYAVDGG 244
A +T + VDGG
Sbjct: 243 AGHITMHNLCVDGG 256


52NH44784_040381NH44784_040671Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_040381217-3.270355hypothetical protein
NH44784_040391022-3.946259ferripyoverdine receptor
NH44784_040401023-3.085774Fe2+-dicitrate sensor, membrane component
NH44784_040411124-4.421562heme uptake regulator
NH44784_040421223-2.330745hypothetical protein
NH44784_040431119-0.271289hypothetical protein
NH44784_040441-1161.652533Ribose 5-phosphate isomerase B
NH44784_040451-2103.255625Pirin
NH44784_0404610114.034797G:T/U mismatch-specific uracil/thymine
NH44784_0404710104.288262Cytochrome c heme lyase subunit CcmH
NH44784_0404811113.817172Cytochrome c heme lyase subunit CcmL
NH44784_0404911113.154622Cytochrome c-type biogenesis protein
NH44784_0405012103.254559Cytochrome c heme lyase subunit CcmF
NH44784_040511091.391587Cytochrome c-type biogenesis protein CcmE, heme
NH44784_040521-38-0.515277cytochrome c biogenesis protein
NH44784_040531-210-0.750140Cytochrome c-type biogenesis protein
NH44784_040541-29-1.318248ABC transporter involved in cytochrome c
NH44784_040551-28-0.900399ABC transporter involved in cytochrome c
NH44784_040561-280.134983Cytochrome c-type protein NapC
NH44784_040571-171.605233Trimethylamine-N-oxide reductase
NH44784_040581073.244123Chaperone protein TorD
NH44784_040591093.374961Tetrathionate reductase subunit B
NH44784_0406012123.955982Tetrathionate reductase subunit C
NH44784_0406111123.945463Tetrathionate reductase subunit A
NH44784_0406215134.734058Cytochrome c heme lyase subunit CcmH
NH44784_0406315153.744418Cytochrome c heme lyase subunit CcmL
NH44784_0406416174.027729Cytochrome c-type biogenesis protein
NH44784_0406517185.186239Cytochrome c heme lyase subunit CcmF
NH44784_0406614153.679610Cytochrome c-type biogenesis protein CcmE, heme
NH44784_0406713142.744289hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040401PRTACTNFAMLY280.049 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.049
Identities = 28/107 (26%), Positives = 42/107 (39%), Gaps = 12/107 (11%)

Query: 166 ARVTRRPFIVSTPHGDIL---------PEEARFTVRLHDLATRV-EVLDQTATVVPGRAA 215
ARVT +S PHG+++ P+ A ++ L A + L P +
Sbjct: 327 ARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLT 386

Query: 216 ASPAAYPRAVLLHANQAVSFDGNSAGPVGVASADAAAWLHGRLAVVD 262
+ A + ++ S G S GP+ VA A A W G VD
Sbjct: 387 LTGGADAQGDIVATELP-SIPGTSIGPLDVALASQARW-TGATRAVD 431


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040431SYCDCHAPRONE351e-04 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 34.5 bits (79), Expect = 1e-04
Identities = 12/64 (18%), Positives = 23/64 (35%), Gaps = 1/64 (1%)

Query: 100 QLFEQLDAEAFAFVACDFSLADGSPAPPYYLGNVLRRLDALDEASSRVRIKLDHNYQTGE 159
Q Q D ++ + P P++ L + L EA S + + + E
Sbjct: 81 QAMGQYDLAIHSYSYG-AIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTE 139

Query: 160 NEKL 163
++L
Sbjct: 140 FKEL 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040561TCRTETA300.022 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.8 bits (67), Expect = 0.022
Identities = 16/62 (25%), Positives = 25/62 (40%), Gaps = 11/62 (17%)

Query: 14 WMGALLLRPATRIGLGVL---VIGGFLGGVTFLALFNTGMNATNTEAFCISCHSMRDTVL 70
+M A G G++ V+GG +GG + A F A N F C + ++
Sbjct: 135 FMSAC-------FGFGMVAGPVLGGLMGGFSPHAPFFAA-AALNGLNFLTGCFLLPESHK 186

Query: 71 PE 72
E
Sbjct: 187 GE 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_040621SYCDCHAPRONE310.007 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.007
Identities = 23/120 (19%), Positives = 44/120 (36%), Gaps = 7/120 (5%)

Query: 150 LAQRLRATPDDADGWYTLARSYETLGRYGDAAVAYREVLRLVPAQPQVLADLADAL--LS 207
+A + D + Y+LA + G+Y DA ++ + L + L +
Sbjct: 25 IAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMG 84

Query: 208 ANQGAPDAASIAAVAQALAANPDQPKALALAGMMALRRGDAAQALEYWQRLQALLPPDSE 267
A + S A+ +P+ A L++G+ A+A Q L+ +E
Sbjct: 85 QYDLAIHSYSYGAIMD-----IKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTE 139


53NH44784_040971NH44784_041091Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0409713143.713665Transcriptional Regulator, AraC family
NH44784_0409812133.917488putative streptomycin phosphotransferase
NH44784_0409912153.158504ADP-ribose pyrophosphatase
NH44784_0410014163.008484Transcriptional regulator, TetR family
NH44784_0410113142.406154Tetracycline efflux protein TetA
NH44784_0410210140.966407transcriptional regulator, LysR-family
NH44784_041031-1131.984682Ribulose-5-phosphate 4-epimerase and related
NH44784_041041-1132.174706major facilitator superfamily MFS_1
NH44784_041051-1132.0292612-hydroxychromene-2-carboxylate isomerase
NH44784_041061-2131.999807Leucine-, isoleucine-, valine-, threonine-, and
NH44784_041071-1112.987980putative transcriptional regulator
NH44784_041081-1123.717884Long-chain-fatty-acid--CoA ligase
NH44784_041091-1123.359339putative cyclase SCIF3.09c
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041001TETREPRESSOR2575e-90 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 257 bits (657), Expect = 5e-90
Identities = 97/201 (48%), Positives = 137/201 (68%), Gaps = 3/201 (1%)

Query: 1 MAKLQRDAVIAAALDLLNEVGVDGLTTRKLAERLGVQQPALYWHFKSKQALLDALSDAMM 60
MA+L R++VI AAL+LLNE G+DGLTTRKLA++LG++QP LYWH K+K+ALLDAL+ ++
Sbjct: 1 MARLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEIL 60

Query: 61 -REHMHTLPSPGGPWKEFLYANARSFRRALLAYRDGARIHAGTRPAEPQYDRVQAQIKLL 119
R H ++LP+ G W+ FL NA SFRRALL YRDGA++H GTRP E QYD V+ Q++ +
Sbjct: 61 ARHHDYSLPAAGESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRFM 120

Query: 120 CDAGFTPLRAANALVAISHYVVGSVMEQQAGEAAAPERVAPP--APPPALARTMKALDRQ 177
+ GF+ A+ A+SH+ +G+V+EQQ AA +R A P PP L ++ +D
Sbjct: 121 TENGFSLRDGLYAISAVSHFTLGAVLEQQEHTAALTDRPAAPDENLPPLLREALQIMDSD 180

Query: 178 GPDATFEYGLAMMLDGLSPDL 198
+ F +GL ++ G L
Sbjct: 181 DGEQAFLHGLESLIRGFEVQL 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041011TCRTETA387e-135 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 387 bits (996), Expect = e-135
Identities = 239/396 (60%), Positives = 287/396 (72%), Gaps = 1/396 (0%)

Query: 2 PDRAIALLLFIVTLDAAGAGLIMPVLPGLLDQLGDAAATPAHYGVLLSLYALAQCLAAPV 61
P+R + ++L V LDA G GLIMPVLPGLL L + AHYG+LL+LYAL Q APV
Sbjct: 3 PNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPV 62

Query: 62 LGALSDRYGRRPVLLVSLAGAAVDYLVMAAAPALWVLYAGRILAGITGATGAVAGACIAD 121
LGALSDR+GRRPVLLVSLAGAAVDY +MA AP LWVLY GRI+AGITGATGAVAGA IAD
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD 122

Query: 122 AGDPSRRARRFGQLSACFGLGMILGPALGGLAGLAGARMPFVAAAAANGVAFLTALAWLP 181
D RAR FG +SACFG GM+ GP LGGL G PF AAAA NG+ FLT LP
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 182 ESRRGARAPWSWRALDPIGGLRQALGGKNLAGLLWVFLVMQMAGQVPGSLWVLYGQDRFQ 241
ES +G R P AL+P+ R A G +A L+ VF +MQ+ GQVP +LWV++G+DRF
Sbjct: 183 ESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFH 242

Query: 242 WDAAAVGLSLAGFGALHAVAQATLPGPLSARLGERGALVVGMAADAAGYVLLACATQGWM 301
WDA +G+SLA FG LH++AQA + GP++ARLGER AL++GM AD GY+LLA AT+GWM
Sbjct: 243 WDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM 302

Query: 302 AAPLMLLLAAGGVGAPALQALLSARAGAGSQGQLQGAMNSLASAAAIAGPLAFTTLYAAS 361
A P+M+LLA+GG+G PALQA+LS + QGQLQG++ +L S +I GPL FT +YAAS
Sbjct: 303 AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 362 VGGWTGWPWVAGAALYLLCAPTLARLGAPGDAAPRA 397
+ W GW W+AGAALYLLC P L R G A RA
Sbjct: 363 ITTWNGWAWIAGAALYLLCLPAL-RRGLWSGAGQRA 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041041TCRTETB310.007 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.007
Identities = 35/141 (24%), Positives = 63/141 (44%), Gaps = 22/141 (15%)

Query: 74 LGGVIFGHYGDRIGRKSMLLITLLLMGVPTILIGLIPSYEQIGYWAAVLLVLMRFLQGIA 133
+G ++G D++G K +LL +++ ++ IG +G+ LL++ RF+QG
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSV-IGF------VGHSFFSLLIMARFIQGA- 115

Query: 134 VGGEWGGAVLMAV--EHAPQGKKGFFGSLPQAGVAPGLILSSLAMGAVAGLPEQDMLS-- 189
G A++M V + P+ + G A GLI S +AMG G M++
Sbjct: 116 -GAAAFPALVMVVVARYIPKENR---------GKAFGLIGSIVAMGEGVGPAIGGMIAHY 165

Query: 190 WGWRLPFLASVVLLAVGWFIR 210
W L ++ + F+
Sbjct: 166 IHWSYLLLIPMITIITVPFLM 186


54NH44784_041241NH44784_041561Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0412410203.247074Long-chain-fatty-acid--CoA ligase
NH44784_041251-2161.917982Gluconate 5-dehydrogenase
NH44784_0412610112.799661hypothetical protein
NH44784_0412710113.092796FIG01027227: hypothetical protein
NH44784_0412810113.105324hypothetical protein
NH44784_0412910113.092603transcriptional regulator, LysR-family
NH44784_0413010132.810682Penicillin G acylase precursor
NH44784_0413111123.850093Transport ATP-binding protein CydCD
NH44784_0413211104.025843transcriptional regulator, LysR family
NH44784_0413312124.179393Sarcosine oxidase beta subunit
NH44784_0413412123.917754transcriptional regulator, LysR family
NH44784_0413513133.833019N-formylglutamate deformylase
NH44784_0413614144.382734Transcriptional regulator, LysR family
NH44784_0413713163.954554Transcription regulatory protein opdE
NH44784_0413810132.218817Major facilitator family transporter
NH44784_041391-1102.636030transcriptional regulator, TetR family
NH44784_041401-293.566094hypothetical protein
NH44784_041411-273.044720Methyltransferase
NH44784_041421-162.524170LacI family transcriptional regulator
NH44784_041431-152.460156D-galactonate transporter
NH44784_041441-163.364288Aspartyl-tRNA(Asn) amidotransferase subunit A
NH44784_041451081.952080hypothetical protein
NH44784_0414610160.036743Multiple antibiotic resistance protein MarC
NH44784_041471-115-0.076698hypothetical protein
NH44784_041481022-2.160118hypothetical protein
NH44784_041491-121-2.704867DUF1275 domain-containing protein
NH44784_041501025-3.905794hypothetical protein
NH44784_041511-124-5.404182Methyltransferase type 12
NH44784_041521-224-5.963060hypothetical protein
NH44784_041531-127-4.588416hypothetical protein
NH44784_041541120-2.721068hypothetical protein
NH44784_041551022-2.756287hypothetical protein
NH44784_041561-120-3.858909Archaeal seryl-tRNA synthetase-related sequence
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041251DHBDHDRGNASE1284e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 128 bits (322), Expect = 4e-38
Identities = 82/249 (32%), Positives = 116/249 (46%), Gaps = 12/249 (4%)

Query: 8 VLVTGGASGIGLAVARGLLGAGRRVLVADVSQERLDEVRGL--APADRLGCVRMDVSDEA 65
+TG A GIG AVAR L G + D + E+L++V A A DV D A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 66 QVSEAIARLESDFGPIAGLVNSAGIGRDVPFLETDAALLRRILEVNLVGSFVVAREAARR 125
+ E AR+E + GPI LVN AG+ R VN G F +R ++
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 126 MRDRGRGSIVNIASVSGIRGNAGRAAYGASKGGVVTLTRVMAVELAAYGIRVNAVAPGPI 185
M DR GSIV + S AAY +SK V T+ + +ELA Y IR N V+PG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 186 ETPLVSQMHTDEARAA---------WRRVVPQHRYAAPEELNGTIGWLLDESQSSYVTGQ 236
ET + + DE A ++ +P + A P ++ + +L+ Q+ ++T
Sbjct: 191 ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG-QAGHITMH 249

Query: 237 VICVDGGFT 245
+CVDGG T
Sbjct: 250 NLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041281ACRIFLAVINRP280.009 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.9 bits (62), Expect = 0.009
Identities = 11/45 (24%), Positives = 22/45 (48%)

Query: 6 SAGIRLALAACLIFAALFAVVGGWTTGYSLESVIWLALTGAIFGA 50
A +A++ ++F L A+ W+ S+ V+ L + G + A
Sbjct: 871 QAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAA 915


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041371TCRTETB386e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.9 bits (88), Expect = 6e-05
Identities = 26/145 (17%), Positives = 52/145 (35%), Gaps = 1/145 (0%)

Query: 49 IAASLDASTGRVGWLMVVPGLLAAL-CAPLVVMGARGVDRRRILCGLLLLLAGANLGSAL 107
IA + W+ L ++ A + + +R +L G+++ G+ +G
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 108 APDMAWFLAARVLVGVCIGGIWAVAGGLAGRLVPPAAIGMATAVIFGGVAVASVLGVPLG 167
+ + AR + G A+ + R +P G A +I VA+ +G +G
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 168 ALIGNLAGWRSAFGAMAVLSAAVLL 192
+I + W + V
Sbjct: 160 GMIAHYIHWSYLLLIPMITIITVPF 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041381TCRTETB362e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 2e-04
Identities = 31/160 (19%), Positives = 60/160 (37%), Gaps = 1/160 (0%)

Query: 25 LLALALAGFIAIMTETVPAGLLPQIGLGLGVSEALAGQLVTLFAAGSVLAAIPIIVATRG 84
L+ L + F +++ E V LP I A + T F + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 85 WNRRPLLLLAIGGLCIFNVVTALS-AHYALALAARFGGGMAAGLLWGLLAGYARRMVATR 143
+ LLL I C +V+ + + ++L + ARF G A L+ R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 144 QQGRALAVVGAGQPLALCLGVPAGAWLGSLMDWRGVFWLM 183
+G+A ++G+ + +G G + + W + +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041391HTHTETR776e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 77.4 bits (190), Expect = 6e-20
Identities = 31/157 (19%), Positives = 59/157 (37%), Gaps = 3/157 (1%)

Query: 13 DAAMQVFWRRGYAATSVQDLVDGTGLGRGSLYNAFGSKQGLYEAALRRYHELTAANLDLL 72
D A+++F ++G ++TS+ ++ G+ RG++Y F K L+
Sbjct: 18 DVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEY 77

Query: 73 A--RPGNARERIGRLLDFIADDELREPSRRGCLVA-NASLEMAGQDEMVAELVRRNFQRL 129
PG+ + +L + + + E RR + E G+ +V + R
Sbjct: 78 QAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLES 137

Query: 130 EKALEQILAEGQANGEIDAGRSPRALARFIVTTVQGL 166
+EQ L + A R A + + GL
Sbjct: 138 YDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041431TCRTETA445e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 5e-07
Identities = 67/359 (18%), Positives = 116/359 (32%), Gaps = 39/359 (10%)

Query: 48 GDWGAVLGYFGYGYMVGALFGGMLADRYGPRKVWIVAGVTWSIFEIATAWAGDFGLAFLG 107
+G +L + A G L+DR+G R V +V+ ++ A A + ++G
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 108 GSALAGFATIRILFGAAEGPAYSIINKTIANWATPRERGFVVGFGLLSTPLGALLTAPVA 167
RI+ G G ++ IA+ ER FG +S G + A
Sbjct: 103 ----------RIVAGIT-GATGAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPV 149

Query: 168 VGLLSLTGSWRAMFYILGAAGLLVLVLFMRIFTDRPDTNPRVSPAELREIQAARAEQAAA 227
+G L S A F+ AA L L F P +
Sbjct: 150 LGGLMGGFSPHAPFFA--AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLA------ 201

Query: 228 GQGGDAPALPWWSFFRSKTLVLNTLGYFSFNYVNFLLLTWTPKYLQDTFGYSLSSLWYMG 287
+ W ++ +F V + + +D F + +++ G
Sbjct: 202 -------SFRWARGMTVVAALMAV--FFIMQLVGQVPAALWVIFGEDRFHWDATTI---G 249

Query: 288 MIPWTGACVTVLLGGRISDMLARRTGNLKIARSWFAAGCLLATTLCFLLVSQAQSVFAVI 347
+ + L I+ +A R G + G + T LL + A
Sbjct: 250 ISLAAFGILHSLAQAMITGPVAARLGERRA----LMLGMIADGTGYILLAFATRGWMAFP 305

Query: 348 ALMTLANALNAMPNSVYWAVVIDTAPASRVGTFSGLMHFFANIASILAPTLTGYLAARH 406
++ LA+ MP A++ R G G + ++ SI+ P L + A
Sbjct: 306 IMVLLASGGIGMP--ALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041451IGASERPTASE387e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 7e-05
Identities = 26/188 (13%), Positives = 53/188 (28%), Gaps = 12/188 (6%)

Query: 14 PLLFALPAAPAVAGMDCARARTPTEKTLCADAALYRLDDELGAAYARLRAAQQPGQNEAL 73
P A P+ + ++ + T + DA + A A+ NE
Sbjct: 1027 PPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 74 RQAQRGWLKQRDACDSDAECLRQRYDTRLAELQAQQSRALAYRPDDIDRLALEDLRQAIE 133
+ Q A ++ E + + + ++ E ++ E
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ--SETVQPQAE 1144

Query: 134 AARQSNPEFAVETVLAARSLKAEATAIRNERAADGDGPARLPAARPAGVTEDEWAAVLAS 193
AR+++P ++ E + N A + VTE S
Sbjct: 1145 PARENDPTVNIK----------EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 194 DLESDAEE 201
+E+
Sbjct: 1195 VVENPENT 1202


55NH44784_041951NH44784_042001Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0419511135.752317hypothetical protein
NH44784_0419613135.525225General secretion pathway protein N
NH44784_0419711144.902771type II secretion system protein M (XcpZ-2);
NH44784_0419810134.090131General secretion pathway protein L
NH44784_0419910124.584403General secretion pathway protein K
NH44784_042001-2113.306057General secretion pathway protein J
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042001PilS_PF08805310.003 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 30.7 bits (69), Expect = 0.003
Identities = 17/89 (19%), Positives = 33/89 (37%), Gaps = 8/89 (8%)

Query: 4 RRVAPHAQAGFTLIEVLVALALMALVSLMAWRGLDSVSSARDW--IARQADDTDAIVRAL 61
RR G TL+EVL+ + ++ +++ A++ V S A +++L
Sbjct: 19 RRKKEQ-DKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTVIANMKSL 77

Query: 62 GQMGRDVEMAYNGPSFAAPGIDARVFTSG 90
GR Y ++ + S
Sbjct: 78 KFQGR-----YTDSNYIKTLYAQGLLPSD 101


56NH44784_042351NH44784_042911Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_042351220-1.813370Nitrogen regulation protein NR(I
NH44784_042361118-1.332038tw-component sensor kinase
NH44784_042371320-1.376123Pyruvate dehydrogenase E1 component
NH44784_0423812150.587729Dihydrolipoamide acetyltransferase component of
NH44784_042391317-0.384357Dihydrolipoamide dehydrogenase of pyruvate
NH44784_042401315-0.745354Dehydrogenases with different specificities
NH44784_042411317-1.697815LysR family transcriptional regulator PA2877
NH44784_042421218-2.312763hypothetical protein
NH44784_042431119-2.683144hypothetical protein
NH44784_042441121-4.402214Flagellar biosynthesis protein FliC
NH44784_042451018-5.006270hypothetical protein
NH44784_042461014-4.564932Methyltransferase
NH44784_042471014-3.990126RNA polymerase sigma factor for flagellar
NH44784_042481-210-3.012415Flagellar transcriptional activator FlhD
NH44784_042491-19-2.923527Flagellar transcriptional activator FlhC
NH44784_04250109-2.466733Flagellar motor rotation protein MotA
NH44784_04251128-2.179754Flagellar motor rotation protein MotB
NH44784_042521110-1.655803Chemotaxis regulator-transmits chemoreceptor
NH44784_042531210-1.836233Signal transduction histidine kinase CheA
NH44784_042541211-2.232316Positive regulator of CheA protein activity
NH44784_042551114-0.884418Methyl-accepting chemotaxis protein I (serine
NH44784_042561-113-0.758274Chemotaxis protein methyltransferase CheR
NH44784_042571110-0.151298Chemotaxis response regulator protein-glutamate
NH44784_0425811111.415812Chemotaxis regulator-transmits chemoreceptor
NH44784_0425911101.739194Chemotaxis response-phosphatase CheZ
NH44784_0426011101.941102methyl-accepting chemotaxis sensory transducer
NH44784_042611182.111555Flagellar biosynthesis protein FlhB
NH44784_042621282.221663Flagellar biosynthesis protein FlhA
NH44784_0426311102.817483Flagellar biosynthesis protein FlhF
NH44784_042641-213-0.851893Flagellar biosynthesis protein FlgN
NH44784_042651114-1.375936Negative regulator of flagellin synthesis FlgM
NH44784_042661211-1.358774Flagellar basal-body P-ring formation protein
NH44784_042671313-2.220199Flagellar basal-body rod protein FlgB
NH44784_042681414-2.033990Flagellar basal-body rod protein FlgC
NH44784_042691413-1.590858Flagellar basal-body rod modification protein
NH44784_042701213-1.391229Flagellar hook protein FlgE
NH44784_042711112-1.744601Flagellar basal-body rod protein FlgF
NH44784_042721212-2.091452Flagellar basal-body rod protein FlgG
NH44784_042731110-2.266215Flagellar L-ring protein FlgH
NH44784_04274109-2.059469Flagellar P-ring protein FlgI
NH44784_042751010-1.530027Flagellar protein FlgJ [peptidoglycan
NH44784_042761111-1.439947Flagellar hook-associated protein FlgK
NH44784_042771216-0.807102Flagellar hook-associated protein FlgL
NH44784_042781316-0.593506hypothetical protein
NH44784_042791213-0.433960Methyl-accepting chemotaxis protein I (serine
NH44784_042801113-0.166695Methyl-accepting chemotaxis protein I (serine
NH44784_042811-111-1.246167putative lipoprotein
NH44784_042821012-1.162519hypothetical protein
NH44784_042831112-1.006851Aerotaxis sensor receptor protein
NH44784_042841212-0.382908methyl-accepting chemotaxis protein
NH44784_042851115-1.372433Flagellar biosynthesis protein FliR
NH44784_042861015-1.137423Flagellar biosynthesis protein FliQ
NH44784_042871-214-0.822705Flagellar biosynthesis protein FliP
NH44784_0428810151.827884Flagellar biosynthesis protein FliQ
NH44784_0428910151.080802Flagellar motor switch protein FliN
NH44784_042901-1151.545475Flagellar motor switch protein FliM
NH44784_042911-1173.250466Flagellar biosynthesis protein FliL
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042351HTHFIS1175e-33 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 117 bits (294), Expect = 5e-33
Identities = 38/150 (25%), Positives = 70/150 (46%)

Query: 8 STVFIVDDDEAVRDSLRWLLEANGYRVRAYASGESFLEDYDPSQIGVLIADVRMPGMSGL 67
+T+ + DDD A+R L L GY VR ++ + +++ DV MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 ELQEQLIARNAPLPIVFITGHGDVPMAVSTMKKGAVDFLEKPFNESDLREIVARMLEQAT 127
+L ++ LP++ ++ A+ +KGA D+L KPF+ ++L I+ R L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 128 QRVSKHQAQKDHEAMLARLTAREQQVLERI 157
+R SK + L +A Q++ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042381IGASERPTASE381e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.7 bits (87), Expect = 1e-04
Identities = 33/156 (21%), Positives = 50/156 (32%), Gaps = 15/156 (9%)

Query: 79 AAPAAKEEPKAEAPKQAAAAAPAAKAEAAAPAASSGPVEIEVPDIGDFKEVEVIEVMVAV 138
A P+ E AE KQ + + +A A + V E V
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE----------AKSNVKANT 1080

Query: 139 GDTIKAEQSLITVESDKALMEIPASQGGVVKEVKVKVGDKVAKGSVVVVVEGSAPAAAAA 198
A+ T E+ + A+ KE K KV + V V +P +
Sbjct: 1081 QTNEVAQSGSETKETQTTETKETATVE---KEEKAKV-ETEKTQEVPKVTSQVSPKQEQS 1136

Query: 199 PAAKAEAASARSEAPAAKAEAPAAP-ATPAVGSRPA 233
+ +A AR P + P + T A +PA
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042391RTXTOXIND340.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.002
Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 8/85 (9%)

Query: 39 TVESDKASMEIPASTGGVVKSINVKVGDKVAEGSVVLEVEASDAAPAAKQAPKADAPKAE 98
+ S EI +VK I VK G+ V +G V+L++ A A AD K +
Sbjct: 89 KLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAE--------ADTLKTQ 140

Query: 99 APKAEAKPAAAAPAAATFKGSADAE 123
+ +A+ + +
Sbjct: 141 SSLLQARLEQTRYQILSRSIELNKL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042401DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 70.1 bits (171), Expect = 2e-16
Identities = 71/266 (26%), Positives = 119/266 (44%), Gaps = 19/266 (7%)

Query: 1 MADHSIKGKVVLIAGGAKNLGGLIARDLAQHGAKAVAIHYNSAASKADADATVAAIQAAG 60
M I+GK+ I G A+ +G +AR LA GA A+ YN + V++++A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKL----EKVVSSLKAEA 56

Query: 61 AKAVALQADLTTAGAVEKLFADAVAAVGRPDIAINTVGKVLKKPLTEISEAEYDEMSAVN 120
A A AD+ + A++++ A +G DI +N G + + +S+ E++ +VN
Sbjct: 57 RHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116

Query: 121 AKTAFFFLKEAGRHVND--NGKVCTLVTSLLGAFTPFYAAYAGTKAPVEHYTRAASKEFG 178
+ F + +++ D +G + T+ ++ G AAYA +KA +T+ E
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 ARGISVTAVGPGPMDTPFFYPAEGADAVAYHKTAAALSPFSKTGL--------TDIEDVV 230
I V PG +T + + A +L F KTG+ +DI D V
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETF-KTGIPLKKLAKPSDIADAV 235

Query: 231 PFIRHLVSD-GWWITGQTILINGGYT 255
F LVS IT + ++GG T
Sbjct: 236 LF---LVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042431IGASERPTASE330.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.001
Identities = 18/80 (22%), Positives = 36/80 (45%), Gaps = 2/80 (2%)

Query: 137 APRTADAPPPQPGATDSVPPSAAPSPDDAAPTPHIQGQSAASAAPAAAAIPAREGAAPHA 196
P+ P+ +++V P A P+ ++ PT +I+ + + A PA+E + +
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPAREN-DPTVNIKEPQSQTNTTADTEQPAKE-TSSNV 1179

Query: 197 DRPAMPKPTLRITEIPAENP 216
++P T+ ENP
Sbjct: 1180 EQPVTESTTVNTGNSVVENP 1199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042441FLAGELLIN2173e-66 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 217 bits (555), Expect = 3e-66
Identities = 237/564 (42%), Positives = 290/564 (51%), Gaps = 59/564 (10%)

Query: 2 AAVINTNYLSLVAQNNLNKSQSALGTAIERLSSGLRINSAKDDAAGMAIANRFTANVKGL 61
A VINTN LSL+ QNNLNKSQS+L +AIERLSSGLRINSAKDDAAG AIANRFT+N+KGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISLAQTTEGAASEINTHLQRVRELSVQAANGSYSQEQLNSMQDEINQRLS 121
TQA+RNANDGIS+AQTTEGA +EIN +LQRVRELSVQA NG+ S L S+QDEI QRL
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 DIDRISQQTDFNGVKVLSGNAKPLTLQVGANDGETITLNLTEISVKTLGLDGFNVNGTGV 181
+IDR+S QT FNGVKVLS + + + +QVGANDGETIT++L +I VK+LGLDGFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ-MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 182 TQNRTATVSDLQAAGGKAGAGAAANDWTVTTNHAAATAEQAFGKLENGNTVVVGGTTYTY 241
S G A A + A T A + G T
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 242 DAANGNFTFQNTTKGATPADSTANLAKLASSLTPATGTSTGTYTNNASASTTFEVDATGN 301
DA N T +T + A A D G
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAI----------------KGGKEGDTFDYKGV 283

Query: 302 LTIGGKAAYLAATGELSTNNPGGGAQATLTDV--LTTTSKAAAGTASISIGGKTFNSTGT 359
G++ST G T+ D+ AA +S ++ N T
Sbjct: 284 TFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFT 343

Query: 360 VDEVTYTDTVAKDALLATFKAGAAGSEINLGQGITAAKLTFTTGTSTDTWVDGSGSFTRT 419
D+ T ++ L A
Sbjct: 344 FDDKTKNESAKLSDLEANNAVKG------------------------------------- 366

Query: 420 QKYDTTYTVDPNTGKATVKSGTGTGDYAPKVGATAYVNSSGKLTTETTSKGGKTSDPLKT 479
++ TV+ A T S + + + T++PL +
Sbjct: 367 ---ESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLAS 423

Query: 480 LDAAFTKLDKLTGELGAVQNRLESTIANLNNVVNNLSSARSRIQDADYATEVSNMSKAQI 539
+D+A +K+D + LGA+QNR +S I NL N V NL+SARSRI+DADYATEVSNMSKAQI
Sbjct: 424 IDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQI 483

Query: 540 LQQAGTSVLAQANQVPQTVLSLLR 563
LQQAGTSVLAQANQVPQ VLSLLR
Sbjct: 484 LQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042501ACRIFLAVINRP371e-04 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 36.7 bits (85), Expect = 1e-04
Identities = 12/44 (27%), Positives = 23/44 (52%)

Query: 4 VIGFLVVIVSVIGSFVALGGHMGALYQPFELTLIFGAAFGAFLA 47
++G +V+ +V GG GA+Y+ F +T++ A +A
Sbjct: 442 LVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVA 485


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042511OMPADOMAIN444e-07 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 44.2 bits (104), Expect = 4e-07
Identities = 34/126 (26%), Positives = 56/126 (44%), Gaps = 11/126 (8%)

Query: 167 FATGRAEVQPYMRDILRELGPVLNEL---PNKVSISGHTDASQYARGERAYSNWELSADR 223
F +A ++P + L +L L+ L V + G+TD G AY N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDR----IGSDAY-NQGLSERR 277

Query: 224 ANASRQELVAGGMNESKVMRIQGLSSSMSLVKDDPYAAVNRRISLVVLNQSTQRRIENEN 283
A + L++ G+ K+ +G+ S + V + V +R +L+ + RR+E E
Sbjct: 278 AQSVVDYLISKGIPADKI-SARGMGES-NPVTGNTCDNVKQRAALIDC-LAPDRRVEIEV 334

Query: 284 AAAADV 289
DV
Sbjct: 335 KGIKDV 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042521HTHFIS813e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.6 bits (199), Expect = 3e-21
Identities = 27/104 (25%), Positives = 46/104 (44%), Gaps = 2/104 (1%)

Query: 6 TLTGAGWKVLTAGNGQEALEVAKSHPVDLVVSDWNMPVMGGLQLIQGLREQEQYLDVPVL 65
L+ AG+ V N + DLVV+D MP L+ ++ + D+PVL
Sbjct: 22 ALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIK--KARPDLPVL 79

Query: 66 VLTTEDDVDSKMAARDLGVCGWLSKPVDPDVLVELASELLDEQS 109
V++ ++ + + A + G +L KP D L+ + L E
Sbjct: 80 VMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042531PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 43.3 bits (102), Expect = 2e-06
Identities = 23/151 (15%), Positives = 51/151 (33%), Gaps = 52/151 (34%)

Query: 414 ELDKSLIERIIDPLT--HLVRNSLDHGIETPEKRVAAGKDPVGQLVLSAQHNGGNIVIEV 471
+++ ++++ + P+ LV N + HGI G+++L + G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 472 SDDGAGLNREKILKKAMAQGLPVNENSPDDEIWQLIFAPGFSTAEKVTDISGRGVGMDVV 531
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKN--------------------------------------TKESTGTGLQNV 318

Query: 532 RRNIQDMGG---HVQLSCEPGNGTTTRIVLP 559
R +Q + G ++LS + G +++P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042571HTHFIS536e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 6e-10
Identities = 36/152 (23%), Positives = 64/152 (42%), Gaps = 25/152 (16%)

Query: 27 ARELIKKHNPDVLTLDVEMPRMDGLDFLEKLMRLRP-MPVVMVSSLTERGGEITLRALEL 85
I + D++ DV MP + D L ++ + RP +PV+++S+ ++A E
Sbjct: 39 LWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNT--FMTAIKASEK 96

Query: 86 GAIDFVTKPKLGIRDGLLEYTEIIADKIRAASRAKLRAPSPHAPAAAPAPMLRRPLSSSE 145
GA D++ KP + TE+I RA + K R + S +
Sbjct: 97 GAYDYLPKP--------FDLTELIGIIGRALAEPKRRPS-------------KLEDDSQD 135

Query: 146 KLVIVGASTGGTEAIREVLQPLPPDSPAILIT 177
+ +VG S E R + + + D ++IT
Sbjct: 136 GMPLVGRSAAMQEIYRVLARLMQTDLT-LMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042581HTHFIS852e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.9 bits (210), Expect = 2e-22
Identities = 31/107 (28%), Positives = 53/107 (49%), Gaps = 3/107 (2%)

Query: 5 GIKILVVDDFPTMRRIIRNLLKELGFENVDEAEDGAIGLEKLRNGGFQFVVSDWNMPNLD 64
G ILV DD +R ++ L G++ V + A + G VV+D MP+ +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 65 GLEMLKQIRADASLASLPVLMVTAEAKKENIVAAAQAGANGYVVKPF 111
++L +I+ LPVL+++A+ + A++ GA Y+ KPF
Sbjct: 62 AFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042601CHANLCOLICIN300.026 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.0 bits (67), Expect = 0.026
Identities = 53/323 (16%), Positives = 112/323 (34%), Gaps = 36/323 (11%)

Query: 62 AQLAQEQARAAEALAASGAQIAALSA-----------STSTHAREIADVSGQNLRAAEQA 110
A+ A AAEA A + A AL+ ++ +++ N A +
Sbjct: 67 AEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAE 126

Query: 111 LAELSEVK-ERVDRMTREMA--AFTDVVGQLTDRARSVGDISKLIKDIALQTQLLALNAG 167
L K E R E A AF + + + R + + +K + + LA A
Sbjct: 127 DERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLA--AL 184

Query: 168 VEAARAGD-AGRGFAVVASEVGRLAERVNAATSDIG-----------RHTGEMLELVDST 215
E A+A + A + + SEV ++ + S + G+ EL ++
Sbjct: 185 SEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQAS 244

Query: 216 QRQTGTLREDVDASGAVLDKTR-----QDFQHFVRDFDNMNRQVGEVVQAIGEVDATNHG 270
+ S D + + + V + +V + ++ N
Sbjct: 245 AKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINAD 304

Query: 271 MSQDVSRIAALSADVRERVASMS-GEIDRIRRQTESVQEVLSDMRTGNTAF-DRLSEAL- 327
++Q I+ +S + +A + E + + Q + + D +F L+E
Sbjct: 305 ITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364

Query: 328 DAFRAAATRLLEQARARGLDVFD 350
+ + A L ++++ + + +
Sbjct: 365 EKYSKMAQELADKSKGKKIGNVN 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042611TYPE3IMSPROT366e-128 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 366 bits (940), Expect = e-128
Identities = 93/343 (27%), Positives = 176/343 (51%), Gaps = 2/343 (0%)

Query: 8 EKTEAASPRRLEKAREEGQIARSRELGTFMMLAVGVGAIWAGGGTIYKGLSGVLRNGLAF 67
EKTE +P+++ AR++GQ+A+S+E+ + ++ + ++ S ++ +
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLML--IPA 61

Query: 68 DQRVVADPGVMVEQAVNGFGHALMVILPIFGLLAVIAVLSSVLLGGIVISGKPLSPNFSK 127
+Q + + N + P+ + A++A+ S V+ G +ISG+ + P+ K
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 128 LSLFAGLKRMFSAQTVVELIKALAKAMLVGGVAVWILWRYHDDMLGLMHVAPSAALTKAL 187
++ G KR+FS +++VE +K++ K +L+ + I+ +L L
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 188 SLVAMCCAFIVASLLVIVLLDVPWQIWSHLKKLRMSKEDVRQEHKESEGDPHTKARIRQQ 247
++ +VI + D ++ + ++K+L+MSK+++++E+KE EG P K++ RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 248 QRQAARRRMMSEVPKADVVVTNPTHYAVALKYEEDKNGAPRVLAKGTGLIAAKIRELAAE 307
++ R M V ++ VVV NPTH A+ + Y+ + P V K T +R++A E
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 308 HRIPTLEAPPLARALHQHVELGQEIPAELYTAVAEVLAWVFQL 350
+P L+ PLARAL+ + IPAE A AEVL W+ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042631PF03544403e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.0 bits (93), Expect = 3e-05
Identities = 20/77 (25%), Positives = 26/77 (33%), Gaps = 3/77 (3%)

Query: 144 PEQEVLQEPVARQAPPAAVP---PAAPVVAAPAPAAPVRVEMPVLPPARPAPSVAPSPMA 200
L+ P A Q PP V P + P APV +E P P V
Sbjct: 55 VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQP 114

Query: 201 RMPDAPLRNGPAMPPAA 217
+ P+ + PA P
Sbjct: 115 KRDVKPVESRPASPFEN 131



Score = 38.8 bits (90), Expect = 6e-05
Identities = 28/132 (21%), Positives = 38/132 (28%), Gaps = 7/132 (5%)

Query: 89 PPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRSIAAYQSAYATPGVPEQEV 148
P PA P S + A A + PP P P P A + + +
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEP----IPEPPKEAPVVIEKPKP 99

Query: 149 LQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPARP--APSVAPSPMARMPDAP 206
+P + P PA+P P P + A + P
Sbjct: 100 KPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159

Query: 207 L-RNGPAMPPAA 217
L RN P P A
Sbjct: 160 LSRNQPQYPARA 171



Score = 38.0 bits (88), Expect = 1e-04
Identities = 22/118 (18%), Positives = 33/118 (27%), Gaps = 5/118 (4%)

Query: 71 AGTPAPQAQPAPVAPRLQPPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRS 130
A PQA P P ++P P A + P P P R
Sbjct: 58 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRD 117

Query: 131 IAAYQSAYATPGVPEQEVLQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPA 188
+ +S A+P P + A + PV + + + P P
Sbjct: 118 VKPVESRPASPFENTA-----PARPTSSTATAATSKPVTSVASGPRALSRNQPQYPAR 170



Score = 35.7 bits (82), Expect = 5e-04
Identities = 22/138 (15%), Positives = 33/138 (23%), Gaps = 7/138 (5%)

Query: 73 TPAPQAQPAPVAPRLQPPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRSIA 132
PA VAP PP + P P PI P P +
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQP---PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKP 101

Query: 133 AYQSAYATPGVPEQEVLQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPARPAP 192
+ + ++ +R A P AP + A + + P
Sbjct: 102 KPKPKPVKKVEQPKRDVKPVESRPASPF--ENTAPARPTSSTATAATSKPVTSVASGPRA 159

Query: 193 SVAPSPMARMPDAPLRNG 210
P + P
Sbjct: 160 LSRNQP--QYPARAQALR 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042681FLGHOOKAP1300.004 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.5 bits (66), Expect = 0.004
Identities = 12/41 (29%), Positives = 20/41 (48%)

Query: 96 SMPNVDPVAETVNMIAASRSYQANVEVLNTAKSLMQKTLTI 136
S+ V+ E N+ + Y AN +VL TA ++ + I
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042701FLGHOOKAP1432e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.0 bits (101), Expect = 2e-06
Identities = 24/102 (23%), Positives = 46/102 (45%), Gaps = 9/102 (8%)

Query: 386 QYTNGETSIVGTIVLAD-FANLQGLQPVGNNAWKETATSGQPILGQPGSNGLSKVVGQAT 444
+ ++ G D +A+L +GN TAT N ++++ Q
Sbjct: 453 DLQSNSKTVGGAKSFNDAYASLVSD--IGNK----TATLKT--SSATQGNVVTQLSNQQQ 504

Query: 445 ESSNVDMSKELVNMIIAQRTYQANSQTIKTQDEIMQVLMNLK 486
S V++ +E N+ Q+ Y AN+Q ++T + I L+N++
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 31.5 bits (71), Expect = 0.008
Identities = 15/52 (28%), Positives = 23/52 (44%), Gaps = 4/52 (7%)

Query: 1 MNAAAQNLDVIGNNIANSGTVGFKSSTASFAD----VYASSRVGLGTKVAAI 48
+NAA L+ NNI++ G+ T A + A VG G V+ +
Sbjct: 11 LNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042721FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 221 ETSNVNVAEELVNMITTQRAYEMNSKAVKTSDEMLARLTQL 261
S VN+ EE N+ Q+ Y N++ ++T++ + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 39.6 bits (92), Expect = 8e-06
Identities = 15/78 (19%), Positives = 32/78 (41%), Gaps = 14/78 (17%)

Query: 4 SLWIAKTGLEGQQTSMDVISNNLANVSTNGFKRGRAVFQDLMYQTLRQPGAQVGDATQLP 63
+ A +GL Q +++ SNN+++ + G+ R + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLG 48

Query: 64 SGLQLGTGARVAATERIH 81
+G +G G V+ +R +
Sbjct: 49 AGGWVGNGVYVSGVQREY 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042731FLGLRINGFLGH1973e-66 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 197 bits (503), Expect = 3e-66
Identities = 118/215 (54%), Positives = 148/215 (68%), Gaps = 6/215 (2%)

Query: 3 VAGCAMIPPEPIVTGPTTAAPPPPPMPAAQPTGSIY---QPTTYGNYPLFEDRRPRNVGD 59
+ GCA IP P+V G T+A P P P P A GSI+ QP YG PLFEDRRPRN+GD
Sbjct: 19 LTGCAWIPSTPLVQGATSAQPVPGPTPVA--NGSIFQSAQPINYGYQPLFEDRRPRNIGD 76

Query: 60 IVTIVLNEKTNASKNVATNTNRSGSASLGITAAPSFMDSW-ANANLNTDAKGGNVAQGKG 118
+TIVL E +ASK+ + N +R G + G P ++ NA + +A GGN GKG
Sbjct: 77 TLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKG 136

Query: 119 DSTANNAFTGTITTTVVGVMSNGNLQVAGEKQIAINRGSEYVRFSGVVDPRSITGTNTVS 178
+ A+N F+GT+T TV V+ NGNL V GEKQIAIN+G+E++RFSGVV+PR+I+G+NTV
Sbjct: 137 GANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVP 196

Query: 179 STQVADARIEYRSKGVMDEVQTMGWLQRFFLIASP 213
STQVADARIEY G ++E Q MGWLQRFFL SP
Sbjct: 197 STQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042741FLGPRINGFLGI381e-134 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 381 bits (981), Expect = e-134
Identities = 154/357 (43%), Positives = 216/357 (60%), Gaps = 9/357 (2%)

Query: 14 LLAGASAAHAERLKDLASIQGVRGNQLIGYGLVVGLDGSGDQVRQTPFTQQSLTNMLSQL 73
L + A R+KD+AS+Q R NQLIGYGLVVGL G+GD +R +PFT+QS+ ML L
Sbjct: 19 LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNL 78

Query: 74 GITVPAGSNMQLKNVAAVMVTTTLPAFARPGQTLDVVVSSMGNAKSLRGGTLLMTPLKGA 133
GIT G + KN+AAVMVT LP FA PG +DV VSS+G+A SLRGG L+MT L GA
Sbjct: 79 GITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGA 137

Query: 134 DSQVYAIAQGNILVGGAGASAGGSSVQINQLNGGRISGGAIVERGVPTTFARDGYIYLEM 193
D Q+YA+AQG ++V G A +++ R+ GAI+ER +P+ F + L++
Sbjct: 138 DGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQL 197

Query: 194 NNTDFGTAQNVVNALNR----KFGQGTATALDGRVVQVRTPMEQASQARFLSQVEDLQVT 249
N DF TA V + +N ++G A D + + V+ P A R ++++E+L V
Sbjct: 198 RNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRLMAEIENLTVE 256

Query: 250 RAPTVAKVIINARTGSVVMNRTVMIEEAAVAHGNLSVIINRQNQVSQPDTPFTEGQTVVV 309
T AKV+IN RTG++V+ V I AV++G L+V + QV QP PF+ GQT V
Sbjct: 257 -TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRGQTAVQ 314

Query: 310 PNTQIEVRQDNGSLQRVTTSANLADVVKGLNALGATPQDLLAILQAMKTAGALRAEL 366
P T I Q+ + + +L +V GLN++G ++AILQ +K+AGAL+AEL
Sbjct: 315 PQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042751FLGFLGJ2269e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 226 bits (578), Expect = 9e-75
Identities = 128/309 (41%), Positives = 186/309 (60%), Gaps = 13/309 (4%)

Query: 16 SVFDMGRLSDLKRDVTKDPANPGTEQQKQVAKQFEALFLQMMLKRMREATPKEGLFDSQQ 75
+ +D L++LK +DPA + VA+Q E +F+QMMLK MR+A PK+GLF S+
Sbjct: 11 AAWDAQSLNELKAKAGEDPAA----NIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEH 66

Query: 76 TQMLQSMADEQLALHL-ATPGIGLSQSILAQMQQGKPGDLPAEAVQRLGQGTDLDFQTGG 134
T++ SM D+Q+A + A G+GL++ ++ QM +P + + +T
Sbjct: 67 TRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPM----KFPLETVV 122

Query: 135 SRQVSALMDVMRNNRASDRALAAAEGAPEHVINFVSKMSRAATLASQQSGVPARLIMGQA 194
Q AL +++ + + P F++++S A LASQQSGVP LI+ QA
Sbjct: 123 RYQNQALSQLVQKAVPRN----YDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 195 ALESGWGQREIKHEDGRTSYNLFGIKAGPSWKGKVVNVLTTEYEDGVAKKVTQPFRAYSS 254
ALESGWGQR+I+ E+G SYNLFG+KA +WKG V + TTEYE+G AKKV FR YSS
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 255 YEESFADYARLIGNSPRYEAVTQARNEIEAARRIQSAGYATDPQYAQKLIGVMSQLRGAA 314
Y E+ +DY L+ +PRY AVT A + + A+ +Q AGYATDP YA+KL ++ Q++ +
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSIS 298

Query: 315 AKVDISRQM 323
KV + M
Sbjct: 299 DKVSKTYSM 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042761FLGHOOKAP1327e-108 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 327 bits (840), Expect = e-108
Identities = 208/556 (37%), Positives = 319/556 (57%), Gaps = 21/556 (3%)

Query: 7 ALGGLNASQAGLATTGHNINNATTVGYNRQRVMISTAGAQATTNGYIGRGVQVDTVVRSY 66
A+ GLNA+QA L T +NI++ GY RQ +++ A + G++G GV V V R Y
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQREY 66

Query: 67 DSFLYKQLVGAQGSGAQLQAQLDQVSQVNNLFADRTVGIAPGLTNFFTSMNTVASKPADP 126
D+F+ QL AQ + L A+ +Q+S+++N+ + T +A + +FFTS+ T+ S DP
Sbjct: 67 DAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVSNAEDP 126

Query: 127 AARQDLLGKANSLTTQIRSAYQEMQNQRLGLNTQITTTVEQVNSYLTRINDLNSQISTAR 186
AARQ L+GK+ L Q ++ Q +++Q +N I +V+Q+N+Y +I LN QIS
Sbjct: 127 AARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQISRLT 186

Query: 187 AKAGGNPPNDLLDQRDQAVSELNQLIGVT-TYEQGDKLSISLASGGQALLSGDTVYPLQA 245
G PN+LLDQRDQ VSELNQ++GV + + G +I++A+ G +L+ G T L A
Sbjct: 187 GVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMAN-GYSLVQGSTARQLAA 245

Query: 246 VSSAKDVSRTVIAYTLPAGSGGKTVAVELNDAEVTGGKLGGLLQFRATSLDFMQAQLGQM 305
V S+ D SRT +AY G +E+ + + G LGG+L FR+ LD + LGQ+
Sbjct: 246 VPSSADPSRTTVAYV-----DGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 306 AVGLALSFNEQHRQGLDTAGNPGTDFFSIGKPQGVPNAGNKSGAQISGEFTNVNNINAKD 365
A+ A +FN QH+ G D G+ G DFF+IGKP + N NK I T+ + + A D
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 366 YEISFDGTNYMVTRLPEGTQVYNGPATGTPPTATLDLN---AEMGVTLTIDSPPQAGDKW 422
Y+ISFD + VTR A+ T T T D N A G+ LT P D +
Sbjct: 361 YKISFDNNQWQVTR----------LASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSF 410

Query: 423 ALSPTRDAARDIKVIITDPEKVAAADTE-GGDANGKNALKLAQLQNTKVLGHGTMSITEM 481
L P DA ++ V+ITD K+A A E GD++ +N L LQ+ G S +
Sbjct: 411 TLKPVSDAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDA 470

Query: 482 FGQVVNTVGVQTAQIQSANTAQKNLIAQKTAAQQAVSGVNLNEEYVSLSLYQEQYQASAR 541
+ +V+ +G +TA +++++ Q N++ Q + QQ++SGVNL+EEY +L +Q+ Y A+A+
Sbjct: 471 YASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQ 530

Query: 542 IIDVASTLFDTLLGLR 557
++ A+ +FD L+ +R
Sbjct: 531 VLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042771FLAGELLIN449e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 43.9 bits (103), Expect = 9e-07
Identities = 60/363 (16%), Positives = 120/363 (33%), Gaps = 23/363 (6%)

Query: 1 MRLSTSMMYSNGLKGVLAQESDMNRLVEQVGSGKKFLTPADDPLSASLAINVAQTQSMNS 60
++T+ + + +S ++ +E++ SG + + DD +A AI T ++
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDD--AAGQAIANRFTSNIKG 59

Query: 61 TYQLNRNT--AKTNLGQENNVLDSITTALADVRTRVVQAGNGTFADSDRQALSTALKNAR 118
Q +RN + L+ I L VR VQA NGT +DSD +++ ++
Sbjct: 60 LTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRL 119

Query: 119 DALLGLANSTDGNGQYLFSGYQGGVVPYSQDTNGKI------VYSGATGERTVQVDQSRQ 172
+ + ++N T NG + S + + I + + G V+ ++
Sbjct: 120 EEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 173 MSTSDIGSDIFNRANPGSQAYVSTAAQANTGTAQFSTVSVTPGSPNIGKDFRLQFESDPA 232
+ D+ S N + A + + + + T + P D
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV------------PDKV 227

Query: 233 TGNMGYRVITTDPNANPPVPPVTTPAPPAAPTVYTPDAAIDFGGVSVVIKGTPQNGDVID 292
N +TTD N + A T A G G
Sbjct: 228 YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFD-YKGVTFT 286

Query: 293 VQSVQSADVDMFNTLDSLIKTLDSPIAGDPVALAKLNNELATANKKLSTNYDNVQTVAAS 352
+ + D + + + + +A A ++ ++K + T+ N Q
Sbjct: 287 IDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDD 346

Query: 353 VGA 355

Sbjct: 347 KTK 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042791PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.4 bits (76), Expect = 0.002
Identities = 26/122 (21%), Positives = 35/122 (28%), Gaps = 10/122 (8%)

Query: 518 DVIEVPARQLAQQHSAPRVAAAQTE----AQVSAARTAKPAAQAAAKPEPAAEP----DH 569
VIE+PA AQ S VA A E Q +P + PEP E +
Sbjct: 39 QVIELPAP--AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK 96

Query: 570 SPVTPPAAPAPRLAQPVRARPTANSGATAARPLRRPVVKTTDATDVKPVKPAPPAARRAP 629
P P P R + A P ++ P + +
Sbjct: 97 PKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156

Query: 630 PA 631
P
Sbjct: 157 PR 158



Score = 31.1 bits (70), Expect = 0.009
Identities = 18/99 (18%), Positives = 27/99 (27%), Gaps = 1/99 (1%)

Query: 534 PRVAAAQTEAQVSAARTAKPAAQAAAKPEPAAEPDHSPVTPPAAPAPRLAQPVRARPTAN 593
P A + V+ A P A PEP EP+ P P P + +P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQP-PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 594 SGATAARPLRRPVVKTTDATDVKPVKPAPPAARRAPPAD 632
+ + +P A R +
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042801PF05272357e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.4 bits (81), Expect = 7e-04
Identities = 13/54 (24%), Positives = 18/54 (33%)

Query: 510 ARLTHSAPRAKTATAEAASAARPPRRPAPRAAANDAKPAPAATRRQAPADDDWE 563
AR + + TA A A PP++ P A A P +W
Sbjct: 380 ARALLADVSSPTAAAGGAGGGEPPKKRDPSAGAGTDPGGPGGGDDGEDPFGEWL 433


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042851TYPE3IMRPROT1673e-53 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 167 bits (425), Expect = 3e-53
Identities = 121/252 (48%), Positives = 179/252 (71%), Gaps = 1/252 (0%)

Query: 1 MFSFTIEQLNGWIGQFLWPFVRILALVGTAPLFSESTVPVKVKIGLAFVLAVAVSPALDP 60
M T EQ W+ + WP +R+LAL+ TAP+ SE +VP +VK+GLA ++ A++P+L P
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSL-P 59

Query: 61 APPIAPGSFAGLWMVMQQVLIGIAMGFTMRLVFAAVQTAGEFVGLQMGLSFASFFDPSTG 120
A + SF LW+ +QQ+LIGIA+GFTM+ FAAV+TAGE +GLQMGLSFA+F DP++
Sbjct: 60 ANDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASH 119

Query: 121 ANTAVLSRLFNIVAMLTFLALDGHLLVLAALVRSFDTLPVAAIQLHQNGWGVVVEWGKTV 180
N VL+R+ +++A+L FL +GHL +++ LV +F TLP+ L+ N + + + G +
Sbjct: 120 LNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLI 179

Query: 181 FVSGLLLALPLICALLTINLAMGILNRAAPQLSVFSVGFPVSLIVGVVLLTVVLPNSGPF 240
F++GL+LALPLI LLT+NLA+G+LNR APQLS+F +GFP++L VG+ L+ ++P PF
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPF 239

Query: 241 LESLFESGLSAM 252
E LF + +
Sbjct: 240 CEHLFSEIFNLL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042861TYPE3IMQPROT608e-16 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 60.2 bits (146), Expect = 8e-16
Identities = 26/78 (33%), Positives = 46/78 (58%)

Query: 4 ETVMTMTYQAMKIALAMAGPLLVITLIVGLVISIFQAATQINEMTLSFIPKLLAMCGVLV 63
+ ++ +A+ + L ++G ++ I+GL++ +FQ TQ+ E TL F KLL +C L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 LLGPWLIGIMVDYIRQLI 81
LL W +++ Y RQ+I
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042871FLGBIOSNFLIP287e-100 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 287 bits (735), Expect = e-100
Identities = 162/239 (67%), Positives = 190/239 (79%), Gaps = 2/239 (0%)

Query: 21 ALLGLALFPAGVVAQATLPALTATPGPGGSETYSLSMQTLLLMTSLSFLPAALLMMTGFT 80
A + L L AQ LP +T+ P PGG +++SL +QTL+ +TSL+F+PA LLMMT FT
Sbjct: 8 APVLLWLITPLAFAQ--LPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 81 RIIIVLGLLRSAMGTAMSPPNHVLIGLALFLTFYTMSPVFDKIYTDAYKPLSEGSIQFEA 140
RIIIV GLLR+A+GT +PPN VL+GLALFLTF+ MSPV DKIY DAY+P SE I +
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 141 AVERAAGPLRTFMLHQTRENDLALFANLAKQPALEDPSQVPLRILVPAFITSELKTAFQI 200
A+E+ A PLR FML QTRE DL LFA LA L+ P VP+RIL+PA++TSELKTAFQI
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 201 GFTIFIPFLIIDLVVASVLMALGMMMVPPVTVALPFKLMLFVLADGWNLLMGSLAQSFY 259
GFTIFIPFLIIDLV+ASVLMALGMMMVPP T+ALPFKLMLFVL DGW LL+GSLAQSFY
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSFY 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042891FLGMOTORFLIN1432e-46 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 143 bits (361), Expect = 2e-46
Identities = 89/157 (56%), Positives = 105/157 (66%), Gaps = 28/157 (17%)

Query: 28 NPPTQADGLKPAQDD-WAAAMAEQTAATPPATAAPAPAAAAPAAPAPAAPAAQSAAQSVF 86
NP + G A DD WA A+ EQ A T +SAA +VF
Sbjct: 6 NPSDENTG---ALDDLWADALNEQKATT-----------------------TKSAADAVF 39

Query: 87 KPLAGSA-AGNGTDIDLIMDVPVQLTVELGRTRLTIKNLLQLGQGSVVELDGLAGEPMDI 145
+ L G +G DIDLIMD+PV+LTVELGRTR+TIK LL+L QGSVV LDGLAGEP+DI
Sbjct: 40 QQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDI 99

Query: 146 FVNGYLIAQGEVVVVEDKYGIRLTDIITPSERINRLN 182
+NGYLIAQGEVVVV DKYG+R+TDIITPSER+ RL+
Sbjct: 100 LINGYLIAQGEVVVVADKYGVRITDIITPSERMRRLS 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042901FLGMOTORFLIM2724e-92 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 272 bits (697), Expect = 4e-92
Identities = 85/282 (30%), Positives = 147/282 (52%), Gaps = 5/282 (1%)

Query: 7 LSQDEVDALLAGV-TGESDSK-KESAGPNDGARAYDLSSPDRVVRRRMQTLELINERFAR 64
LSQDE+D LL + +G++ + YD PD+ + +M+TL L++E FAR
Sbjct: 5 LSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFAR 64

Query: 65 QMRSVLLNFMRRSADITVGSIKIQKYADFERNLPVPSNLNMVQMKPLRGTALFTYDPNLV 124
+ L +R + V S+ Y +F R++P PS L ++ M PL+G A+ DP++
Sbjct: 65 LTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSIT 124

Query: 125 FLVIDSLFGGDGRYHTRVEGRDFTTTEQRIIRRLLNLTLESYGKSWDPVYPIEFDYVRSE 184
F +ID LFGG G+ RD T E ++ ++ L + +SW V + + E
Sbjct: 125 FSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQIE 182

Query: 185 MHTKFASITGNNEVVVVTSFHIEFGATGGDLNICLPYSMIEPVRDLL-TRPLQETTLEEV 243
+ +FA I +E+VV+ + + G G +N C+PY IEP+ L ++ +
Sbjct: 183 TNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRRSS 242

Query: 244 DQRWSQQLSRQVRSADIDVVAEFARIPSSIRELMRMKVGDIL 285
++ L ++ + D+DVVAE + S+R+++ ++VGDI+
Sbjct: 243 TTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDII 284


57NH44784_043211NH44784_043331Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_043211320-3.678276Methionine aminopeptidase
NH44784_043221216-2.766506SSU ribosomal protein S2p (SAe
NH44784_043231010-1.263620Translation elongation factor Ts
NH44784_043241012-0.849708Uridylate kinase
NH44784_043251111-1.968261Ribosome recycling factor
NH44784_043261111-2.239379Undecaprenyl pyrophosphate synthetase
NH44784_043271212-2.129201Phosphatidate cytidylyltransferase
NH44784_043281312-2.8560251-deoxy-D-xylulose 5-phosphate reductoisomerase
NH44784_043291312-3.277875Membrane-associated zinc metalloprotease
NH44784_043301211-3.068670Outer membrane protein assembly factor YaeT
NH44784_043311-112-1.065548Outer membrane chaperone Skp (OmpH) precursor @
NH44784_0433211120.596630UDP-3-O-[3-hydroxymyristoyl] glucosamine
NH44784_0433312110.744845(3R)-hydroxymyristoyl-[acyl carrier protein]
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043241CARBMTKINASE280.025 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 28.3 bits (63), Expect = 0.025
Identities = 15/59 (25%), Positives = 22/59 (37%), Gaps = 13/59 (22%)

Query: 123 LEEGKVVIFAAGTGNPFFTT-------------DTAAALRGAEIGAEIVLKATKVDGIY 168
+E G +VI + G G P D A E+ A+I + T V+G
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAA 241


58NH44784_043761NH44784_044041Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0437610103.234900Leucine-responsive regulatory protein, regulator
NH44784_0437712114.325150Glutathione peroxidase
NH44784_0437814124.865056Major facilitator superfamily (MFS_1)
NH44784_0437912104.453296FOG: TPR repeat
NH44784_043801294.745207Transcriptional regulator
NH44784_0438113104.281372Probable pyrodoxal phosphate biosynthesis
NH44784_0438212113.4922494-hydroxythreonine-4-phosphate dehydrogenase
NH44784_0438310113.1610044-hydroxythreonine-4-phosphate dehydrogenase
NH44784_0438411102.675837methyl-accepting chemotaxis protein
NH44784_0438513122.425238Deoxyribodipyrimidine photolyase
NH44784_0438613121.478638CAIB/BAIF family protein
NH44784_043871416-0.008254Alkyl hydroperoxide reductase subunit C-like
NH44784_0438813150.659002Exonuclease SbcC
NH44784_0438911150.505441hypothetical protein
NH44784_0439010150.587987hypothetical protein
NH44784_043911-1140.890803Hypothetical protein YaeJ with similarity to
NH44784_0439212121.654112Transcriptional regulator, ArsR family
NH44784_0439311103.709211N-ethylmaleimide reductase
NH44784_0439413124.005029CAAX amino terminal protease family protein
NH44784_0439511143.692656probable membrane protein NMA1176
NH44784_0439610124.877557hypothetical protein
NH44784_043971-193.979382COG3332
NH44784_0439810103.551286integral membrane protein
NH44784_0439910123.198353Putative uncharacterized protein YeaK
NH44784_0440011123.562414two-component response regulator
NH44784_0440111113.170055Outer membrane component of tripartite multidrug
NH44784_0440210101.873242ABC-type multidrug transport system, permease
NH44784_044031-292.640520ABC-type multidrug transport system, ATPase
NH44784_044041093.117779Predicted membrane fusion protein (MFP)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043781TCRTETB462e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.0 bits (109), Expect = 2e-07
Identities = 26/132 (19%), Positives = 54/132 (40%), Gaps = 1/132 (0%)

Query: 26 LPNVATDLDVSIDIAGLLITGYAMGVVIGAPIMAIITARLPRKATLIGLASTFVLGNLLC 85
LP++A D + + T + + IG + ++ +L K L+ G+++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 86 ALAPGYGT-LMAARVFTAFCHGAFFGIGAVVAADLVPRHQRSSAIALMFTGLTLANVLGV 144
+ + + L+ AR AF + VV A +P+ R A L+ + + + +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 145 PLGTALGQVAGW 156
+G + W
Sbjct: 157 AIGGMIAHYIHW 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043801RTXTOXINC280.029 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 27.9 bits (62), Expect = 0.029
Identities = 19/50 (38%), Positives = 25/50 (50%), Gaps = 9/50 (18%)

Query: 152 PVHPDWIVTPLMAENMGPVMAPSLLPAFQAGDYVALSSRTRPRAWDDWLN 201
P+H +W V+ L A N +LPA QA YV L+ P A+ W N
Sbjct: 21 PLHRNWPVS-LFAIN--------VLPAIQANQYVLLTRDDYPVAYCSWAN 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043831FLGMOTORFLIM320.002 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 32.2 bits (73), Expect = 0.002
Identities = 20/55 (36%), Positives = 29/55 (52%), Gaps = 3/55 (5%)

Query: 73 LEVVSQDRESELASVRAALELDV-DCL-LGGTHVDEALALLAGSRIRYYPFPGRV 125
++VV++ L SVR L L V D + L THV + L G+R ++ PG V
Sbjct: 259 MDVVAEVGSLRL-SVRDILGLRVGDIIRLHDTHVGDPFVLSIGNRKKFLCQPGVV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043881PF03544340.001 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.8 bits (77), Expect = 0.001
Identities = 14/94 (14%), Positives = 25/94 (26%), Gaps = 2/94 (2%)

Query: 317 PPAAQAPAQSQSRPPQ--SQSPGQPQSRPSQSPNQPQARPQQAQPQARPQQASRPGPAPD 374
PA P Q+ PP+ + +P+ P P + + + P
Sbjct: 56 APADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPK 115

Query: 375 NLAQGAPREHRAAPANRPEARPQGNPQHGGGRAA 408
+ + N ARP +
Sbjct: 116 RDVKPVESRPASPFENTAPARPTSSTATAATSKP 149



Score = 29.2 bits (65), Expect = 0.032
Identities = 13/96 (13%), Positives = 24/96 (25%), Gaps = 15/96 (15%)

Query: 317 PPAAQAPAQSQSRP----------PQSQSPGQPQSRPSQSPNQPQARP-----QQAQPQA 361
P A Q P + P P+ + +P P + + +P
Sbjct: 63 PQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVE 122

Query: 362 RPQQASRPGPAPDNLAQGAPREHRAAPANRPEARPQ 397
+ AP + P + P+
Sbjct: 123 SRPASPFENTAPARPTSSTATAATSKPVTSVASGPR 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_044021ABC2TRNSPORT452e-07 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 44.9 bits (106), Expect = 2e-07
Identities = 39/176 (22%), Positives = 72/176 (40%), Gaps = 8/176 (4%)

Query: 208 AMTRERERGTMENLLAMPVRPLEVMTGKIVPYIAIGLIQSTIILLAAYFVFHVPF---LG 264
A R + T E +L +R +++ G++ + I + A + + + L
Sbjct: 90 AFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLY 149

Query: 265 SLVTVYLGALIFVAANLTVGITLSSLAQNQLQAMQLTMFYFLPNMLLSGFMFPFLGMPKW 324
+L + L L F + + V ++LA + + P + LSG +FP +P
Sbjct: 150 ALPVIALTGLAFASLGMVV----TALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIV 205

Query: 325 AQYLGNLLPLTHFNRLIRGILLKGNGWWDLWPAIWPMLLFTVVVMTAAVKFYRRTL 380
Q LPL+H LIR I+L D+ + + ++ V+ + RR L
Sbjct: 206 FQTAARFLPLSHSIDLIRPIMLGHPV-VDVCQHVGALCIYIVIPFFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_044041RTXTOXIND537e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.9 bits (127), Expect = 7e-10
Identities = 49/360 (13%), Positives = 104/360 (28%), Gaps = 91/360 (25%)

Query: 43 IASSEAGRLDTLSVRRGQQVAAAAPLFALEAERETAARDQAGAQLAAASAQLEDIATGKR 102
I E + + V+ G+ V L L A A + + L A + R
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 103 PPEVD---------------------------------VARAQLAQAEAASRRSAAQLAR 129
E++ + Q Q E + A+
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 130 DTAQFRA----GGVARAQLDDSRE------------------------QAQSDAARVREL 161
A+ V +++LDD + + +++ ++
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 162 RAQLEVANLPGRQ----------DQRRAQAAQVDAARATLAQADWALAQKRLAAPAAALV 211
+++ A + D+ R + LA+ + + AP + V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 212 FD-TLYRVGEWVPAGSPVVSLLPPGN-IKVRFFVPETVVGTLKPGQQAMVRCDGCGD--- 266
++ G V ++ ++P + ++V V +G + GQ A+++ +
Sbjct: 339 QQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY 398

Query: 267 -PVPVRIDYISPQAEYTPPVIYSRETRGKLVYMVEAR------PAPDAATRLHPGQPVEA 319
+ ++ I+ A + R LV+ V + L G V A
Sbjct: 399 GYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTA 450


59NH44784_044381NH44784_044471Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0443812113.627055D-amino acid dehydrogenase small subunit
NH44784_0443914113.891680DNA double-strand break repair protein Mre11
NH44784_0444014113.253262DNA double-strand break repair Rad50 ATPase
NH44784_044411316-0.197803hypothetical protein
NH44784_0444212110.572687Permease of the drug/metabolite transporter
NH44784_0444311130.459107hypothetical protein
NH44784_0444410120.503637hypothetical protein
NH44784_0444510120.209697hypothetical protein
NH44784_044461113-0.077561hypothetical protein
NH44784_0444713150.691702Na(+) H(+) antiporter subunit A; Na(+) H(+)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_044401RTXTOXIND382e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 2e-04
Identities = 23/188 (12%), Positives = 51/188 (27%), Gaps = 21/188 (11%)

Query: 479 ALKRELAQMQAASAALLARLGVAHAAEAEE------RHARGIALQRELDAMRKTLAIHAP 532
L R + + L + +E E + Q + L
Sbjct: 155 ILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKR- 213

Query: 533 KGVDALRGQRNEAQARRAQLAERLALLPPAAEADDVDPPAARQALRDAEASAAQAEQALA 592
+ + N + RL + A+ A+ + E +A L
Sbjct: 214 AERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAI----AKHAVLEQENKYVEAVNELR 269

Query: 593 AVQRALDADGARAQLLETQAAARGADLQSPERAAQ-RQDRAGRLAEARNGHDLLEQRLHQ 651
++QL + ++ A + + + +L + + LL L +
Sbjct: 270 V---------YKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 652 AQAALEAL 659
+ +A
Sbjct: 321 NEERQQAS 328



Score = 30.2 bits (68), Expect = 0.040
Identities = 31/213 (14%), Positives = 64/213 (30%), Gaps = 20/213 (9%)

Query: 494 LLARLGVAHAAEAEERHARGIALQRELDAMRKTLAIHAPKGVDALRGQRNEAQARRAQLA 553
+L +L A AEA+ + LQ L+ R L + +L
Sbjct: 123 VLLKL-TALGAEADTLKTQSSLLQARLEQTRY----------QILSRSIELNKLPELKLP 171

Query: 554 ERLALLPPAAEADDVDPPAARQALRDAEASAAQAEQALAAVQRALDADGARAQLLETQAA 613
+ + E ++ + Q E L + AR E +
Sbjct: 172 DEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSR 231

Query: 614 ARGADLQSPERAAQRQ--------DRAGRLAEARNGHDLLEQRLHQAQAALEALRPELLE 665
+ L +Q ++ + EA N + + +L Q ++ + + + E +
Sbjct: 232 VEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE-YQ 290

Query: 666 QDAQRYEKSAAIERDAQHRRHSEIQQLLGKLEQ 698
Q ++ + + L K E+
Sbjct: 291 LVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323


60NH44784_046311NH44784_046421Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0463110143.088206hypothetical protein
NH44784_0463210123.022783hypothetical protein
NH44784_0463311152.551701hypothetical protein
NH44784_0463410171.601844hypothetical protein
NH44784_0463510151.2750203-oxoacyl-[acyl-carrier protein] reductase
NH44784_0463610150.912463hypothetical protein
NH44784_0463712151.904336hypothetical protein
NH44784_0463812161.773381DNA topoisomerase IB (poxvirus type
NH44784_0463911131.831730ThiJ/PfpI family protein
NH44784_0464011113.013488hypothetical protein
NH44784_0464110112.624733hypothetical protein
NH44784_0464212102.916898PepSY-associated TM helix domain protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046341RTXTOXIND320.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.002
Identities = 18/165 (10%), Positives = 41/165 (24%), Gaps = 10/165 (6%)

Query: 16 FMLLAAPLAARAAQLSPADSHFLQAAAAADAFQLQAARLASERAASTEVRAFAEKMRSSY 75
F ++ R L F +L + +ER E +
Sbjct: 176 FQNVSEEEVLRLTSL--IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 76 QERDAGLRQLARAKRITLPAKAEPGDQRA-----LEAMAGKTGEAFDSLYIEQVALQAHE 130
+ R L + I A E ++ L + + I +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESE--ILSAKEEYQL 291

Query: 131 KSERLYRTAAEQSADAQVRDFALRHQPVLADQLALARALKRDPAA 175
++ ++ L + ++ ++ R P +
Sbjct: 292 VTQLFKNEILDKLRQTTDNIGLLTLELAKNEER-QQASVIRAPVS 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046351DHBDHDRGNASE977e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 97.0 bits (241), Expect = 7e-26
Identities = 71/237 (29%), Positives = 111/237 (46%), Gaps = 13/237 (5%)

Query: 50 LAGMAAIVTGGDSGIGRAVSVLFAREGADVAIVYLNEHEDARETERAVQAEGRRCLLIAG 109
+ G A +TG GIG AV+ A +GA +A V N E + +++AE R
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPA 64

Query: 110 DVRESAFCDRAVEQAAQAFGRLDVLVNNAAYQQHDEGLSAISDEKWDKTLRTNIYGYFYM 169
DVR+SA D + + G +D+LVN A + ++SDE+W+ T N G F
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVN-VAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 170 ARAILPHLRA--GAAIINTGSVTGLRGSGGLLDYSTSKGAIHAFTRSLASNLASQGIRVN 227
+R++ ++ +I+ GS + Y++SK A FT+ L LA IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 228 AVAPGPVWTPLNP---ADRDADE--IPSFGQDTELGRP----AQPEEISPAYVFLAA 275
V+PG T + AD + E I + + G P A+P +I+ A +FL +
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046401PRTACTNFAMLY270.003 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 26.9 bits (59), Expect = 0.003
Identities = 11/37 (29%), Positives = 13/37 (35%)

Query: 10 PPKPEPIPPGSPPGDLPVDPDTDEPEVDLPPLEPPPA 46
PP P+P P P P P + P P A
Sbjct: 572 PPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSA 608


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046411PF06580240.043 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 23.7 bits (51), Expect = 0.043
Identities = 8/44 (18%), Positives = 21/44 (47%)

Query: 1 MSTILLIVLILLLIGAVPAWPYSRGWGYYPSGLLGIVLIVLIVL 44
+ I + ++ L+L A ++ +GW G + + ++ V+
Sbjct: 43 IFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVV 86


61NH44784_046591NH44784_047251Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0465912122.588575Sensory histidine kinase in two-component
NH44784_0466014133.673479hypothetical protein
NH44784_0466114123.796066Chromate transport protein ChrA
NH44784_0466214102.104728Enoyl-CoA hydratase
NH44784_0466315111.954144transcriptional regulator, LysR family
NH44784_0466413111.034179Integral membrane protein
NH44784_0466512120.533338drug resistance transporter, Bcr/CflA subfamily
NH44784_0466611120.840849hypothetical protein
NH44784_0466712132.502921Tricarboxylate transport membrane protein TctA
NH44784_0466814174.830467Tricarboxylate transport protein TctB
NH44784_0466913195.319339hypothetical protein
NH44784_0467012134.875290General secretion pathway protein G
NH44784_0467112135.034201General secretion pathway protein K
NH44784_046721-1103.691753General secretion pathway protein L
NH44784_046731-19-0.019204General secretion pathway protein M
NH44784_046741-112-0.960138hypothetical protein
NH44784_046751-112-1.111840General secretion pathway protein E
NH44784_046761-1101.867218General secretion pathway protein F
NH44784_0467710111.729936Phosphate-binding DING protein (related to PstS
NH44784_0467810121.852157Phosphate-binding DING protein (related to PstS)
NH44784_0467910113.089983Phosphate-binding DING protein (related to PstS)
NH44784_0468010113.348162hypothetical protein
NH44784_0468110113.265096Filamentous haemagglutinin family outer membrane
NH44784_0468210122.340310hypothetical protein
NH44784_0468310142.966134Extracytoplasmic function (ECF) sigma factor
NH44784_0468410153.658722Sigma factor regulator VreR (cytoplasmic
NH44784_0468514183.550420Transposase
NH44784_0468614184.883610Transposase and inactivated derivatives
NH44784_0468712174.886376RNA polymerase sigma-70 factor, ECF subfamily
NH44784_0468811174.575913General secretion pathway protein H
NH44784_0468912153.876204General secretion pathway protein I
NH44784_0469013163.231405General secretion pathway protein J
NH44784_0469116171.883329hypothetical protein
NH44784_0469213140.217169COGs COG3558
NH44784_046931415-0.011346Transcriptional regulator, TetR family
NH44784_046941315-0.576550hypothetical protein
NH44784_046951217-0.650078hypothetical protein
NH44784_046961-114-0.387883hypothetical protein
NH44784_0469711151.010783diguanylate cyclase (GGDEF domain
NH44784_046981-1151.134683Redox-sensitive transcriptional activator SoxR
NH44784_0469910152.115215hypothetical protein
NH44784_0470010161.994414hypothetical protein
NH44784_0470110163.285835GGDEF family protein
NH44784_0470210174.029612Phage-related integrase
NH44784_0470310162.938012hypothetical protein
NH44784_0470410162.633403hypothetical protein
NH44784_0470510142.521345hypothetical protein
NH44784_047061-1122.930258FIG00456855: hypothetical protein
NH44784_047071-1121.864223ABC-type nitrate/sulfonate/bicarbonate transport
NH44784_0470810141.742776FIG00905417: hypothetical protein
NH44784_0470911123.438842putative taurine-binding periplasmic protein
NH44784_047101-1134.384922Threonine dehydratase, catabolic
NH44784_047111-1154.019408hypothetical protein
NH44784_0471211154.389021FIG00538434: hypothetical protein
NH44784_0471313104.004466hypothetical protein
NH44784_047141393.977812Carboxyl-terminal protease
NH44784_047151492.764574hypothetical protein
NH44784_0471614102.062132hypothetical protein
NH44784_0471713102.114749FIG00770418: hypothetical protein
NH44784_0471812141.912394hypothetical protein
NH44784_0471910152.393840tRNA pseudouridine synthase A
NH44784_0472010152.146880Muramoyltetrapeptide carboxypeptidase
NH44784_047211-1122.101412hypothetical protein
NH44784_0472210132.630109hypothetical protein
NH44784_0472310122.829675amino acid transporter
NH44784_0472411142.232036hypothetical protein
NH44784_0472512161.709628transcriptional regulator, XRE family with cupin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046651TCRTETA522e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.1 bits (125), Expect = 2e-09
Identities = 57/284 (20%), Positives = 97/284 (34%), Gaps = 16/284 (5%)

Query: 17 LLMGLVTLTPMGIDIYLPSLPAMAAGFEQPMSALQASITLFIFAVGVGQVLIGP----LA 72
+++ V L +GI + +P LP + + + A + + + Q P L+
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDL-VHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 73 DRYGRRPVALGGALAYLLGSALGAAATSLELFYAARVIQGLGACSASLVAFAAVRDRFSP 132
DR+GRRPV L + A+ A A L + Y R++ G+ + + VA A + D
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGA-VAGAYIADITDG 126

Query: 133 AVGARVYSYLNGALCTVPALAPMLGGALAVHLGWRSTFVFMVLFALALALLLAWRFDETR 192
AR + +++ P+LGG + + F L + E+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 193 AAPPAGRAPLYSLRRYAPIITSGRFLYFAVFGMAGMAMILVFVSAAPVVLVQQLGYSELG 252
R PL V + + I+ V P L G
Sbjct: 186 ---KGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFH 242

Query: 253 FSA------WFGGNAAINIAAFFLAPVFIARFGRHAMLRVGMAA 290
+ A ++A + AR G L +GM A
Sbjct: 243 WDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046701BCTERIALGSPG1751e-59 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 175 bits (446), Expect = 1e-59
Identities = 67/140 (47%), Positives = 89/140 (63%), Gaps = 6/140 (4%)

Query: 16 RRQRGFTLIEIMVVIVIMGILAALIVPRLLDRPDQARAVAARQDISALMQALKLYRLDNG 75
+QRGFTL+EIMVVIVI+G+LA+L+VP L+ ++A A DI AL AL +Y+LDN
Sbjct: 5 DKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNH 64

Query: 76 SYPNTEQGLRALVERPASGPGASSAWRA--YLDRLPNDPWGHPYQYLNPGTRGEIDVFSL 133
YP T QGL +LVE P P A++ + Y+ RLP DPWG+ Y +NPG G D+ S
Sbjct: 65 HYPTTNQGLESLVEAPTLPPLAAN-YNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLSA 123

Query: 134 GADGKPDGDAGNADIGSWQL 153
G DG+ + DI +W L
Sbjct: 124 GPDGEMGTE---DDITNWGL 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046761BCTERIALGSPF352e-121 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 352 bits (904), Expect = e-121
Identities = 168/406 (41%), Positives = 246/406 (60%), Gaps = 5/406 (1%)

Query: 1 MADFRFEAADALGRIQRGVLSADSARAARAQLRAQGLTALAV---RLSARRGAGSGELFA 57
MA + ++A DA G+ RG ADSAR AR LR +GL L+V R ++ +G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 58 --ARLSDSDLAWATRQLASLLAAGLPLEGALSATVEQAEKKHIAATFSGVRADVRGGQRF 115
RLS SDLA TRQLA+L+AA +PLE AL A +Q+EK H++ + VR+ V G
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 116 SDALSARPRDFPPIYRALIKAGEDSGDLARIMERLADYIEGRNALRSKVLTAFIYPAIVG 175
+DA+ P F +Y A++ AGE SG L ++ RLADY E R +RS++ A IYP ++
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 176 VVSICIVVFLLAYVVPQVVSAFSQARQELPWITRAMLAASDYVRAWGGLNALVAALLFGL 235
VV+I +V LL+ VVP+VV F +Q LP TR ++ SD VR +G L F
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 236 WRLSQRAEAARLAWHGRILRLPMIGRFALGVNTARFAATLAILADAGVPLLRALDAARQT 295
+R+ R E R+++H R+L LP+IGR A G+NTAR+A TL+IL + VPLL+A+ +
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 296 LANDCLRRAVLDATERIEGGSGLGAALRQQKVFPPLLCHLVASGEKTGTLARMLDRAAQT 355
++ND R + AT+ + G L AL Q +FPP++ H++ASGE++G L ML+RAA
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 356 LSADIERRAMAMTAMLEPLMILIMGGVVLTIVLAVMLPIMEINQMV 401
+ + + EPL+++ M VVL IVLA++ PI+++N ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046811PF05860625e-13 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 62.1 bits (151), Expect = 5e-13
Identities = 18/93 (19%), Positives = 39/93 (41%), Gaps = 4/93 (4%)

Query: 136 TVVTVDQTADKAILNWETFNVGRNTTVSFKQQASSWSVLNRVNDPRARPSQIQGQIEAPG 195
+ Q +++ F+V + T F + ++++RV S I G I A
Sbjct: 23 IIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGS--VSNIDGLIRANA 80

Query: 196 T--VMIVNRNGIVFSGSSQVNTRNLVAAAVGMS 226
T + ++N NGI+F +++++ +
Sbjct: 81 TANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046821TONBPROTEIN300.008 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 29.6 bits (66), Expect = 0.008
Identities = 10/46 (21%), Positives = 24/46 (52%)

Query: 131 RAAISFEVSPMGRVERARVLGSTGDRAKDDEIVLALDQAVFDKAPP 176
+ + F+V+P GRV+ ++L + + E+ A+ + ++ P
Sbjct: 175 QVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKP 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046881BCTERIALGSPH474e-09 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 46.9 bits (111), Expect = 4e-09
Identities = 18/73 (24%), Positives = 34/73 (46%)

Query: 24 QRGFTLIELMVVLVIVGVATAALGLSIRSDPARQLRDDAQRLVERLAAAQSEVRIDGRAI 83
QRGFTL+E+M++L+++GV+ + L+ + R +L Q G+
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFF 62

Query: 84 AWQADADGYRFVR 96
D ++F+
Sbjct: 63 GVSVHPDRWQFLV 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046891PilS_PF08805310.001 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 31.1 bits (70), Expect = 0.001
Identities = 10/38 (26%), Positives = 20/38 (52%)

Query: 18 AGFTLLEILIALAIVSVALAAVMRTTGMLTTNNGVLRE 55
G TL+E+L+ + ++ V A+ + M+ +N E
Sbjct: 26 KGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNE 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046901PF05946320.001 Toxin-coregulated pilus subunit TcpA
		>PF05946#Toxin-coregulated pilus subunit TcpA

Length = 199

Score = 32.2 bits (73), Expect = 0.001
Identities = 23/73 (31%), Positives = 37/73 (50%), Gaps = 10/73 (13%)

Query: 10 TLIEVVIAIMIMAVIS----LISWRAIDSVALTSRRLDQHTEEALALQRAFDQFERDIGA 65
TL+EV+I + IM V+S ++ RAIDS +T ++ +Q A Q R +G
Sbjct: 2 TLLEVIIVLGIMGVVSAGVVTLAQRAIDSQNMTKAAQSLNS-----IQVALTQTYRGLG- 55

Query: 66 RSPDLAESAAPAA 78
P A++ A +
Sbjct: 56 NYPATADATAASK 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046931HTHTETR698e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.9 bits (168), Expect = 8e-17
Identities = 35/199 (17%), Positives = 65/199 (32%), Gaps = 10/199 (5%)

Query: 1 MPSDTALPSPDTRQRLLQATERLVYAGGIHATGMDLIVRTSGVARKQVYRLYPNKDALVA 60
M T + +TRQ +L RL G+ +T + I + +GV R +Y + +K L +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AALRARDERWMQWFVAASSR-----AQAPRARLLAMFDALREWFGTDDFRGCAF--LNAA 113
+ + + ++ R L+ + ++ F
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 114 GEIGDEASPILAVAREHKARLLEYVRTLTRAAALP---DPDEAAAQLLVLIDGAIAVALV 170
GE+ + E R+ + ++ A LP AA + I G + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 171 TRDPAIADSAGRAAAALLG 189
R A+L
Sbjct: 181 APQSFDLKKEARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046981HTHTETR270.020 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 27.3 bits (60), Expect = 0.020
Identities = 16/69 (23%), Positives = 33/69 (47%), Gaps = 6/69 (8%)

Query: 14 TVGEVARRSGVPVSTIHFY-ESKGLIHSSRSDGNQRRFPGIVLRYIAIIKVAQRTGIPLE 72
++GE+A+ +GV I+++ + K + S + ++ + L Y A+ G PL
Sbjct: 33 SLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQ-----AKFPGDPLS 87

Query: 73 EIREAMARF 81
+RE +
Sbjct: 88 VLREILIHV 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_047041cloacin270.018 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 26.6 bits (58), Expect = 0.018
Identities = 21/77 (27%), Positives = 32/77 (41%), Gaps = 1/77 (1%)

Query: 3 GGGHGTDSGAGLHGGSRAAGIAGAFADRLALAATPVFALMAL-LAAMREAGAMAALCGAG 61
GG +SG G G + +A A +TP +A+ ++A + A+A + A
Sbjct: 64 NGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIMAAL 123

Query: 62 AWPFDGMAWMYALMSVF 78
PF W AL V
Sbjct: 124 KGPFKFGLWGVALYGVL 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_047171RTXTOXIND387e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 7e-05
Identities = 30/204 (14%), Positives = 61/204 (29%), Gaps = 8/204 (3%)

Query: 98 REDAQRAEDQRLADLAAQLARREALLASRERDLEEALKVATGQLAATEARLRTAESQLRQ 157
D + + L + R + L S E + LK+ E + +
Sbjct: 133 EADTLKTQSS-LLQARLEQTRYQILSRSIELNKLPELKLPD------EPYFQNVSEEEVL 185

Query: 158 RDEALAHGQQQLADARAALLAQVADAERGRQEQAEALAAAHARHTAHERRWLNDLDAERG 217
R +L Q + D +R + A + + E+ L+D +
Sbjct: 186 RLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLH 245

Query: 218 NAKRLQARLEDLLAARQQQAEQSRQALAQAAERLHAAEAEAAERLRVAESEAAAQLRAAQ 277
+ + + + + R +Q + + + A E ++ ++
Sbjct: 246 KQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ-IESEILSAKEEYQLVTQLFKNEILDKL 304

Query: 278 AEAQAEQGRRLAELATVREALQAS 301
+ G ELA E QAS
Sbjct: 305 RQTTDNIGLLTLELAKNEERQQAS 328



Score = 30.6 bits (69), Expect = 0.009
Identities = 21/144 (14%), Positives = 42/144 (29%), Gaps = 10/144 (6%)

Query: 200 RHTAHERRWLNDLDAERGNAKRLQARLEDLLAARQQQAEQSRQALAQAAERLHAAEAEAA 259
A + + L R R Q + + + + + Q E
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS------EEEV 184

Query: 260 ERLRVAESEAAAQLRAAQAEAQAEQGRRLAELATVREALQASRRREQELAQRLAAAEAQQ 319
RL E + + + + + ++ AE TV + + RL +
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS-- 242

Query: 320 AELMAQQKARDAQLLELTRHLISA 343
L+ +Q +LE + A
Sbjct: 243 --LLHKQAIAKHAVLEQENKYVEA 264



Score = 30.6 bits (69), Expect = 0.011
Identities = 25/131 (19%), Positives = 43/131 (32%), Gaps = 7/131 (5%)

Query: 73 AQAAAHFWEAALSAARADLAGAN---REREDAQRAEDQRLADLAAQLARREALLASRERD 129
Q + E L RA+ E+ R E RL D + L ++A+ +
Sbjct: 198 WQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS-SLLHKQAIAKHAVLE 256

Query: 130 LEEALKVATGQLAATEARLRTAESQLRQRDEALAHGQQQLADARAALLAQVADAERGRQE 189
E A +L +++L ES++ E Q + +L ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEY---QLVTQLFKNEILDKLRQTTDNIGL 313

Query: 190 QAEALAAAHAR 200
LA R
Sbjct: 314 LTLELAKNEER 324



Score = 29.8 bits (67), Expect = 0.018
Identities = 22/166 (13%), Positives = 52/166 (31%), Gaps = 11/166 (6%)

Query: 171 DARAALLAQVADAERGRQEQAEALAAAHARHTAHERRWLNDLDAERGNAKRLQARLEDLL 230
A A L + + R EQ + + L E + + L
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE--LKLPDEPYFQNVSEEEVLRLT 188

Query: 231 AARQQQAEQSRQALAQAAERLHAAEAEAAERLRVAESEAAAQLRAAQAEAQAEQGRRLAE 290
+ ++Q + Q L AE A++ + ++ E+ RL +
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERL--------TVLARINRYENLSRVEK-SRLDD 239

Query: 291 LATVREALQASRRREQELAQRLAAAEAQQAELMAQQKARDAQLLEL 336
+++ ++ E + A + +Q + ++++L
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285


62NH44784_048071NH44784_048231Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0480713123.425462Alkanesulfonates transport system permease
NH44784_0480813123.400935Alkanesulfonates transport system permease
NH44784_0480914124.126395Alkanesulfonates ABC transporter ATP-binding
NH44784_0481013114.091362Nitrilotriacetate monooxygenase component A
NH44784_0481117135.806924GMP synthase [glutamine-hydrolyzing]
NH44784_0481216125.133034L-lactate permease
NH44784_0481313183.654740hypothetical protein
NH44784_0481412133.175789Copper resistance protein D
NH44784_0481511102.420882Protein yobA precursor
NH44784_048161-2102.236593hypothetical protein
NH44784_048171-2112.411648ProQ activator of osmoprotectant transporter
NH44784_048181-2102.723910hypothetical protein
NH44784_048191-292.974921Gluconolactonase
NH44784_048201-293.206279D-galactarate dehydratase
NH44784_048211-293.4138765-dehydro-4-deoxyglucarate dehydratase
NH44784_048221-193.763029Ketoglutarate semialdehyde dehydrogenase
NH44784_048231183.297396two-component system sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048131ALARACEMASE260.018 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 25.9 bits (57), Expect = 0.018
Identities = 14/58 (24%), Positives = 28/58 (48%), Gaps = 9/58 (15%)

Query: 1 MANAALILGPALPDAGWDYVRVLLLKIALALAMTGLALANRYRWIARIRAEPALALHA 58
++N+A L P+A +D+VR + + G + + ++R IA P + L +
Sbjct: 189 LSNSAATL--WHPEAHFDWVRP-------GIILYGASPSGQWRDIANTGLRPVMTLSS 237


63NH44784_048331NH44784_048651Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_048331212-0.397831YbbM seven transmembrane helix protein
NH44784_048341014-0.925711YbbL ABC transporter ATP-binding protein
NH44784_048351014-0.992151Siderophore [Alcaligin] biosynthetic enzyme
NH44784_0483610120.366256alcaligin biosynthesis protein
NH44784_048371112-0.185085Siderophore [Alcaligin] biosynthesis
NH44784_0483810101.203427Putative iron reductase in siderophore
NH44784_048391181.765880Iron-sulfur protein in siderophore [Alcaligin]
NH44784_048401273.010522Transcriptional regulator AlcR in siderophore
NH44784_048411273.410108Siderophore [Alcaligin] translocase AlcS
NH44784_048421082.806555Iron-siderophore [Alcaligin] receptor @
NH44784_0484315113.993880Iron-siderophore [Alcaligin-like] transport
NH44784_0484412102.457280transport system permease protein
NH44784_0484510102.530107hypothetical protein
NH44784_048461-2102.020131Ferric enterobactin transport ATP-binding
NH44784_048471-2101.852665Glutamate transport ATP-binding protein
NH44784_0484810113.341856ABC transporter membrane-spanning
NH44784_0484910113.455580Amino acid ABC transporter, amino acid-binding
NH44784_0485010154.368310Enoyl-[acyl-carrier-protein] reductase [FMN]
NH44784_0485111143.564192putative thiol:disulfide interchange protein
NH44784_0485210143.059295hypothetical protein
NH44784_0485311142.654138FIG00460718: hypothetical protein
NH44784_0485411112.535113Permeases of the major facilitator superfamily
NH44784_0485511103.140378Long-chain-fatty-acid--CoA ligase
NH44784_0485611102.968670LysR-type transcriptional regulator NahR
NH44784_048571193.149140Metallo-beta-lactamase superfamily protein
NH44784_0485813104.036532N-acetyl-gamma-glutamyl-phosphate reductase
NH44784_0485912104.200454Transcriptional regulators, LysR family
NH44784_0486012123.846000Chloride channel protein
NH44784_048611-1150.133431hypothetical protein
NH44784_048621-117-1.378652Probable transmembrane protein
NH44784_048631-112-0.368253histidine kinase
NH44784_048641115-1.905954two component transcriptional regulator, LuxR
NH44784_048651218-2.167437(Y14336) putative extracellular protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048371PF041835450.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 545 bits (1405), Expect = 0.0
Identities = 160/562 (28%), Positives = 274/562 (48%), Gaps = 20/562 (3%)

Query: 16 PEVWDKVNRLLVKKAISEYAHEWLLEPERLGPAEAPGFDRYRLVLADGHAEYQFDAQLMA 75
+ WD VNR LV K +SE +E + E G DRY + L A+++F A+
Sbjct: 3 HKDWDLVNRRLVAKMLSELEYEQVFHAESQGD------DRYCINLPG--AQWRFIAERGI 54

Query: 76 MRHWRIPPESISKTMAGTPAPLDALQFVIEIRDKLGMPEDRLPIYLDEITSTLHGSAYKH 135
I +++ P+ A +++++ L M + + ++ ++ +TL G
Sbjct: 55 WGWLWIDAQTLRCA----DEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLL 110

Query: 136 -TRGAPAAAELALADYQTIETSMIEGHPSFVANNGRLGFNAEDYHVYAPECASPIRVMWL 194
R +A++L + ++ ++ GHP FV N GR G+ E YAPE A+ R+ WL
Sbjct: 111 KARRGLSASDLINLNADRLQ-CLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWL 169

Query: 195 AVHKDNAHFACLSTMDYDSLLRDELGADTVARHTQTLRDQGLDPADYYFMPSHPWQWFNK 254
AV +++ + C + MD LL + AR +Q ++ GLD ++ +P HPWQW K
Sbjct: 170 AVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDH-NWLPLPVHPWQWQQK 228

Query: 255 LSLAFAPYVATRKIVCLGYGDDDYLAQQSIRTFYNISQPGKHYVKTSLSILNMGFMRGLS 314
++ F A ++V LG D +LAQQS+RT N S+ G +K L+I N RG+
Sbjct: 229 IATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIP 288

Query: 315 PYYMSGTPAINEYLHGLISSDPWLRRNGFTILREAASLGFRNYYYEAAISVDTPYKKMFS 374
Y++ P + +L + ++D L ++G IL E A+ + Y A Y++M
Sbjct: 289 GRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLG 348

Query: 375 ALWRENPTALIGANQRLMTMAALLHVDPQGGALLPELIRASGLDAAAWLERYVDAYLTPL 434
+WRENP + ++ + MA L+ D L I SGLDA WL + + PL
Sbjct: 349 VIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPL 408

Query: 435 VHCFYAHDLVFMPHGENVILVVENHVPVRAFMKDIAEESSILNPN----AALPENARRLA 490
H + + + HG+N+ L ++ VP R +KD + ++ +LP+ R +
Sbjct: 409 YHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVT 468

Query: 491 ADVPEEYKLLTIFVDVFEGYFRHLSQVLVDNGVMDEDAFWKLVAGRIIAYQEAHPERLEK 550
+ + +Y + + F R +S ++V GV E F++L+A + Y + HP+ E+
Sbjct: 469 SRLSADYLIHDLQTGHFVTVLRFISPLMVRLGV-PERRFYQLLAAVLSDYMKKHPQMSER 527

Query: 551 YRQYDLFAPEMIHSCLNRLQLA 572
+ + LF P++I LN ++L
Sbjct: 528 FALFSLFRPQIIRVVLNPVKLT 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048411TCRTETA765e-17 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 75.6 bits (186), Expect = 5e-17
Identities = 94/377 (24%), Positives = 137/377 (36%), Gaps = 35/377 (9%)

Query: 36 LPAFGALGREFAVPQETVQLTLGVYMACYASMLLLH----GTLSDSLGRRRVVLAALGVY 91
+P L R+ V V G+ +A YA M G LSD GRR V+L +L
Sbjct: 25 MPVLPGLLRDL-VHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGA 83

Query: 92 VCGALMAALAPGFGWLLAGRAVQGLSAGAGIVVGQAIIRDCYDGAVARRAMSYLILVFNL 151
+ A AP L GR V G++ G V G A I D DG R ++ F
Sbjct: 84 AVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDERARHFGFMSACFGF 142

Query: 152 SPALAPILGGYLAAHQGWRSIFLLLTALAAAAWLLCARRLPETLPPDRRQPLAWRTLGGN 211
P+LGG + + F AL +L LPE+ +RR PL L
Sbjct: 143 GMVAGPVLGGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESHKGERR-PLRREAL-NP 199

Query: 212 YARVLGNRRYAALGLAFSLLFAAQGFLIGAAPD-----FITNVLGLPETDFAYLFVPL-V 265
A R + ++ F Q L+G P F + T +
Sbjct: 200 LASFRWARGMTVVAALMAVFFIMQ--LVGQVPAALWVIFGEDRFHWDATTIGISLAAFGI 257

Query: 266 VGAMAGALLAARKAGRWSDGRITLLAYLLMGGSCLAYVLYMAAAEPPRLPWAVLLPGLFT 325
+ ++A A++ A R + R +L M Y+L A W +
Sbjct: 258 LHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILLAFATR----GWMAFPIMVLL 310

Query: 326 CGLAMGVPAMTLRILGHVPDLSGTAASVLGFMQMLTFSVASGWGVPLVYGQPLRLAVAIL 385
+G+PA+ + V + G + LT S+ S G PL++ AI
Sbjct: 311 ASGGIGMPALQAMLSRQVDEERQGQLQ--GSLAALT-SLTSIVG-PLLFT-------AIY 359

Query: 386 ACVAASAAGWAWLHRCA 402
A + GWAW+ A
Sbjct: 360 AASITTWNGWAWIAGAA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048431FERRIBNDNGPP422e-06 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 41.9 bits (98), Expect = 2e-06
Identities = 57/263 (21%), Positives = 101/263 (38%), Gaps = 39/263 (14%)

Query: 85 PKRVVALGLGADDLVLSLGVVPVGVGRADWGGDSDHYWPWVRAAIEARGQPLPER---IT 141
P R+VAL +L+L+LG+VP GV D+ +Y WV PLP+ +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGV------ADTINYRLWVSEP------PLPDSVIDVG 82

Query: 142 VYPELDIEKIIALRPDVVL-APFSGVSPEAYAQLSRLVPVVGYPEQPFLTPVD---TQID 197
+ E ++E + ++P ++ + G SPE L+R+ P G+ P+ +
Sbjct: 83 LRTEPNLELLTEMKPSFMVWSAGYGPSPE---MLARIAPGRGFNFSDGKQPLAMARKSLT 139

Query: 198 LIAAALGTREAAAPLKARIHDALAQAARRYPELSGKTFAYVRADLGSGNFAAYVAGDPRV 257
+A L + AA A+ D + R+ + + V G +
Sbjct: 140 EMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTL---IDPRHMLVFGPNSL 196

Query: 258 --DTLSAMGLTLAPSVRGLRASAGHFAHY----LGFEHADALGDADILVSWFYTPQERDR 311
+ L G+ A G + + + A D D+L ++ D
Sbjct: 197 FQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDA 248

Query: 312 TAAMPLYASIPAVRRGAYVALSD 334
A PL+ ++P VR G + +
Sbjct: 249 LMATPLWQAMPFVRAGRFQRVPA 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048461PF05272290.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.018
Identities = 14/64 (21%), Positives = 20/64 (31%), Gaps = 16/64 (25%)

Query: 48 VGPNGCGKSTLLRVLGGALAPS----------------QGEAFLDGANLASLRRKAVARR 91
G G GKSTL+ L G S G + + + + RR
Sbjct: 602 EGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRRADAEAV 661

Query: 92 LAYL 95
A+
Sbjct: 662 KAFF 665


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048541TCRTETB290.032 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.032
Identities = 35/146 (23%), Positives = 58/146 (39%), Gaps = 14/146 (9%)

Query: 292 SMFIELGTLVWFGHLSDRIGRRTVYLIGSAGLMLAAFPFFWLVQTREPVWIFLAFFLGNT 351
+ +GT V +G LSD++G + + L G + F + + + I F G
Sbjct: 59 MLTFSIGTAV-YGKLSDQLGIKRLLLFGIIINCFGSVIGF-VGHSFFSLLIMARFIQG-- 114

Query: 352 VCHAAMIGTQPAYFAELFPPEVRYTGLALGHELASVFAG-GLSPLIAMALLRKYESATPV 410
AA A P E R G A G + V G G+ P I + +
Sbjct: 115 AGAAAFPALVMVVVARYIPKENR--GKAFGLIGSIVAMGEGVGPAIGGMIAHY------I 166

Query: 411 AW-FLVGMAAITVITLLLTREPARQE 435
W +L+ + IT+IT+ + ++E
Sbjct: 167 HWSYLLLIPMITIITVPFLMKLLKKE 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048551PF06057300.014 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/45 (22%), Positives = 14/45 (31%), Gaps = 4/45 (8%)

Query: 144 GLPAPATPVPPLPDAPIPAAELGPGDTLAILYT--SGTSGLSKGV 186
L PV P ++ L I + G + L K V
Sbjct: 28 NLGLTLLPVEPSTQVNAASS--HTKPPLVIFLSGDGGWATLDKAV 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048601ACRIFLAVINRP310.022 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.6 bits (69), Expect = 0.022
Identities = 20/77 (25%), Positives = 38/77 (49%), Gaps = 8/77 (10%)

Query: 283 LGGLAIGIGGLIDPAALGVGYDNI-RHLLAGDLALQAVLLLLVVKV--AIWSIALGSGTS 339
+ G+ + IG L+D A + V +N+ R ++ L + + ++ A+ IA+
Sbjct: 395 MFGMVLAIGLLVDDAIVVV--ENVERVMMEDKLPPKEATEKSMSQIQGALVGIAM---VL 449

Query: 340 GGVLAPLLIFGGALGAL 356
V P+ FGG+ GA+
Sbjct: 450 SAVFIPMAFFGGSTGAI 466


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048631PF06580300.025 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.025
Identities = 34/224 (15%), Positives = 69/224 (30%), Gaps = 36/224 (16%)

Query: 201 TPATWPWLPWLLWALTSALLVAGVA----ASRRSRAEARRQQEQARVAAMARLSTLGEMA 256
P + L ++ + + + + +Q ++A+MA+ + L +
Sbjct: 108 KPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALK 167

Query: 257 AGI-AHELNQPLTAILAQTRAAQRLLDDDEERPTVRRALLASAEQAKRAADIITRMRALV 315
A I H + L I A E +A +++T + L+
Sbjct: 168 AQINPHFMFNALNNIRALIL-----------------------EDPTKAREMLTSLSELM 204

Query: 316 QPSAPGRREALDPDALVASLRFLR---EPELARQGIQLSWRNASPQARPLADRVALEQIL 372
+ S L L + + + +L + N A + D ++
Sbjct: 205 RYSLRYSNARQVS--LADELTVVDSYLQLASIQFEDRLQFENQINPA--IMDVQVPPMLV 260

Query: 373 HNLVQNA-ADALAGASGARHIALEGRAEGATYVFSVSDNGPGIA 415
LV+N +A I L+G + T V + G
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLAL 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_048641HTHFIS957e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.5 bits (235), Expect = 7e-25
Identities = 33/160 (20%), Positives = 61/160 (38%), Gaps = 7/160 (4%)

Query: 6 QSPLVYLVDDDDAVRDALALLLRTVGLRSESHADPQQFLAQLAPQAIGCVVLDIRMPGIS 65
+ + DDD A+R L L G ++ +A VV D+ MP +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 66 GLDVLARLADT-SDLPVVMLTGHANVDLCRRAFKGGAMEFLQKPVDDDVFLDAVQAAVRG 124
D+L R+ DLPV++++ +A + GA ++L KP D + + A+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA- 120

Query: 125 HIARRERLAVTQAAADRLARL---STREHAVLERIVQGMS 161
+ R + + + L S + + + M
Sbjct: 121 --EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQ 158


64NH44784_049011NH44784_049071Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0490112102.431754Dihydroxy-acid dehydratase
NH44784_0490211142.743102transcriptional regulator, GntR family
NH44784_0490311153.163070acetyltransferase (putative
NH44784_0490410143.757267hypothetical protein
NH44784_0490511133.525452hypothetical protein
NH44784_0490611123.664017Alkanesulfonates transport system permease
NH44784_0490711113.142690Nitrate ABC transporter, nitrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049031SACTRNSFRASE413e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 3e-07
Identities = 19/102 (18%), Positives = 36/102 (35%), Gaps = 7/102 (6%)

Query: 29 PWLEPASLREADLDAQTEGERIWVASAADDVLAGFVSL---WEPDDFVHHLYVGRAWRRQ 85
P+ + + D+ E + ++ G + + W + + V + +R++
Sbjct: 45 PYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKK 104

Query: 86 GVARALLRALPGW----PATRYRLKCLRRNAAALAFYRACGF 123
GV ALL W L+ N +A FY F
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


65NH44784_049311NH44784_049391Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0493110104.269947NnrS protein involved in response to NO
NH44784_0493210113.525567hypothetical protein
NH44784_0493313113.567695hypothetical protein
NH44784_0493413113.558944monooxygenase, FAD-binding
NH44784_0493513113.352134transcriptional regulator, LysR family
NH44784_0493612133.490926HlyD family secretion protein
NH44784_0493712122.981003Cobalt-zinc-cadmium resistance protein CzcA;
NH44784_0493810102.924467RND multidrug efflux transporter; Acriflavin
NH44784_049391-193.131125Transcriptional regulator, LysR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049361RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 16/140 (11%), Positives = 42/140 (30%), Gaps = 1/140 (0%)

Query: 35 VAVSVAAARSGPLPRDLHALGTITPLARV-VLRSQVDGQLQRLHYTEGQAVRRGQLLAEI 93
+ ++ + G + A G +T R ++ + ++ + EG++VR+G +L ++
Sbjct: 68 LVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKL 127

Query: 94 DPRPYQAALAAAEGELAHVEALLGNAEIDLRRYRQLARQEAVAGQQLDTAEAQARSYAAQ 153
+A + L +I R E +
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 154 RQRLAAAVADARRLLALTRI 173
+ + + +
Sbjct: 188 TSLIKEQFSTWQNQKYQKEL 207



Score = 29.4 bits (66), Expect = 0.024
Identities = 18/64 (28%), Positives = 28/64 (43%), Gaps = 11/64 (17%)

Query: 60 LARVVLRSQVDGQLQRLH-YTEGQAVRRGQLLAEIDPRPYQAALAAAEGELAHVEALLGN 118
V+R+ V ++Q+L +TEG V + L I P E + V AL+ N
Sbjct: 325 QQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP----------EDDTLEVTALVQN 374

Query: 119 AEID 122
+I
Sbjct: 375 KDIG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049371ACRIFLAVINRP7910.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 791 bits (2045), Expect = 0.0
Identities = 291/1038 (28%), Positives = 482/1038 (46%), Gaps = 33/1038 (3%)

Query: 3 LSRPFILRPVATSFLMLALLLSGILAWRMLPVAALPQVDYPIIQVTTPYPGASPDVTARA 62
++ FI RP+ L + L+++G LA LPVA P + P + V+ YPGA
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTAPLERRFGQIPGLKQMSSSSGS-GISVITLQFSLDVSLGVAEQEVQAAISASGSLLPS 121
VT +E+ I L MSS+S S G ITL F +A+ +VQ + + LLP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DLPTPPVYRKVNPADVPILTLAVTSDSLPLPQ--VYDLVDTRMTQRLAQLSGVGMVSLAG 179
++ + + ++ SD+ Q + D V + + L++L+GVG V L G
Sbjct: 121 EVQQQGISV-EKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 GQRPAVRVQANPMALAARGLQLSDLQEAIAKANSNQPKGSFDGPV------RSVIMDAND 233
Q A+R+ + L L D+ + N G G + + A
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QLQSAQEYRDLIV-AWRNGAPVRLGEVATVEDGAEDRYLAAWVDRQPAVLVNIQRQPGAN 292
+ ++ +E+ + + +G+ VRL +VA VE G E+ + A ++ +PA + I+ GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 VIAVADQVKALLPQLTASLPAAVQVRVLTDRTESIRASVRGVQWELAFAVGLVVLVTFLF 352
+ A +KA L +L P ++V D T ++ S+ V L A+ LV LV +LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNLPATLIPSLAVPLSLIGTFGVMHLAGFSTNNLTLMALTIGAGFVVDDAIVMLENIAR 412
L+N+ ATLIP++AVP+ L+GTF ++ G+S N LT+ + + G +VDDAIV++EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 Y-REQGHSPMAAALKGAGQIGFTLVSLTLSLIAVLIPLLFMEDVVGRLFREFAITLAVAI 471
E P A K QI LV + + L AV IP+ F G ++R+F+IT+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 LISLAVSLTLTPMMCARLLPP----HAERPPGL-------LDRLQARYAGWLDVTLRHQR 520
+S+ V+L LTP +CA LL P H E G D Y + L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 521 LTLLAMLATVALTALLYLAVPKGFFPAQDGGVLQGVTQSAQSTSFEAMSQRQQALAQSLL 580
LL VA +L+L +P F P +D GV + Q + E + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 581 QD--PDVASLSSFIGIDGMNATLNTGRLLVNLKPWSERGAPLADIMARLDARARQVRGI- 637
++ +V S+ + G N G V+LKPW ER A + ++ I
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 638 SLYLQPVQELNIEDRVSRGQYQFTLTS---PDAALLARWSRALAQRLDAAP-QLADISSD 693
++ P I + + + F L L + L P L + +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 694 LQGGGRQAYLEVSRDAAARLGLTMDDVAQALYNAFGQRQVATLFTQSNQYRVVLEVDPRL 753
Q LEV ++ A LG+++ D+ Q + A G V + ++ ++ D +
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 754 AASPEALERIHLKTADGQPIPLAALATVSERAVPLAVNHLSQFPAVNFSFNLPPGGSLGA 813
PE ++++++++A+G+ +P +A T + + P++ PG S G
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 814 AIAAIEAARQDIAMPPSVELRLQGAAAAFEASLSNTLWLMLAAVVTMYLVLGMLYESAIH 873
A+A +E +P + G + S + L+ + V ++L L LYES
Sbjct: 839 AMALME--NLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 874 PVTILSTLPSATVGALLALLLTGRPLDLIAVIGIILLIGLVKKNGIMMVDFALESERSRG 933
PV+++ +P VG LLA L + D+ ++G++ IGL KN I++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 934 LAPREAIREAALLRLRPILMTTLAALFGALPLMLATGSGAELRQPLGWVMVGGLLVSQVL 993
EA A +RLRPILMT+LA + G LPL ++ G+G+ + +G ++GG++ + +L
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 994 TLFTTPAVYLFFHRLGAG 1011
+F P ++ R G
Sbjct: 1017 AIFFVPVFFVVIRRCFKG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049381ACRIFLAVINRP7330.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 733 bits (1893), Expect = 0.0
Identities = 291/1035 (28%), Positives = 492/1035 (47%), Gaps = 29/1035 (2%)

Query: 1 MIRALLHRPIACIFLALALTLLGAVAWRLLPVAPLPQVDFPTIEVRAELPGASPESMAST 60
M + RPI LA+ L + GA+A LPVA P + P + V A PGA +++ T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VAAPLERALGGIAGVSAMSSSS-NQGATRVLLQFALDRDINAAARDVQAAINAARAELPS 119
V +E+ + GI + MSS+S + G+ + L F D + A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 GMPGNPTYRKVNPSQAPIMALALSS--PTRPAGRLYDLGATVLAQKLSQIVGVGEVTLGG 177
+ S + +M S P + D A+ + LS++ GVG+V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 178 SSLPAVRVQVNPNALAHYGVALDDLRQSIADAAPMGPQGQLDSA----GQRWEVGTPGQP 233
+ A+R+ ++ + L Y + D+ + GQL GQ+ Q
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 RM--ADDYNGLIVR-HQDGATIRLAQIARVSDSVENRYSSGFHNHDPAVVLTISRQPGAN 290
R +++ + +R + DG+ +RL +ARV EN N PA L I GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 291 IIETIAAINQALPGLRALMPADVDLTVVLDRSPGIHATLREAHVTLGLAVGLVILVVWLF 350
++T AI L L+ P + + D +P + ++ E TL A+ LV LV++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 351 LGSARAAAIPSVAIPVCLVATFAVMYLWGFSLNNLSLMALIVAAGLVVDDAIVVLENISR 410
L + RA IP++A+PV L+ TFA++ +G+S+N L++ +++A GL+VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 411 HL-ERGLGPRQAALRGVREVGFTLVAMTLALSVVFVSILFMGGLVERLFREFSITLVAAI 469
+ E L P++A + + ++ LV + + LS VF+ + F GG ++R+FSIT+V+A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 470 LISLVVSVAIIPSLCARWLRP-PAEAQTRPSRLRAVFARVHQW----YGASLARVLGHAR 524
+S++V++ + P+LCA L+P AE F Y S+ ++LG
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 525 LTLLLLAAVVALNAYLYAQAPKGFLPQQDTGQLMGFVRGDDGFSFQVMQPKIDVYRQLVL 584
LL+ A +VA L+ + P FLP++D G + ++ G + + Q +D L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 585 KHPAVQD-----VIGYNGGSLGISNSLFLIRLKPASER---RESSAQVIDWLRANAPPVP 636
K+ V G++ + + + LKP ER S+ VI + +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 637 GGMFFLNVDQDLRMPGGFGNSGDHELAIMASDVPALRQWSRRI-SRAMQDIPELRDVDAV 695
G + G + AL Q ++ A Q L V
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 696 GDAATQQVVINIDRAAARRLGVDMRTIASVLGNSFSQRQVATLYDPMNQYRVVLELDPRY 755
G T Q + +D+ A+ LGV + I + + V D ++ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 756 TEDPEVLERVQVVAADGNRVPLSAFSTYEHGLVNDRVFHDGLFAAVGVGFSLAEGVSLQQ 815
PE ++++ V +A+G VP SAF+T + R+ ++ + A G S
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 816 GLAAIDAAMARLMVPAHIQTRLGGDARSFQQSLQDQPWLILGVLVAIYLVLGILYESPLH 875
+A ++ ++L PA I G + + S P L+ V ++L L LYES
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 876 PLTILSTLPSAGVGALLALRLADIEFSLIALLGLFLLVGVVMKNAILMIDFALGLERREG 935
P++++ +P VG LLA L + + + ++GL +G+ KNAIL+++FA L +EG
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 936 LTPEQAIHRAAMLRLRPIVMTNLAGLLGALPLVLGMGEGSELRRPLGVTIVGGLMISQFL 995
+A A +RLRPI+MT+LA +LG LPL + G GS + +G+ ++GG++ + L
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 996 TLYTTPIVYLALERL 1010
++ P+ ++ + R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


66NH44784_050051NH44784_050281Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0500510193.451205Transcriptional regulator, MarR family
NH44784_0500610203.444918Inner membrane component of tripartite multidrug
NH44784_050071-1203.613116hypothetical protein
NH44784_050081-1203.652916Membrane fusion component of tripartite
NH44784_050091-1173.892158Outer membrane component of tripartite multidrug
NH44784_0501010173.490898hypothetical protein
NH44784_0501110163.080043Acetolactate synthase large subunit
NH44784_0501210151.624482hypothetical protein
NH44784_0501310141.3401463-oxoacyl-[acyl-carrier protein] reductase
NH44784_050141-1130.975772Transcriptional regulator, IclR family
NH44784_050151-1111.357506FIG00638051: hypothetical protein
NH44784_0501610132.821944hypothetical protein
NH44784_0501710162.917449Lysine-specific permease
NH44784_050181-1153.780899Methyl-accepting chemotaxis protein
NH44784_050191-1184.115581hypothetical protein
NH44784_050201-2183.635213RecA/RadA recombinase
NH44784_050211-2153.174149DNA polymerase-like protein PA0670
NH44784_050221-3141.913409DNA polymerase III alpha subunit
NH44784_050231-2141.010550hypothetical protein
NH44784_050241-2130.776243hypothetical protein
NH44784_050251012-0.056618Albicidin resistance protein
NH44784_050261110-0.541900Transcriptional regulator
NH44784_050271115-0.668889Aldo-keto reductase
NH44784_050281316-1.673344conserved hypothetical protein; putative
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050081RTXTOXIND538e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 8e-10
Identities = 24/125 (19%), Positives = 48/125 (38%), Gaps = 13/125 (10%)

Query: 44 DGRVKAYVVQVAPDVSGLVTAVPVHDNQDVKAGDVLFEIDRARFQLAYDQAQASVRSQQV 103
GR K + P + +V + V + + V+ GDVL ++ + + Q+S+ ++
Sbjct: 93 SGRSKE----IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARL 148

Query: 104 --AREQALRDAKRNRSLGQLV-------AAEALEQSQTKLQQTEAALAQAEVQLNTARLN 154
R Q L + L +L + E+ + + + Q LN
Sbjct: 149 EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN 208

Query: 155 LERSR 159
L++ R
Sbjct: 209 LDKKR 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050091RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.002
Identities = 14/104 (13%), Positives = 26/104 (25%), Gaps = 5/104 (4%)

Query: 358 TGAVQARIAEAEANTKAAVARFDASVLNALRETESALVVYARQLDR-QAALQAARDQAAQ 416
E T +F N + E L + A + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQ-NQKYQKELNLDKKRAERLTVLARINRYENLSRV 232

Query: 417 AASQ---ARQLFQYGKTDYLTVLDAERTLASNESALAAGQAELS 457
S+ L VL+ E + L +++L
Sbjct: 233 EKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276



Score = 30.6 bits (69), Expect = 0.015
Identities = 22/166 (13%), Positives = 47/166 (28%), Gaps = 26/166 (15%)

Query: 153 RRAVEAAGADAQAAQAAYDATRVTVAAETARAYANLCSAGMQLASAEHSVKVQQESLDAV 212
+++ A+ Q Y ++ Q S E +++ +
Sbjct: 136 TLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 213 SRLQRAGRGTALDVTRARSQLAQLRASLPPFQAQQRTALYRLAALTGQTPAEIPTTLLQC 272
S Q Q Q +L +A++ T L R+ + E
Sbjct: 196 STWQN--------------QKYQKELNLDKKRAERLTVLARINRYENLSRVE-------- 233

Query: 273 AAAPQLSETIPVGDGAALLRRRPDIRQAERALAAATARIGVATADL 318
+L + + A+ + + + E A + V + L
Sbjct: 234 --KSRLDDFSSLLHKQAIAKHA--VLEQENKYVEAVNELRVYKSQL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050131DHBDHDRGNASE1199e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 119 bits (300), Expect = 9e-35
Identities = 82/261 (31%), Positives = 120/261 (45%), Gaps = 13/261 (4%)

Query: 5 MQGKTALVFGAGSSGPGWSNGRAAAALYAREGAHVFAVDLRVESAEETRRIILDEGGRCT 64
++GK A + GA G A A A +GAH+ AVD E E+ + E
Sbjct: 6 IEGKIAFITGAAQG-----IGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 65 ALAADVTDSAQILAAVRAMQAEAGRIDVLHNNVGITEMGDPIEASEESWHRVMDTNLTGV 124
A ADV DSA I ++ E G ID+L N G+ G S+E W N TGV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 125 FLTCKHVLPVMLAQGGGSIVNISSLASIQVNTYPYTSYYAAKAGLNHLTRALAVRYAPDN 184
F + V M+ + GSIV + S + V +Y ++KA T+ L + A N
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 185 IRVNAVLPGVMDTPLIYTQIAGQFEDVEEMRRRRNAAS-----PMGRMGDAWDVAHAALF 239
IR N V PG +T + ++ A + + E + + + P+ ++ D+A A LF
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADE--NGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 240 LASDEAKYITGVCLPVDGGKS 260
L S +A +IT L VDGG +
Sbjct: 238 LVSGQAGHITMHNLCVDGGAT 258


67NH44784_050371NH44784_050521Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_050371-122-3.010752Glutamate Aspartate transport system permease
NH44784_050381226-4.657907probable solute-binding protein
NH44784_050391131-4.914561Aspartate aminotransferase
NH44784_050401236-6.299609Dimethylmenaquinone methyltransferase family
NH44784_050411341-7.239384hypothetical protein
NH44784_050421138-6.800392possible DNA helicase
NH44784_050431036-6.349349hypothetical protein
NH44784_050441-235-5.771151Transcriptional regulator, GntR family domain /
NH44784_050451-133-6.596091FIG00482846: hypothetical protein
NH44784_050461133-5.711738cytochrome B561
NH44784_050471124-3.766784Cobalt-zinc-cadmium resistance protein
NH44784_050481326-4.354671Isochorismatase
NH44784_050491426-4.481370hypothetical protein
NH44784_050501325-3.150161hypothetical protein
NH44784_050511320-1.873451hypothetical protein
NH44784_050521316-1.878728Methyl-accepting chemotaxis protein I (serine
68NH44784_050661NH44784_050711Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0506611113.364058PhnB protein; putative DNA binding
NH44784_0506712124.194351MFS family multidrug efflux protein in
NH44784_0506812114.154425Probable transmembrane protein
NH44784_0506911114.618353transcriptional regulator, AraC family
NH44784_0507011123.758085TonB-dependent receptor
NH44784_0507112133.423475Methyltransferase type 12
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050671TCRTETB1162e-30 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 116 bits (292), Expect = 2e-30
Identities = 83/409 (20%), Positives = 170/409 (41%), Gaps = 20/409 (4%)

Query: 14 LMVLCLGVLMIVLDTTIVNVALPSIREDLHFTETSLVWVVNAYMLTFGGFLLLGGRLGDL 73
L+ LC+ VL+ ++NV+LP I D + S WV A+MLTF + G+L D
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 74 LGHRRMFLAGLVLFTVASLACGLARGQ-GLLIAARAAQGLGGAVVSAVSLSLIMNLFTEA 132
LG +R+ L G+++ S+ + LLI AR QG G A A+ + ++ +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 133 GERARAMGVYGFVCAGGGSLGVLLGGLLTSKLSWHWIFLVNIPIGVAVYALCLRLLPAAR 192
R +A G+ G + A G +G +GG++ + HW +L+ IP+ + L L +
Sbjct: 136 -NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPMITIITVPFLMKLL-KK 191

Query: 193 GAAGGGKLDVAGALTVTASLMLAVYAVVNGNEAGWTSAQSLGLLGAAALLMALFLAIEAR 252
G D+ G + ++ ++ + T++ S+ L + L +F+ +
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRK 242

Query: 253 VAEPLMPLALFRLRNVATANVVGVLWAAGMFAWFFVSALYMQLVLGYDAMQVGLAFLPAN 312
V +P + L + + G + + + + M+ V ++G +
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 313 LIMAAFSLGLSAKLVMRFGIRGPLATGLLMAALGLALFARAPVDGHFAADVLPGMLLLGL 372
+ + LV R G L G+ ++ + ++++ +
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLL----ETTSWFMTIIIVFV 358

Query: 373 GAGIAFNPMLLA--AMSDVEPGQSGLASGVVNTAFMMGGALGLAVLASL 419
G++F +++ S ++ ++G ++N + G+A++ L
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


69NH44784_050801NH44784_051351Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0508010174.230542hypothetical protein
NH44784_0508112143.881027putative lipoprotein
NH44784_0508211123.858224hypothetical protein
NH44784_0508310102.831643Purine nucleoside phosphorylase
NH44784_050841082.554076FIG00453891: hypothetical protein
NH44784_050851182.498991Transcriptional regulator, AraC family
NH44784_0508611121.429307Maebl
NH44784_050871-1160.041308hypothetical protein
NH44784_0508810151.528732Flagellar motor rotation protein MotA
NH44784_0508912132.761442Flagellar motor rotation protein MotB
NH44784_0509011123.511166PhnB protein
NH44784_0509111143.545391PhnB protein
NH44784_0509211122.947587Glyoxalase/bleomycin resistance
NH44784_0509312113.587904RNA polymerase sigma-70 factor, ECF subfamily
NH44784_0509413123.281807hypothetical protein
NH44784_0509513123.504825L-rhamnose operon transcriptional activator
NH44784_0509615132.920932FIG006442: Integral membrane protein
NH44784_0509713142.648299hypothetical protein
NH44784_0509814173.813528HAD-superfamily hydrolase, subfamily IA, variant
NH44784_0509913163.799090major facilitator superfamily MFS_1
NH44784_0510013164.046275probable transcriptional regulator
NH44784_0510112173.778438COGs COG2043
NH44784_0510213153.144270FIG00553879: hypothetical protein
NH44784_0510312143.241621LysR family transcriptional regulator STM2281
NH44784_051041-1132.524193Glutathione S-transferase
NH44784_0510510133.548407hypothetical protein
NH44784_051061-293.945129Transcriptional regulator, MarR family
NH44784_051071-293.778193Hypothetical cytosolic protein
NH44784_0510810134.009874GCN5-related N-acetyltransferase
NH44784_0510911143.876019hypothetical protein
NH44784_0511014175.274332Citrate synthase (si
NH44784_0511112124.687880CoA transferase, CAIB/BAIF family
NH44784_0511214223.837135hypothetical protein
NH44784_051131483.966134COG0666: FOG: Ankyrin repeat
NH44784_0511415114.148940hypothetical protein
NH44784_0511514104.640724hypothetical protein
NH44784_0511611123.550690FOG: WD40-like repeat
NH44784_0511712112.689361FIG00432598: hypothetical protein
NH44784_0511810101.309346hypothetical protein
NH44784_051191-216-1.169461Flagellar motor protein
NH44784_051201020-1.727228hypothetical protein
NH44784_051211125-3.741649Lignostilbene-alpha,beta-dioxygenase and related
NH44784_051221236-6.685410hypothetical protein
NH44784_051231335-5.320835hypothetical protein
NH44784_051241122-2.409741hypothetical protein
NH44784_051251015-1.280859hypothetical protein
NH44784_051261012-0.828310hypothetical protein
NH44784_051271011-0.359549COG1526: Uncharacterized protein required for
NH44784_051281-1111.967619hypothetical protein
NH44784_0512910121.294407VgrG protein
NH44784_051301212-0.178648ClpB protein
NH44784_051311512-1.527084Uncharacterized protein ImpH/VasB
NH44784_051321411-1.626827Protein ImpG/VasA
NH44784_051331715-2.674463Uncharacterized protein ImpF
NH44784_051341512-2.186401Uncharacterized protein ImpD
NH44784_051351512-1.601053Uncharacterized protein ImpC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050811cloacin270.038 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 27.4 bits (60), Expect = 0.038
Identities = 20/66 (30%), Positives = 28/66 (42%), Gaps = 1/66 (1%)

Query: 26 GYRGGHHGGYHHHHGGGGGGSGWWIGGAFALAATGLILAANSGPSYAQTGVYAPGVSYTS 85
G GG H +GGG G SG G L+A +A P+ + G VS ++
Sbjct: 51 GSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGF-PALSTPGAGGLAVSISA 109

Query: 86 PPVYAA 91
+ AA
Sbjct: 110 GALSAA 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050891OMPADOMAIN744e-17 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 74.2 bits (182), Expect = 4e-17
Identities = 37/134 (27%), Positives = 64/134 (47%), Gaps = 11/134 (8%)

Query: 179 QSVSFRISNELLFPSGQATLSPSGLDVIKRLAAIL---NKNDYQVSVEGHSDPVPIQTRQ 235
Q+ F + +++LF +ATL P G + +L + L + D V V G++D I +
Sbjct: 211 QTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDR--IGSDA 268

Query: 236 FASNWELSSSRATSVLRELVRDGVAGSRLRAVGYAETRPIESNDTPAGRAANRRVELIMD 295
+ N LS RA SV+ L+ G+ ++ A G E+ P+ N + ++ +
Sbjct: 269 Y--NQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCL-- 324

Query: 296 ITAPAKPVQAEAKG 309
AP + V+ E KG
Sbjct: 325 --APDRRVEIEVKG 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_050991TCRTETB386e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.9 bits (88), Expect = 6e-05
Identities = 53/386 (13%), Positives = 116/386 (30%), Gaps = 69/386 (17%)

Query: 40 IAAPLLRRDLGLDLASIGTLTAVFSILGMVGGIAAGGIIARFGARRMLLWGLATVAAGTA 99
++ P + D AS + F + +G G + + G +R+LL+G+ G+
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 100 AGALA-PGYGVLLASRVVEGLGFLLITVAGPAALQRLVLPSSRDLAFALWSCYMPAGMAI 158
G + + +L+ +R ++G G + R + +R AF L + G +
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 159 AMLASQAFGDWHAYWWCAGATAVIALGCIAALAPPVPGGAALSWNGLRRDTTDTLGAAGP 218
+ + + + + + G
Sbjct: 155 GPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR----------IKGHFDIKGI 204

Query: 219 MLLAASFTLY--------------SLMFFALFT--------------------------- 237
+L++ + S++ F +F
Sbjct: 205 ILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLC 264

Query: 238 -------------FLPVLLMDKLGLPLATAG-LYSAVASAANIIGNLGAGVLLSR-GWRR 282
+P ++ D L A G + + + II G+L+ R G
Sbjct: 265 GGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 283 STLIACASVAMGVVALLIFQSVFAAMPTFLLCVLFSAVGGLIPATLLGTAPLVAPRPALT 342
I +++ + + + ++F G T++ T + +
Sbjct: 325 VLNIGVTFLSVSFLTASFL--LETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEA 382

Query: 343 AATVGLVMMGSNLGQVIGPVTVGGVI 368
A + L+ S L + G VGG++
Sbjct: 383 GAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051021SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 12/50 (24%), Positives = 20/50 (40%)

Query: 80 VSDAARGSGVGRALEEAAEALARARGCDRIEVHCHERRVDAHRFYHRQGY 129
V+ R GVG AL A A+ + + + + A FY + +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051181RTXTOXIND382e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.9 bits (88), Expect = 2e-04
Identities = 28/197 (14%), Positives = 60/197 (30%), Gaps = 11/197 (5%)

Query: 702 QSHADLQAALAARDKARLAAWHASLGDVAAELGRQWRQAGEQTAARHQELEQLLARTAHD 761
+ AD ++ +ARL + + EL + E+ + R
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190

Query: 762 LTAQTQAQAALLENLSTRLETAAGGVTQAWTEAQVRQELANEKLASDNQQALTTAAAAFE 821
+ Q L+ +A + + E L+ + L ++
Sbjct: 191 IKEQFSTWQNQKYQKELNLDKK-----RAERLTVLARINRYENLSRVEKSRLDDFSSLLH 245

Query: 822 QHSASLLQTLGQSQAGLQAELAARDQERLAAWTGELGRMAEALRQEWEQAGARSAALQQE 881
+ + + L Q ++A L + +L ++ + E+ + + E
Sbjct: 246 KQAIAKHAVLEQENKYVEA------VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNE 299

Query: 882 ICATLAQTAQDIATQTE 898
I L QT +I T
Sbjct: 300 ILDKLRQTTDNIGLLTL 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051191OMPADOMAIN778e-19 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 76.9 bits (189), Expect = 8e-19
Identities = 43/142 (30%), Positives = 62/142 (43%), Gaps = 16/142 (11%)

Query: 70 ALAAPLAAGRVTLINGRIGISGSVLFALNSDQLQPEGREVLKSLIEPLSAYLKANDEILM 129
+ AP A + + VLF N L+PEG+ L L LS D ++
Sbjct: 198 PVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNL-DPKDGSVV 256

Query: 130 VSGFTDDQQVREGNRRFADNWDLSAQRALTVTRALIDEGVPSSLVFAAAFGAEQPVASNA 189
V G+TD R G+ + N LS +RA +V LI +G+P+ + A G PV N
Sbjct: 257 VLGYTD----RIGSDAY--NQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNT 310

Query: 190 DEEGR---------AKNRRVEI 202
+ + A +RRVEI
Sbjct: 311 CDNVKQRAALIDCLAPDRRVEI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051301RTXTOXIND412e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.6 bits (95), Expect = 2e-05
Identities = 16/109 (14%), Positives = 37/109 (33%), Gaps = 11/109 (10%)

Query: 436 ALIDDARHRLARQETERAALRREAAAGAAQSARLRELDEEMAATRQQLVDAEARLAQESE 495
A I+ + +++ A + E + + +L +++L Q
Sbjct: 221 ARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ--- 277

Query: 496 LVRQIHALREELEAAGQEPESETTGKRRGGKAAEPTPAQAQLAELQQQL 544
+ +I + +EE + Q ++E K R + L +L
Sbjct: 278 IESEILSAKEEYQLVTQLFKNEILDKLR--------QTTDNIGLLTLEL 318


70NH44784_051611NH44784_051861Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_051611-1103.520433MFS permease
NH44784_051621-192.586504Uncharacterized glutathione S-transferase-like
NH44784_0516311103.209992Superoxide dismutase [Cu-Zn] precursor
NH44784_0516411113.577782probable membrane protein STY4873
NH44784_051651-1113.004356Glycine cleavage system transcriptional
NH44784_051661-1112.026832Enolase
NH44784_0516710151.447946hypothetical protein
NH44784_051681-2102.273926hypothetical protein
NH44784_051691-292.590990hypothetical protein
NH44784_051701-2122.126476hypothetical protein
NH44784_051711-2152.872686Transcriptional regulator, LysR family
NH44784_051721-2143.432431hypothetical protein
NH44784_051731-2173.017644Enoyl-CoA hydratase /
NH44784_051741-2153.0583413-ketoacyl-CoA thiolase
NH44784_051751-1132.662296Acyl-CoA dehydrogenase, long-chain
NH44784_051761-1123.007788Acyl-CoA dehydrogenase family protein
NH44784_051771-1112.499459Long-chain-fatty-acid--CoA ligase
NH44784_0517810112.468225hypothetical protein
NH44784_0517911102.617369pyridoxal phosphate-dependent
NH44784_0518011111.334540Transcriptional regulator, GntR family
NH44784_0518111113.180276Universal stress protein family
NH44784_0518211133.454558Transcriptional regulator, LysR family
NH44784_0518310103.081865Permeases of the major facilitator superfamily
NH44784_051841-1102.708344Transcriptional regulator, MarR family
NH44784_051851093.463249Niacin transporter NiaP
NH44784_051861083.774394Putative diheme cytochrome c-553
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051611TCRTETA513e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.4 bits (123), Expect = 3e-09
Identities = 70/350 (20%), Positives = 117/350 (33%), Gaps = 16/350 (4%)

Query: 38 GATAAQTGYLQTAQTLPFLLLSLPAGVLADRLPRRTLMTSAECVRAASLLALLALLALGG 97
A G L L + G L+DR RR ++ + A + L
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWV 98

Query: 98 LNLGWLAALGFLGAVGTVAYNVAAPALVPTLVPAPALGQANRWLELARSAAFSAGPALGG 157
L +G + A G GA G V A A + + + ++ AGP LGG
Sbjct: 99 LYIGRIVA-GITGATGAV-----AGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG 152

Query: 158 ALVGGIGAAAAYGLATLLSLLAA--GLLAGLPPQAPRAAARRHLWHELRDGAAFVTGHPL 215
L+GG A + A L+ L G R + G +
Sbjct: 153 -LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 216 LRPVLVTAVFFNLAWFVLQAVYVVYAVERLGLGAAGVGVTLGVYGA-GMLAGALAAPWLA 274
+ ++ L V A++V++ +R A +G++L +G LA A+ +A
Sbjct: 212 VAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVA 271

Query: 275 GRLSFGALIAVGPLCALAASLILLSTLALPSGLWAAVGFFLFGAGPILWTIATTTLRQAI 334
RL + +G + A LA + W A + A + A +
Sbjct: 272 ARLGERRALMLG----MIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQ 327

Query: 335 TPNALLGRVSAVILTATFGARPLGALIGAAL--ASRLGLEACLWVSSAGF 382
G++ + T +G L+ A+ AS W++ A
Sbjct: 328 VDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAAL 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051831TCRTETB1942e-58 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 194 bits (494), Expect = 2e-58
Identities = 79/398 (19%), Positives = 159/398 (39%), Gaps = 14/398 (3%)

Query: 29 LLIVVDMTVLYTALPRLTHDLGVTASSKLWIVNIYALVVSGLLLGMGTLGDRLGHKRLFM 88
V++ VL +LP + +D +S W+ + L S G L D+LG KRL +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 89 MGLAVFGAASLAAAYSPNPAA-LIAARAVLAVGAAMMMPATLSILRLTFADERERAIAIG 147
G+ + S+ + + LI AR + GAA PA + ++ + + R A G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAF-PALVMVVVARYIPKENRGKAFG 142

Query: 148 VWASVASGGAALGPVIGGFLLEHFWWGSVFLINLPIVLLALPLARRYIPAGQPDRQRPWD 207
+ S+ + G +GP IGG + + W +L+ +P++ + + + + +D
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHFD 200

Query: 208 LVGSLQVMVGLILTAYALKELGRAQPSWLDAALACAGGAAMLVVFARRQRGRAHPLIDFA 267
+ G + + VG++ S ++ ++F + R P +D
Sbjct: 201 IKGIILMSVGIVFFMLFTTSY-----SISFLIVS----VLSFLIFVKHIRKVTDPFVDPG 251

Query: 268 IFRNRSFSSAVASALFAAAALLGMELVFSQRLQLVLGMSPIQA-ALYILPLPLAAFIAGP 326
+ +N F V + G + ++ V +S + ++ I P ++ I G
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 327 LAGWLLPKVGSARLLFVALLTSGLGMGGYLFTYDGALAAQMASLCVLGVGIGATMTAASS 386
+ G L+ + G +L + + + F + + + G+ T T S+
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIST 371

Query: 387 TIMQSATPERAGMAASIEEVSYELGGALGVTLMGSILS 424
+ S + AG S+ + L G+ ++G +LS
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051851TCRTETB552e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 55.3 bits (133), Expect = 2e-10
Identities = 33/116 (28%), Positives = 56/116 (48%), Gaps = 1/116 (0%)

Query: 45 SAPSIARTFGIPVPEALQTGTAFFVGMLIGAFAFGRLADRIGRRPVLMMAVVIDACAGVA 104
S P IA F P TAF + IG +G+L+D++G + +L+ ++I+ V
Sbjct: 36 SLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVI 95

Query: 105 SAFAPDFA-WLLLLRFVTGIGVGGTLPVDYTMMAEFLPSERRGRWLVLLESFWALG 159
F L++ RF+ G G + ++A ++P E RG+ L+ S A+G
Sbjct: 96 GFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMG 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051861RTXTOXINA350.001 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 35.3 bits (81), Expect = 0.001
Identities = 14/51 (27%), Positives = 24/51 (47%)

Query: 819 RATATATATATATATTTATTTATASAATSASVVAAPAVTTATAAAATINAV 869
+ T A+ T +T A+ ++ SAA + S+V AP A I+ +
Sbjct: 359 KETGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISGI 409


71NH44784_052071NH44784_052211Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0520710113.399344putative amidase
NH44784_052081-1133.214981hypothetical protein
NH44784_052091-1142.934683putative esterase
NH44784_0521010152.363104Transcriptional regulator, LysR family
NH44784_0521110132.464071Permeases of the major facilitator superfamily
NH44784_052121-1161.704535hypothetical protein
NH44784_052131-1171.440219DinG family ATP-dependent helicase CPE1197
NH44784_0521412170.783251Hypothetical protein, restriction
NH44784_052151319-0.415922hypothetical protein
NH44784_052161220-0.817359Putative acetyltransferase
NH44784_052171-214-0.646947FIG00433914: hypothetical protein
NH44784_052181-110-1.426023hypothetical protein
NH44784_052191110-1.389048Enoyl-[acyl-carrier-protein] reductase [NADPH]
NH44784_05220129-1.141447Quaternary ammonium compound-resistance protein
NH44784_052211210-0.760757putative TetR-family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052111TCRTETB1005e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 100 bits (251), Expect = 5e-25
Identities = 76/407 (18%), Positives = 155/407 (38%), Gaps = 17/407 (4%)

Query: 19 LLAACLTAVLIPLCFTGPAVVLPAISHALGGTPVQLNWILNGYILAYGSVIMVSGSLTDL 78
L+ C+ + L V LP I++ P NW+ ++L + V G L+D
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 79 YGPRRVWLGGLAWFCAFTFPIAWAPS-AAWIDFLRLMQGVGGAAAFAAAMSSLAPLFHGA 137
G +R+ L G+ C + S + + R +QG G AA A M +A
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 138 ARARAFSLLGTTFGIGLSFGPLVSGWLVQTAGWRWVFHGTGLVGLLGLALVAASVRATAG 197
R +AF L+G+ +G GP + G + W ++ + + L+ +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLM--KLLKKEV 193

Query: 198 AASGRFDWRGALSFTGALGLFTYGMLLAPENGWRDAAVAGSLLASALLAIAFIATEMRAS 257
G FD +G + + + F ML L+ S L + F+ + +
Sbjct: 194 RIKGHFDIKGIILMSVGIVFF---MLFTTSYSI------SFLIVSVLSFLIFVKHIRKVT 244

Query: 258 HPMLDLSLFRGARFVGVQALAASPAFLFIALIAILPGRFIGIDGHSALRAGQLMMGLATP 317
P +D L + F+ ++++P + S G +++ T
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 318 LLVV-PFLAALLTRRFSPAILSSIGLALAAAGLAWLARDLGGDAGRLWLPMSLIGIGIGL 376
+++ ++ +L R P + +IG+ + + L + ++ + ++ + GL
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLL--ETTSWFMTIIIVFVLGGL 362

Query: 377 PWG--LMDAMAVSVAPPERVGMATGIFNTVRVSADGVAIAVISALLA 421
+ ++ + S + G + N ++G IA++ LL+
Sbjct: 363 SFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052161SACTRNSFRASE481e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 48.4 bits (115), Expect = 1e-09
Identities = 21/102 (20%), Positives = 43/102 (42%), Gaps = 10/102 (9%)

Query: 60 NLDARAVFIAEQDGQPIGFVCVQQEHDHPGEVLLDNLHVLPPYQGTGAGKLMVARAEAWA 119
+ +A F+ + IG + ++ + G L++++ V Y+ G G ++ +A WA
Sbjct: 61 EEEGKAAFLYYLENNCIGRIKIRSNWN--GYALIEDIAVAKDYRKKGVGTALLHKAIEWA 118

Query: 120 RARGAARLYLYALEGNTRAIAFYERQGWQYTGSEIDRIGGIE 161
+ L L + N A FY + + IG ++
Sbjct: 119 KENHFCGLMLETQDINISACHFYAKHHF--------IIGAVD 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052191DHBDHDRGNASE1283e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 128 bits (322), Expect = 3e-38
Identities = 88/253 (34%), Positives = 132/253 (52%), Gaps = 15/253 (5%)

Query: 5 GKAALVTGGTRGIGAAIVRSLAQRGAAVAFTYANSDSAANELRRELGASGARVLGIKADS 64
GK A +TG +GIG A+ R+LA +GA +A N + ++ L A AD
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE-KVVSSLKAEARHAEAFPADV 66

Query: 65 RDPAAVRAAVDHTARELGRLDILVNSAGVFPSGPIEDATLQEIDDTLAIHARAVFVASQA 124
RD AA+ RE+G +DILVN AGV G I + +E + T ++++ VF AS++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 125 ALAHMGA--GGRIISIGSCFAQRVPYGGVTLYAMSKSALIGFTKGLAREVGERGITVNIV 182
+M G I+++GS VP + YA SK+A + FTK L E+ E I NIV
Sbjct: 127 VSKYMMDRRSGSIVTVGSN-PAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 183 DPGSTDTDMNPADGPAASAELALMA-----------VKRYARPAEIAAAVAYLASAESQF 231
PGST+TDM + + ++ +K+ A+P++IA AV +L S ++
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 232 ITGSSLAIDGGFT 244
IT +L +DGG T
Sbjct: 246 ITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052211HTHTETR527e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.3 bits (125), Expect = 7e-11
Identities = 29/161 (18%), Positives = 51/161 (31%), Gaps = 8/161 (4%)

Query: 1 MRTFWTQGYEGTSIQDLVAAMGVNKPSLYATFGCKEEIFREAVELYDRLEGRATSHSLAN 60
+R F QG TS+ ++ A GV + ++Y F K ++F E+++ E L
Sbjct: 21 LRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF---SEIWELSESNIGELELEY 77

Query: 61 APTAREAVETMLRANARAYVVQDG-----PRGCMIVLSSLLGAPENESVRAFLASNRLNG 115
++LR + I+ E V+ + L
Sbjct: 78 QAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLES 137

Query: 116 ETLLRERLAQGIAQGDLTATADIAQLAAFYTTVLEGLSIQA 156
+ + L I L A + A + GL
Sbjct: 138 YDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENW 178


72NH44784_052601NH44784_052761Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0526012250.297228hypothetical protein
NH44784_052611224-0.215902hypothetical protein
NH44784_052621121-0.629563hypothetical protein
NH44784_052631321-0.180424hypothetical protein
NH44784_052641-114-0.337691hypothetical protein
NH44784_052651-114-0.293093Transcriptional regulator, ArsR family
NH44784_052661013-2.212909Transcriptional regulator, ArsR family
NH44784_052671113-2.673189Bifunctional protein: zinc-containing alcohol
NH44784_052681214-3.6622625,10-methylenetetrahydrofolate reductase
NH44784_052691313-4.916893Transcriptional activator MetR
NH44784_052701419-5.871698FMN reductase
NH44784_052711320-6.037742hypothetical protein
NH44784_052721424-5.6645945-methyltetrahydropteroyltriglutamate--
NH44784_052731529-5.859273Transcriptional regulator, LysR family
NH44784_052741318-5.368837putative transposase
NH44784_052751317-4.533244Transcriptional regulator, AraC family
NH44784_052761214-3.463804TonB-dependent siderophore receptor
73NH44784_053631NH44784_053801Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0536313171.364012Transcriptional regulator, LysR family
NH44784_0536413181.230324putative hybrid sensor and regulator protein(
NH44784_0536514201.060546two-component hybrid sensor and regulator
NH44784_0536614201.003476hypothetical protein
NH44784_0536714211.086388Hemolysin activation/secretion protein
NH44784_0536814211.039539PUTATIVE HEMAGGLUTININ-RELATED PROTEIN
NH44784_053691-2100.261740Peptide transport system permease protein sapC
NH44784_053701-1120.836581putative acyl coenzyme A thioester hydrolase(
NH44784_053711-1120.737986hypothetical protein
NH44784_053721-2130.816470FIG00433859: hypothetical protein
NH44784_053731-1122.021115Protein crcB homolog
NH44784_0537410122.316115Acetylornithine
NH44784_053751-1112.521578transcriptional regulator, LysR family
NH44784_053761-1103.187398hypothetical protein
NH44784_053771-1112.326413hypothetical protein
NH44784_053781092.749452putative two component sensor kinase
NH44784_0537910113.182098Response regulators consisting of a CheY-like
NH44784_0538012142.792304putative outer membrane (scaffolding) protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053641HTHFIS685e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 5e-14
Identities = 41/160 (25%), Positives = 71/160 (44%), Gaps = 6/160 (3%)

Query: 695 RIWLAEDTPEIREFLVEELASLGFLVESEADGRGMVARIQADDARRPDLILTDHRMEGAD 754
I +A+D IR L + L+ G+ V ++ + I A D DL++TD M +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDEN 61

Query: 755 GSAVLRAARARWPDLPVVAVSATPQETPPLDG--AGYDASLLKPINLVELRHLLGRLLHL 812
+L + PDLPV+ +SA + G L KP +L EL ++GR L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 813 ATNPAANEPAPAASPLPRLSRAE-LEQVQRLLDIGGISDL 851
+ + +P + R+ ++++ R+L +DL
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053651HTHFIS731e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.6 bits (178), Expect = 1e-16
Identities = 39/158 (24%), Positives = 65/158 (41%), Gaps = 11/158 (6%)

Query: 20 RVLVVEDRPEDRMLLVDFLTSQDCRVYVAEDGNDGYRKAQLVQPDIILMDVNMPVCDGLA 79
+LV +D R +L L+ V + + +R D+++ DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 80 A-CRLLKADPATRAIPVIFLTAASLPMERVAGLSAGAVDYVAKPFDFEEVRL---RLLIH 135
R+ KA P +PV+ ++A + M + GA DY+ KPFD E+ R L
Sbjct: 65 LLPRIKKARP---DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 136 LRVTPVAQPASAAGPCEDGAPAAGTGSTLDGVLFRAAR 173
+ P + +DG P G + + + AR
Sbjct: 122 PKRRPSKLEDDS----QDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053671PYOCINKILLER310.017 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.017
Identities = 18/66 (27%), Positives = 28/66 (42%), Gaps = 6/66 (9%)

Query: 22 GAEPGRQLPQPVMPESTPTAPPVTVQQGGATEAPAGADKLTFTLTDMRV-EGVTRYPADT 80
+ PG Q P +TP P GAT P A T+ + + +PAD+
Sbjct: 419 ASPPGNQNPS----STTPVVPKPVPVYEGATLTPVKATPETYPGVITLPEDLIIGFPADS 474

Query: 81 -LRPLY 85
++P+Y
Sbjct: 475 GIKPIY 480


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053681PF05860714e-16 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 71.0 bits (174), Expect = 4e-16
Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 7/109 (6%)

Query: 44 PTGGNVVAGGATINDRGNGTLDINQSTGKAIINWKDFSIGANETVNFRQPGSNSITLNRV 103
P N+ G T Q+ ++++FS+ + T F P + ++RV
Sbjct: 10 PINSNITTEGNTRIIERG-----TQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRV 64

Query: 104 VGNDPSAIFGRLNANGT--VMLVNPNGVLFGKGARIDVGGLVATTANIS 150
G S I G + AN T + L+NPNG++FG+ AR+D+GG +
Sbjct: 65 TGGSVSNIDGLIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053711FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.4 bits (76), Expect = 0.001
Identities = 21/58 (36%), Positives = 27/58 (46%), Gaps = 9/58 (15%)

Query: 2 PFTRRHALGALAALPLLPRLAFAQAQADFPTR-------PVRLL--VGFAPGGLTDIA 50
+RR L A+A PLL ++ A A A P R PV LL +G P G+ D
Sbjct: 6 LISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTI 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053791HTHFIS874e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 4e-22
Identities = 32/167 (19%), Positives = 69/167 (41%), Gaps = 14/167 (8%)

Query: 3 VLLVEDDAMLGDALRSSMVGAGLRVDWVRDVPAARLALVEHGYGAVLLDLGLPGGSGLAV 62
+L+ +DDA + L ++ AG V + + V+ D+ +P + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 63 LKHLRARYDATPVLIITARDRLSERVQGLDAGADDYIVKPFQLDELLARLRAVVRRSSNS 122
L ++ PVL+++A++ ++ + GA DY+ KPF L EL+ + +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE---- 121

Query: 123 VVSVLRCRDVQLDPARRVVTRA-GAEVVLSASEYRTLLALMERAGQT 168
+ P++ G +V ++ + + ++ R QT
Sbjct: 122 ---------PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQT 159


74NH44784_054351NH44784_054451Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0543512140.495481Transcriptional regulator, MarR family
NH44784_0543613150.605870Hydroxyethylthiazole kinase
NH44784_0543713140.127595Ferrichrome-iron receptor
NH44784_054381215-0.3025712',3'-cyclic-nucleotide 2'-phosphodiesterase
NH44784_054391214-0.149361Iron(III) dicitrate transport ATP-binding
NH44784_0544010130.840587Petrobactin ABC transporter, permease protein
NH44784_054411-1101.258638Ferric anguibactin transport system permease
NH44784_054421-2111.096006Iron compound ABC transporter, periplasmic iron
NH44784_054431-2110.892817TonB-dependent receptor; Outer membrane receptor
NH44784_0544410112.717006Copper-sensing two-component system response
NH44784_054451-1133.074123Copper sensory histidine kinase CpxA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054421FERRIBNDNGPP586e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 58.4 bits (141), Expect = 6e-12
Identities = 52/209 (24%), Positives = 81/209 (38%), Gaps = 21/209 (10%)

Query: 23 ALGAASALSVA-QTMPVKHARGETAVPSNPAKTVVLDLAVLDTLHALGVDVTGVPSAAKL 81
L A ALS M HA P + V L+ ++ L ALG+ GV
Sbjct: 11 RLLTAMALSPLLWQMNTAHAAAID-----PNRIVALEWLPVELLLALGIVPYGVADTINY 65

Query: 82 ------PPQLAQYADKRYLKVGSMFEPNYEVIHAAQPQVIFVAGRSAPKYDELAKLAPTV 135
PP D VG EPN E++ +P + + P + LA++AP
Sbjct: 66 RLWVSEPPLPDSVID-----VGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 136 DLTVNAKDLVGSVTRNT-ETLAAIYGKQAVAKEKLDALRASIAALHGQAAKAGSALIVLT 194
+ ++ R + +A + Q+ A+ L I ++ + K G+ ++LT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 195 T---GGKMSAYGPGSRFGVIHDAFGFAPA 220
T M +GP S F I D +G A
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNA 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054441HTHFIS847e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 7e-21
Identities = 39/137 (28%), Positives = 60/137 (43%), Gaps = 1/137 (0%)

Query: 1 MATRRILLIDDDVDLALMLREYLEPTHIELTLAHNSAQGLPLAMQECFDLVLLDLMLPDG 60
M IL+ DDD + +L + L ++ + N+A DLV+ D+++PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLDLLRHLR-RHSRRPVIMFTAHGGETDRVLGLELGADDYLTKPFGPRELKARIGAVLR 119
N DLL ++ PV++ +A + E GA DYL KPF EL IG L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RFEEKPEPAPPELSVGA 136
+ +P + G
Sbjct: 121 EPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054451PF06580310.009 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.009
Identities = 12/44 (27%), Positives = 21/44 (47%)

Query: 360 IENVLRNALRFAPAGGAIQARLLVDAGEACLEIEDNGPGVPEKE 403
+EN +++ + P GG I + D G LE+E+ G +
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT 307


75NH44784_054751NH44784_055011Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_054751215-3.980656Chaperone protein HscB
NH44784_054761112-3.750144Iron binding protein IscA for iron-sulfur
NH44784_054771012-3.534394Iron-sulfur cluster assembly scaffold protein
NH44784_054781012-2.708075Cysteine desulfurase
NH44784_054791-112-1.968519Iron-sulfur cluster regulator IscR
NH44784_054801213-2.809508Low molecular weight protein tyrosine
NH44784_054811314-3.113787Excinuclease ABC subunit B
NH44784_054821418-3.477305hypothetical protein
NH44784_054831620-3.382143Aromatic-amino-acid aminotransferase
NH44784_054841519-3.728747SOS-response repressor and protease LexA
NH44784_054851418-4.233491ATP-dependent protease La
NH44784_054861419-4.149677ATP-dependent Clp protease ATP-binding subunit
NH44784_054871219-3.134790ATP-dependent Clp protease proteolytic subunit
NH44784_054881420-3.460583Cell division trigger factor
NH44784_054891016-4.659280Putative metalloprotease yggG
NH44784_054901117-3.701793hypothetical protein
NH44784_054911418-3.039740cold shock-like protein
NH44784_054921117-1.962871FIG027190: Putative transmembrane protein
NH44784_054931122-2.489288Cold shock protein CspA
NH44784_054941-217-0.202782FIG00431460: hypothetical protein
NH44784_0549511161.285841hypothetical protein
NH44784_0549612180.410135hypothetical protein
NH44784_0549713171.837537hypothetical protein
NH44784_0549813172.004111Endoribonuclease L-PSP
NH44784_0549913162.177580METAL-ACTIVATED PYRIDOXAL ENZYME
NH44784_0550013162.651449Transcriptional regulator, LysR family
NH44784_0550112142.435213putative translational inhibitor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054851GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.9 bits (77), Expect = 0.002
Identities = 24/156 (15%), Positives = 55/156 (35%), Gaps = 14/156 (8%)

Query: 128 SETEALRRAIVAQFEQYVKLNKKIPPEILTSLAGIDDAGRLADTIAAHLPLKLEQKQKML 187
++ E + K + E A + + + + +
Sbjct: 228 ADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT-- 285

Query: 188 EIVGTSERLEGLLTQLETEIDILQVEKRIRGRVKKQMEKSQRDYYLNEQVKAIQKELGEG 247
+ LE LE + +L R +++ ++ S+ K ++ E +
Sbjct: 286 -LEAEKAALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK------KQLEAEHQKL 335

Query: 248 EEGADIEELEKKIIAAHM--PKEARKKADAELKKLK 281
EE I E ++ + + +EA+K+ +AE +KL+
Sbjct: 336 EEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLE 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054861HTHFIS290.048 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.048
Identities = 12/46 (26%), Positives = 23/46 (50%), Gaps = 3/46 (6%)

Query: 109 VELSKSNIMLIGPTGSGKTLLAQTL---ARMLNVPFVMADATTLTE 151
+ + +M+ G +G+GK L+A+ L + N PFV + +
Sbjct: 156 LMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_054991ALARACEMASE424e-06 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 41.7 bits (98), Expect = 4e-06
Identities = 34/182 (18%), Positives = 58/182 (31%), Gaps = 36/182 (19%)

Query: 41 TTASVTLDTLDTPCLLLDETRMTRNIERLNRLMA-------GHGVQLRPHLKTPKSIEVA 93
AS+ L L + R R+ + GHG+ +
Sbjct: 5 IQASLDLQALK------QNLSIVRQAATHARVWSVVKANAYGHGI-----------ERIW 47

Query: 94 RRVMARPEGPAAVSTLQEAEQFAAAGVTD--LLYAVGVAPAKLERVLALRRGGVDLTVVV 151
+ A A+ L+EA G L+ LE LT V
Sbjct: 48 SAIGAT--DGFALLNLEEAITLRERGWKGPILMLEGFFHAQDLEIYDQ-----HRLTTCV 100

Query: 152 DSVEAAQAVAARALEEGAAIPALIEIDCDGHRAGVQPGQAGQLLEIARALHAAGCLRGVM 211
S +A+ L A + ++++ +R G QP + + + RA+ G +M
Sbjct: 101 HSNWQLKALQNARL--KAPLDIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGE-MTLM 157

Query: 212 TH 213
+H
Sbjct: 158 SH 159


76NH44784_055181NH44784_055721Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_055181128-3.064441hypothetical protein
NH44784_055191126-3.453050hypothetical protein
NH44784_055201127-2.969233hypothetical protein
NH44784_055211232-5.317407phage-related hypothetical protein
NH44784_055221230-4.433828hypothetical protein
NH44784_055231229-4.079679phage-related putative exported protein
NH44784_055241331-3.696766phage-related hypothetical protein
NH44784_055251432-4.246305phage-related putative membrane protein
NH44784_055261132-4.242458Phage tail fiber protein
NH44784_055271331-3.313115Phage tail assembly protein
NH44784_055281430-3.739767Phage minor tail protein
NH44784_055291331-3.517723EBNA-1
NH44784_055301328-4.240395Phage minor tail protein #tail protein M
NH44784_055311223-3.957890Phage tail length tape-measure protein 1
NH44784_055321316-3.582524hypothetical protein
NH44784_055331219-2.910300hypothetical protein
NH44784_055341323-3.079510hypothetical protein
NH44784_055351222-3.557249hypothetical protein
NH44784_055361325-3.504077hypothetical protein
NH44784_055371430-3.553683hypothetical protein
NH44784_055381331-3.970701Prophage Clp protease-like protein
NH44784_055391331-4.495260Phage portal
NH44784_055401429-4.678758hypothetical protein
NH44784_055411530-4.133849COG5525: Bacteriophage tail assembly protein
NH44784_055421526-2.993156hypothetical protein
NH44784_055431527-2.916019hypothetical protein
NH44784_055441426-1.924212hypothetical protein
NH44784_055451326-1.671121hypothetical protein
NH44784_055461428-2.176575phage-related hypothetical protein
NH44784_055471129-3.233252hypothetical protein
NH44784_055481136-6.934498hypothetical protein
NH44784_055491038-7.932047hypothetical protein
NH44784_055501-135-7.734267hypothetical protein
NH44784_055511139-7.638145hypothetical protein
NH44784_055521140-6.410447hypothetical protein
NH44784_055531139-4.744927hypothetical protein
NH44784_055541136-3.285527hypothetical protein
NH44784_055551134-2.605415hypothetical protein
NH44784_055561035-2.741147hypothetical protein
NH44784_055571031-2.976411hypothetical protein
NH44784_055581028-3.168651hypothetical protein
NH44784_055591224-3.368444hypothetical protein
NH44784_055601324-2.934581hypothetical protein
NH44784_055611424-3.194608hypothetical protein
NH44784_055621322-2.684918Single-stranded DNA-binding protein
NH44784_055631223-2.503511phage-related hypothetical protein
NH44784_055641118-1.753009DNA recombination-dependent growth factor C
NH44784_055651016-1.601770hypothetical protein
NH44784_055661011-0.920973hypothetical protein
NH44784_055671010-0.533309hypothetical protein
NH44784_0556810120.357469hypothetical protein
NH44784_0557012150.724022*N-succinyl-L,L-diaminopimelate aminotransferase
NH44784_0557112151.6562772,3,4,5-tetrahydropyridine-2,6-dicarboxylate
NH44784_0557212131.259542N-succinyl-L,L-diaminopimelate desuccinylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055231BACINVASINB250.029 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 25.5 bits (55), Expect = 0.029
Identities = 12/25 (48%), Positives = 18/25 (72%)

Query: 8 IRGYVVAALAAVAAVVLVYLRGRRA 32
I G +VAA+A VA +V+V + G+ A
Sbjct: 409 IVGAIVAAIAMVAVIVVVAVVGKGA 433


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055291cloacin330.003 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.8 bits (74), Expect = 0.003
Identities = 34/126 (26%), Positives = 44/126 (34%), Gaps = 18/126 (14%)

Query: 287 GGAGRMSQGWVATTPGVAMAIVVGAGGGTATAGGASSFAGLNPAAGAAGKNAIAGGYGGA 346
GG GR +T G G G G + G+ + NP G +G GG G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 347 GASGGGIAGADRTSLRGGSGGNGGASFFGAGGTGGEAASGSSGLAGSSPDAGAYGAGGGG 406
G GG GN G G GTGG ++ ++ +A P GAGG
Sbjct: 63 GNGGG--------------NGNSG----GGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104

Query: 407 AGGGGD 412

Sbjct: 105 VSISAG 110



Score = 31.2 bits (70), Expect = 0.010
Identities = 31/99 (31%), Positives = 38/99 (38%), Gaps = 7/99 (7%)

Query: 326 GLNPAAGAAGKNAIAGGYGGAGASGGGIAGADRTSLRGGSGGNGGASFFGAGGTGGEAAS 385
G N A + N I GG G G GG G+ +S GG G S GG G
Sbjct: 8 GHNTGAHSTSGN-INGGPTGLGVGGGASDGSGWSSENNPWGG-GSGSGIHWGGGSGHGNG 65

Query: 386 GSSGLAGSSPDAGAYGAGGGGAGGGGDGAFGVGGAAGRG 424
G +G +G G G GG + AFG + G
Sbjct: 66 GGNGNSG-----GGSGTGGNLSAVAAPVAFGFPALSTPG 99



Score = 30.5 bits (68), Expect = 0.016
Identities = 23/84 (27%), Positives = 30/84 (35%), Gaps = 2/84 (2%)

Query: 344 GGAGASGGGIAGADRTSLRGGSGGNGGASFFGAGGTGGEAASGSSGLAGSSPDAGAYGAG 403
GG G A + ++ GG G G GA G ++ + GS G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGG--GASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 404 GGGAGGGGDGAFGVGGAAGRGGVV 427
G G GGG + G G G V
Sbjct: 61 GHGNGGGNGNSGGGSGTGGNLSAV 84



Score = 29.3 bits (65), Expect = 0.031
Identities = 23/73 (31%), Positives = 27/73 (36%), Gaps = 13/73 (17%)

Query: 365 SGGNGGASFFGAGGTGGEAASGSSGLAGSS-------------PDAGAYGAGGGGAGGGG 411
SGG+G GA T G G +GL P G G+G GG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 412 DGAFGVGGAAGRG 424
G G G +G G
Sbjct: 62 HGNGGGNGNSGGG 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055311RTXTOXIND482e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 2e-07
Identities = 26/195 (13%), Positives = 70/195 (35%), Gaps = 14/195 (7%)

Query: 431 QLTEALEKNARLEAAIRAVNPEDDRISSKAIREREAETRKKFQDKDAVSAGQNSLSGQLA 490
L +A + R + R++ E +++ + + ++ +++
Sbjct: 142 SLLQARLEQTRYQILSRSI--ELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199

Query: 491 AMQAQARMREEALRAETAALEGQ-----RAAGLLSEEAFIRRRAAAQRA-ALSDELDIAR 544
+ Q + + RAE + + + + ++A A L+
Sbjct: 200 NQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 545 KQADIAGGKKQIAERERYTGRVQELEAQIARSQQQEATDIEKYQAKIRGALRATQLDIAN 604
K + E Y +++++E++I ++++ + ++ +I LR T +I
Sbjct: 260 KYVEAVN------ELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 605 YSETRALQESRQNNA 619
+ A E RQ +
Sbjct: 314 LTLELAKNEERQQAS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055411PF03544300.019 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 30.3 bits (68), Expect = 0.019
Identities = 16/70 (22%), Positives = 22/70 (31%), Gaps = 1/70 (1%)

Query: 608 LWVAEHLGLTRKTEAWWDAMAAKLDALPPPADSVEADDAPPPAARGRRPSPAVPAEAAPA 667
V + + L + M A D PP A + P P P P EA
Sbjct: 35 TSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPE-PEPEPIPEPPKEAPVV 93

Query: 668 KPRAPPRPAR 677
+ P+P
Sbjct: 94 IEKPKPKPKP 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055531ABC2TRNSPORT260.009 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 25.7 bits (56), Expect = 0.009
Identities = 9/22 (40%), Positives = 17/22 (77%)

Query: 7 MLVMFVIGFGVGVIYGRMKGMS 28
++ +F +G G+GV+ GR+ G+S
Sbjct: 43 LIYLFGLGAGLGVMVGRVGGVS 64


77NH44784_055821NH44784_055951Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0558212130.021927chorismate mutase
NH44784_0558311130.278933ATP-dependent DNA helicase UvrD/PcrA
NH44784_0558412150.505816Fatty acid desaturase
NH44784_0558512130.870895hypothetical protein
NH44784_0558612130.854463Ribosomal RNA small subunit methyltransferase B
NH44784_055871-114-0.000691hypothetical protein
NH44784_055881-115-0.019973Phosphoribosylglycinamide formyltransferase
NH44784_055891-2150.145102Riboflavin kinase / FMN adenylyltransferase
NH44784_055901-1150.451516Isoleucyl-tRNA synthetase
NH44784_0559112131.606105Lipoprotein signal peptidase
NH44784_0559212112.096704Phosphopantothenoylcysteine decarboxylase
NH44784_055931281.442829probable extra-cytoplasmic solute receptor
NH44784_0559414101.569996AroM protein
NH44784_0559512101.647570hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_055821TYPE3IMSPROT315e-04 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.9 bits (70), Expect = 5e-04
Identities = 13/33 (39%), Positives = 18/33 (54%), Gaps = 4/33 (12%)

Query: 56 EDQQVQRIRRLAEEAGVP----PALAETILREV 84
D QVQ +R++AEE GVP LA + +
Sbjct: 288 TDAQVQTVRKIAEEEGVPILQRIPLARALYWDA 320


78NH44784_056331NH44784_056461Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_056331416-0.351600ATP-dependent Clp protease adaptor protein ClpS
NH44784_0563413100.253046Cold shock protein CspD
NH44784_056351390.764900exported protein
NH44784_056361290.430125Acetyl-CoA acetyltransferase
NH44784_056371292.097467hypothetical protein
NH44784_056381292.396162Chloride channel protein
NH44784_056391282.364389Methyl-accepting chemotaxis protein I (serine
NH44784_0564014132.078115hypothetical protein
NH44784_0564113102.311841Superoxide dismutase [Fe]
NH44784_0564213112.996254Ankyrin
NH44784_0564310102.312053hypothetical protein
NH44784_056441-1112.749095Exodeoxyribonuclease VII large subunit
NH44784_056451-1132.355134hypothetical protein
NH44784_0564610113.281646MotA/TolQ/ExbB proton channel family protein
79NH44784_057031NH44784_057251Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_057031-2123.308825Probable glutathione S-transferase-related
NH44784_057041-1113.422462Ferredoxin
NH44784_0570510112.660880ADP-ribose pyrophosphatase
NH44784_0570610102.157806Transcriptional regulator, LysR family
NH44784_0570711102.227461Putative heat shock protein YegD
NH44784_0570811102.729246Inner-membrane permease FptX, ferripyochelin
NH44784_0570911102.283465hypothetical protein
NH44784_0571011112.255842Outer membrane receptor for ferric-pyochelin
NH44784_057111-192.592146Outer membrane receptor for ferric-pyochelin
NH44784_0571210104.226572iron aquisition regulator
NH44784_057131194.073589Transcriptional regulator, AraC family
NH44784_0571412133.488327major facilitator superfamily MFS_1
NH44784_0571511112.313293hypothetical protein
NH44784_0571613112.509887FIG00431968: hypothetical protein
NH44784_0571711102.441706Isochorismatase
NH44784_057181-1101.854817GCN5-related N-acetyltransferase
NH44784_057191-1132.213097Uncharacterized glutathione S-transferase-like
NH44784_057201-1132.574907OsmC/Ohr family protein
NH44784_0572110132.854305Gfa-like protein
NH44784_0572210122.743113Transcriptional regulator, IclR family
NH44784_0572312123.100039hypothetical protein
NH44784_0572412133.357152putative 2-pyrone-4,6-dicarboxylic acid
NH44784_0572510133.034943hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_057071SHAPEPROTEIN462e-07 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 45.5 bits (108), Expect = 2e-07
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 21/127 (16%)

Query: 97 LLTRFIAELKRRAETAAGRDFTRAVLGRPVFFVDDNPAADQTAQDTLGEIARSVGFTDIE 156
+L FI ++ + R R ++ PV A Q + + E A+ G ++
Sbjct: 90 MLQHFIKQVHSNS---FMRPSPRVLVCVPV-------GATQVERRAIRESAQGAGAREVF 139

Query: 157 FQFEPLAAAFDYESQIDREELVLVIDIGGGTSDFSLIRLGPGRAAKPDRRDDILAYGGVH 216
EP+AAA + +V+DIGGGT++ ++I L ++ V
Sbjct: 140 LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVR 188

Query: 217 IGGVDFD 223
IGG FD
Sbjct: 189 IGGDRFD 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_057141TCRTETB300.014 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.2 bits (68), Expect = 0.014
Identities = 17/96 (17%), Positives = 32/96 (33%), Gaps = 3/96 (3%)

Query: 273 FFAGALIQRFGLAKVLGAGMLLNVACALIAMASPSLPAFYAALFCLGVGWNFMFVGGTTL 332
+ G L+ R G VL G+ L A L + + + + + T
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVSFLTASF---LLETTSWFMTIIIVFVLGGLSFTKT 367

Query: 333 LAQSYRPSERARAQGAAEMLRYAATAVATLAAGPAL 368
+ + S + + A M T+ + G A+
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_057171ISCHRISMTASE402e-06 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 40.0 bits (93), Expect = 2e-06
Identities = 31/132 (23%), Positives = 52/132 (39%), Gaps = 15/132 (11%)

Query: 3 SALIVIDVQRALFETTPEPAD-ASAVLARINDLAERARAAAAPVIYVQHEAAGSPLAQG- 60
+ L++ D+Q + A + + A I L + PV+Y + +P +
Sbjct: 31 AVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDRAL 90

Query: 61 -----EPGWQLD-------TRLRPADGDIRIRKTTPDSFLRTGLGDALSQAGVTQLVVCG 108
PG T L P D D+ + K +F RT L + + + G QL++ G
Sbjct: 91 LTDFWGPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQLIITG 150

Query: 109 -YASEFCVDTTV 119
YA C+ T
Sbjct: 151 IYAHIGCLVTAC 162


80NH44784_057851NH44784_058041Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0578511103.020480D-amino acid dehydrogenase small subunit
NH44784_057861-193.015167putative inner membrane protein
NH44784_0578710113.418657Phenazine biosynthesis protein PhzF like
NH44784_057881-2103.196618Transcriptional regulator, GntR family domain /
NH44784_057891-1103.329020hypothetical protein
NH44784_057901-191.698409hypothetical protein
NH44784_057911-2101.649483Predicted dye-decolorizing peroxidase
NH44784_057921-1121.786098FIG00460227: hypothetical protein
NH44784_057931-2120.852077Transcriptional regulator, LysR family
NH44784_0579410101.332597permease
NH44784_057951-18-0.252511transcriptional regulator, AraC family
NH44784_057961011-0.377578hypothetical protein
NH44784_0579712110.097971PhnB protein; putative DNA binding
NH44784_057981213-0.217042hypothetical protein
NH44784_057991416-0.691298Poly(3-hydroxybutyrate) depolymerase
NH44784_058001222-3.475646hypothetical protein
NH44784_058011215-0.745751hypothetical protein
NH44784_0580213151.016095COG1272: Predicted membrane protein hemolysin
NH44784_0580313141.669215ortholog of Bordetella pertussis (BX470248)
NH44784_0580412121.606741Cold shock protein CspA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_057941TCRTETB961e-23 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 96.5 bits (240), Expect = 1e-23
Identities = 75/389 (19%), Positives = 138/389 (35%), Gaps = 12/389 (3%)

Query: 30 MATLDTSMTNTALPSMASQLGVGAADVIGVVTVYQLVMVATMLPLAALAGKIGHRRVFLP 89
+ L+ + N +LP +A+ A V T + L L+ ++G +R+ L
Sbjct: 25 FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLF 84

Query: 90 ALWLFLAASVWCGAAQSLWS-LEAARAAQGLAAAALMGCNMALVSAIYRKEELGRGMGLN 148
+ + SV S +S L AR QG AAA M +V+ KE G+ GL
Sbjct: 85 GIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 149 AMIAAASLAGGPVLASAMLTVLSWHWLFYVNVPFCLVALMLAGRHLPRLPGDGKA-LDAG 207
I A GP + + + HW + + +P + + L + K D
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIK 202

Query: 208 AAVLCALAFGLAVQGLERAAQPDWGTALIWACCALSWALLLRRERLSAHAIVPLDLLRLA 267
+L ++ + + ++ LS+ + ++ R V L +
Sbjct: 203 GIILMSVGIVFFMLFTTSYS---ISFLIVS---VLSFLIFVKHIRKVTDPFVDPGLGKNI 256

Query: 268 PFSMAVLVCVLAFAAQGAAMVALPFLLNQTLGRGIAEVG-MLIAPWPVMGACLAPFSGKW 326
PF + VL + F + +P+++ AE+G ++I P + G
Sbjct: 257 PFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGIL 316

Query: 327 SDRMSASLLGAVGLALLAAGLLILSGLPAGAPIWAVMLPMALCGTGFGLFLSPNQRFIMF 386
DR + +G+ L+ L S L + ++ + + G + +
Sbjct: 317 VDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIST-IVSS 375

Query: 387 SVPAHRASVASGLSGLARLLGQTMGAAFV 415
S+ A L L + G A V
Sbjct: 376 SLKQQEAGAGMSLLNFTSFLSEGTGIAIV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_057981LIPOLPP20290.008 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 28.6 bits (63), Expect = 0.008
Identities = 29/112 (25%), Positives = 52/112 (46%), Gaps = 9/112 (8%)

Query: 14 IRGLSVLATLLEKGAAHAAANGI---DPAELVNARLAPDMYPLSGQVQRASDASKFAVQR 70
I G+SV+A ++ G +HA +GI + A + APD + G +++ + K++
Sbjct: 8 ILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDW--VVGDLEKVAKYEKYSGVF 65

Query: 71 LSQVE---SPRFPDEETTFEQLRQRVADTIAYLRSVPADKLDGAEARTITLS 119
L + E + D T + R A+ A L+S L+ + RT+ S
Sbjct: 66 LGRAEDLITNNDVDYSTNQATAKAR-ANLAANLKSTLQKDLENEKTRTVDAS 116


81NH44784_058311NH44784_058741Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0583111123.321528Cyanate ABC transporter, ATP-binding protein
NH44784_0583211133.475937Cyanate hydratase
NH44784_0583314144.405400Pyridoxal kinase
NH44784_0583414162.370914hypothetical protein
NH44784_0583512132.253241hypothetical protein
NH44784_0583612131.809271Survival protein SurA precursor (Peptidyl-prolyl
NH44784_0583710160.896121FIG00958542: hypothetical protein
NH44784_058381-1161.792115hypothetical protein
NH44784_058391-1150.860944Putative outer membrane TonB-dependent receptor
NH44784_0584010100.930472Ferric siderophore transport system, periplasmic
NH44784_058411-1113.235898Biopolymer transport protein ExbD/TolR
NH44784_0584210123.606724MotA/TolQ/ExbB proton channel family protein
NH44784_0584310123.556286MotA/TolQ/ExbB proton channel family protein
NH44784_0584410133.708187Hemolysin activation/secretion protein
NH44784_0584511123.996649hypothetical protein
NH44784_0584611134.385017Filamentous haemagglutinin family outer membrane
NH44784_0584710174.520375Sigma factor regulator VreR (cytoplasmic
NH44784_0584810184.135049RNA polymerase sigma-70 factor, ECF subfamily
NH44784_0584911194.380767hypothetical protein
NH44784_0585011184.250571Filamentous haemagglutinin family outer membrane
NH44784_0585110194.419989Filamentous haemagglutinin family outer membrane
NH44784_0585210212.779452Sigma factor regulator VreR (cytoplasmic
NH44784_0585310171.844304RNA polymerase sigma-70 factor, ECF subfamily
NH44784_0585412182.521345hypothetical protein
NH44784_058551-1201.964267hypothetical protein
NH44784_058561-1190.566120hypothetical protein
NH44784_058571-1190.383420Transcriptional regulator, MerR family
NH44784_058581-123-0.100935Ribulose-5-phosphate 4-epimerase and related
NH44784_058591-125-1.978815glutathione S-transferase-like
NH44784_058601029-3.001191uncharacterized protein UPF0065
NH44784_058611-134-4.697716hypothetical protein
NH44784_058621031-4.426834transcriptional regulator, LysR-family
NH44784_058631031-4.4668912-Methylcitrate dehydratase( EC:4.2.1.79
NH44784_058641030-4.272870hypothetical protein
NH44784_058651030-4.112160Immune-responsive protein 1
NH44784_058661027-3.954896Transcriptional regulator, LysR family
NH44784_058671028-3.976543hypothetical protein
NH44784_058681-128-4.311771hypothetical protein
NH44784_058691029-5.1780623-oxoacyl-[acyl-carrier protein] reductase
NH44784_058701030-4.609880hypothetical protein
NH44784_058711-131-4.660774amidohydrolase 2
NH44784_058721034-5.145095Gluconolactonase
NH44784_058731035-4.991271hypothetical protein
NH44784_058741036-4.347743hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058361cdtoxina280.049 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 28.1 bits (62), Expect = 0.049
Identities = 14/45 (31%), Positives = 18/45 (40%), Gaps = 6/45 (13%)

Query: 33 SPAQSSESSPVPAKAAAAPEKGAAPAAAATAQPAPAKAPAVATLG 77
P+ P+P A P GA P P P APAV+ +
Sbjct: 46 VPSPDEPGLPLPGPGPALPTNGAIPI------PEPGTAPAVSLMN 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058381RTXTOXIND366e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.3 bits (84), Expect = 6e-05
Identities = 17/134 (12%), Positives = 38/134 (28%), Gaps = 2/134 (1%)

Query: 24 AAAALCAAGLSVALPAQAESMEERLRAQLRSTTQQLQQLQSEQAQVNAAKAAAEAQRDAA 83
+ + L + + + Q L + ++E+ V A E
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 84 QKELVALRSQLASAKGQAEKLAGQQEAVMESAQAQVAASHAQLGKFKGAYDEL-LTLSRA 142
+ L S L + Q+ +E A ++ +QL + +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVE-AVNELRVYKSQLEQIESEILSAKEEYQLV 292

Query: 143 KEAERQTLARTLAQ 156
+ + + L Q
Sbjct: 293 TQLFKNEILDKLRQ 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058401TONBPROTEIN426e-07 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 42.3 bits (99), Expect = 6e-07
Identities = 16/47 (34%), Positives = 18/47 (38%)

Query: 57 PLPPPPPPPPEPEKPPEPEPPKEEPVAPPEPEPQPTPEPPKPQDEAP 103
L PP P PE EPEP E PP+ P +P P
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP 100



Score = 33.0 bits (75), Expect = 7e-04
Identities = 19/92 (20%), Positives = 31/92 (33%), Gaps = 1/92 (1%)

Query: 16 RRWLKLAAVAVALIAAAYALWRWANDMAGVRREAPKATAIIPLPPPPPPPPEPEKPPEPE 75
RR+ ++V + A A + + + AP + + P P P PE
Sbjct: 7 RRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPE 66

Query: 76 PPKEEPVAPPEPEPQPTPEP-PKPQDEAPPRP 106
P E P P P + + P+P
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058421RTXTOXINA300.008 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.3 bits (68), Expect = 0.008
Identities = 14/50 (28%), Positives = 24/50 (48%), Gaps = 3/50 (6%)

Query: 97 KENQVLGAKLGSLSNAIAGGPYIGLLGTVLGIMVVFLGTAMAGDVNINAI 146
++ Q G LG + I +G G +L FLGTA++ + I+ +
Sbjct: 119 QKYQKAGNILGGGAENIGDN--LGKAGGILSTFQNFLGTALSS-MKIDEL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058441PF00577364e-04 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 36.0 bits (83), Expect = 4e-04
Identities = 36/213 (16%), Positives = 62/213 (29%), Gaps = 27/213 (12%)

Query: 242 TDNAKVWSGSYSMPLDKQWSLQFTG--YKSDSNVATIGGTNVLGK--GYSFGMSAIYTLA 297
+ + + + L W++ + G G +G S M+ +
Sbjct: 390 QEKPRFFQSTLLHGLPAGWTI-YGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTL 448

Query: 298 PQGDWYNSLSVGLDYKKFDETTRFGGNEDLIPLKYVPFTFSYNGYRYSEASQSSIGLSLV 357
P ++ SV Y K + GYRYS + + +
Sbjct: 449 PDDSQHDGQSVRFLYNKSLNESGT--------------NIQLVGYRYSTSGYFNFADTTY 494

Query: 358 GASRSFFGLGSDWKEFDDKRYRASPSFALL---RGDGTHTQNLFGDWQ-LGLRAGFQLAS 413
+ D ++ + A + T TQ L G L L Q
Sbjct: 495 SRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYW 553

Query: 414 GALVSNEQFSAGGSTSVRGY---LAAERTGDDG 443
G +EQF AG +T+ L+ T +
Sbjct: 554 GTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAW 586


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058461PF05860588e-12 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 58.3 bits (141), Expect = 8e-12
Identities = 25/99 (25%), Positives = 47/99 (47%), Gaps = 6/99 (6%)

Query: 132 TEGGRQLVTI-EQTQSRAILNWDTFNVGRNTTLRFQQKAD-DAVLNRVVGASARPSQIQG 189
TEG +++ Q S ++ F+V + T F + +++RV G S S I G
Sbjct: 17 TEGNTRIIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGS--VSNIDG 74

Query: 190 AIQADGT--VLVVNQNGVIFSGTSQVNARNLVVAAATMS 226
I+A+ T + ++N NG+IF ++++ V +
Sbjct: 75 LIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058511PF05860601e-12 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 60.2 bits (146), Expect = 1e-12
Identities = 25/99 (25%), Positives = 47/99 (47%), Gaps = 6/99 (6%)

Query: 132 TEGGRQLVTI-EQTQSRAILNWDTFNVGRNTTLRFQQKAD-DAVLNRVVGASARPSQIQG 189
TEG +++ Q S ++ F+V + T F + +++RV G S S I G
Sbjct: 17 TEGNTRIIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGS--VSNIDG 74

Query: 190 AIQADGT--VLVVNQNGVIFSGTSQVNTRNLVVAAATMS 226
I+A+ T + ++N NG+IF ++++ V +
Sbjct: 75 LIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058691DHBDHDRGNASE1129e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (280), Expect = 9e-32
Identities = 85/264 (32%), Positives = 127/264 (48%), Gaps = 9/264 (3%)

Query: 2 SKRLEGKVAIVTGAGCVGPGWGNGRAVAVRFAQEGAKVFAVDKSADAMAETLLRAKDAGG 61
+K +EGK+A +TGA G G AVA A +GA + AVD + + + + + K
Sbjct: 3 AKGIEGKIAFITGAA-----QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEAR 57

Query: 62 DITAWTCDATVSADVHAMVQACVERYGRVDILVNNVGGSRRGGPVDLSEADWDAQMDFNL 121
A+ D SA + + G +DILVN G R G LS+ +W+A N
Sbjct: 58 HAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNS 117

Query: 122 KSVFLACKHVIPIMQGQGGGAIVNTAS-TSGIRWTGAAQVGYASAKAGIIQLSRVVAVEY 180
VF A + V M + G+IV S +G+ T A YAS+KA + ++ + +E
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMA--AYASSKAAAVMFTKCLGLEL 175

Query: 181 AKHNIRVNTVVPGQMHTPMVEARLAGQRAGGNVDALLAER-QARIPLGFMGDGTDTANAA 239
A++NIR N V PG T M + A + V E + IPL + +D A+A
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAV 235

Query: 240 LFLASDEARFVTGTEIIVDGGMSV 263
LFL S +A +T + VDGG ++
Sbjct: 236 LFLVSGQAGHITMHNLCVDGGATL 259


82NH44784_058931NH44784_059071Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0589312201.818143MFS transporter
NH44784_0589412201.187457hypothetical protein
NH44784_0589513211.807656putative secreted protein
NH44784_0589614190.957791Response regulator of zinc sigma-54-dependent
NH44784_0589712220.850900Sensor protein of zinc sigma-54-dependent
NH44784_058981-2150.059362Lactoylglutathione lyase and related lyases
NH44784_058991-3140.197663Ferredoxin
NH44784_059001-3121.591218Transcriptional regulator, LysR family
NH44784_059011-3101.733397Glutathione S-transferase
NH44784_059021-291.973535hypothetical protein
NH44784_059031-192.179997putative autotransporter
NH44784_0590414113.607740Transcriptional repressor, BlaI/MecI family
NH44784_0590513123.621750Murein-DD-endopeptidase
NH44784_0590613132.502843hypothetical protein
NH44784_0590712112.800534Beta-lactamase class D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058961HTHFIS475e-168 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 475 bits (1223), Expect = e-168
Identities = 189/476 (39%), Positives = 255/476 (53%), Gaps = 37/476 (7%)

Query: 2 ARILIVDDDAAFRESLSETLHDLGHTALQASSTVAGLNRLRTEAVDLAIVDLRMPGEDGL 61
A IL+ DDDAA R L++ L G+ S+ + DL + D+ MP E+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 VFLRKAAGIAP-VPCIMLTAYASGGNTIEAMRLGAFDHLTKPVARAALVQTLERALDSAR 120
L + P +P ++++A + I+A GA+D+L KP L+ + RAL +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 VAGDDGADADTAPADEFELVSGSEAMREVFKRIGMAARGDATVLIQGETGTGKELVARAL 180
+ + D LV S AM+E+++ + + D T++I GE+GTGKELVARAL
Sbjct: 124 R---RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 181 HRNGARAARPFVAVNCAAIPADLMESELFGHVKGAYTGATGDRAGRFREAEGGTLFLDEI 240
H G R PFVA+N AAIP DL+ESELFGH KGA+TGA GRF +AEGGTLFLDEI
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 241 GDMPLATQAKILRALQEREITPVGGARTVPVDVRIVAATHRDLPAAVAAGRFREDLWYRL 300
GDMP+ Q ++LR LQ+ E T VGG + DVRIVAAT++DL ++ G FREDL+YRL
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 301 QVVPIALPPLRERLGDVLLLAEHFL---RRGGGTPKRLSAAAAQRLLAHNWPGNVRELRN 357
VVP+ LPPLR+R D+ L HF+ + G KR A + + AH WPGNVREL N
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 358 AMERAAILSRGASIEPEHIGLQ------------------------------PLGIAQDG 387
+ R L I E I + A G
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 388 FAIPLDGSLAQAVAALEGAMIRRALAAAAGNRAEAARRLGLSRQQLYRKLAEHGVE 443
A+P G + +A +E +I AL A GN+ +AA LGL+R L +K+ E GV
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059031PERTACTIN795e-17 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 79.0 bits (194), Expect = 5e-17
Identities = 131/552 (23%), Positives = 198/552 (35%), Gaps = 69/552 (12%)

Query: 379 GTAFSGTSVFNKLGAGRLTLSGNSAAFTGNTQVQAGTLQVDGVLGGPVDVL--AGARLTG 436
G G ++ ++ + L+ A V + G GP+DV + AR TG
Sbjct: 362 GARAQGRALLYRVLPEPVKLTLAGGAQGQGDIVATELPPIPGASSGPLDVALASQARWTG 421

Query: 437 TGRV--GATTNKGTIAPGPRSGFGTLTIAGD---------YAAQGGNLEVRTRLGG---- 481
R + + T S G L +A D A + L V T G
Sbjct: 422 ATRAVDSLSIDNATWVMTDNSNVGALRLASDGSVDFQQPAEAGRFKVLMVDTLAGSGLFR 481

Query: 482 -----DDSPTDKLVITGATAGTTPVTVTNIGGAGAQTQRGIQVVQVNGLSAGQFNLANGD 536
D +DKLV+ +G + V N G A + +VQ SA F LAN D
Sbjct: 482 MNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPASGNT-MLLVQTPRGSAATFTLANKD 540

Query: 537 YVIGGRPALVAGAYGYTLQQDSGDGSWYLRSALTDPESPQTGGGSPAPAPGPLYQPGVPV 596
+ G Y Y L + G+G W L A P P P P P P P
Sbjct: 541 GKVD------IGTYRYRLAAN-GNGQWSLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQ 593

Query: 597 YEAYANTLMQLSKLPTLRQRVGDRLYEPQGAGRN--------GVWARMEGSTS------R 642
+ + P + G L A N +W + S R
Sbjct: 594 PPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELR 653

Query: 643 LEPAVS-----TTGQRQDIDD----------WKMQIGVDRVLAGQEDGSRLVGGLALHYG 687
L P QRQ +D+ ++G D +A G +GGLA +
Sbjct: 654 LNPDAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAG-GRWHLGGLAGY-- 710

Query: 688 TSDTRVSSVYGNGSIDTTRYGLTPTLTWYGSDGVYVDAQAQATWFDSDLK--SRLAGKLK 745
T R + G G D+ G T+ + G Y+DA +A+ ++D K +K
Sbjct: 711 TRGDRGFTGDGGGHTDSVHVG--GYATYIANSGFYLDATLRASRLENDFKVAGSDGYAVK 768

Query: 746 DGRKAQSYGLGLEAGKAFGLREGLALVPQAQLTYVSTRFDRFSDRFGARVESDKGDSLQG 805
+ G+ LEAG+ F +G L PQA+L + G RV + G S+ G
Sbjct: 769 GKYRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVGGGAYRAANGLRVRDEGGSSVLG 828

Query: 806 RLGVALDYRSNGQETGSDRRSNVYGVINLKHEFLDGTRIQVANVPVVSRMSRTWGNLGVG 865
RLG+ + R E R+ Y ++ EF ++ + + + T LG+G
Sbjct: 829 RLGLEVGKRI---ELAGGRQVQPYIKASVLQEFDGAGTVRTNGIAHRTELRGTRAELGLG 885

Query: 866 ADYAWSKRYAVY 877
A + +++Y
Sbjct: 886 MAAALGRGHSLY 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059051BLACTAMASEA424e-06 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 41.7 bits (98), Expect = 4e-06
Identities = 31/115 (26%), Positives = 49/115 (42%), Gaps = 14/115 (12%)

Query: 346 VLVLDSRSGATLFGRAENDAVPIASLTKLVTAMVVLDTSP----DLSRRIRIDERDALAT 401
++ +D SG TL ++ P+ S K+V VL L R+I ++D +
Sbjct: 42 MIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVD- 100

Query: 402 ASGAASPLP---VGASVTLDTLLKLALMASDNRAAHALARAY--PGGETAFAEAL 451
SP+ + +T+ L A+ SDN AA+ L P G TAF +
Sbjct: 101 ----YSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQI 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059071BLACTAMASEA280.022 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.2 bits (63), Expect = 0.022
Identities = 10/28 (35%), Positives = 12/28 (42%)

Query: 122 ADNLFEVGQADGWRLYGKTGTGSPGSNG 149
A L GW + KTG G G+ G
Sbjct: 213 AGPLIRSVLPAGWFIADKTGAGERGARG 240


83NH44784_059311NH44784_059511Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0593112143.773310*Xanthine and CO dehydrogenases maturation
NH44784_0593212133.984798CTP:molybdopterin cytidylyltransferase
NH44784_0593312123.858934Isoquinoline 1-oxidoreductase beta subunit
NH44784_0593411144.354954Isoquinoline 1-oxidoreductase alpha subunit
NH44784_0593511154.043254Transcriptional regulator, AraC family
NH44784_0593612145.338004Transcriptional regulator, GntR family domain /
NH44784_0593711124.580921putative dioxygenase
NH44784_0593813124.804864L-lysine permease
NH44784_0593914143.031570Phenazine biosynthesis protein PhzF like
NH44784_0594013192.128845hypothetical protein
NH44784_0594112183.416841hypothetical protein
NH44784_0594213192.859691hypothetical protein
NH44784_0594316182.411319hypothetical protein
NH44784_0594410151.088262hypothetical protein
NH44784_0594510152.105330hypothetical protein
NH44784_0594612163.520882hypothetical protein
NH44784_0594713173.441909hypothetical protein
NH44784_0594811152.073893hypothetical protein
NH44784_0594911152.573395dTDP-glucose 4,6-dehydratase
NH44784_0595011163.311631Glycosyl transferase, group 1
NH44784_0595111163.146319ADP-heptose--lipooligosaccharide
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059411IGASERPTASE280.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.011
Identities = 11/50 (22%), Positives = 24/50 (48%), Gaps = 2/50 (4%)

Query: 14 APPSPAAPAPRAPADSEDARGQAPGNAVPEEDIEPADATADDPHPAQAGR 63
PP+PA P+ +E+++ ++ E+D + TA + A+ +
Sbjct: 1026 PPPAPATPSETTETVAENSKQESKTVEKNEQD--ATETTAQNREVAKEAK 1073


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059471CHANLCOLICIN290.006 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.5 bits (63), Expect = 0.006
Identities = 15/33 (45%), Positives = 17/33 (51%)

Query: 88 PTPGQGTVKGSTANPAPDGRLPGGRAGSGGSKS 120
P +G V + N PDG GG G GGSKS
Sbjct: 13 PYDDKGQVIITLLNGTPDGSGSGGGGGKGGSKS 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059491NUCEPIMERASE1731e-53 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 173 bits (440), Expect = 1e-53
Identities = 78/334 (23%), Positives = 130/334 (38%), Gaps = 36/334 (10%)

Query: 10 RALVAGGAGFLGSNLCARLLADGWEVMCVDNFQTGRSRNVAM----LAAHPRFSVVRHDI 65
+ LV G AGF+G ++ RLL G +V+ +DN ++ L A P F + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 66 ------VEPLADVQVDRIYNLACPASPVHYQ-ADPLKTLRTCVLGAMQLLELARRCG-AR 117
+ A +R++ + V Y +P + + G + +LE R
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLA-VRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 118 VLQASTSEVYGDPLEHPQREGYWGHVNPVGPRSCYDEGKRCAETIFMEYAAHEGVDVRIA 177
+L AS+S VYG + P + P S Y K+ E + Y+ G+
Sbjct: 121 LLYASSSSVYGLNRKMPFST----DDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 178 RIFNTYGPHMSPDDGRVVSNFIVQALAGKPLTIYGDGTQTRSFCYVDDLIDGLVRLMESV 237
R F YGP PD + F L GK + +Y G R F Y+DD+ + ++RL + +
Sbjct: 177 RFFTVYGPWGRPD--MALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 238 PP-----------------DVRPVNLGNPAELTMLEMADMVRTLTGANVPLVHRDLPVDD 280
P R N+GN + + +++ + G L D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 PTHRCPDISLARERLGWTPLIAPVEGMARTVAYF 314
D E +G+TP +G+ V ++
Sbjct: 295 VLETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328


84NH44784_059601NH44784_059801Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0596012155.106235hypothetical protein
NH44784_0596112135.434039hypothetical protein
NH44784_0596212145.792051Glycosyl transferase, family 2
NH44784_0596311135.393178No significant database matches
NH44784_0596410124.145270Potassium efflux system KefA protein /
NH44784_0596511123.518382hypothetical protein
NH44784_0596610132.610163Transaldolase / Glucose-6-phosphate isomerase
NH44784_059671312-1.052138hypothetical protein
NH44784_059681110-0.927395Protein yciF
NH44784_0596911100.047165hypothetical protein
NH44784_0597012131.206970Multimeric flavodoxin WrbA
NH44784_0597111132.065413hypothetical protein
NH44784_0597210103.172858Calcium/proton antiporter
NH44784_059731-193.222812Endonuclease/Exonuclease/phosphatase family
NH44784_059741-193.368002Cardiolipin synthetase
NH44784_059751-293.206517hypothetical protein
NH44784_059761-293.104036Glycogen synthase, ADP-glucose transglucosylase
NH44784_059771-1103.574656Alpha-amylase
NH44784_0597810103.611513Trehalose synthase
NH44784_059791193.4723651,4-alpha-glucan (glycogen) branching
NH44784_059801083.226052Glycogen debranching enzyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059741STREPKINASE300.019 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 30.1 bits (67), Expect = 0.019
Identities = 29/111 (26%), Positives = 50/111 (45%), Gaps = 5/111 (4%)

Query: 199 DNQGHPTDIERYYRAGLRAARQDVLIANAYFFPGYRLLHDLASTARRGVRVRLLLQGEPD 258
D+ +E+ A L A Q+ LIAN + Y + D AS A R + + D
Sbjct: 92 DSGAMSHKLEK---ADLLKAIQEQLIANVHSNDDYFEVIDFASDATITDRNGKVYFADKD 148

Query: 259 MLVAQLAASMLYDYLIDAGVEIFEYCKRPLHAKVACVDEDWSTVGSSNLDP 309
V L + ++L+ V + Y ++P+ + VD ++ TV + L+P
Sbjct: 149 GSVT-LPTQPVQEFLLSGHVRVRPYKEKPIQNQAKSVDVEY-TVQFTPLNP 197


85NH44784_059951NH44784_060001Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0599512123.655115putative phosphotransferase
NH44784_0599612133.824042putative aminotransferase
NH44784_0599712123.848582Transcriptional regulator, GntR family domain /
NH44784_059981-1123.348631Permeases of the major facilitator superfamily
NH44784_059991-1113.904859hypothetical protein
NH44784_060001-1113.541416Two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059981TCRTETA300.018 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.018
Identities = 42/245 (17%), Positives = 80/245 (32%), Gaps = 28/245 (11%)

Query: 71 AVLFGHLGDRVGRKQSLVITLLLMGVATTLIGLLPSYHSIGLAAVVLLVALRLVQGIAVG 130
A + G L DR GR+ L+++L V ++ P + +L R+V GI
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP--------FLWVLYIGRIVAGITGA 111

Query: 131 GEWGGAVLIA---AEHAPPKWRTFLASAPQYGSPIGLILATGVFRLVSDLPKEDFLSWGW 187
IA + F+++ +G G++ + L+
Sbjct: 112 TGAVAGAYIADITDGDERARHFGFMSAC--FG--FGMVAGPVLGGLMGGF--------SP 159

Query: 188 RLPFIISGVLVLVAFVIRRGVNESPELEARLKEKKRESTAPISMVLRERKR--ALLLGIG 245
PF + L + F+ + R ++ S A L+ +
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVF 219

Query: 246 LCLLGISGFYFVTTLMITYTTTYLKITRSQILDVVTWAGVVE-LISFPIASYIATRVGER 304
+ + L + + + I + G++ L I +A R+GER
Sbjct: 220 FIMQLVGQVP--AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGER 277

Query: 305 RFLIW 309
R L+
Sbjct: 278 RALML 282


86NH44784_060391NH44784_060441Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_060391-1153.720916Outer membrane protein A precursor
NH44784_0604011144.932847hypothetical protein
NH44784_0604111134.836149Transcriptional regulator, LysR family
NH44784_0604211135.098348Outer membrane component of tripartite multidrug
NH44784_0604310113.784302Membrane fusion component of tripartite
NH44784_060441-1103.450076Inner membrane component of tripartite multidrug
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_060391OMPADOMAIN857e-21 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 85.0 bits (210), Expect = 7e-21
Identities = 41/107 (38%), Positives = 64/107 (59%), Gaps = 3/107 (2%)

Query: 212 VYFDTDKAEVKAESKAALDEM-GKLLNANPK-LKVYVVGHTDNQGTLAGNLELSQKRADA 269
V F+ +KA +K E +AALD++ +L N +PK V V+G+TD G+ A N LS++RA +
Sbjct: 221 VLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 270 VVKALEAGYKIPAARLSARGVASLAPVAANDAEAGRARNRRVELVAQ 316
VV L + IPA ++SARG+ PV N + + R ++ +A
Sbjct: 281 VVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAP 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_060431RTXTOXIND907e-22 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 90.3 bits (224), Expect = 7e-22
Identities = 69/422 (16%), Positives = 129/422 (30%), Gaps = 100/422 (23%)

Query: 4 TQTPDSRTSTRKSRLMLAGGATLLAGAGWAAWTFLAGAATQTTDNAYVNGHVVAITPQVG 63
+TP SR + ++ + + AT +G I P
Sbjct: 49 IETPVSRRPRLVAYFIMGFLVIAF---ILSVLGQVEIVAT-ANGKLTHSGRSKEIKPIEN 104

Query: 64 GAVSAIWADNADRIRAGQTLVEVDP-------ADTQIALAGARADLARARR--------- 107
V I + +R G L+++ TQ +L AR + R +
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNK 164

Query: 108 ---------------------RVQALFASR--------DQAGAEVARRQAELDHAQADVS 138
R+ +L + Q + +++AE A ++
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN 224

Query: 139 ARR-------------------GIAAQGAITSEEARHAADA--LNAARAALAVAQAAERA 177
A+ A+ +E ++ L ++ L ++ +
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284

Query: 178 AQAQVDGVTPDTHPDVT--LAQARLRAAALDAQ---------RTTIRAPVGGMVAQRSVQ 226
A+ + VT ++ L Q L + + IRAPV V Q V
Sbjct: 285 AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVH 344

Query: 227 -LGKHVAIGDKLMAVVPLDQ-MWIDANFKEIQLAGICPGQPAVVTVDA-----HGAKVRY 279
G V + LM +VP D + + A + + I GQ A++ V+A +G
Sbjct: 345 TEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YL 401

Query: 280 RGRVGDVMAGSGSAFALLPSQNATGNWIKVVQRVPVRINLDPKELADHPLRIGLSAEVTV 339
G+V ++ + G V+ + N + PL G++ +
Sbjct: 402 VGKVKNINLDA-------IEDQRLGLVFNVIISIEE--NCLSTGNKNIPLSSGMAVTAEI 452

Query: 340 DT 341
T
Sbjct: 453 KT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_060441TCRTETB1193e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (299), Expect = 3e-31
Identities = 83/380 (21%), Positives = 155/380 (40%), Gaps = 16/380 (4%)

Query: 27 SFMAVVDITIANVSVPTISGNLGVSPEIGEWAITFFAIANSICIPLTGWLSRRLGQVRLF 86
SF +V++ + NVS+P I+ + P W T F + SI + G LS +LG RL
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 87 VLSVAAFTLASVLCGVAQNFESLLAF-RVLQGMVSGPIVPLSQALLVAIFPPDKRTLALS 145
+ + SV+ V +F SLL R +QG + L ++ P + R A
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 146 MWAMTNMAGPVAGPVLGGWLTDDFSWPWIFLVNAPVGIAVVAVTATMFRGRDTPSTRLPV 205
+ G GP +GG + W ++ L+ P+ I ++ V M + +
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI--PM-ITIITVPFLMKLLKKEVRIKGHF 199

Query: 206 DLRGLILLAVAIGCLQLTLDRGRTLDWFASPFIVTTALLSALGFVFLVIWELGEAHPIVD 265
D++G+IL++V I L F + + ++ ++S L F+ V P VD
Sbjct: 200 DIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVD 249

Query: 266 LSLFRHRNFAMGTLAVAVGFGLYFAALVLIPLWLQTDMRYTATWAGLATA-PMGVFGILL 324
L ++ F +G L + FG + ++P ++ + + G P + I+
Sbjct: 250 PGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIF 309

Query: 325 APLLGRWVQRGDARVFASLAFVAWSLVAWWRASMTTDVGVGLISLACLLQGIGIGLFLTP 384
+ G V R ++ V + V++ AS + +++ + G+ T
Sbjct: 310 GYIGGILVDRRGPLYVLNIG-VTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTV 368

Query: 385 LVSLSLAGLPPERIAAASGL 404
+ ++ + L + A L
Sbjct: 369 ISTIVSSSLKQQEAGAGMSL 388


87NH44784_061191NH44784_061321Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0611910153.197320ABC transporter, ATP-binding/permease protein
NH44784_061201-1114.100138hypothetical protein
NH44784_061211-1114.0865992-keto-3-deoxygluconate permease (KDG permease
NH44784_061221094.881753FIG00432481: hypothetical protein
NH44784_061231-1113.5739774-hydroxythreonine-4-phosphate dehydrogenase
NH44784_0612410153.175678probable GntR-family transcriptional regulator
NH44784_061251-1172.6310613-oxoadipate CoA-transferase subunit B
NH44784_0612610143.3329763-oxoadipate CoA-transferase subunit A
NH44784_0612712154.114698Enoyl-CoA hydratase
NH44784_0612813132.417491CAIB/BAIF family protein
NH44784_0612913162.125651Transcriptional regulator, IclR family
NH44784_0613012172.520240Cysteine dioxygenase
NH44784_0613113151.989162NAD-dependent protein deacetylase of SIR2
NH44784_0613213150.880454hypothetical protein
88NH44784_062921NH44784_063211Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0629212132.976160Gamma-glutamyltranspeptidase
NH44784_0629312132.416969protease Do
NH44784_0629413132.263937Arsenic efflux pump protein
NH44784_0629510111.638252Thiol-disulfide isomerase and thioredoxins
NH44784_0629614132.874302Cytochrome c-type biogenesis protein CcdA (DsbD
NH44784_0629714131.366298hypothetical protein
NH44784_0629814140.921395Transcriptional regulatory protein ompR
NH44784_0629915141.356242sensor histidine kinase
NH44784_0630014151.073977Organic hydroperoxide resistance protein
NH44784_0630112151.380178transcriptional regulator, LysR family
NH44784_0630210161.657141Dipeptide transport system permease protein DppB
NH44784_0630310152.029671Dipeptide transport system permease protein DppC
NH44784_063041-2142.073056Oligopeptide transport ATP-binding protein OppD
NH44784_063051-2122.635094Oligopeptide transport ATP-binding protein OppF
NH44784_063061-2122.680050Oligopeptide ABC transporter, periplasmic
NH44784_0630710102.573354FIG00482872: hypothetical protein
NH44784_0630811122.356771Outer membrane protein (porin
NH44784_0630913172.532921hypothetical protein
NH44784_0631013192.458981Transcriptional regulator, GntR family domain /
NH44784_0631112240.970732FIG00840301: hypothetical protein
NH44784_063121026-0.675755alpha/beta superfamily hydrolase
NH44784_063131229-1.860177hypothetical protein
NH44784_063141024-1.585132hypothetical protein
NH44784_063151124-0.998673hypothetical protein
NH44784_063161229-2.943595hypothetical protein
NH44784_063171430-3.695141FIG00901576: hypothetical protein
NH44784_063181534-3.867857hypothetical protein
NH44784_063191330-4.031536hypothetical protein
NH44784_063201326-3.110316hypothetical protein
NH44784_063211123-3.552149diguanylate cyclase/phosphodiesterase (GGDEF &
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062931V8PROTEASE756e-17 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 75.0 bits (184), Expect = 6e-17
Identities = 35/167 (20%), Positives = 64/167 (38%), Gaps = 34/167 (20%)

Query: 72 GGRQSIGSGVIVDAAQGNILTNHHVVRGATSIRVSLQ------------DGRSFTATVVG 119
I SGV+V + +LTN HVV +L+ +G +
Sbjct: 98 PTGTFIASGVVV--GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITK 155

Query: 120 SDPDTDLAVLRI--------PPDKLQALTLSDSSDLRVGDFVVAIGDPYGLG---QSASS 168
+ DLA+++ + ++ T+S++++ +V + G P S
Sbjct: 156 YSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESK 215

Query: 169 GIVSALERSSLRAAGYQNFIQTDASINPGNSGGALVNLNGELVGINT 215
G ++ L+ +Q D S GNSG + N E++GI+
Sbjct: 216 GKITYLK---------GEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062981HTHFIS891e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.1 bits (221), Expect = 1e-22
Identities = 35/128 (27%), Positives = 61/128 (47%), Gaps = 1/128 (0%)

Query: 6 HILIVDDDREIRELAGNFLKKNGLNVTLAADGRQMRNLLETLTVDLIVLDIMMPGDDGLV 65
IL+ DDD IR + L + G +V + ++ + + DL+V D++MP ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 66 LCRELRAGKHRRTPILLLTARDEDMDRVLGLEMGADDYLVKPFVARELLARIKAILRRTR 125
L ++ P+L+++A++ M + E GA DYL KPF EL+ I L +
Sbjct: 65 LLPRIKK-ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 126 MLPPNFQV 133
P +
Sbjct: 124 RRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_063081ECOLNEIPORIN717e-16 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 70.6 bits (173), Expect = 7e-16
Identities = 74/349 (21%), Positives = 129/349 (36%), Gaps = 67/349 (19%)

Query: 25 AAETSSVTLYGLVDLGTTYER---KDGASSLRQKSGN---QSGSRWGLRGSEDLGNGYKA 78
A + VTLYG + G R +GA + ++G GS+ G +G EDLGNG KA
Sbjct: 15 VAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKA 74

Query: 79 VFRLESGFNANNGTQAQGRMFGRWAYVGLAGGFGEVRLGRQW-----VYGFE-WAGVGSP 132
++++E + R +++GL GGFG++R+GR W
Sbjct: 75 IWQVEQKASIAGT---DSGWGNRQSFIGLKGGFGKLRVGRLNSVLKDTGDINPWDSKSDY 131

Query: 133 FGTGWGQSSNN--ASLGYNDGDFGAG-------------GRVNNAVFYA--TPRLGGWQA 175
G S+ Y+ +F AG GR N+ ++A + GG+
Sbjct: 132 LGVNKIAEPEARLISVRYDSPEF-AGLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFV 190

Query: 176 GLGYSFESHD-GEAFATDAHDRVLTAGLRYNGGPVAAALTYERLNPNSQLPNKKTAGNLQ 234
G +++ H + ++ Y+ + A++ ++ + N +
Sbjct: 191 QYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQQQDAKLVEENYSHNSQTE 250

Query: 235 VAGSYDFEWIKLHGTYGNLRNANTGPSAGYDRVNSYLAGVSMPTGKAGTLMASYQRATSS 294
VA + + +GN+ P Y A S+ +
Sbjct: 251 VAATLAYR-------FGNVT-----PRVSY----------------AHGFKGSFDATNYN 282

Query: 295 DITGWA-LGYQHDLSKRTNLY--AYVNRLDIRASHTLQT--SVGIRHLF 338
+ +G ++D SKRT+ A + S + T VG+RH F
Sbjct: 283 NDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_063111FLGMRINGFLIF280.030 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 28.0 bits (62), Expect = 0.030
Identities = 19/68 (27%), Positives = 28/68 (41%), Gaps = 14/68 (20%)

Query: 15 PDEDPWVRGKPAPEAVVVVPYDPDWPARYARLAQDIAAALGPAALR-IDHVGSTAVPGLP 73
P +VR + +P A V V +P AL + + H+ S+AV GLP
Sbjct: 159 PKPSLFVREQKSPSASVTVTLEPG-------------RALDEGQISAVVHLVSSAVAGLP 205

Query: 74 AKPVVDID 81
V +D
Sbjct: 206 PGNVTLVD 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_063181IGASERPTASE260.034 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 25.8 bits (56), Expect = 0.034
Identities = 12/46 (26%), Positives = 16/46 (34%)

Query: 7 NDPPRQPSPKTPEPSSEDQVDISSRVQEGLKQANQRPRPHGQTPAT 52
N P TP + SS + + + R PH PAT
Sbjct: 1193 NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238


89NH44784_063411NH44784_063501Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_063411011-3.064249Type cbb3 cytochrome oxidase biogenesis protein
NH44784_063421011-2.119071Cytochrome c oxidase subunit CcoP
NH44784_063431110-0.737232Cytochrome c oxidase subunit CcoQ
NH44784_063441090.676700Cytochrome c oxidase subunit CcoO
NH44784_063451191.761020Cytochrome c oxidase subunit CcoN
NH44784_0634613104.397869Type cbb3 cytochrome oxidase biogenesis protein
NH44784_0634713114.683883Type cbb3 cytochrome oxidase biogenesis protein
NH44784_0634812134.875707Adenosylmethionine-8-amino-7-oxononanoate
NH44784_0634912125.5348228-amino-7-oxononanoate synthase
NH44784_063501-1133.692880Biotin synthesis protein bioH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_063441PYOCINKILLER310.005 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.005
Identities = 12/32 (37%), Positives = 15/32 (46%)

Query: 207 RKAQQAKKAEEARKAEEARKAAQPAAAAQPAA 238
KA++ AE RKAEE + AA A
Sbjct: 220 NKAREQAAAEAKRKAEEQARQQAAIRAANTYA 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_063491PERTACTIN300.014 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 30.5 bits (68), Expect = 0.014
Identities = 52/221 (23%), Positives = 80/221 (36%), Gaps = 9/221 (4%)

Query: 175 SAAPAARFIVTESVFSMDGDRTDVARLAALAERYQAFVYLDEAHATGVLGPGGMGLAGLA 234
S APAA F+ + ++DG R A +A A V+L A P G + G A
Sbjct: 214 SGAPAAVFVFGANELTVDGGHITGGRAAGVAAMDGAIVHLQRATIRRGDAPAGGAVPGGA 273

Query: 235 PDGIDLAMGTFSKALGGFGAYVAGSRA-LCDYLVNACSGFIYTTALPPAVL----GAMDA 289
G + G G +G V+ S L +V A A A + G++ A
Sbjct: 274 VPGGAVPGGFGPLLDGWYGVDVSDSTVDLAQSIVEAPQLGAAIRAGRGARVTVSGGSLSA 333

Query: 290 ALDLVPTLDAERARLAASGERLRVALRGLGLDSGDSSTQIVPAIVGDEGRALALAATLEQ 349
V R L + L+ + + + V E L LA +
Sbjct: 334 PHGNVIETGGGARRFPPPASPLSITLQ----AGARAQGRALLYRVLPEPVKLTLAGGAQG 389

Query: 350 RGLLAVAIRPPTVPAGTSRLRIALSAAHRDADIDRLIDGLT 390
+G + PP A + L +AL++ R R +D L+
Sbjct: 390 QGDIVATELPPIPGASSGPLDVALASQARWTGATRAVDSLS 430


90NH44784_002931NH44784_003101N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_002931316-2.302882FIG00973797: hypothetical protein
NH44784_002941516-2.734435GTP-binding protein TypA/BipA
NH44784_002951416-1.426023tRNA pseudouridine synthase B
NH44784_002961517-1.426023Ribosome-binding factor A
NH44784_002971416-1.470305Translation initiation factor 2
NH44784_002981421-2.427884Transcription termination protein NusA
NH44784_002991426-1.631422COG0779: clustered with transcription
NH44784_003001324-1.065117Ribosomal large subunit pseudouridine synthase
NH44784_003011223-1.640287Segregation and condensation protein B
NH44784_003031017-1.278204*LysR-family transcriptional regulator clustered
NH44784_003041116-0.990700Putative metal chaperone, involved in Zn
NH44784_003051010-0.552116Putative metal chaperone, involved in Zn
NH44784_003081010-0.511553**Two-component system response regulator QseB
NH44784_003091-110-0.045353Sensory histidine kinase QseC
NH44784_003101011-0.603486Peptidase S1C, Do
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002931MICOLLPTASE310.001 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.8 bits (69), Expect = 0.001
Identities = 10/49 (20%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 83 ARFIQEIYAAFGEI--LGELHECSYVHVIDARAAAYGYGGKTQEYRHQH 129
F I L EL + H + R G G+ + Y+
Sbjct: 480 GTFFTYERTPEESIYTLEELFRHEFTHYLQGRYVVPGMWGQGEFYQEGV 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002941TCRTETOQM1685e-47 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 168 bits (428), Expect = 5e-47
Identities = 94/435 (21%), Positives = 167/435 (38%), Gaps = 62/435 (14%)

Query: 5 LRNVAIIAHVDHGKTTLVDQLLRQSGTFRENQALTE--RVMDSNDLEKERGITILAKNCA 62
+ N+ ++AHVD GKTTL + LL SG E ++ + D+ LE++RGITI +
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 63 VEYEGTHINIVDTPGHADFGGEVERVLSMVDGVLLLVDAVEGPMPQTIFVTRKALALGLK 122
++E T +NI+DTPGH DF EV R LS++DG +LL+ A +G QT + +G+
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 123 PIVVVNKVDRPGART-------------DFVINATFDLFDKLGATDEQL----------- 158
I +NK+D+ G + VI +L+ + T+
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 159 -DFPVVYASG--LSG---YAGLTPDVREGDMRPLF--------------EAILQHVPQRD 198
D Y SG L + + P++ E I
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSST 242

Query: 199 DDPNGPLQMQIISLDYNSYVGKIGVGRINRGRMRPGMEVAYKFGPEGQGGRGRINQVLKF 258
L ++ ++Y+ ++ R+ G + V + + +I ++
Sbjct: 243 HRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS-----EKEKIKITEMYTS 297

Query: 259 HGLERIVVDEAEAGDIVLINGIEDLGIGCTVTDPVTQDVLPMLRIDEPTLTMNFMVNTSP 318
E +D+A +G+IV++ E L + + D + P L +
Sbjct: 298 INGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQ 356

Query: 319 LAGREGKFVTSRQLRDRLDRELKSNVALRVRDTGDDTVFEVSGRGELHLTILLETMRRE- 377
L D L S+ LR +S G++ + + ++ +
Sbjct: 357 QREM---------LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKY 407

Query: 378 GYELAVSRPRVVFKE 392
E+ + P V++ E
Sbjct: 408 HVEIEIKEPTVIYME 422



Score = 33.7 bits (77), Expect = 0.003
Identities = 17/100 (17%), Positives = 32/100 (32%), Gaps = 1/100 (1%)

Query: 387 RVVFKEVDGVKCEPFESLTIDVEDAHQGGVMEELGRRKGDLQDMQPDGRGRTRLEYLIPA 446
V K+ EP+ S I + + + ++ D Q L IPA
Sbjct: 525 EQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQL-KNNEVILSGEIPA 583

Query: 447 RGLIGFQNEFLTLTRGTGLMSHIFHEYAPIKEGSIGERRN 486
R + ++++ T G + Y + + R
Sbjct: 584 RCIQEYRSDLTFFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_002971TCRTETOQM772e-16 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 76.8 bits (189), Expect = 2e-16
Identities = 69/277 (24%), Positives = 94/277 (33%), Gaps = 76/277 (27%)

Query: 525 VMGHVDHGKTSLLDYIRRAKVAAGEAG------------------GITQHIGAYHVETER 566
V+ HVD GKT+L + + A E G GIT G + E
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 567 GMVTFLDTPGHEAFTAMRARGAKATDIVILVCAADDGVMPQTREAIHHAKAAGVPMVVAM 626
V +DTPGH F A R D IL+ +A DGV QTR H + G+P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 627 TKIDKPSANPERVKQ--------------------------------------------- 641
KID+ + V Q
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 642 ------ELVAEEVVPEEYGG--DVPFVPV---SAKTGEGIDALLENVLLQAELLELKAPV 690
L A E+ EE + PV SAK GID L+E + +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 691 DAQAKGLVIEARLDKGRGPVATILVQSGTLHRGDVVL 727
++ G V + + R +A I + SG LH D V
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_003001cloacin320.009 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.009
Identities = 28/118 (23%), Positives = 36/118 (30%), Gaps = 10/118 (8%)

Query: 431 GEANGNRKGGKPTGGRGQAGAGGKSGGRGGKAGGGKARGVRAAAAGSAGAPEATVGAGRK 490
G+ G+ G T G G G G G G G + GS G+G
Sbjct: 4 GDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHG 63

Query: 491 PAGAKPGGKPAGARGGNAGRGNKPAGAGRAGNKAEGARAGGNKGPGAGGKPRAARGGS 548
G GN+G G+ G A PGAGG + G+
Sbjct: 64 NGGG----------NGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_003081HTHFIS873e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 3e-22
Identities = 32/125 (25%), Positives = 56/125 (44%), Gaps = 1/125 (0%)

Query: 2 RILLVEDDTMIGESVLDCLRAEHYAVDWVKDGHAAELALRTDPYDLVLLDLGLPRRDGLT 61
IL+ +DD I + L Y V + + DLV+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLRELRARKDRTPVLIATARDAVSDRIAGLDAGADDYIVKPYDVDELLARM-RALIRRSA 120
LL ++ + PVL+ +A++ I + GA DY+ KP+D+ EL+ + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GRAEP 125
++
Sbjct: 125 RPSKL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_00309156KDTSANTIGN320.004 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 32.2 bits (73), Expect = 0.004
Identities = 27/112 (24%), Positives = 48/112 (42%), Gaps = 12/112 (10%)

Query: 187 RQPEDLSPINVPDLP-------DEIRPLMQELNLLLERMRGAF--ALQKQFVGDAAHELR 237
+ D++ INVPD ++I+ +QEL LE +R +F + FV
Sbjct: 276 KPFADIAGINVPDTGLPNSASIEQIQSKIQELGDTLEELRDSFDGYINNAFVNQIHLNFV 335

Query: 238 SPLAALKLQLQSLRRAGDDASRRVAEE---RLAAGIERATRLVEQLLSMARH 286
P A + Q Q ++ ++ RL G ++ +L + L+ + RH
Sbjct: 336 MPPQAQQQQGQGQQQQAQATAQEAVAAAAVRLLNGSDQIAQLYKDLVKLQRH 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_003101V8PROTEASE853e-20 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 85.1 bits (210), Expect = 3e-20
Identities = 39/214 (18%), Positives = 72/214 (33%), Gaps = 39/214 (18%)

Query: 117 RGEGSGFIVSNDGIILTNAHVVQGAKEVTVKLTDRREFR------------AKVLGADTQ 164
SG +V +LTN HVV L ++ +
Sbjct: 101 TFIASGVVV-GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 165 TDVAVIKIDAR--------NLPVVKIGDVNKLQVGEWVLAIGSPYGLENTATAGIVSAKG 216
D+A++K + + + + QV + + G P AT K
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDK-PVATMWESKGKI 218

Query: 217 RSLPDDTSVPFIQTDVAVNPGNSGGPLFNDRGEVVGINSQIYSRTGGFQGLSFSIPIDVA 276
L +Q D++ GNSG P+FN++ EV+GI+ G+ V
Sbjct: 219 TYLK----GEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW---------GGVPNEFNGAVF 265

Query: 277 Y--KIKDQILEHGKVQHARLGVTVQEVNQDLANS 308
+++ + ++ ++ Q N D ++
Sbjct: 266 INENVRNFLKQN--IEDIHFANDDQPNNPDNPDN 297


91NH44784_005791NH44784_005881N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0057912140.262025Peptidyl-prolyl cis-trans isomerase
NH44784_0058011101.273390DNA ligase
NH44784_0058112101.850224hypothetical protein
NH44784_0058211100.952159Chromosome partition protein smc
NH44784_005831-180.181694hypothetical protein
NH44784_005841-27-0.379686Probable transmembrane protein
NH44784_00585108-0.072869Uracil-DNA glycosylase, family 4
NH44784_005861-18-1.394651Inactive homolog of metal-dependent
NH44784_005871010-2.401293Two-component system response regulator OmpR
NH44784_005881-29-1.092514Osmolarity sensory histidine kinase EnvZ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005791SOPEPROTEIN280.038 Salmonella type III secretion SopE effector protein ...
		>SOPEPROTEIN#Salmonella type III secretion SopE effector protein

signature.
Length = 239

Score = 27.8 bits (61), Expect = 0.038
Identities = 14/40 (35%), Positives = 21/40 (52%)

Query: 16 PAFAQNVATVNGKAIPQKSLDQFVKLLVSQGATDSPQLRE 55
PA+A A+ K+ DQ LL+S+G +P L+E
Sbjct: 103 PAYASQTREAILSAVYSKNKDQCCNLLISKGINIAPFLQE 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005821GPOSANCHOR521e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 52.0 bits (124), Expect = 1e-08
Identities = 51/334 (15%), Positives = 108/334 (32%), Gaps = 4/334 (1%)

Query: 168 AGVSRYKERRRETENRLSDTRENLTRVEDILRELNSQLEKLEAQAEVATRYRELQADGEK 227
+ E+ +E ++ L L N L+ + + +
Sbjct: 46 RSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKND 105

Query: 228 KQHALWFLKETGAREERAKKAQEMAQAQNELEAAIAGLRAGEAALESRRQAHYAASDAVH 287
K + K +A + + A N A A ++ EA + A+
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE 165

Query: 288 TAQGALYEANAQVSRLEAEIRHVVDSRNRLQARRDQLQQQIAEWNSQREHCAEQIAQAEE 347
A +A++ LEAE + + L+ + +++ + + A
Sbjct: 166 GAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAA 225

Query: 348 DLAAGAARTEEARAAAEDAQAALPSVEQRVREAASGRDEMRAALARVEQNLALVAQTQRD 407
A E A + A + ++E + + E+ AL + +
Sbjct: 226 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 285

Query: 408 ADRQMQALEQRRERLQQELRELHSPDPVRLEQLAGDRVAGEEQLEEAQAELATLEGKVPD 467
+ + ALE + L+ + + L+ + L D A E ++ +AE LE +
Sbjct: 286 LEAEKAALEAEKADLEHQSQVLN----ANRQSLRRDLDASREAKKQLEAEHQKLEEQNKI 341

Query: 468 ADAERSRAQAAAQTDAQNLARLEARLSALVKLQE 501
++A R + + +LEA L + +
Sbjct: 342 SEASRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375



Score = 50.8 bits (121), Expect = 3e-08
Identities = 44/322 (13%), Positives = 99/322 (30%), Gaps = 3/322 (0%)

Query: 168 AGVSRYKERRRETENRLSDTRENLTRVEDILRELNSQLEKLEAQAEVATRYRELQADGEK 227
A + + E S + L + L + LEK A + + +
Sbjct: 123 ADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLE 182

Query: 228 KQHALWFLKETGAREERAKKAQEMAQAQNELEAAIAGLRAGEAALESRRQAHYAASDAVH 287
+ A ++ + +++ A A A +A A +
Sbjct: 183 AEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 242

Query: 288 TAQGALYEANAQVSRLEAEIRHVVDSRNRLQARRDQLQQQIAEWNSQREHCAEQIAQAEE 347
+ A+ + LEA + + +I +++ + A E
Sbjct: 243 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302

Query: 348 DLAAGAARTEEARAAAEDAQAALPSVEQRVREAASGRDEMRAALARVEQNLALVAQTQRD 407
A + R + ++ A +E ++ A+ + ++L + ++
Sbjct: 303 QSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362

Query: 408 ADRQMQALE---QRRERLQQELRELHSPDPVRLEQLAGDRVAGEEQLEEAQAELATLEGK 464
+ + Q LE + E +Q LR +Q+ +L + LE
Sbjct: 363 LEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEES 422

Query: 465 VPDADAERSRAQAAAQTDAQNL 486
+ E++ QA + +A+ L
Sbjct: 423 KKLTEKEKAELQAKLEAEAKAL 444



Score = 38.9 bits (90), Expect = 1e-04
Identities = 47/268 (17%), Positives = 103/268 (38%), Gaps = 8/268 (2%)

Query: 663 QQEIENLQREIKAQQLIADQARAAVARAEVAWQQVSQAIAPARQRVAEVTRRVHDIQLEH 722
+ L++ ++ + A + E ++ A + + +
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 723 SRLQQQAEQSGQHAARLRQDLEEIAAHEEDLRATREEAEARFEALDEELAEHQSRFADAE 782
L+ + A L + LE A + EA AL+ E A+ + +
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLN 308

Query: 783 MDGETLAAQAEAARARLRELERAAQEAEFAERGIQVRITDLQRNQQLAADQSQRGSVELE 842
+ ++L +A+R ++LE Q+ E + + L+R+ + + ++ E +
Sbjct: 309 ANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQ 368

Query: 843 QLLADLVDLDASASQAGLQDALEVRAEREEALSRARQEME-NLAALLRGADEDRLQQERT 901
+L +AS L+ L+ E ++ + +A +E LAAL + E ++ T
Sbjct: 369 KLEEQNKISEASR--QSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLT 426

Query: 902 LEPRRARITELQ-----LQEQAARLAEE 924
+ + +L+ L+E+ A+ AEE
Sbjct: 427 EKEKAELQAKLEAEAKALKEKLAKQAEE 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005851PF05616310.008 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.9 bits (69), Expect = 0.008
Identities = 21/55 (38%), Positives = 25/55 (45%), Gaps = 5/55 (9%)

Query: 59 TPGGAPPAPVRDAAPAPAAVASPVAARSAGPGPGLRNGPPPKPAPKPEVAADAAP 113
TPG A AP +A P P SP + P P G P P P P++ DA P
Sbjct: 316 TPGSAE-AP--NAQPLPEV--SPAENPANNPAPNENPGTRPNPEPDPDLNPDANP 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005861SACTRNSFRASE472e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 46.9 bits (111), Expect = 2e-08
Identities = 20/73 (27%), Positives = 35/73 (47%)

Query: 286 IAVARSLHRQGLGSVLLDWCEQQARERSLDGVLLEVRPSNTAAVEFYKARGYLQIGLRRG 345
IAVA+ ++G+G+ LL + A+E G++LE + N +A FY ++ +
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTM 154

Query: 346 YYPAEKGGREDAL 358
Y E A+
Sbjct: 155 LYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005871HTHFIS1024e-27 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 102 bits (255), Expect = 4e-27
Identities = 38/135 (28%), Positives = 69/135 (51%), Gaps = 1/135 (0%)

Query: 11 KILVVDDDPRLRDLLRRYLSEQGFNVFVAEDAKEMGKLWQREHFDLLVLDLMLPGEDGLS 70
ILV DDD +R +L + LS G++V + +A + + DL+V D+++P E+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 71 ICRRLRGGHDNTPIIMLTAKAEEIDRIVGLEMGADDYLSKPFNPRELLARI-NAILRRRG 129
+ R++ + P+++++A+ + I E GA DYL KPF+ EL+ I A+ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 130 TEEHPGAPSQENESI 144
SQ+ +
Sbjct: 125 RPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_005881PF06580424e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.8 bits (98), Expect = 4e-06
Identities = 17/104 (16%), Positives = 37/104 (35%), Gaps = 22/104 (21%)

Query: 351 LIENARRYG-RSTDGMAHLVMTLQAEGGMIVIEVSDRGPGIAPEDVDRLLRPFSRGEAAR 409
L+EN ++G +++ + G + +EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------- 309

Query: 410 TGVSGAGLGLAIVERLLKHVGG---ALRMLPREGGGLTARIELP 450
G GL V L+ + G +++ ++G A + +P
Sbjct: 310 ----STGTGLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIP 348


92NH44784_006101NH44784_006141N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_006101420-4.027649NADH-ubiquinone oxidoreductase chain C
NH44784_006111318-3.141289NADH-ubiquinone oxidoreductase chain B
NH44784_006121218-1.557984NADH ubiquinone oxidoreductase chain A
NH44784_006131-115-0.089877outer membrane porin
NH44784_006141-1112.383500Transcriptional regulator, TetR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006101PF04183280.045 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.5 bits (61), Expect = 0.045
Identities = 16/72 (22%), Positives = 21/72 (29%), Gaps = 27/72 (37%)

Query: 100 WRLRVRTWAPDDEFPMV-ASLMEC----------------WPAVGWFER----------E 132
WR W DE P++ A+LMEC A W +
Sbjct: 351 WRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYH 410

Query: 133 AFDLYGIVFEGH 144
YG+ H
Sbjct: 411 LLCRYGVALIAH 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006121HOKGEFTOXIC250.037 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 24.8 bits (54), Expect = 0.037
Identities = 10/30 (33%), Positives = 13/30 (43%)

Query: 4 QQYFPVLLFIVVATLIGFALLTAGSLLGPR 33
+ + IV TL+ F LT SL R
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIR 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006131ECOLNEIPORIN1012e-26 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 101 bits (253), Expect = 2e-26
Identities = 84/388 (21%), Positives = 141/388 (36%), Gaps = 58/388 (14%)

Query: 1 MKKTLLAAALLAGFAGVAQAETSVTLYGIIDTGIGYNK-ISGAGDAKNGSKIGMINGVQN 59
MKK+L+A L A A VTLYG I G+ ++ ++ G + G V
Sbjct: 1 MKKSLIALTLAAL---PVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGT-GIVDL 56

Query: 60 GSRWGLRGSEDLGDGLRAVFQLESGFDSGNGKSGQNGRLFGRQATVGLASDSWGQLDFGR 119
GS+ G +G EDLG+GL+A++Q+E +G + RQ+ +GL +G+L GR
Sbjct: 57 GSKIGFKGQEDLGNGLKAIWQVEQK----ASIAGTDSGWGNRQSFIGLKGG-FGKLRVGR 111

Query: 120 QTNIASKYFGSIDPFGAGFNVANIGTGMSAANTQRYDNMVMYQTPSFSGFQFGVGYSFSA 179
++ G I+P+ + ++ A + V Y +P F+G V Y+ +
Sbjct: 112 LNSVLKDT-GDINPW---DSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALND 167

Query: 180 DEGNADGSTKADPDRVGFKTADNVRAITTGLRYVNGPLNVALTYDQLNASHNQAQGEVDA 239
+ G + + G Y NG V Q ++
Sbjct: 168 NAGRHNS-----------------ESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIE- 209

Query: 240 TPRSYMIGGSYDFEVVKLALAYARTTDGWFATSNPAGASGTITVDGTSRKLGFGANQFAD 299
+ + + YD + + ++ + D A S + + A +F +
Sbjct: 210 KYQIHRLVSGYDNDALYASV-AVQQQD---AKLVEENYSHNSQTEVAAT----LAYRFGN 261

Query: 300 GFKANSYLLGLSAPIGGASSLFGSWQRVDPSNNKLTGDDATMNVFSAGYTYDLSKRTNLY 359
SY G + + G YD SKRT+
Sbjct: 262 VTPRVSYAHGFKGSFDAT------------------NYNNDYDQVVVGAEYDFSKRTSAL 303

Query: 360 AYGSYSKNYAFNDGVKATAVGVGLRHRF 387
+ + +TA GVGLRH+F
Sbjct: 304 VSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006141HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.1 bits (166), Expect = 2e-16
Identities = 33/150 (22%), Positives = 55/150 (36%), Gaps = 6/150 (4%)

Query: 8 ADRNERLSALRRQMILDAAQRVFERDGLEKTTIRAIAKEAGCTTGAIYPWFAGKEILYGA 67
A + ++ + RQ ILD A R+F + G+ T++ IAK AG T GAIY F K L+
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 68 LLDESLQRLHAHLQAATADC--APAAAARQAILAFFGYYAERRTDFSLGLYLFQ---GLG 122
+ + S + A P + R+ ++ L +F +G
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 123 PRGLGRDMDEQLNGRLRQ-CVDVLGQALAR 151
+ + L L +
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEA 151


93NH44784_006231NH44784_006351N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_006231-1141.870286Transcription repressor of multidrug efflux pump
NH44784_006241-1151.208950RND efflux system, membrane fusion protein CmeA
NH44784_006251-1151.238423RND efflux system, inner membrane transporter
NH44784_006261-2120.720834RND efflux system, outer membrane lipoprotein
NH44784_006271-210-0.186447Transcriptional regulator, DeoR family
NH44784_006281-190.391198RND efflux system, outer membrane lipoprotein
NH44784_006291013-0.248293RND multidrug efflux transporter; Acriflavin
NH44784_006301-1100.780242Probable Co/Zn/Cd efflux system membrane fusion
NH44784_006311-1120.414850High-affinity choline uptake protein BetT
NH44784_006321-1150.909156Transcriptional regulator, TetR family
NH44784_006331-1130.526062Probable Co/Zn/Cd efflux system membrane fusion
NH44784_006341-212-0.162448Acriflavin resistance protein
NH44784_006351-2110.311380RND efflux system, outer membrane
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006231HTHTETR1025e-29 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 102 bits (254), Expect = 5e-29
Identities = 60/213 (28%), Positives = 103/213 (48%), Gaps = 6/213 (2%)

Query: 1 MARKTKEESQRTRDRILDAAEHVFLSKGVASTTMSDIADFAGVSRGAVYGHYKNKIDVCI 60
MARKTK+E+Q TR ILD A +F +GV+ST++ +IA AGV+RGA+Y H+K+K D+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AMCDRA---LGEAVSLTRVSTDGEALESLYASMRQYVQIYAEEGSVQRVLEILYLKCERS 117
+ + + +GE + G+ L L + ++ E + ++EI++ KCE
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 118 DENAPLLRRRDLWERHALRTSEKLLRAAVSREDLPRALDVRLSNVYLHSLIEGVFGTICW 177
E A + + + + E+ L+ + + LP L R + + + I G+ W
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMEN--W 178

Query: 178 SDRLKGDIWPRVERML-RAAIDTLRLSPQLRTP 209
+ + R ++ L P LR P
Sbjct: 179 LFAPQSFDLKKEARDYVAILLEMYLLCPTLRNP 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006241RTXTOXIND447e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 7e-07
Identities = 29/199 (14%), Positives = 58/199 (29%), Gaps = 15/199 (7%)

Query: 37 IGVIVAKATPTSVASELPGRLEPY-REAEVRARVAGIVTRRLYEEGQEVARGAPLFQIDP 95
I I++ + + G+L R E++ IV + +EG+ V +G L ++
Sbjct: 70 IAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTA 129

Query: 96 -------APLQAAYDSEAAALARAQANLSAAADKLRRYADLVSDRAISERDHAESVAEER 148
Q++ R Q LS + + + + D + E V
Sbjct: 130 LGAEADTLKTQSSLLQARLEQTRYQI-LSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT 188

Query: 149 QARAEVALAKANLQSARLRLEYARVTSPIDGRARRALVTEGALVGEGQATPLTVVQQLDP 208
E N Q + L + + R E E
Sbjct: 189 SLIKEQFSTWQN-QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRL-----DDFSS 242

Query: 209 IYVNFSQPAAEVMQLQKQI 227
+ + V++ + +
Sbjct: 243 LLHKQAIAKHAVLEQENKY 261



Score = 39.0 bits (91), Expect = 3e-05
Identities = 33/231 (14%), Positives = 68/231 (29%), Gaps = 30/231 (12%)

Query: 80 EGQEVARGAPLFQIDPAPLQAAYDSEAAALARAQANLSAAADKL----RRYADLVSDRAI 135
E + + L A + E A L +L + +
Sbjct: 233 EKSRLDDFSSLLHKQAIAKHAVLEQENK-YVEAVNELRVYKSQLEQIESEILSAKEEYQL 291

Query: 136 SERDHAESVAEE-RQARAEVALAKANLQSARLRLEYARVTSPIDGR-ARRALVTEGALVG 193
+ + ++ RQ + L L R + + + +P+ + + + TEG +V
Sbjct: 292 VTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVT 351

Query: 194 EGQA--------TPLTVVQQLDPIYVNFSQPAAEV---MQLQKQIRAGALEGVAPDKMRV 242
+ L V + + F ++ R G L G +V
Sbjct: 352 TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVG------KV 405

Query: 243 RLLLPDGSEYGQGGTLSFADLAVDPGTDNVTMRALFDNPDRVLLPGMYVRV 293
+ + D E + G + ++++ N + L GM V
Sbjct: 406 KNINLDAIEDQRLGLVFNVIISIE------ENCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006251ACRIFLAVINRP11110.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1111 bits (2874), Expect = 0.0
Identities = 540/1030 (52%), Positives = 724/1030 (70%), Gaps = 6/1030 (0%)

Query: 1 MARFFIDRPVFAWVISLLIALVGLLSIRALPVAQYPDIAPPVVNIGASYPGASAKVVEEA 60
MA FFI RP+FAWV+++++ + G L+I LPVAQYP IAPP V++ A+YPGA A+ V++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTAIIEREMNGAPGLMYTSSSSDSTGWASINLTFKQGTNPDIAAVEVQNRLKAVEPRLPE 120
VT +IE+ MNG LMY SS+SDS G +I LTF+ GT+PDIA V+VQN+L+ P LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 SVRRDGVRVEKAADNIQLVVSLKSD-GSLDDMQLGELAASNVLQALRRVEGVGKVQSFGA 179
V++ G+ VEK++ + +V SD + + ASNV L R+ GVG VQ FGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 EAAMRIWPDPAKLTALSLTPGDIVSALRSHNARVTIGELGNQAVPQDAPLNASIVAGESL 239
+ AMRIW D L LTP D+++ L+ N ++ G+LG LNASI+A
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 240 HTPEQFANIPLRAQPDGATLRLKDVARVELGGTDYMYLSRVNGMTGTGLGIKLAPGSNAV 299
PE+F + LR DG+ +RLKDVARVELGG +Y ++R+NG GLGIKLA G+NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 300 ETTRRIRETMRELAQYFPPGVSWDIPYETSTFVEISIQKVLMTLLEAVALVFCVMYLFMQ 359
+T + I+ + EL +FP G+ PY+T+ FV++SI +V+ TL EA+ LVF VMYLF+Q
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 360 NLRATLIPTLVVPVALLGTLGVMLGLGYSINVLTMFGMVLAIGILVDDAIVVVENVERIM 419
N+RATLIPT+ VPV LLGT ++ GYSIN LTMFGMVLAIG+LVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 420 AEEGLSPHDATVKAMGQISGAIVGITVVLVSVFVPMAFFDGAVGNIYRQFAVTLAVSIAF 479
E+ L P +AT K+M QI GA+VGI +VL +VF+PMAFF G+ G IYRQF++T+ ++A
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 480 SAFLALSLTPALCASLLKPIPAGHHE-KRGFFGWFNRAFARLTTRYTARVAGVLARPVRF 538
S +AL LTPALCA+LLKP+ A HHE K GFFGWFN F YT V +L R+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 539 GLAYALVIGVAALLFARLPSSFLPDEDQGSFMAMVILPQGSPQAETMAVVKDVERYMMEH 598
L YAL++ +LF RLPSSFLP+EDQG F+ M+ LP G+ Q T V+ V Y +++
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 EP--VQYVYSVNGFSQYGSGPNSAMFFVTLKDWKERRDASLHVDAVVKRINEAFADRKNL 656
E V+ V++VNGFS G N+ M FV+LK W+ER +AV+ R ++
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 657 MVFALNSPPLPDLGSTSGFDFRLQDRGGLGYEALTQARQKLLAAAAEHPA-LTDVVFAGQ 715
V N P + +LG+ +GFDF L D+ GLG++ALTQAR +LL AA+HPA L V G
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 716 EEAPQLQLRIDRDKAQAMGVPIDEINTALAVMYGSDYIGDFMLNGQVRRVTVQADGKRRV 775
E+ Q +L +D++KAQA+GV + +IN ++ G Y+ DF+ G+V+++ VQAD K R+
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 776 DVDDISRLHVRNLQGQMVPLSAFATLKWSMGPPQLNRYNGFPSFTINGSAAPGHSSGEAM 835
+D+ +L+VR+ G+MVP SAF T W G P+L RYNG PS I G AAPG SSG+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 836 RAMETLAAELPRGIGFDWSGQSYEERLSGNQAPVLFALSVLIVFLALAALYESWSIPLAV 895
ME LA++LP GIG+DW+G SY+ERLSGNQAP L A+S ++VFL LAALYESWSIP++V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 896 ILVVPLGVIGALLGVTVRGMPNDIYFKVGLIATIGLSAKNAILIVEVAKDL-VRDGQGIL 954
+LVVPLG++G LL T+ ND+YF VGL+ TIGLSAKNAILIVE AKDL ++G+G++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 955 SATLEAARLRLRPIVMTSLAFGVGVLPLALASGAASGAQAAIGTGVLGGIITATVLAVFL 1014
ATL A R+RLRPI+MTSLAF +GVLPLA+++GA SGAQ A+G GV+GG+++AT+LA+F
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1015 VPLFFLIVGR 1024
VP+FF+++ R
Sbjct: 1021 VPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006291ACRIFLAVINRP8000.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 800 bits (2067), Expect = 0.0
Identities = 309/1029 (30%), Positives = 519/1029 (50%), Gaps = 29/1029 (2%)

Query: 5 DLFVRRPVLALVVSTLLLLMGLRALSGLPIRQYPLTESTTITITTQYPGASPDLMQGFVT 64
+ F+RRP+ A V++ +L++ G A+ LP+ QYP ++++ YPGA +Q VT
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 65 QPIAQSVATVEGIDYLSSSSTQ-GRSVITVRMKLNADSNKAMTEVMAKVNEVKYRLPQDA 123
Q I Q++ ++ + Y+SS+S G IT+ + D + A +V K+ LPQ+
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 124 YDPVLVKSAGEATAVAYVGFSSST--MSLPALTDYLSRVVQPQLSSIDGVASVELYGGQK 181
+ ++ + GF S + ++DY++ V+ LS ++GV V+L+G Q
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 182 LAMRVWLDPDRLAARGISAGELAEALRQNNVQAAPGQAKGLYVVS------NIKVNTDLV 235
AMR+WLD D L ++ ++ L+ N Q A GQ G + +I T
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 236 NVAEFRDMVVKREGD-AIVRLGDVATVELGAASTDSSGLMDGEPAVYFGLNATPVGNPLV 294
N EF + ++ D ++VRL DVA VELG + + ++G+PA G+ N L
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 295 IVKRLNELLPNIKQNLPPGTSVQVPFELARFISASIDGVVHTLMEAIVIVVAVIFLCLGS 354
K + L ++ P G V P++ F+ SI VV TL EAI++V V++L L +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 355 LRAVLIPVVTIPLSMLGGAAIMALFGFSVNLLTLLAMVLAIGLVVDDAIVVVENVHRHIE 414
+RA LIP + +P+ +LG AI+A FG+S+N LT+ MVLAIGL+VDDAIVVVENV R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EGKS-PVRAALIGAREVAGPVIAMTITLAAVYAPIGLMGGLTGSLFKEFAFTLAGAVVVS 473
E K P A ++ G ++ + + L+AV+ P+ GG TG+++++F+ T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 474 GVIALTLSPVMSSFLLNSHVSE-----GWMARRAEHFFNRLGQAYGRLLDVSLHHRWVTG 528
++AL L+P + + LL +E G F+ Y + L
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 529 LVAVAVMASLPFLYNAAQRELAPVEDQSTVLTAIKSPQHANIDYVEKFGKK-WDDVMATL 587
L+ ++A + L+ P EDQ LT I+ P A + +K + D +
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 588 PEQKGRWL-ING----SDGVSNSIGGVDFVDWQKRKR---SADQIQADMQSMVNQIEGSN 639
+NG + + V W++R SA+ + + + +I
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 640 IFAFQLPP--LPGSTGGLPVQMVIMSAADYPVVYEAMENL-KQAARASGLFMVVDSDLDY 696
+ F +P G+ G +++ + + + +A L AA+ + V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 697 NNPVVQLKVDRAKANSLGVTMKAVGDTLAVLVGENYVNRFGMDGRSYDVIPQSLREQRIS 756
+ +L+VD+ KA +LGV++ + T++ +G YVN F GR + Q+ + R+
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 757 PESLARYYVKSASGAQVPLSNIVSVSMGVEPNKLTQFNQLNAATFQAIPMPGVTMGDAVQ 816
PE + + YV+SA+G VP S + +L ++N L + Q PG + GDA+
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 817 FLTRQAQELPAGFSYDWQSDARQFSQEGSALMITFIFAIIVIYLVLAAQYESLRDPFIIL 876
+ A +LPAG YDW + Q G+ + +V++L LAA YES P ++
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 877 VSVPLSICGALIPLALGMATINIYTQIGLVTLIGLISKHGILMVEFANEMQVSHGLDRRS 936
+ VPL I G L+ L ++Y +GL+T IGL +K+ IL+VEFA ++ G
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 937 AMEHAARIRLRPILMTTAAMVVGLIPLLFASGAGAHSRYSLGLVIVVGMLVGTLFTLFVL 996
A A R+RLRPILMT+ A ++G++PL ++GAG+ ++ ++G+ ++ GM+ TL +F +
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFV 1021

Query: 997 PTVYTVLAR 1005
P + V+ R
Sbjct: 1022 PVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006301RTXTOXIND545e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.7 bits (129), Expect = 5e-10
Identities = 22/141 (15%), Positives = 56/141 (39%), Gaps = 7/141 (4%)

Query: 105 QGELARLQAQAKNARALLERTRRLVPQQAATREQLDQAQADYDQAAGEIKRIQALIEQKR 164
+ +L +++++ +A+ + +L + ++L Q + E+ + + +
Sbjct: 272 KSQLEQIESEILSAKEEYQLVTQLF--KNEILDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 165 IKAPFDGVLGVRRVH-LGQFARAGDPLVSLT-DAASVYANITLPEQALGALRMGQPVSIT 222
I+AP + +VH G + L+ + + ++ + + +G + +GQ I
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 223 VDAHAGR---VFEGTVTTIEP 240
V+A G V I
Sbjct: 390 VEAFPYTRYGYLVGKVKNINL 410



Score = 47.1 bits (112), Expect = 6e-08
Identities = 22/124 (17%), Positives = 48/124 (38%), Gaps = 7/124 (5%)

Query: 44 AVAPVLQADFPVALSGIASLEATRQVLVPAEVDGRVAQILFTPGERVRAGQLLVQLNDAP 103
+ V +G + + + P E + V +I+ GE VR G +L++L
Sbjct: 76 VLGQVEIV---ATANGKLTHSGRSKEIKPIE-NSIVKEIIVKEGESVRKGDVLLKLTALG 131

Query: 104 EQGELARLQAQAKNARALLERTRRLVPQQAATREQLDQAQADYDQ---AAGEIKRIQALI 160
+ + + Q+ AR R + L + + + + E+ R+ +LI
Sbjct: 132 AEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLI 191

Query: 161 EQKR 164
+++
Sbjct: 192 KEQF 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006321HTHTETR784e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 78.1 bits (192), Expect = 4e-20
Identities = 42/204 (20%), Positives = 83/204 (40%), Gaps = 11/204 (5%)

Query: 5 RLTREQSRDQTRQRLLDAAQSIFLTKGFVAASVEDIAELAGYTRGAFYSNFASKSELFLQ 64
R T+++++ +TRQ +LD A +F +G + S+ +IA+ AG TRGA Y +F KS+LF +
Sbjct: 3 RKTKQEAQ-ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 65 LLKRDHENVMSDMRAIFEA--GETRQQMEDSVLHY--YSNHFRDNECFLLWMEAKLQAAR 120
+ + N+ G+ + + ++H + + + K +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 121 DPEFRIGFIACM-GELRDATTEYIRQFSERVGTPLPLPARELAIGLLALSDGMQFSFAFD 179
+ + E D + ++ E P L R AI + G+ ++ F
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 180 PQNVSAETT-----ESVLAGFFRR 198
PQ+ + +L +
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006331RTXTOXIND454e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.8 bits (106), Expect = 4e-07
Identities = 18/127 (14%), Positives = 44/127 (34%), Gaps = 5/127 (3%)

Query: 93 VGGKIIERKVRLGDTVKPGQVVARLDPADATKNAAAARAQLSAAQ-----HQLDYAKQQL 147
+ E V+ G++V+ G V+ +L A + ++ L A+ +Q+ +L
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 148 DRDRAQARENLIAANQLEQTRNAYASALAQRDQAAQQAALSADQLNYTTLVADHAGVITA 207
++ + + + ++L + + Q +LN A+ V+
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLAR 222

Query: 208 EQADTGQ 214

Sbjct: 223 INRYENL 229



Score = 33.6 bits (77), Expect = 0.001
Identities = 25/198 (12%), Positives = 61/198 (30%), Gaps = 28/198 (14%)

Query: 120 ADATKNAAAARAQLSAAQHQLDYAKQQLDRDRAQARENLIAANQLEQTRNAYASALAQRD 179
+A ++QL + ++ AK++ + + +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN---------EILDKLRQTTDNIG 312

Query: 180 QAAQQAALSADQLNYTTLVADHAGVITAEQADT-GQNVAAGQAVYNLAWSGD---VDALC 235
+ A + ++ + + A + + + T G V + + + D V AL
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTAL- 371

Query: 236 DVPESVLAGLAIGQRATVTLAPLPGKTFSAV---LREIAPAADPQSRT---YRAKLTLES 289
V + + +GQ A + + P + + ++ I A R + +++E
Sbjct: 372 -VQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEE 430

Query: 290 PSPEVRL-------GMTA 300
GM
Sbjct: 431 NCLSTGNKNIPLSSGMAV 448



Score = 29.4 bits (66), Expect = 0.025
Identities = 19/113 (16%), Positives = 40/113 (35%), Gaps = 7/113 (6%)

Query: 96 KIIERKVRLGDTVKPGQVVARLDPADATKNAAAARAQLSAAQHQLDYAKQQLDRDRAQ-- 153
+ IE + + + + + + Q S Q+Q + LD+ RA+
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 154 ----ARENLIAANQLEQTRNAYASALAQRDQAAQQAALSADQLNYTTLVADHA 202
+++E++R S+L QA + A+ + Y V +
Sbjct: 218 TVLARINRYENLSRVEKSRLDDFSSLLH-KQAIAKHAVLEQENKYVEAVNELR 269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006341ACRIFLAVINRP472e-152 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 472 bits (1217), Expect = e-152
Identities = 241/1047 (23%), Positives = 442/1047 (42%), Gaps = 62/1047 (5%)

Query: 22 LSAWALRHQPLVIFLITLVTLFGVLSYSRLAQSEDPPFTFRVMVIKTLWPGATSQQVQEQ 81
++ + +R L ++ + G L+ +L ++ P + + +PGA +Q VQ+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 82 VTDRIAKKLQETPSTDFLRSYS-RPGESLIFFTMKDSAPASEVANEWYQVRKKVGDIGAT 140
VT I + + + ++ S S G I T + QV+ K+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQV---QVQNKLQLATPL 117

Query: 141 LPQGVQGP-FFNDEFGDVYTNIYTLAGDG--FSPAQLRDYAD-NLRTVLLRVPGVAKVDY 196
LPQ VQ ++ Y + D + + DY N++ L R+ GV V
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 197 FGEQPEHVYIEISNTQLTRLGVSPQQIAEAINSQNAVASAGTLTTADD------RVFVRP 250
FG Q + I + L + ++P + + QN +AG L +
Sbjct: 178 FGAQYA-MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIA 236

Query: 251 SGQFQDTRALAETLIRVN--GKSIRLGDIATIHRGYDDPPAEQMRTRGEPVLGIGITMQP 308
+F++ + +RVN G +RL D+A + G ++ R G+P G+GI +
Sbjct: 237 QTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENY-NVIARINGKPAAGLGIKLAT 295

Query: 309 GQDVVHLGKALEAKFGELKARLPAGLTLTEVSSMPKAVSFSVDEFLRSVAEAVAIVLIVS 368
G + + KA++AK EL+ P G+ + V S+ E ++++ EA+ +V +V
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 369 LVSLG-LRTGMVVVISIPVVLAVTALFMDMFGIGLHKVSLGTLVLALGLLVDDAIIAVEM 427
+ L +R ++ I++PVVL T + FG ++ +++ +VLA+GLLVDDAI+ VE
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 428 MA-VKLEQGWSRARAAAFAYTSTAFPMLTGTLVTVAGFLPIALAKSSTGEYTRSIFQVSA 486
+ V +E A + + ++ +V A F+P+A STG R
Sbjct: 416 VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIV 475

Query: 487 IALITSWFAAVVLIPLLGYRMLPERKREAHLPDDHEHDIYNTRFYQRLRGW---VAWCVD 543
A+ S A++L P L +L E H +NT F + + V +
Sbjct: 476 SAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILG 535

Query: 544 RRWRVLLATVLVFILSMAGFGLVPQQFFPSSDRTELLVDVRLQEGASFAATLRQVERLEK 603
R LL L+ + F +P F P D+ L ++L GA+ T + ++++
Sbjct: 536 STGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTD 595

Query: 604 VL--DGRPEIAHSVSFVGTGAPRFYLPLDQQLATPNFAQIVITANSVEDR---EKLAHWL 658
+ + + + G N ++ E+R E A +
Sbjct: 596 YYLKNEKANVESVFTVNGFSFSG---------QAQNAGMAFVSLKPWEERNGDENSAEAV 646

Query: 659 EPVLREQFPAIRSRLSRLENGPPV-------GF----QVQFRVSGDKIPEVRQVAEKVAA 707
+ + IR N P + GF Q + D + + R +AA
Sbjct: 647 IHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAA 706

Query: 708 EVRADSRSVNVQFDWDEPSERSVRFDIDQQKAREVGVSSNDISSFLAMTLSGYTVTQYRE 767
+ A SV D + ++DQ+KA+ +GVS +DI+ ++ L G V + +
Sbjct: 707 QHPASLVSVRPNGLEDTAQ---FKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID 763

Query: 768 RDKLINVSLRAPDSERIDPGRLATLAMPTPNG-PVPLGSLGQVRYGLEYGVIWERDRQPT 826
R ++ + ++A R+ P + L + + NG VP + + + + P+
Sbjct: 764 RGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPS 823

Query: 827 ITVQADVTAGAQGIDVTHAIDRKLDALRAELPVGYRIEVGGPVEESAKGQSSINAQMPIM 886
+ +Q + G D ++ L ++LP G + G + + A + I
Sbjct: 824 MEIQGEAAPGTSSGDAMALMEN----LASKLPAGIGYDWTGMSYQERLSGNQAPALVAIS 879

Query: 887 VVAVLTLLMVQLQSFARVLMVVLTAPLGLIGVVAALLLFGKPFGFVAMLGVIAMFGIIMR 946
V V L +S++ + V+L PLG++GV+ A LF + M+G++ G+ +
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAK 939

Query: 947 NSVILVDQIEQDISA-GHKRVDAIVGATVRRFRPITLTAAAAVLALIPLLRSNFFG---- 1001
N++++V+ + + G V+A + A R RPI +T+ A +L ++PL SN G
Sbjct: 940 NAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQ 999

Query: 1002 -PMATALMGGITSATVLTLFFLPALYA 1027
+ +MGG+ SAT+L +FF+P +
Sbjct: 1000 NAVGIGVMGGMVSATLLAIFFVPVFFV 1026



Score = 93.0 bits (231), Expect = 3e-21
Identities = 63/335 (18%), Positives = 133/335 (39%), Gaps = 23/335 (6%)

Query: 726 SERSVRFDIDQQKAREVGVSSNDISSFL--------AMTLSGYTVTQYRERDKLINVSLR 777
++ ++R +D + ++ D+ + L A L G ++ + I R
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 778 APDSERIDPGRLATLAMPTPNGPVPLGSLGQVRYGLE-YGVIWERDRQPTITVQADVTAG 836
+ E TL + + V L + +V G E Y VI + +P + + G
Sbjct: 240 FKNPEEFGK---VTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATG 296

Query: 837 AQGIDVTHAIDRKLDALRAELPVGYRIEVGGPVEESAKGQSSINA---QMPIMVVAVLTL 893
A +D AI KL L+ P G ++ P + + Q SI+ + ++ V +
Sbjct: 297 ANALDTAKAIKAKLAELQPFFPQGMKVLY--PYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 894 LMVQLQSFARVLMVVLTAPLGLIGVVAALLLFGKPFGFVAMLGVIAMFGIIMRNSVILVD 953
+ + LQ+ L+ + P+ L+G A L FG + M G++ G+++ +++++V+
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 954 QIEQDISAGH-KRVDAIVGATVRRFRPITLTAAAAVLALIPLL-----RSNFFGPMATAL 1007
+E+ + +A + + + A IP+ + + +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 1008 MGGITSATVLTLFFLPALYAAWFRVRHDEREEPEG 1042
+ + + ++ L PAL A + E E +G
Sbjct: 475 VSAMALSVLVALILTPALCATLLKPVSAEHHENKG 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006351RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.006
Identities = 37/206 (17%), Positives = 68/206 (33%), Gaps = 20/206 (9%)

Query: 235 DALQADQDAATLEASLPGLRAQ---WQATRHALAVLMGRSPDQAPPDLAFAMIAVPEEV- 290
AL A+ D ++SL R + +Q ++ L + P + F ++ E +
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIE-LNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 291 -PVLVPSELLAARPDIHIAEAVLRAAAADVGVATAQLFPSLSLSASMG-----------K 338
L+ + + + E L A+ A++ +LS K
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK 246

Query: 339 GGFSWPTALSGAGSIWAVGASLTQPIFHGGALLAERSAAKERYEAAVLQYKQTVLTALRD 398
+ L L + +E +AKE Y+ +K +L LR
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQ 306

Query: 399 VADTLAQLEADGQALASAEASRRAAE 424
D + L LA E ++A+
Sbjct: 307 TTDNIGLLT---LELAKNEERQQASV 329


94NH44784_006511NH44784_006631N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_006511-114-0.927998Functional role page for Anaerobic nitric oxide
NH44784_006521-120-1.250585hypothetical protein
NH44784_006531017-0.578260Osmosensitive K+ channel histidine kinase KdpD
NH44784_006541-1140.242584Two-component hybrid sensor and regulator
NH44784_006551090.752759UTP--glucose-1-phosphate uridylyltransferase
NH44784_006561-290.267844Chemotaxis protein methyltransferase CheR
NH44784_006571-290.088586Bis(5'-nucleosyl)-tetraphosphatase, symmetrical
NH44784_006581-270.1469353-oxoacyl-[acyl-carrier protein] reductase
NH44784_006591-260.212084Acetoacetyl-CoA reductase
NH44784_006601-27-0.020085Uncharacterized transporter, similarity to
NH44784_006611-170.180516autotransporter
NH44784_006621081.316593two-component system regulatory protein
NH44784_006631161.703614Two-component system sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006511HTHFIS360e-122 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 360 bits (925), Expect = e-122
Identities = 132/397 (33%), Positives = 197/397 (49%), Gaps = 30/397 (7%)

Query: 128 ALEVGTFDAEAQS-DLRDMIVLVEAALRVTRLEAETRALRVARGALPADSALADDSEILG 186
A E G +D + DL ++I ++ AL + G ++G
Sbjct: 93 ASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGM-----------PLVG 141

Query: 187 QSEAIARLLHEIEVVADSELPILLLGETGVGKELFAHRVHRQSRRRGKPLVHVNCAALPE 246
+S A+ + + + ++L +++ GE+G GKEL A +H +RR P V +N AA+P
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201

Query: 247 SLAESELFGHAKGAFSGALTERPGRFEAANGGTLFLDEVGELPLAVQAKLLRTLQNGEIQ 306
L ESELFGH KGAF+GA T GRFE A GGTLFLDE+G++P+ Q +LLR LQ GE
Sbjct: 202 DLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYT 261

Query: 307 RLGDDRPRTVDVRIVAATNRDLRDRVRQGDFRADLYHRLSVYPVPIPPLRKRGRDVLLLA 366
+G P DVRIVAATN+DL+ + QG FR DLY+RL+V P+ +PPLR R D+ L
Sbjct: 262 TVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLV 321

Query: 367 GRYLELNRARLGLRSLRLSPEAEDMLSRYRWPGNVRELEHVISRAAIKAVSHGAHRNEIV 426
+++ + GL R EA +++ + WPGNVRELE+++ R R I
Sbjct: 322 RHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIE 380

Query: 427 TLGPGLLDLDDVAMPAPPPEAPALPAGAAQPLR-----------------VAVDACQRQA 469
+ + A + ++ + +R + +
Sbjct: 381 NELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPL 440

Query: 470 IRRALQAHQGSWAGAARALDIDSSNLHKLARRLGLKD 506
I AL A +G+ AA L ++ + L K R LG+
Sbjct: 441 ILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006541HTHFIS769e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.0 bits (187), Expect = 9e-17
Identities = 24/115 (20%), Positives = 54/115 (46%), Gaps = 5/115 (4%)

Query: 412 TRKAVLLVEDNRMVGEVVSTLLTDLGQDLTWATDAESALGMLQERQGRFDIAILDIVLPG 471
T +L+ +D+ + V++ L+ G D+ ++A + + G D+ + D+V+P
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWI--AAGDGDLVVTDVVMPD 59

Query: 472 MDGVQLAEMIKAQWPAIRVVLATGYSE---ALAGRDAGALEVMQKPYSIEALTRL 523
+ L IK P + V++ + + A+ + GA + + KP+ + L +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006561HTHFIS748e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.5 bits (183), Expect = 8e-19
Identities = 28/119 (23%), Positives = 48/119 (40%), Gaps = 2/119 (1%)

Query: 2 TILLADDNQDAVETLAEILRLDHHQVHTAFEGRTALALAQAHRPDIVLLDISMPELDGYT 61
TIL+ADD+ L + L + V T A D+V+ D+ MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LCQRLRREPWAAGLVIVALSGYGSPQDLERGRIAGFDRYFTKPADPGELLDYLNRCGGH 120
L R+++ L ++ +S + + G Y KP D EL+ + R
Sbjct: 65 LLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006581DHBDHDRGNASE1233e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 123 bits (309), Expect = 3e-36
Identities = 79/256 (30%), Positives = 122/256 (47%), Gaps = 10/256 (3%)

Query: 4 LTGKVAIITGASSGIGYEAAKLFAREGASLVVVARRQAELEHLVAAIEADGGHAVPLAGD 63
+ GK+A ITGA+ GIG A+ A +GA + V +LE +V++++A+ HA D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 64 VRDEAVAEAAVTLAERRFGGLDIAFNNAGMTGGALPLHQVPPDAWRQILDTNLSSAFLGA 123
VRD A + ER G +DI N AG+ +H + + W N + F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 124 RHQIPAMLRRGGGSLVFTSTFVGHTAGFPGMGAYAASKAGLIGLTQAIAAEYGAQGIRAN 183
R M+ R GS+V + M AYA+SKA + T+ + E IR N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRT-SMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 184 ALLPGGTDTPMGRA--SMDSPEARRYIESLHA------LKRLASPAEIARSALYLASDAS 235
+ PG T+T M + + ++ + SL LK+LA P++IA + L+L S +
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 236 SFTTGTALLVDGGVSI 251
T L VDGG ++
Sbjct: 244 GHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006591DHBDHDRGNASE891e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 89.3 bits (221), Expect = 1e-23
Identities = 66/250 (26%), Positives = 112/250 (44%), Gaps = 13/250 (5%)

Query: 4 RIAYVTSGMGHTGTAICQALHHAGHRVIA-GCGPRSSRKDHWLKEQKSLGYDFIASEGDA 62
+IA++T G A+ + L G + A P K + K+ A D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEK--VVSSLKAEARHAEAFPADV 66

Query: 63 TDWASTEAAFAKVRREVGEIDVLVNNAGAMLDMRFRQMSHADWSAVLRSNLDTLFNSTKQ 122
D A+ + A++ RE+G ID+LVN AG + +S +W A N +FN+++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 VVDGMAERGWGRIINIGSVAAEKGQIGQINYATAKGAVIGFTRSLAQEVAARGVTVNLVS 182
V M +R G I+ +GS A + YA++K A + FT+ L E+A + N+VS
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PG----------FIADDTVKAFPPVLLDRIIESVPVGRLGTAEDLAGLCAWLASDEAAFV 232
PG + ++ + L+ +P+ +L D+A +L S +A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGANYAINGG 242
T N ++GG
Sbjct: 247 TMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006611SUBTILISIN1094e-28 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 109 bits (275), Expect = 4e-28
Identities = 67/343 (19%), Positives = 106/343 (30%), Gaps = 65/343 (18%)

Query: 56 WGLSAMNAQYAYARGLSGAGVKLGAVDSGYLPSHREFASRGIIALSVKGTYMNDGEQLDG 115
G+ + A + + G GVK+ +D+G H + +R
Sbjct: 24 RGVEMIQAPAVWNQT-RGRGVKVAVLDTGCDADHPDLKAR-------------------- 62

Query: 116 SKLAWRAGEAFDRAGAYVDKTDLTKRIGANDN-HGNHVSGTIAAAKNGQGMMGVSFGSQY 174
+ D + I + N HG HV+GTIAA +N G++GV+ +
Sbjct: 63 ----------IIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATENENGVVGVAPEADL 112

Query: 175 YTTNSNGTDSSRYGSNMDYNYFKAAYGNLAAAGVRVINSSWGSPPTADNYDTLSSFTQAY 234
S + V +I+ S G P
Sbjct: 113 LIIKVLNKQGSGQYDWI-----IQGIYYAIEQKVDIISMSLGGPEDVP------------ 155

Query: 235 LRLNGAGKKTWLDAAADVSLQYGVLHVWANGNAGVAHASTRAGLPYFRLELEKYWITATG 294
L A ++ +L + A GN G T I+
Sbjct: 156 ----------ELHEAVKKAVASQILVMCAAGNEGDGDDRT---DELGYPGCYNEVISVGA 202

Query: 295 LKPSGDQGFNKCGLAKYWCLAAPGYNILSANVTGDDKYKESNGTSMSAPHVTGALGILME 354
+ L APG +ILS G KY +GTSM+ PHV GAL ++ +
Sbjct: 203 INFDRHAS-EFSNSNNEVDLVAPGEDILSTVPGG--KYATFSGTSMATPHVAGALALIKQ 259

Query: 355 RYPYLGNEEIRTILLTTASHRGTGPADTPNEVFGWGVPDLRKG 397
++ L + T P ++ G G+ L
Sbjct: 260 LANASFERDLTEPELYAQLIKRTIPLGNSPKMEGNGLLYLTAV 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006621HTHFIS908e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 8e-23
Identities = 45/143 (31%), Positives = 65/143 (45%), Gaps = 4/143 (2%)

Query: 2 RILVIEDDEDLADALVRRLRRLGHAVDCQKDGLSADGVLQYETFDLVILDIGLPRMSGFE 61
ILV +DD + L + L R G+ V + + + DLV+ D+ +P + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILHRLRDRGSKTPVLALTARIDIEDRVHALDTGADDYLAKPFDFRELEARCRALL----R 117
+L R++ PVL ++A+ + A + GA DYL KPFD EL L R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 118 RPSVQAAGVLRFGELVIDSAARQ 140
RPS LV SAA Q
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQ 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_006631PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.9 bits (101), Expect = 2e-06
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 22/103 (21%)

Query: 367 LIDNALRHGAS---ADGHIDVRLERDDAGWRITVADQGPGISAALAQSAFERFSRGPNPR 423
L++N ++HG + G I ++ +D+ + V + G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL---------------ALKNT 307

Query: 424 TPGAGLGLSIIKRVVDIHHG---RLSLSNRPGGGLEAVVLLPA 463
G GL ++ + + +G ++ LS + G A+VL+P
Sbjct: 308 KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIPG 349


95NH44784_008211NH44784_008291N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_008211410-3.517527Chaperone protein DnaK
NH44784_008221-111-2.303824putative thioredoxin
NH44784_008231-110-0.638340Heat shock protein GrpE
NH44784_008241-111-1.352602hypothetical protein
NH44784_008251-112-1.269529Ferrochelatase, protoheme ferro-lyase
NH44784_008261015-0.706201Heat-inducible transcription repressor HrcA
NH44784_008271-2120.615951NAD kinase
NH44784_008281013-1.663103DNA repair protein RecN
NH44784_008291221-4.761150Ferric uptake regulation protein FUR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008211SHAPEPROTEIN1413e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (357), Expect = 3e-39
Identities = 84/388 (21%), Positives = 141/388 (36%), Gaps = 77/388 (19%)

Query: 2 SKIIGIDLGTTNSCVAVMDGGQVKIIENAEGART----TPSIVAYMDDGETLVGAPAKRQ 57
S + IDLGT N+ + V G V + R +P VA VG AK+
Sbjct: 10 SNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVA-------AVGHDAKQM 62

Query: 58 AVTNPKNTLYAVKRLIGRKFEEKAVQKDINLMPYAIVKADNGDAWVEVRGKKMAPPQVSA 117
P N + A++ + + +A V+
Sbjct: 63 LGRTPGN-IAAIRPM---------------------------------KDGVIADFFVTE 88

Query: 118 DVLRK-MKKTAEDYLGEEVTEAVITVPAYFNDSQRQATKDAGRIAGLEVKRIINEPTAAA 176
+L+ +K+ + ++ VP +R+A +++ + AG +I EP AAA
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAA 148

Query: 177 LAFGLDKTEKGDRKIAVYDLGGGTFDVSIIEIADVDGEKQFEVLSTNGDTFLGGEDFDQR 236
+ GL +E V D+GGGT +V++I + V + +GG+ FD+
Sbjct: 149 IGAGLPVSE--ATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEA 197

Query: 237 IIDYIIGEFKKEQGVDLSKDVLALQRLKEAAEKAKIELSSS----QQTEINLPYITADAS 292
II+Y+ + G + AE+ K E+ S+ + EI +
Sbjct: 198 IINYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEG 244

Query: 293 GPKHLNLKITRAKLEALVEELIERTIEPCRVAIKDAGVKVSDIDD--VILVGGMTRMPKV 350
P+ L + LEAL E L + SDI + ++L GG + +
Sbjct: 245 VPRGFTLN-SNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 351 QEKVKEFFGKDPRKDVNPDEAVAAGAAI 378
+ E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008231RTXTOXIND300.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.006
Identities = 18/126 (14%), Positives = 41/126 (32%), Gaps = 6/126 (4%)

Query: 17 DADAAPAAQDAAQAELNELRAQLDAAQATVNEQQDQLLRVRAEAENVRRRAQEEVSKARK 76
+AD QA L + R Q+ + + ++L ++ E + EE
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSI----ELNKLPELKLPDEPYFQNVSEEEVLRLT 188

Query: 77 FGIESFAESLVPVKDSLEAALAQADQTVDTLREGVEVTLKQLAAAFERNLLKEIAPVQGD 136
I+ + K E L + T+ + + + E++ L + + +
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN--RYENLSRVEKSRLDDFSSLLHK 246

Query: 137 KFDPHL 142
+
Sbjct: 247 QAIAKH 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008271PF06057343e-04 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 34.4 bits (79), Expect = 3e-04
Identities = 18/62 (29%), Positives = 25/62 (40%), Gaps = 9/62 (14%)

Query: 40 DTARNTGLTEYPVATYEEIGKDAS-----LAVVMGGDGTVL----GAARHLAPYGVPVVG 90
+ A N GLT PV ++ +S L + + GDG L G PVVG
Sbjct: 24 EFADNLGLTLLPVEPSTQVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQGWPVVG 83

Query: 91 IN 92
+
Sbjct: 84 WS 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008281CHANLCOLICIN356e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 6e-04
Identities = 51/262 (19%), Positives = 89/262 (33%), Gaps = 36/262 (13%)

Query: 129 HAHQSLMRPEAQRDLLDAHGGHGELRQGVAQAWKQWRALARQLELAEKDAAGLAAERERL 188
HA+ + M+ E +R L A+A ++ R A E A ++A E ER
Sbjct: 117 HANNAAMQAEDERLRL-------------AKAEEKARKEAEAAEKAFQEAEQRRKEIERE 163

Query: 189 QWQVDELDRLGLAPDEWDALQSEHTRLSHSQSLLDGATQILDALDGEGDSAHHRLTAANQ 248
+ + + +L A + LS ++ A + L A E + N
Sbjct: 164 KAETERQLKLAEA------EEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNS 217

Query: 249 RIQQMLRHD----TGLQGIYDELESARIAISEAVSDLNNYVSRVDLDPRRLADVEARLSA 304
R+ + L G +EL A E + R + + EA
Sbjct: 218 RLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRR 277

Query: 305 V-----FETARKFRTEPEALCALRDSLHAELSALQAAADIDALRAQAQAAQAQYDAAAAK 359
V E +K T E + ++A+++ +Q A I + A A+ A
Sbjct: 278 VGAGKIREEKQKQVTASE---TRINRINADITQIQKA--ISQVSNNRNAGIARVHEAEEN 332

Query: 360 LTTARRKVAKDLGKQVTQAMQT 381
L ++ L Q+ A+
Sbjct: 333 L---KKAQNNLLNSQIKDAVDA 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_008291ACRIFLAVINRP280.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.012
Identities = 11/49 (22%), Positives = 21/49 (42%), Gaps = 5/49 (10%)

Query: 30 QRHLSAEDVYRALIGENVEIG----LATVYRVLTQFEQAGILARSQFDS 74
+ L+ DV L +N +I T Q + I+A+++F +
Sbjct: 195 KYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS-IIAQTRFKN 242


96NH44784_011401NH44784_011501N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_011401-192.045882NAD-dependent epimerase/dehydratase
NH44784_011411-2121.811002Threonine dehydratase, catabolic
NH44784_011421-2111.706411Holliday junction DNA helicase RuvB
NH44784_011431-1111.549480putative two-component sensor
NH44784_011441-2100.327054Two-component system response regulator OmpR
NH44784_011451-290.540369ortholog of Bordetella pertussis (BX470248)
NH44784_011461-210-0.351570ATP-dependent DNA helicase RecQ
NH44784_011471-18-1.173618Cytochrome d ubiquinol oxidase subunit II
NH44784_01148108-1.643888Cytochrome d ubiquinol oxidase subunit I
NH44784_011491210-2.192307Glutaminyl-tRNA synthetase
NH44784_011501315-2.475204Septum site-determining protein MinC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_011401NUCEPIMERASE512e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 50.6 bits (121), Expect = 2e-09
Identities = 28/130 (21%), Positives = 44/130 (33%), Gaps = 20/130 (15%)

Query: 1 MRILVIGGTGFIGRHLIARLSGGQHQILV---------PTRRYGQGRELQILPTVTLLAS 51
M+ LV G GFIG H+ RL HQ++ + + + EL P
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQAR-LELLAQPGFQFHKI 59

Query: 52 DVHDDDALDRLARDC--DVVVNLVGILHGNAGRPYGSDFAQAHVHLP----QRIARACRR 105
D+ D + + L + V Y + A+ I CR
Sbjct: 60 DLADREGMTDLFASGHFERVFISP----HRLAVRYSLENPHAYADSNLTGFLNILEGCRH 115

Query: 106 QGVRRLLHVS 115
++ LL+ S
Sbjct: 116 NKIQHLLYAS 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_011431PERTACTIN330.003 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 33.2 bits (75), Expect = 0.003
Identities = 23/66 (34%), Positives = 25/66 (37%), Gaps = 7/66 (10%)

Query: 82 QWRLWSRVLPAEAKLQRFNGRRPPPRGTQRPPPPPPPPPSAQIQGPPLPPEMRDEHEVEQ 141
QW L P K P G Q P PP PP Q PP PP+ + E Q
Sbjct: 559 QWSLVGAKAPPAPK-------PAPQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQ 611

Query: 142 ERAERE 147
A RE
Sbjct: 612 PPAGRE 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_011441HTHFIS963e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.4 bits (240), Expect = 3e-25
Identities = 47/135 (34%), Positives = 73/135 (54%), Gaps = 1/135 (0%)

Query: 9 TKLLVVDDDPALRQLLADYLNRHGYDTLLAPDASDLPSRITRYAPDLLVLDRMLPGGDGA 68
+LV DDD A+R +L L+R GYD + +A+ L I DL+V D ++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DACRRLREQGEDIPVILLTARDEAVDRIIGLEAGADDYVGKPFDPRELLARIE-AVLRRK 127
D R+++ D+PV++++A++ + I E GA DY+ KPFD EL+ I A+ K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 128 RGPSALTKDAPVAFG 142
R PS L D+
Sbjct: 124 RRPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_011451PF00577592e-11 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 59.1 bits (143), Expect = 2e-11
Identities = 36/238 (15%), Positives = 61/238 (25%), Gaps = 40/238 (16%)

Query: 220 LAEGTFGYSSSLGRLNRMDPSATSGAVDYGASVGNSTVRYGLTPVLTLEGQMQSAPSLTT 279
EG YS + G A ST+ +GL T+ G Q A
Sbjct: 369 QREGHTRYSITAGEYRS------GNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRA 422

Query: 280 RGMGTTYTAGEYGTFQAGATQSSFDAA-----SAWRYRFGYS------------------ 316
G G G TQ++ RF Y+
Sbjct: 423 FNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYS 482

Query: 317 ----VDVADAVNLGYTNEQIGAGFGDLSTYSGGVAAAPQTRN-----TLTAGVPITGWGT 367
+ AD I G + N LT + T
Sbjct: 483 TSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST 542

Query: 368 LSGTYSGL-HEGSALAEQRLGLSHSM-FVAPKVKLAVGADRDIISGNYEWRANLSMPV 423
L + S + G++ +++ + F L+ ++ + L++ +
Sbjct: 543 LYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_011501IGASERPTASE330.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.002
Identities = 28/116 (24%), Positives = 40/116 (34%), Gaps = 16/116 (13%)

Query: 161 RPAQVIDTAPPNDVATPVPSVPAAAVEIA-------PAPTPAAE-------AQVDKAADK 206
+ + PN++ VPSVP+ EIA P P PA A+ K K
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESK 1049

Query: 207 TTDKSADKAAAEPLPERAAASTTSAPSPTPAAPQSSSALVITKPLRSGQRVYARHT 262
T +K+ A R A A S A Q++ + Q + T
Sbjct: 1050 TVEKNEQDATETTAQNREVAK--EAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103


97NH44784_012021NH44784_012071N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_012021-2120.698811Macrolide-specific efflux protein MacA
NH44784_012031-1130.375231putative inner membrane efflux protein
NH44784_012041-114-0.003518Cation transport protein chaC
NH44784_012051-2110.283894Pyridoxamine 5'-phosphate oxidase
NH44784_012061-211-0.486786Nitrilotriacetate monooxygenase component B
NH44784_012071-312-0.615165Aminobutyraldehyde dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_012021RTXTOXIND583e-11 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 57.5 bits (139), Expect = 3e-11
Identities = 32/227 (14%), Positives = 68/227 (29%), Gaps = 20/227 (8%)

Query: 97 LTQQNELRNTQAALQTRRAERAAKVATLKQAELAFRRQRQMLAADASSREAFESAEATLG 156
++ + + E + L+Q E ++L+A + + +
Sbjct: 248 AIAKHAVLEQENKYVEAVNELRVYKSQLEQIE------SEILSAKEEYQLVTQLFKNE-- 299

Query: 157 VTRADIAALDAQIVQAEIEVDKAKVNLGYTRIVSPIDGVVVAV-VTKEGQTVNAIQSAPT 215
+ I +E+ K + + I +P+ V + V EG V +
Sbjct: 300 -ILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE--TL 356

Query: 216 IIKVAQVATMTIKAQISEADVTRVKPGLPVYFTILGEPDHRY-HATLRAVEPAPDSIQKD 274
++ V + T+ + A + D+ + G + P RY + + D+I+
Sbjct: 357 MVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQ 416

Query: 275 DAATSLTSSSSTSTSAAVYYNGLFDVPNPDEKLRISMTAQVSIVLGE 321
S + N + L M I G
Sbjct: 417 RLGLVFNVIISIEENCLS-------TGNKNIPLSSGMAVTAEIKTGM 456



Score = 52.5 bits (126), Expect = 2e-09
Identities = 28/180 (15%), Positives = 64/180 (35%), Gaps = 17/180 (9%)

Query: 1 MKKPRSKVQRYVLIAVVLLIIAFGVRAAFFSSPPPPTFAVAEVVRGNLEDSVLASGTMDA 60
++ P S+ R V ++ ++ + + G +E A+G +
Sbjct: 49 IETPVSRRPRLVAYFIMGFLVIAFILSVL----------------GQVEIVATANGKLTH 92

Query: 61 IERV-SVGAQATGQLKSLKVALGDRVTRGQLVAEIDDLTQQNELRNTQAALQTRRAERAA 119
R + +K + V G+ V +G ++ ++ L + + TQ++L R E+
Sbjct: 93 SGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTR 152

Query: 120 KVATLKQAELAFRRQRQMLAADASSREAFESAEATLGVTRADIAALDAQIVQAEIEVDKA 179
+ EL + ++ + E + + + Q Q E+ +DK
Sbjct: 153 YQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKK 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_012031TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 4e-09
Identities = 67/320 (20%), Positives = 117/320 (36%), Gaps = 9/320 (2%)

Query: 34 GTLVAVFAVLPMLFSVRAGRWVDRVGIVRPLVTGTTLVAVGTALPFLSQTQFALLVASCC 93
G L+A++A++ + G DR G L+ AV A+ + + L +
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 94 IGIGFMLHQVATQDLLGHAEPTQRLRNFSLMSLALAGSGFSGPLIAGLAIDHLGTRLAF- 152
GI VA + + +R R+F MS +GP++ GL F
Sbjct: 106 AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFF 164

Query: 153 --GLLTLGPLLSAGGLYALRHQLKAVDSQLAGTKEAQRRRVTELLAVPALRRILMVNTIL 210
L L+ L H+ + + R + V A + L
Sbjct: 165 AAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQL 224

Query: 211 SGAWDTHLFVVPIFGV-AIGLSATTIGVILASFAAATFIIRLVL-PLIQRRVRSWTLVRV 268
G L+V IFG ATTIG+ LA+F + + ++ + R+ + +
Sbjct: 225 VGQVPAALWV--IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML 282

Query: 269 AMATAAIDFMLYPLFTDVGVLIVLSFILGLALGCCQPSMLSLLHQHSPPGRAAEAVGLRM 328
M ++L F G + +L + G P++ ++L + R + G
Sbjct: 283 GMIADGTGYILL-AFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 329 ALINGSQVSLPLTFGALGAV 348
AL + + + PL F A+ A
Sbjct: 342 ALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_012061RTXTOXINA280.015 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.015
Identities = 9/33 (27%), Positives = 16/33 (48%)

Query: 15 TALGRYATGVTVVTTADLDGAPIGLTVSSFNSV 47
T L ++G++ T L GAP+ V + +
Sbjct: 373 TVLASVSSGISAAATTSLVGAPVSALVGAVTGI 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_012071PF04183330.002 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 32.9 bits (75), Expect = 0.002
Identities = 12/71 (16%), Positives = 26/71 (36%), Gaps = 1/71 (1%)

Query: 48 SEVIHAGIYYPAGSLKARLCVRGKHLLYAYCAERGI-PHKRLGKLIVATSTEQAAQLEGI 106
SE+ + +++ R C+ + + AERGI + + + E +
Sbjct: 19 SELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLL 78

Query: 107 AQRARANGVDD 117
Q + + D
Sbjct: 79 MQLKQVLSMSD 89


98NH44784_014491NH44784_014551N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_014491013-1.739012Type III secretion outermembrane pore forming
NH44784_014501212-0.702907hypothetical protein
NH44784_014511112-0.324652Type III secretion inner membrane protein
NH44784_0145212131.225199Type III secretion inner membrane protein
NH44784_0145312121.563107Type III secretion inner membrane protein
NH44784_0145413121.761477Type III secretion inner membrane protein
NH44784_0145513112.918170Type III secretion inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014491TYPE3OMGPROT430e-146 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 430 bits (1106), Expect = e-146
Identities = 175/540 (32%), Positives = 287/540 (53%), Gaps = 46/540 (8%)

Query: 1 MLKRMILATALSLGLAAGAQAAPNWPNVPYSYYARDENLQTLLREFAGGFSLSLQIGPNV 60
KR++ T L L + AQ W +PY Y A+ E+L+ LL +F + ++ + +
Sbjct: 8 FFKRVLTGTLLLLSSYSWAQELD-WLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKI 66

Query: 61 TGTVNGKFNANTPTEFMDRLGGVYGFNWFVYAGTLFVSRTNDMKTRSVSAMGSSISALRQ 120
V+G+F + P +F+ + +Y W+ L++ + +++ +R + S + L+Q
Sbjct: 67 NDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQ 126

Query: 121 ALQQLGVLDPRFGWGELPDQGIALVSGPPGYVDLVERTVAAL-------PMGAGGQQVAV 173
ALQ+ G+ +PRFGW + VSGPP Y++LVE+T AAL G + +
Sbjct: 127 ALQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEI 186

Query: 174 FRLKHASVNDRTISYRDQKVVTPGLSTVLRNLITGGGGGANNETLAAIAAPLRDNPPAFP 233
F LK+AS +DRTI YRD +V PG++T+L+ ++
Sbjct: 187 FPLKYASASDRTIHYRDDEVAAPGVATILQRVL--------------------------- 219

Query: 234 QVAGDGAAAPAGPAGNQPAAPASTGKSGLRLREPTVQADARLNAIIVQDIPDRIPIYRSL 293
A NQ A+T S + V+AD LNAIIV+D P+R+P+Y+ L
Sbjct: 220 ----SDATIQQVTVDNQRIPQAATRASA----QARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 294 IEQLDLPSTLVEIEAMIVDVNSDLVSELGVTWGARAGSTTFGYGNLGLGPSGGLPLESGA 353
I LD PS +E+ IVD+N+D ++ELGV W R G T + + +G +
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDW--RVGIRTGNNHQVVIKTTGDQSNIASN 329

Query: 354 ALSPGTIGVSVGNTLAARLRALQTKGQANILSQPSILTADNLGAMIDLSDTFYIQTRGER 413
+ + L AR+ L+ +G A ++S+P++LT +N A+ID S+T+Y++ G+
Sbjct: 330 GALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKE 389

Query: 414 VATVTPVTVGTSLRVTPRHIESKNGARVELTVDIEDGR-IQEERQIDNLPTVRKSNISTL 472
VA + +T GT LR+TPR + + + + L + IEDG I+ +PT+ ++ + T+
Sbjct: 390 VAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEGIPTISRTVVDTV 449

Query: 473 AIVGNNQTLLIGGYNSSQNSEQVDKVPILGDIPGLGLLFSNKSKTVQRRERLFLIRPKVV 532
A VG+ Q+L+IGG + S + KVP+LGDIP +G LF KS+ +R RLF+I P+++
Sbjct: 450 ARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRII 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014511TYPE3IMSPROT346e-120 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 346 bits (889), Expect = e-120
Identities = 151/344 (43%), Positives = 219/344 (63%)

Query: 5 MSGEKTEQPTQKKLRDARQKGDVAHSKDFTQTLLVLALFGYLLGNAHGIVEALGRLVLIP 64
MSGEKTEQPT KK+RDAR+KG VA SK+ T L++AL L+G + E +L+LIP
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 65 STLVGQSFEQALPIALDAALREAVYLVLPFVLIVLVVGMFSEFLQVGVVLAFEKLKPSAK 124
+ F QAL +D L E YL P + + ++ + S +Q G +++ E +KP K
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 125 KLNVMSNLKNIFSKKNLVELLKSIIKIAFLSVLVTLVVRDALPELMAVPHSGLAGLEAGV 184
K+N + K IFS K+LVE LKSI+K+ LS+L+ ++++ L L+ +P G+ + +
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 185 GGMMRALIVNIAVAYVVISLADFVWQRMQYRKGLMMSKEDIKQEFKEMEGDPHIKHKRKH 244
G ++R L+V V +VVIS+AD+ ++ QY K L MSK++IK+E+KEMEG P IK KR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 245 LHQEMVMHGAVASTRKASVLVTNPTHLAVAIYYEPDETPLPVVLAKGEGALAEQMMRAAR 304
HQE+ + +++SV+V NPTH+A+ I Y+ ETPLP+V K A + + + A
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 305 EAGVPVMQNIPLARALMASALPDQYIPSELIEPVAEVLRLVRKL 348
E GVP++Q IPLARAL AL D YIP+E IE AEVLR + +
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014521TYPE3IMRPROT1435e-44 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 143 bits (363), Expect = 5e-44
Identities = 46/230 (20%), Positives = 96/230 (41%), Gaps = 4/230 (1%)

Query: 14 LSTLALTQPRILALCAMLPLFNRQLLPGMLRYALCAAIGLVLVPALAPRYAVIDLDAVEL 73
L+ R+LAL + P+ + + +P ++ L I + P+L +
Sbjct: 13 LNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFF--A 70

Query: 74 VLLVAKEVFIGLVMGFLVAIPFWIFEAVGFVVDNQRGASLGATINPATGNDSSPLGILFN 133
+ L +++ IG+ +GF + F G ++ Q G S ++PA+ + L + +
Sbjct: 71 LWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMD 130

Query: 134 QAFMVFFLVGGGFMLMLTMLYDSFRLWDLWDWAPTLRRESVPLMLDQLGRFLRLTLLFAA 193
++ FL G + ++++L D+F + L + + L+ A
Sbjct: 131 MLALLLFLTFNGHLWLISLLVDTFHTLPI--GGEPLNSNAFLALTKAGSLIFLNGLMLAL 188

Query: 194 PAIISMFLAEVGLALVSRFAPQLQVFFLAMPIKSALALLVMVLYMSTLFE 243
P I + + L L++R APQL +F + P+ + + +M M +
Sbjct: 189 PLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014531TYPE3IMQPROT713e-20 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 71.3 bits (175), Expect = 3e-20
Identities = 36/79 (45%), Positives = 46/79 (58%)

Query: 5 DLVSYMTQALYLVLWLSLPPIAVAAIVGTLFSLFQALTQIQEQTLSFAVKLIAVFATIML 64
DLV +ALYLVL LS P VA I+G L LFQ +TQ+QEQTL F +KL+ V + L
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 TARWLSAELYNFTISVFDL 83
+ W L ++ V L
Sbjct: 63 LSGWYGEVLLSYGRQVIFL 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014541TYPE3IMPPROT2309e-79 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 230 bits (587), Expect = 9e-79
Identities = 87/224 (38%), Positives = 132/224 (58%), Gaps = 14/224 (6%)

Query: 5 DPISLAVVLALLALVPLAAVMTTSFLKIAVVLTLVRNALGVQQVPPNMALYGLALILSAY 64
+ ISL +LA L+P T F+K ++V +VRNALG+QQ+P NM L G+AL+LS +
Sbjct: 3 NDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMF 62

Query: 65 VMGPVVMQIGDELRAPPAVAAPGTPEPDRLEGILEAVARGAEPMRAFMLKNSRAEQRDFF 124
VM P++ + + + + V G + R +++K S E FF
Sbjct: 63 VMWPIMHDAYVYFEDED-------VTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFF 115

Query: 125 LRTARGLWGEQQARN-------LKEDDLLVLIPSFLLSELTAAFQIGFLLYLPFVIIDLI 177
++ +++ + L+P++ LSE+ +AF+IGF LYLPFV++DL+
Sbjct: 116 ENAQLKRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLV 175

Query: 178 VSNILLAMGMMMVSPVTISMPLKLFLFVMVDGWTRLIQGLVLSY 221
VS++LLA+GMMM+SPVTIS P+KL LFV +DGWT L +GL+L Y
Sbjct: 176 VSSVLLALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_014551TYPE3OMOPROT748e-17 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 73.5 bits (180), Expect = 8e-17
Identities = 37/166 (22%), Positives = 70/166 (42%), Gaps = 10/166 (6%)

Query: 187 PAANEIDADAVPVRIAACLGWTELDAAQLRSLAPRDTVFLDHCLVSPEGELWLGAGAQGL 246
PA + + +G ++ + L + D + + E++ A G
Sbjct: 138 PAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTS----RAEVYCYAKKLGH 193

Query: 247 RVRRQDSSYLVTQGWTSLMTETPQSPQDADAGAQTPLDIDAIPVRLTFELGERLITLGEL 306
R + +V + E + + A+ ++ +PV+L F L + +TL EL
Sbjct: 194 F-NRVEGGIIVETLDIQHIEEENNTTETAET----LPGLNQLPVKLEFVLYRKNVTLAEL 248

Query: 307 RQLQPGETFDLARPLAEGPVLVRANGALVGSGELVEIDGRIGVTLH 352
+ + L AE V + ANG L+G+GELV+++ +GV +H
Sbjct: 249 EAMGQQQLLSLPTN-AELNVEIMANGVLLGNGELVQMNDTLGVEIH 293


99NH44784_015241NH44784_015311N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_015241-1123.040896Nicotinamidase/isochorismatase family protein
NH44784_015251-2103.479170Acetyl-CoA synthetase (ADP-forming) alpha and
NH44784_015261-1132.699982Glutamyl-tRNA synthetase
NH44784_015271-1112.361855putative membrane transport protein
NH44784_015281-3141.851754putative lipoprotein
NH44784_015291-2160.174143Transcriptional regulator, LysR family
NH44784_015301018-1.073343hypothetical protein
NH44784_015311218-1.950715hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015241ISCHRISMTASE681e-15 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 68.1 bits (166), Expect = 1e-15
Identities = 52/205 (25%), Positives = 82/205 (40%), Gaps = 26/205 (12%)

Query: 28 RTAVVVIDMQQYFTLPGYQGECAAARDIIAPVNRLCDAVRAAGGTVVWV-QTASDNADA- 85
R +++ DMQ YF + + ++ A + +L + G VV+ Q S N D
Sbjct: 30 RAVLLIHDMQNYFVDA-FTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDR 88

Query: 86 -----FWSHHHGVMLTPERSARRLETLRRDSPGFALHPDLRPADSDLRVTKRFYSAMATG 140
FW L + +L P D DL +TK YSA
Sbjct: 89 ALLTDFWG----------------PGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFK-- 130

Query: 141 SSELEPLLRGRGVDTLLIAGTVTNVCCESTARDAMMRDFRTIMVDDALAAVTPAEHENAL 200
+ L ++R G D L+I G ++ C TA +A M D + V DA+A + +H+ AL
Sbjct: 131 RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKHQMAL 190

Query: 201 HGWLLFFGDVLSVDEVSTRLRPAQA 225
+ D + +L+ A A
Sbjct: 191 EYAAGRCAFTVMTDSLLDQLQNAPA 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015271TCRTETB699e-15 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 68.8 bits (168), Expect = 9e-15
Identities = 60/371 (16%), Positives = 127/371 (34%), Gaps = 44/371 (11%)

Query: 32 LPQLAAEFGATTGQAARAVTAFAVAYGVLQMFFGPVGDRYGKYRVVSVATFACALGSAGA 91
LP +A +F TAF + + + +G + D+ G R++ GS
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 92 FVAES-LDVLVFCRALSGAAGAGIVPLSMAWIGDTVPYERRQATLARFLTGTILGMAAGQ 150
FV S +L+ R + GA A L M + +P E R + +G G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 151 LAGGLFADTLGWRWAFAALVVGYLVVGTLLQLEVRRQRALGLGVVDPNAPRQGFVAQARL 210
GG+ A + W + ++ ++ +++ ++ G D V
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMI--TIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFF 214

Query: 211 VLGTP---WARVVLATVF------------------------------IEGLLVFGALA- 236
+L T + ++++ + + G ++FG +A
Sbjct: 215 MLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAG 274

Query: 237 ---FAPSYLHERFDISLTAAGALVAVYA-VGGLLYTVVAGRVLKRLGERGLAVAGGLVLG 292
P + + +S G+++ + +++ + G ++ R G + G L
Sbjct: 275 FVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334

Query: 293 VAFLSYLLGPVW-LWSLLASVLAGFGYYLLHATLQTNATQMV--PSARGTAVAWFASCLF 349
V+FL+ W + ++ G T+ + G ++ F
Sbjct: 335 VSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394

Query: 350 MGQAAGVALAG 360
+ + G+A+ G
Sbjct: 395 LSEGTGIAIVG 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015301PF06057280.029 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.3 bits (63), Expect = 0.029
Identities = 13/70 (18%), Positives = 26/70 (37%), Gaps = 15/70 (21%)

Query: 118 VVVGHSMGA--LPA----LLAASQRRVAGVVLMAPSPPANLP---------GALGLPPVP 162
+++G+S GA +P + A ++ V G VL++PS ++ +
Sbjct: 120 ILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVTSDNQSARYLT 179

Query: 163 ADAVRATPAA 172
V
Sbjct: 180 LPEVNKQTTV 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_015311PHAGEIV310.004 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 31.4 bits (71), Expect = 0.004
Identities = 14/50 (28%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 54 RTFAQFLGKQTGQTVIVEDRPGAGGIVGTNAARNAAADGYTFLLSTNSTH 103
R F + KQTG++VIV P G V ++ + F +S +
Sbjct: 32 RDFVTWYSKQTGESVIVS--PDVKGTVTVYSSDVKPENLRDFFISVLRAN 79


100NH44784_016161NH44784_016261N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0161610112.340578transcriptional regulator, TetR family
NH44784_0161712151.964974D-aminoacylase
NH44784_0161812151.592315D-amino acid dehydrogenase small subunit
NH44784_0161911150.873775hypothetical protein
NH44784_0162011150.824048RND efflux system, outer membrane
NH44784_0162113160.731142AcrB/AcrD/AcrF family protein
NH44784_0162210131.343873AcrB/AcrD/AcrF family protein
NH44784_016231-2102.147684Probable Co/Zn/Cd efflux system membrane fusion
NH44784_016241-2101.846534FIG002842: hypothetical protein
NH44784_016251-2111.939143Dephospho-CoA kinase
NH44784_016261-1111.421041Leader peptidase (Prepilin peptidase)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016161HTHTETR531e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.7 bits (126), Expect = 1e-10
Identities = 23/130 (17%), Positives = 42/130 (32%), Gaps = 5/130 (3%)

Query: 3 RRENTERQILQALEDQIKETGMGGVGINAIAKRAGVSKELIYRYFDGMPGLMLAWMQEQ- 61
+ T + IL + G+ + IAK AGV++ IY +F L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 62 DFWTRNPGLLAADESSQRSPGELVLSMLRAQIDALAGNETLREVRRWELIERNEVSAPLA 121
A P ++ +L +++ E R + E+I
Sbjct: 68 SNIGELELEYQAKFP--GDPLSVLREILIHVLESTVTEERRRLLM--EIIFHKCEFVGEM 123

Query: 122 ERRERAARGF 131
++A R
Sbjct: 124 AVVQQAQRNL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016171UREASE454e-07 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 45.5 bits (108), Expect = 4e-07
Identities = 31/128 (24%), Positives = 46/128 (35%), Gaps = 27/128 (21%)

Query: 2 PAQPDPQEYDLIISGGLIADGLGGPARRADVAINGERIAAIGD--------------GSG 47
+ D +I+ LI D G +AD+ + RIAAIG G G
Sbjct: 60 QVTREGGAVDTVITNALILDHWG--IVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 48 WRARRRIDASGRVVAPGFIDCHTHDDRALLGSPLMRPKVSQGVTTVVTGNCGISLAPVHA 107
I G++V G +D H H + + + G+T ++ G G P H
Sbjct: 118 TEV---IAGEGKIVTAGGMDSHIH----FICPQQIEEALMSGLTCMLGGGTG----PAHG 166

Query: 108 PGDTVPPP 115
T P
Sbjct: 167 TLATTCTP 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016181adhesinmafb300.026 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 29.6 bits (66), Expect = 0.026
Identities = 15/44 (34%), Positives = 20/44 (45%), Gaps = 1/44 (2%)

Query: 9 GVVGMATAYYLHREGHRVSVVDASPGP-GLATSRANGAQLSYSF 51
V T Y L+ EGH DA GP G + GA+ Y++
Sbjct: 123 NVDEGFTVYRLNWEGHEHHPADAYDGPKGGNYPKPTGARDEYTY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016211ACRIFLAVINRP8200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 820 bits (2119), Expect = 0.0
Identities = 283/1032 (27%), Positives = 509/1032 (49%), Gaps = 27/1032 (2%)

Query: 7 FIVRPVATTLLSLAVVLAGMLSFFLLPVAPLPQMDIPTISVSASLPGASPETMASSVATP 66
FI RP+ +L++ +++AG L+ LPVA P + P +SVSA+ PGA +T+ +V
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 67 LERSLGSIAGVTEMTSSS-SQGSTRVTLQFDLSRDINGAARDVQAAINAARSLLPTSLRS 125
+E+++ I + M+S+S S GS +TL F D + A VQ + A LLP ++
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ- 123

Query: 126 NPTYHKSNPSDAPIMTLAMTSDT--LSQGQMYDLASTIVAQKLSQVDGVGEVTVGGSSLP 183
S + +M SD +Q + D ++ V LS+++GVG+V + G+
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY- 182

Query: 184 AVRVTVLPGALANRGVSLDEVRTALANANANRPKGVLENDQY------HWQIMVNDQLSR 237
A+R+ + L ++ +V L N G L + I+ +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 238 AEQYRPLIV-AWRDGAPVRVSDVARVEDSVEDLFQTGFYNQRNAILMIVRRQADANIIET 296
E++ + + DG+ VR+ DVARVE E+ N + A + ++ AN ++T
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 297 VDAVRAQLPTLLALMPADVQLTVAQDRTPSIRASLHEAELTLIIAVGLVVLVVLLFLRRW 356
A++A+L L P +++ D TP ++ S+HE TL A+ LV LV+ LFL+
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 357 RAAIIPSVAVPVSLIGTFCIMYLCGFTLNTISLMALIVATGFVVDDAIVVLENIMRH-VE 415
RA +IP++AVPV L+GTF I+ G+++NT+++ +++A G +VDDAIVV+EN+ R +E
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 416 NGMSPMRAALRGSREVGFTVLSMSLSLVAVFIPILLMGGVAGRLFREFAVTLSASIMVSL 475
+ + P A + ++ ++ +++ L AVFIP+ GG G ++R+F++T+ +++ +S+
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 476 VVSLTLTPMMCARLLKAEPKEP-KPPGRLARWAERGFDWMQDGYRRSLAWALAHGRLMML 534
+V+L LTP +CA LLK E + G W FD + Y S+ L +L
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLL 542

Query: 535 VLAAAVGLNVYLYTVVPKGFFPQQDTGQLLGFFRVDQGTSFQATVPKLEALRKVVLADP- 593
+ A V V L+ +P F P++D G L ++ G + + T L+ + L +
Sbjct: 543 IYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEK 602

Query: 594 ----AVQSMTGYAGGRGGSNSSFMQIQLKPLGER---KVSADQVINRLRARLQNMPGARM 646
+V ++ G++ N+ + LKP ER + SA+ VI+R + L + +
Sbjct: 603 ANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFV 662

Query: 647 FLVAQQDIRVGGRQSQGSYDYTLMSG-DLQLLRAWMPKVQQAMAKVP-EITDVDTDVEDK 704
I G + ++ +G L ++ A+ P + V + +
Sbjct: 663 IPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLED 722

Query: 705 GREINLVIDRDAATRLGVSMATISTVLNNSFSQRQVSVMYGPLNQYHVVLGVDQRFAQDI 764
+ L +D++ A LGVS++ I+ ++ + V+ + + D +F
Sbjct: 723 TAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLP 782

Query: 765 ESLKQVEVITSTGARVPMAAFARFENANAPLSVQHQGLFVADTVSFSLAPGVSLGQATAA 824
E + ++ V ++ G VP +AF ++ + + APG S G A A
Sbjct: 783 EDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMAL 842

Query: 825 IDAAVARIGLPSDQIQAGFQGTAAVLQQTLAQQPWLILAALVTMYIVLGILYESFVHPLT 884
++ ++ LP+ I + G + + + Q P L+ + V +++ L LYES+ P++
Sbjct: 843 MENLASK--LPAG-IGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 885 ILSTLPSAGLGALLALLLVRTDFTLIALIGVFLLIGIVKKNAIMMVDFALEAERNQHMAP 944
++ +P +G LLA L + ++G+ IG+ KNAI++V+FA + +
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 945 REAIFQACLTRFRPIMMTTMAAIFGALPLVLATGAGVEMRQPLGITIVGGLVLSQILTLY 1004
EA A R RPI+MT++A I G LPL ++ GAG + +GI ++GG+V + +L ++
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1005 TTPVVYLYLDRF 1016
PV ++ + R
Sbjct: 1020 FVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016221ACRIFLAVINRP8270.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 827 bits (2139), Expect = 0.0
Identities = 286/1033 (27%), Positives = 509/1033 (49%), Gaps = 25/1033 (2%)

Query: 4 SRLFILRPVATTLSMVAILIAGFIAYRMLPVSALPEVDYPTIQVTTLYPGASPDVMTSLV 63
+ FI RP+ + + +++AG +A LPV+ P + P + V+ YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TSPLERQFGQMPGLNQMSSTS-SGGASVITMQFNLTLPLDVAEQQVQAAINAASNLLPSD 122
T +E+ + L MSSTS S G+ IT+ F D+A+ QVQ + A+ LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 LPVPPTYNKVNPADAAVLTLAITS--PTMPLPQVRDLVDTRVAQKLSQIPGVGLVSVAGG 180
+ + + ++ S P + D V + V LS++ GVG V + G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QRPAVRVQVNPQALAANGLGLTDLRAAVVGANVNQPKGNLDGPL------RSTTINANDQ 234
Q A+R+ ++ L L D+ + N G L G + +I A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 LKSPTDYNDLII-AYKNNAPLRLSDVARAVEGAEDTRQAAWAGDKPAILLNIQRQPGANV 293
K+P ++ + + + + +RL DVAR G E+ A KPA L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 IDVVNRIQALLPQLRAALPATLDVTVVSDRTQTIRDSVADVQFEMLLAVALVVMVTFVFL 353
+D I+A L +L+ P + V D T ++ S+ +V + A+ LV +V ++FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 RSLTATLIPSVVVPLSLVGTFGIMYLAGFSINNLTLMALTIATGFVVDDAIVMIENIARH 413
+++ ATLIP++ VP+ L+GTF I+ G+SIN LT+ + +A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 I-EEGETPLQAALKGAQQIGFTLISLTFSLIAVLIPLLFMTEVVGRLFREFAITLAVSIL 472
+ E+ P +A K QI L+ + L AV IP+ F G ++R+F+IT+ ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 ISLVVSLTLTPMMCARLLRAESEQKH---GRFHQVTGAFIDRTIAAYDRMLQKVLRHQPL 529
+S++V+L LTP +CA LL+ S + H G F D ++ Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 530 TLLVALATFALTVLLYILVPKGFFPQQDTGLIQAITQAPQSISFTAMAERQQAAARLAL- 588
LL+ A V+L++ +P F P++D G+ + Q P + + L
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 589 -EDPDVQAVSSFIGVDGSNATLSAGRMQIALKPQAERNGDLRTVMARLQQAFGKQDGLTV 647
E +V++V + G S +AG ++LKP ERNGD + A + +A + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 648 -YMQPVQDLTIEDRVSRTQYQMTL---SNPDLKVLSEWTPKLIDRLRQVPG-LKDVTDDL 702
++ P I + + T + L + L++ +L+ Q P L V +
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 703 QDDGLQTWVEIDRDAASRLGITAAVVDEALYDAFGQRLISTIFTQSNQYRVVLEVQPQFQ 762
+D Q +E+D++ A LG++ + +++ + A G ++ + ++ ++ +F+
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 763 MNPAALGQIHVPTSTGAQVPLSSIARITEGKTVLAVNRLDQFPMVTVSFNLAPGASLSAA 822
M P + +++V ++ G VP S+ + R + P + + APG S A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 823 VEAITAAEAEIGMPASVETRFQGAALAFQNSLSSTLWLILAAIVTMYIVLGVLYESYIHP 882
+ + ++ +PA + + G + + S + L+ + V +++ L LYES+ P
Sbjct: 840 MALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIP 897

Query: 883 ITILSTLPSAGVGALLALLISGTELDMIGIIGIILLIGIVKKNAIMMIDFALDAERKRGL 942
++++ +P VG LLA + + D+ ++G++ IG+ KNAI++++FA D K G
Sbjct: 898 VSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGK 957

Query: 943 SPRAAIHEAALLRFRPILMTTLAALFGALPLMLSTGTGAELRQPLGLVMVGGLLLSQVLT 1002
A A +R RPILMT+LA + G LPL +S G G+ + +G+ ++GG++ + +L
Sbjct: 958 GVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA 1017

Query: 1003 LFTTPVIYLMFDR 1015
+F PV +++ R
Sbjct: 1018 IFFVPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016231RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 23/147 (15%), Positives = 64/147 (43%), Gaps = 12/147 (8%)

Query: 89 VTAYNTVTVRSRVSGELVDVPFQEGQRVKAGDLLAQVDP-------RAFQVALDQARGTQ 141
+ + ++ + + ++ +EG+ V+ GD+L ++ Q +L QAR Q
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ 150

Query: 142 MQNLALLENARRDLQRYQALFKQ---DSIAKQQVDTQAALVRQYEGTIKSD--QANVDNA 196
+ L + + L + ++++++V +L+++ T ++ Q ++
Sbjct: 151 TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210

Query: 197 RLQLDYARITAPISGRLGLRQVDRGNL 223
+ + + + A I+ L +V++ L
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRL 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_016261PREPILNPTASE2408e-81 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 240 bits (613), Expect = 8e-81
Identities = 126/274 (45%), Positives = 164/274 (59%), Gaps = 3/274 (1%)

Query: 15 FVALAALAGLFAGGWLTRLTHRLPRMMEHEWQAQCQEAAGKPRSA---AAYGLLAPAAHC 71
+ +L L L G +L + HRLP M+E EWQA+ + Y L+ P + C
Sbjct: 15 YFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCC 74

Query: 72 PACQAPVTGWRRLPLLGWLALRGRCAACHAPIGWRYPAIESMTCILFAACAWRFGATPIA 131
P C P+T +PLL WL LRGRC C API RYP +E +T +L A A
Sbjct: 75 PHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWGT 134

Query: 132 LCAMGLSAALVALAWIDLETTLLPDAITLPLAWAGLLVNLFDAFTPLSMAVVGAVAGYVF 191
L A+ L+ LVAL +IDL+ LLPD +TLPL W GLL NL F L AV+GA+AGY+
Sbjct: 135 LAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLV 194

Query: 192 LWVIFHAFRLLTGREGMGFGDFKLLAALGAWFGVGALPMLLLAASVAGVVVGGALTLSGR 251
LW ++ AF+LLTG+EGMG+GDFKLLAALGAW G ALP++LL +S+ G +G L L
Sbjct: 195 LWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRN 254

Query: 252 ASRGQALPFGPYLALAGVAVLLLGGEQGLGWLLH 285
+ + +PFGPYLA+AG LL G +L +
Sbjct: 255 HHQSKPIPFGPYLAIAGWIALLWGDSITRWYLTN 288


101NH44784_023801NH44784_023931N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_023801046-6.932306Undecaprenyl-phosphate N-acetylglucosaminyl
NH44784_023811-139-5.784702UDP-N-acetylglucosamine 4,6-dehydratase
NH44784_023821-132-2.738934Lipopolysaccharide heptosyltransferase I
NH44784_023831-120-0.3584083-deoxy-D-manno-octulosonic-acid transferase
NH44784_023841015-0.217372UDP-glucose 4-epimerase
NH44784_0238510111.612005hypothetical protein
NH44784_0238610102.198008Pantothenate kinase type III, CoaX-like
NH44784_0238710111.989195Biotin-protein ligase
NH44784_023881-1121.387229ABC-type transport system involved in resistance
NH44784_0238911110.768764Methionine ABC transporter ATP-binding protein
NH44784_023901-1100.712425Mammalian cell entry related domain protein
NH44784_023911-2110.457216Probable lipoprotein
NH44784_023921-39-0.538651Alpha/beta hydrolase
NH44784_023931-121-4.227442D-alanyl-D-alanine carboxypeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023801PREPILNPTASE300.015 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.8 bits (67), Expect = 0.015
Identities = 25/99 (25%), Positives = 37/99 (37%), Gaps = 1/99 (1%)

Query: 134 LGALLAAVALVWLLNLYNFMDGIDGIAAAEAVCVCSGGAILYLLTGRPEAGYAPLLLAAA 193
L L + L + D + G A A + + S LLTG+ GY L AA
Sbjct: 163 LPLLWGGLLFNLLGGFVSLGDAVIG-AMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAA 221

Query: 194 AAGFLCWNFPPAKIFMGDVGSGFLGITLGVLALQAGWTA 232
+L W P + + + F+GI L +L
Sbjct: 222 LGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKP 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023811NUCEPIMERASE781e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 77.5 bits (191), Expect = 1e-17
Identities = 55/298 (18%), Positives = 110/298 (36%), Gaps = 42/298 (14%)

Query: 281 TVLVTGAGGSIGSELCRQLARFSPARLVLVEANEFALYNVEQWFHQHWPQLELVLLAGDV 340
LVTGA G IG + ++L + + N++ +++Q + Q D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 341 KDAARMEEVFQGWRPDVVFHAAAYKHVPLMEVANAWQAARNNVLGTLRVAECAVRYGASR 400
D M ++F + VF + V + N A +N+ G L + E
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 401 FVLIST---------------DKAVNPTNVMGATKRMAEMVCESLYRRHQTTQFSMVRFG 445
+ S+ D +P ++ ATK+ E++ + Y + +RF
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHT-YSHLYGLPATGLRFF 179

Query: 446 NVLGSTGS---VIPKFQAQIARGGPVTV-THPEINRYFMSIPEAAQLVLQA--------- 492
V G G + KF + G + V + ++ R F I + A+ +++
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADT 239

Query: 493 ---------ASMGAGGEIFVLDMGHPVKIVDLARNMIRLSGYSEAQIRIEFTGLRPGE 541
A+ A ++ + PV+++D + + G + + L+PG+
Sbjct: 240 QWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG---IEAKKNMLPLQPGD 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023841NUCEPIMERASE812e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 80.6 bits (199), Expect = 2e-19
Identities = 71/349 (20%), Positives = 126/349 (36%), Gaps = 61/349 (17%)

Query: 1 MRIAVTGSTGYIAQALIKHLANQENEVLAI---------SRGQPITALATLPGVQW---- 47
M+ VTG+ G+I + K L ++V+ I S Q L PG Q+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 48 LSTADGIPSAAALRGCHAVVHLAGRAHTRVAHEGGRDLFDESNRVLALNCSDASREAGVD 107
L+ +G+ A V R R + E D SN LN + R +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYAD-SNLTGFLNILEGCRHNKIQ 119

Query: 108 RFVFISTLGVHGGSSPKPVTENSPMTG-VSPYARSKIEAE---QQLAENYRNTPETLCIV 163
++ S+ V+G + P + + + VS YA +K E + Y P T +
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY-GLPAT--GL 176

Query: 164 RPPMVYGPSCPGN--FPRLINLVRRGLPLP-FASIHSRRSFIHVDNLA------------ 208
R VYGP + + + G + + +R F ++D++A
Sbjct: 177 RFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPH 236

Query: 209 ------HFLGAVSSCQVPGGTYVIGDESDFTVPELIRMIGQAFGLPTR--LIPFPPSLLL 260
G ++ P Y IG+ S + + I+ + A G+ + ++P P
Sbjct: 237 ADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQP---- 292

Query: 261 AAAHLVGRRGEMESLTQAMEVDWSRARDFAGWTPPIPGRQALQDTLENY 309
G++ T A D + G+TP + +++ + Y
Sbjct: 293 ---------GDVLE-TSA---DTKALYEVIGFTPETTVKDGVKNFVNWY 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023861PF03309962e-25 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 95.6 bits (238), Expect = 2e-25
Identities = 57/287 (19%), Positives = 90/287 (31%), Gaps = 58/287 (20%)

Query: 1 MIILIDSGNSRLKVGWLDPGSPD--------MPREPAAVAFDGLDLDALDRWLGALPRRP 52
M++ ID N+ VG + + EP A D L + L
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTA------DELALTIDGL---- 50

Query: 53 SRALGVNVAGAERGAAIAAV-----------LQRH-GCPVTWATSQATTLGLVNRYKTPS 100
+G + A GA+ + L+++ G+ P
Sbjct: 51 ---IGDD-AERLTGASGLSTVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPK 106

Query: 101 QLGADRWASLLGVLSHLPGAHPPFLLASFGTATTIDTVGPDNVFAGGLILPGPAMMRNAL 160
++GADR + L A ++ FG++ +D V F GG I PG + +A
Sbjct: 107 EVGADRIVNCLAAYHKYGTAA---IVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAA 163

Query: 161 AHGTANLP---IADGQVVPYPTDTHEAIASGIAAAQAGAV----VRQWLTGRQRYGQAPQ 213
A +A L + + V +T E + +G AG V R G
Sbjct: 164 AARSAALRRVELTRPRSV-IGKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFSGADVA 222

Query: 214 IFAAGGGWPEVHQEIERLLADAGGAFGAAPVPVYLDHPVLDGLAAIA 260
+ A G L+ Y H LDGL +
Sbjct: 223 VVATGHT--------APLVLPDLRTV-----EHYDRHLTLDGLRLVF 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023901PF07201300.012 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.012
Identities = 22/123 (17%), Positives = 45/123 (36%), Gaps = 6/123 (4%)

Query: 132 RIEQRGGRL---ISNVEDATEQLRRLLSEQNVQALTASLQNATDITHSLKEASRDLGPAL 188
++ R+ V ++ L +QNV L + L N+ +I+ S +A +
Sbjct: 72 KLSDSQARVSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLE---GK 128

Query: 189 AKLGPLVDSLGSTSRQADRAAREVADLAQQARQSLARLNAPDGALAAATRSLNDIAFAAS 248
++ + R A + E+A L+ Q+L + G + A+ S
Sbjct: 129 SEEPSEQFKMLCGLRDALKGRPELAHLSHLVEQALVSMAEEQGETIVLGARITPEAYRES 188

Query: 249 RLD 251
+
Sbjct: 189 QSG 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023921PF06057280.023 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.3 bits (63), Expect = 0.023
Identities = 24/129 (18%), Positives = 46/129 (35%), Gaps = 27/129 (20%)

Query: 107 PWVLAGFSFGTAVAAQTYAAL-ADQGDTVLPSALMLMGPAVNRFQSH------------- 152
+L G+SFG V + A VL + L+ + + F+ H
Sbjct: 118 KVILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSSD-FEIHVSEMVTSDNQSAR 176

Query: 153 -------EVQVPGDTLMVHG-EEDEVVPLSEAMDWARPRSIPVVVIPGASHFFHGKLLVL 204
Q L ++G E+D + L + + ++ V+ + G F +
Sbjct: 177 YLTLPEVNKQTTVPMLCLYGKEDDAPLHLCPEV---KQPNVTVMELSGGHS-FDDDYDKV 232

Query: 205 RQLVQAHLK 213
+L++ LK
Sbjct: 233 VKLIKGWLK 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_023931BLACTAMASEA392e-05 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 38.6 bits (90), Expect = 2e-05
Identities = 35/152 (23%), Positives = 54/152 (35%), Gaps = 21/152 (13%)

Query: 57 IVIDVNSGQTLAASNPDMKVEPASLTKIMTAYVVFNALEEKRLTLEQTVPVSEHAWRTGG 116
I +D+ SG+TL A D + S K++ V ++ LE+ + +
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQD----- 97

Query: 117 SRMFIEPRKPVTVDELNQGM---------IVQSGNDASVALAEAVGGSEA--AFATLMNQ 165
+ PV+ L GM I S N A+ L VGG AF +
Sbjct: 98 ----LVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGD 153

Query: 166 EAERLGMRNTHFMNATGLPDPQHMTSTRDLAT 197
RL R +N D + T+ +A
Sbjct: 154 NVTRLD-RWETELNEALPGDARDTTTPASMAA 184


102NH44784_026171NH44784_026271N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0261712122.539097Nucleoside-diphosphate-sugar epimerases
NH44784_0261812132.239997D-beta-hydroxybutyrate dehydrogenase
NH44784_026191-1111.729316Predicted pyridoxine biosynthesis protein
NH44784_026201-1101.268889regulatory protein GntR, HTH:GntR, C-terminal
NH44784_026211-1120.916319transcriptional regulator, LysR family
NH44784_0262210120.818505major facilitator superfamily MFS_1
NH44784_026231-1120.624547hypothetical protein
NH44784_026241-2120.158034Zinc-regulated outer membrane receptor
NH44784_026251214-0.023724Hydroxypyruvate isomerase
NH44784_026261118-3.461647hypothetical protein
NH44784_026271015-3.006570Putative DMT superfamily metabolite efflux
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026171NUCEPIMERASE914e-23 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 91.4 bits (227), Expect = 4e-23
Identities = 54/219 (24%), Positives = 90/219 (41%), Gaps = 28/219 (12%)

Query: 1 MKILITGGAGFLGQRLARKLLEQGRLALSGERVPISQID-----LLDVTRTDAINDTRVR 55
MK L+TG AGF+G ++++LLE G + V I ++ L R + + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGH-----QVVGIDNLNDYYDVSLKQARLELLAQPGFQ 55

Query: 56 SVEGDVADPDCLRSLIG-ADTAVIFHLAAIVSGQAEADFDLG-----MRINLDASRALLE 109
+ D+AD + + L +F + + L NL +LE
Sbjct: 56 FHKIDLADREGMTDLFASGHFERVFISPH----RLAVRYSLENPHAYADSNLTGFLNILE 111

Query: 110 ACRRQGHRPRVVFTSSVAVYGGA--LPETVRDDTALNPQSSYGTQKAIAELLLADYTRRG 167
CR + +++ SS +VYG +P + DD+ +P S Y K EL+ Y+
Sbjct: 112 GCRHNKIQ-HLLYASSSSVYGLNRKMPFST-DDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 168 FVDGRALRLPTISVRPGRPNAAASSFASGIIREPLNGEP 206
+ LR T+ GRP+ A F + L G+
Sbjct: 170 GLPATGLRFFTVYGPWGRPDMALFKFTKAM----LEGKS 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026201DHBDHDRGNASE310.002 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 31.2 bits (70), Expect = 0.002
Identities = 19/49 (38%), Positives = 23/49 (46%), Gaps = 5/49 (10%)

Query: 186 EARHADQMRAVLREHEAIAEAIRRQDEDAAREAARI-HMVNAAKRLRLA 233
EARHA+ A +R+ AI E R RE I +VN A LR
Sbjct: 55 EARHAEAFPADVRDSAAIDEITAR----IEREMGPIDILVNVAGVLRPG 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026221TCRTETB290.029 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.029
Identities = 24/143 (16%), Positives = 54/143 (37%), Gaps = 15/143 (10%)

Query: 78 VGGMLIGSYADRHGRRAAMTLTLWLMGLGCALIAAAPTHAQMGLLGPVMMVLARLIQGFA 137
+G + G +D+ G + + + + G + H+ LL ++AR IQG
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGS--VIGFVGHSFFSLL-----IMARFIQGAG 116

Query: 138 AGGEVGASTTLLVEHAPPSRRGFYSSWQFGSQSLGVALGAVVVGTLTAALSAEQMQAWGW 197
A ++ + P RG ++G +G + G + + W
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI--------HW 168

Query: 198 RVPFVIGILTVPVGAYIRRNLEE 220
+I ++T+ ++ + L++
Sbjct: 169 SYLLLIPMITIITVPFLMKLLKK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026241PF00577320.011 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 31.7 bits (72), Expect = 0.011
Identities = 19/125 (15%), Positives = 33/125 (26%), Gaps = 14/125 (11%)

Query: 575 VEGELRHQFTP---VFSAAVFGDYVRGKLTGGDGNLPRIPA-------ARAGLRGNVKWQ 624
+ L H ++ D R G N+ + A A + L + +
Sbjct: 396 FQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHD 455

Query: 625 QWSGGIEYARVFSQKD----IAAYESSTPGYNLVNAVVAYRGRYGATGYEVYLRGTNLLN 680
S Y + ++ + Y ST GY R + +
Sbjct: 456 GQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKF 515

Query: 681 KLAYN 685
YN
Sbjct: 516 TDYYN 520


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026271PF06580290.021 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.021
Identities = 16/102 (15%), Positives = 34/102 (33%), Gaps = 5/102 (4%)

Query: 63 IALYGVTLACMNLLFYMALRTLPLGVAIAIEFTGPLTLAVVLSRRAIDFVWIACALAGLV 122
++ + ++ M L+ A R+ G + L V+ + I VW +
Sbjct: 41 SMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWR 100

Query: 123 LLIPTGQSMHDLDPVGIAYALGAAVCWALYIIFGKMAGNVHG 164
LL PV L ++ + + ++ + G
Sbjct: 101 LLAFINTK-----PVAFTLPLALSIIFNVVVVTFMWSLLYFG 137


103NH44784_026621NH44784_026741N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_026621-2100.126264glycosyltransferase( EC:2.4.-
NH44784_026631-3101.733828transposase, IS4 family protein
NH44784_026641-3121.565190hypothetical protein
NH44784_026651-2161.746781Glutathione S-transferase family protein
NH44784_026661-1131.419077FIG01197967: hypothetical protein
NH44784_026671-1140.3940183-oxoacyl-[acyl-carrier protein] reductase
NH44784_026681-215-0.4187493-oxoacyl-[acyl-carrier protein] reductase
NH44784_026691-114-0.369881Branched-chain amino acid transport ATP-binding
NH44784_026701-18-0.996518Branched-chain amino acid transport ATP-binding
NH44784_026711-19-1.254300Branched-chain amino acid transport system
NH44784_026721-111-1.579129High-affinity branched-chain amino acid
NH44784_026731-215-1.549847ABC branched chain amino acid family
NH44784_026741-314-0.4950693-oxoacyl-[acyl-carrier protein] reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026621AUTOINDCRSYN290.020 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 28.7 bits (64), Expect = 0.020
Identities = 7/19 (36%), Positives = 11/19 (57%)

Query: 99 LRCAYAFALDQGYEGIVTI 117
++ D+GY+GI TI
Sbjct: 124 FLSMINYSKDKGYDGIYTI 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026671DHBDHDRGNASE1132e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (283), Expect = 2e-32
Identities = 70/250 (28%), Positives = 115/250 (46%), Gaps = 17/250 (6%)

Query: 10 ALVTGGASGMGLAIVERLARDGFRVVMADRNAALADKETQALRAQGLDVDYRAVDLADEH 69
A +TG A G+G A+ LA G + D N +K +L+A+ + D+ D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 70 ATRALV----RELAPLAALVNNAGLFDERKFFDVTSDDYRRMYDVNLLAVATLTQEAARD 125
A + RE+ P+ LVN AG+ ++ +++ + VN V ++ ++
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 126 MAA--GGKIVNIASRAYLGAR-NHPHYVASKAALVGYTRASAMELAPRGILVNAIAPGLI 182
M G IV + S R + Y +SKAA V +T+ +ELA I N ++PG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 183 DTPLLRNLSPERLAAQLALQ----------PTGRAGQPRDVANAVSFLASPQMDFITGQV 232
+T + +L + A+ ++ P + +P D+A+AV FL S Q IT
Sbjct: 191 ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHN 250

Query: 233 IFVDGGKSLG 242
+ VDGG +LG
Sbjct: 251 LCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026681DHBDHDRGNASE1212e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 121 bits (304), Expect = 2e-35
Identities = 75/252 (29%), Positives = 116/252 (46%), Gaps = 10/252 (3%)

Query: 5 MQDKRVLVTGAGRGLGATIAQGFAREGASVIVADLDPALARASAQAIAAAGGRAIDAALD 64
++ K +TGA +G+G +A+ A +GA + D +P ++ A A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 65 VTDAEAVRAFAAECEARHGAIDVLVNNAGISARAPFDDPQTPQIWERVMNVNLQGTFNVT 124
V D+ A+ A E G ID+LVN AG+ + WE +VN G FN +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEE-WEATFSVNSTGVFNAS 124

Query: 125 HAFVEQLKARR-GAIVNLCSIVAYGCGISTAGYVVSKGGVRSFTEVLARDLAPHGVRVNA 183
+ + + RR G+IV + S A S A Y SK FT+ L +LA + +R N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 184 VAPGLMETEMTAGQRAQAHGTDWYMRRA--------PMARAGRADEIVGPVLFLASDMAS 235
V+PG ET+M A +G + ++ + P+ + + +I VLFL S A
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 236 YVNGVVLPVDGG 247
++ L VDGG
Sbjct: 245 HITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026691YERSSTKINASE290.029 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.6 bits (63), Expect = 0.029
Identities = 29/105 (27%), Positives = 47/105 (44%), Gaps = 11/105 (10%)

Query: 107 LRQDRQAVKADLDMV-------CDTFPRLRERFEQKAATLSGGEQQMLATGRAMMARPRM 159
L+Q +++ KA L ++ D + +RF+ + G +Q A R MMA
Sbjct: 611 LQQQQESAKAQLSILINRSGSWADVARQSLQRFDSTRPVVKFGTEQYTAIHRQMMAAHAA 670

Query: 160 ILLDEPSM---GLSPLVVEQIFDIVLRLNREQGITILLVEQNVKL 201
I L E S + V+ I ++++L R + LVEQ KL
Sbjct: 671 ITLQEVSEFTDDMRNFTVDSI-PLLIQLGRSSLMDEHLVEQREKL 714


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_026741DHBDHDRGNASE1051e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (264), Expect = 1e-29
Identities = 82/266 (30%), Positives = 126/266 (47%), Gaps = 26/266 (9%)

Query: 8 AAGPAPRLALVTGGGGGIGATICQELARAGYRVVVSDVDAQRARRVADELGAP---HSAH 64
A G ++A +TG GIG + + LA G + D + ++ +V L A A
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 65 AFDVSDESAVETAFDQIEQAHGPIAVLVSAAGLLLFQPNGERPLIKDTTLDIWERSFAVN 124
DV D +A++ +IE+ GPI +LV+ AG+L + WE +F+VN
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSD------EEWEATFSVN 116

Query: 125 ARGVFLCGRA---YLRRREAAPLKHGRIVTFTSVAAQLGGYRSSASYIAAKSAVLGLTKA 181
+ GVF R+ Y+ R + G IVT S A + S A+Y ++K+A + TK
Sbjct: 117 STGVFNASRSVSKYMMDR-----RSGSIVTVGSNPAGVP-RTSMAAYASSKAAAVMFTKC 170

Query: 182 MARESAHLGVTVNGIAPGLIDTDMLRSTVTS--------SGALAAAAQAIPLGRIGTVDD 233
+ E A + N ++PG +TDM S G+L IPL ++ D
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSD 230

Query: 234 VAAAVRFLASEEAGYITGSVIDVNGG 259
+A AV FL S +AG+IT + V+GG
Sbjct: 231 IADAVLFLVSGQAGHITMHNLCVDGG 256


104NH44784_027341NH44784_027431N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_027341090.892909Transcriptional regulator, PadR family
NH44784_027351090.560443iron-chelator utilization protein
NH44784_027361-1100.532797Transcriptional regulator, LysR family
NH44784_027371-290.550387Methylisocitrate lyase
NH44784_027381-2100.696209hypothetical protein
NH44784_027391-1100.463094hypothetical protein
NH44784_0274011120.937245putative RNA polymerase sigma factor
NH44784_0274112101.338536response regulator, NarL-family
NH44784_0274211100.963120two-component hybrid sensor and regulator
NH44784_027431090.541190Oxalate/formate antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027341cloacin300.014 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 29.7 bits (66), Expect = 0.014
Identities = 18/55 (32%), Positives = 21/55 (38%), Gaps = 2/55 (3%)

Query: 60 GHGRGARYATGDWHEGGRGRHGRHAGGGGGGWFG--GPWGGRGGDADDSFGGDGR 112
GH GA +G+ + G G G GW PWGG G GG G
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027351OMPADOMAIN290.026 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 28.7 bits (64), Expect = 0.026
Identities = 5/31 (16%), Positives = 16/31 (51%), Gaps = 1/31 (3%)

Query: 234 AKAVREIMVAQHGVDKSRIRAASYWKRGAIA 264
A++V + ++++ G+ +I A + +
Sbjct: 278 AQSVVDYLISK-GIPADKISARGMGESNPVT 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027391PYOCINKILLER310.041 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.5 bits (68), Expect = 0.041
Identities = 31/99 (31%), Positives = 39/99 (39%), Gaps = 7/99 (7%)

Query: 365 ARDAAGPTSEADAATEADAATQADAATAATAAAEADAAIETAAADTVTAPANIDVAQDPA 424
AA + EA AA +A A+A A A AAI AA+T PAN V A
Sbjct: 206 TLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIR--AANTYAMPANGSVVATAA 263

Query: 425 -----QVEADAQAVADAAAAGAEAAAAAADAATSADAGG 458
QV A ++A A + +A S A G
Sbjct: 264 GRGLIQVAQGAASLAQAISDAIAVLGRVLASAPSVMAVG 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027411HTHFIS702e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 2e-16
Identities = 33/156 (21%), Positives = 60/156 (38%), Gaps = 4/156 (2%)

Query: 11 RVLLADDHGIVREGLKMVLQQAPAMIGSIDEAATGEQVLALLAAHGADVLVLDLGMPGVA 70
+L+ADD +R L L +A + + + +AA D++V D+ MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 71 GSGWVRALRDRHPALHIMVLTANTDARSRQAILEAGADAYLAKTGSSHELMAAIQR-LHQ 129
+ ++ P L ++V++A + E GA YL K EL+ I R L +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 130 DPADRPVRARPNHTPAPADTLTRREQQVLALAAQGA 165
+ P + Q++ + A+
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLM 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027421PF06580415e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.4 bits (97), Expect = 5e-06
Identities = 41/191 (21%), Positives = 73/191 (38%), Gaps = 40/191 (20%)

Query: 258 ARRQQTVLDSLNRLLDGLLDVSRMDAHLLRIE----RRAVDLDQL-FDDIRLDFESLARA 312
+ + +L SL+ L+ L S L E + L + F+D RL FE
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFED-RLQFE----- 243

Query: 313 RGLRLSVAPTSLALDTDPALLRRILDNLLTNALSYTPKGG-VLLSARRRRGGVLIQVWDT 371
+ P + + P L++ +++N + + ++ P+GG +LL + G V ++V +T
Sbjct: 244 ----NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 372 GVGIAPELQEAVFTEFTRGAGAPASPPAGQGLGLGLAIVRRLAGLLHGE---ITLRSTPG 428
G + G GL VR +L+G I L G
Sbjct: 300 GSLALKN--------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 429 KGSVFSLWLPA 439
K + + +P
Sbjct: 340 KVNAM-VLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_027431TCRTETB363e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 3e-04
Identities = 29/129 (22%), Positives = 58/129 (44%), Gaps = 4/129 (3%)

Query: 65 LTWGVVQPFTGMIADRHGSAKVILGGLACYALGLAGMAHAGTVTAFMLSAGVCIGIALSG 124
LT+ + G ++D+ G +++L G+ G + + ++ A G
Sbjct: 60 LTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG--A 117

Query: 125 TAF-AVIYGALSRLVPPERRGWALGVAGAVGGLGQFTMVPAAQWLIGSRGWVDALLIFAV 183
AF A++ ++R +P E RG A G+ G++ +G+ + PA +I LL+ +
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGE-GVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 184 VLAVVVPLA 192
+ + VP
Sbjct: 177 ITIITVPFL 185


105NH44784_028891NH44784_028961N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0288913121.499786Acetoacetyl-CoA reductase
NH44784_0289012112.142881hypothetical protein
NH44784_0289112112.188294ABC-type multidrug transport system, permease
NH44784_028921192.415860ABC-type multidrug transport system, permease
NH44784_028931-183.209041secretion protein HlyD
NH44784_028941093.055224RND efflux system, outer membrane lipoprotein
NH44784_0289510101.919625Polyhydroxyalkanoic acid synthase
NH44784_028961-2120.787091hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028891DHBDHDRGNASE1225e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 122 bits (307), Expect = 5e-36
Identities = 78/249 (31%), Positives = 111/249 (44%), Gaps = 10/249 (4%)

Query: 5 RTALVTGGTGCLGRAIARALLDAGHDVIVTCHASEAATRQWLDQEAAAGRRYDMVKVDVA 64
+ A +TG +G A+AR L G + + E + + A R + DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKV-VSSLKAEARHAEAFPADVR 67

Query: 65 DYDACQALARRLADDGRQIDILVNNAGITRDASLRKMSYENWNDVLRSNLDSMFNMTQPL 124
D A + R+ + IDILVN AG+ R + +S E W N +FN ++ +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 125 CGPMADRGWGRIVNISSVNGSKGAFGQTNYAASKAGIHGFTKSLALELARKGVTVNTVSP 184
M DR G IV + S YA+SKA FTK L LELA + N VSP
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GYLATRMVEAV------PEDVLK---EKILPQIPLGRLGQPDEIAALVAFICSDAAAFMT 235
G T M ++ E V+K E IPL +L +P +IA V F+ S A +T
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 236 GSNVAMNGG 244
N+ ++GG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028911ABC2TRNSPORT581e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 57.6 bits (139), Expect = 1e-11
Identities = 46/176 (26%), Positives = 73/176 (41%), Gaps = 5/176 (2%)

Query: 197 AALIREREHGTIEHLLVMPVTPTEIMLAKV-WSMGLVVLVSAGLSLTFVVRGLLQVPVEG 255
AA R T E +L + +I+L ++ W+ L AG+ + G Q
Sbjct: 89 AAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWL--- 145

Query: 256 SVALFLAGVALHLFATTSMGIFMATLARSMPQFGMLLVLVLLPLQMLSGGTTPRESMPDF 315
S+ L +AL A S+G+ + LA S F LV+ P+ LSG P + +P
Sbjct: 146 SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIV 205

Query: 316 VQNIMLAAPTTHFVELGQAILFRGAGLGVVWQPFLALALIGSVLFAFSLTRFRKTL 371
Q P +H ++L + I+ + V Q AL + + F S R+ L
Sbjct: 206 FQTAARFLPLSHSIDLIRPIMLGHPVVDVC-QHVGALCIYIVIPFFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028931RTXTOXIND702e-15 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 70.2 bits (172), Expect = 2e-15
Identities = 65/391 (16%), Positives = 131/391 (33%), Gaps = 84/391 (21%)

Query: 11 LLAIVAVAAAGYYGWRMLSDTGPGAGFVSGNGRIEATEVDVATKLAGRVQDVLVAEGDFV 70
+ V A + G ++ +GR + + V++++V EG+ V
Sbjct: 63 FIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVKEGESV 118

Query: 71 SAGQPLARM------------QIDTLQAQREEAR-------------------------- 92
G L ++ Q LQA+ E+ R
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 93 AQHQQAVNNAASASAQVAQRESDKLAAEAVVVQRESELDAARRRLARSE----------- 141
++ + + Q + ++ K E + ++ +E R+ R E
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 142 ---TLSREGASSIQELDDDRARVRSAQAAVNAGRAQVKAAQAAIDAAKAAQVG------- 191
+L + A + + + + A + ++Q++ ++ I +AK
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 192 ------AQSAVNAALAT--IARIEADIADSELRAPRDGRV-QYRVAQAGEVLGAGGKVLN 242
Q+ N L T +A+ E S +RAP +V Q +V G V+ ++
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358

Query: 243 MVDLADVY-MTFFLPEQAAGRVALGQDVRIVLDAAPQY---VIPAKVSFVASTAQFTPKT 298
+V D +T + + G + +GQ+ I ++A P + KV + A
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA------ 412

Query: 299 VETATERQKLMFRVKAQIAPELLRQHLRQVK 329
+R L+F V I L + +
Sbjct: 413 --IEDQRLGLVFNVIISIEENCLSTGNKNIP 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028941RTXTOXIND330.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.003
Identities = 22/171 (12%), Positives = 44/171 (25%), Gaps = 36/171 (21%)

Query: 86 DLRTALLRVEEARALYGIQRADQFPTIGAQADGSRGRTPGDLNLTGQPQVASQYQVGVGM 145
L A L + L ++ P + + P N++ + +
Sbjct: 142 SLLQARLEQTRYQILSRSIELNKLPELKLPDE------PYFQNVSEEEVL---------- 185

Query: 146 AAWELDFWGRVRSLKDAALENYLASDAAAEAATLSLIAQVADSYLTLRELDERLALTRAT 205
R+ SL + E A+ + R+
Sbjct: 186 ---------RLTSLIKEQFSTWQNQKYQKELNLDKKRAERL-------TVLARINRYENL 229

Query: 206 IASREESLRIFRRRYEVGSISKLDLTQVE----TLWQQARALGADLEQARA 252
+ L F +I+K + + E + R + LEQ +
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIES 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_028961HTHTETR645e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 64.3 bits (156), Expect = 5e-15
Identities = 29/140 (20%), Positives = 48/140 (34%), Gaps = 8/140 (5%)

Query: 5 QILDAALQEFSAAGYTGARMDDIALRAGLSKGGLYAHFASKEEVFEALLARYLCPPRLDA 64
ILD AL+ FS G + + +IA AG+++G +Y HF K ++F + +
Sbjct: 15 HILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE--SNIGE 72

Query: 65 RALVEGSATPRHLAERLVNHLYASLANPAMISAMRLLLAESLRVPHLA------RRWREQ 118
L + P L L L + RLL+ ++ +
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 119 TADAHQADIAQLLELARARG 138
I Q L+
Sbjct: 133 LCLESYDRIEQTLKHCIEAK 152


106NH44784_032011NH44784_032071N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_03201119-0.117250hypothetical protein
NH44784_032021012-2.049076Transcriptional regulator, AraC family
NH44784_032031213-2.800788outer membrane porin
NH44784_032041311-3.230438outer membrane porin
NH44784_032051311-2.604056outer membrane porin
NH44784_032061212-2.277366Cytochrome oxidase biogenesis protein
NH44784_032071212-3.037398Cytochrome O ubiquinol oxidase subunit IV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032011TCRTETB574e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.2 bits (138), Expect = 4e-11
Identities = 39/169 (23%), Positives = 70/169 (41%), Gaps = 4/169 (2%)

Query: 5 WLVIALLALPQVAETILAPALPDLARHWRLDAAATQPVMGIFFVGFAAGVLLWGHLADTR 64
WL I + E +L +LPD+A + A+T V F + F+ G ++G L+D
Sbjct: 18 WLCILSFFSV-LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 65 G-RRPAMLAGLALGLAGTLCALAAPAYPWLLAGRFLQALGLAACSVVTQTVLRDCLDGPR 123
G +R + + + + + L+ RF+Q G AA + V+ +
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 124 LTHYFVTLGTVLAWSPAVGPLTGQALAD--GHGYGGVLAAIAVVVALLL 170
F +G+++A VGP G +A Y ++ I ++ L
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032031ECOLNEIPORIN961e-24 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 96.0 bits (239), Expect = 1e-24
Identities = 93/371 (25%), Positives = 139/371 (37%), Gaps = 43/371 (11%)

Query: 1 MKKTLLAAAMLTTFAGVAQAETSVTLYGVIDTGIGYNK-IKGNGYDGSKLGMINGI-QAG 58
MKK+L+A LT A A VTLYG I G+ ++ + NG + + GI G
Sbjct: 1 MKKSLIA---LTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLG 57

Query: 59 SRWGLRGSEDLGDGLRAVFTLESGFDSGNGTRGQSGRLFGRQATIGLANDAWGTIEFGRQ 118
S+ G +G EDLG+GL+A++ +E G RQ+ IGL +G + GR
Sbjct: 58 SKIGFKGQEDLGNGLKAIWQVEQK----ASIAGTDSGWGNRQSFIGLKGG-FGKLRVGRL 112

Query: 119 ATVGSNYLADIDPFYTSYTQSNLGLGFSAANTMRWDNMVMYRSPSMSGFQFAAGYSFNVD 178
+V + DI+P + S LG A V Y SP +G + Y+ N D
Sbjct: 113 NSVLKDT-GDINP-WDS-KSDYLG-VNKIAEPEARLISVRYDSPEFAGLSGSVQYALN-D 167

Query: 179 DTNNDETHFRTNDNSRGITAGLRYVEGPVNVTLTYDQLNGSNRASIDHDATPRQYAVGLS 238
+ NS AG Y G V + + + +
Sbjct: 168 NAGRH--------NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSG 219

Query: 239 YDLEVVKLAAAYGRTTDGWFVGQDLPAGTPFSDEFGTNRYVDGFKANAYMLGATMP-VGG 297
YD + + A+ + D + + N + AY G P V
Sbjct: 220 YDNDAL-YASVAVQQQDA----------KLVEENYSHNSQTEVAATLAYRFGNVTPRVSY 268

Query: 298 ASSLFASWQHVSPSNDRLTGGDANMNVWSVGYTYDLSKRTNLYAYGSYGKDYAFIDGLKS 357
A S+ + + + VG YD SKRT+ + ++ S
Sbjct: 269 AHGFKGSFDAT--------NYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVS 320

Query: 358 TAAGVGIRHVF 368
TA GVG+RH F
Sbjct: 321 TAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032041ECOLNEIPORIN852e-20 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 84.9 bits (210), Expect = 2e-20
Identities = 89/400 (22%), Positives = 135/400 (33%), Gaps = 71/400 (17%)

Query: 1 MKKTLLAAAMLATFAGVAQAETSVTLYGIIDTGIGYNKVSGTEPGINLGTGKPQNVDVSG 60
MKK+L+A + A A VTLYG I G+ ++ N +
Sbjct: 1 MKKSLIALTLAAL---PVAAMADVTLYGTIKAGVETSR------------SVAHNGAQAA 45

Query: 61 SRIGMINGVQSGSRWGLKGSEDLGDGLRAMFQLESGFDSGNGDTTLDRLFGRQATVGLAN 120
S V GS+ G KG EDLG+GL+A++Q+E D+ RQ+ +GL
Sbjct: 46 SVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTDS---GWGNRQSFIGLKG 102

Query: 121 DAWGSVEFGRQATVGSNFLAEIDPFAASFTQANIGTGLSAANTMHWDNMIMYRSPWTDGF 180
+G + GR +V L + ++++ A + Y SP G
Sbjct: 103 G-FGKLRVGRLNSV----LKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGL 157

Query: 181 QFALGYSFNVDTDDGNQTGFRTADNARGITAGLRYVQGPLNVSLTYDQLNSSNKAYATGK 240
++ Y+ N + G N+ AG Y G V
Sbjct: 158 SGSVQYALN------DNAGR---HNSESYHAGFNYKNGGFFVQYGGAYKRH--------- 199

Query: 241 NGRPVFDDAGNRVALDNNITPRQYAVAVSYDLEVLKLAAAYGRTTDGWFVGQDLPEGSAS 300
V ++ + + + YD + L A+ + D V ++ S +
Sbjct: 200 ------HQVQENVNIEKY---QIHRLVSGYDNDAL-YASVAVQQQDAKLVEENYSHNSQT 249

Query: 301 NHFGTYRYAEGF--KANSYMLGATLRLDGASNLFGSWQHVSPSNDLLTGDDARMNIWSVG 358
T Y G SY G D + + VG
Sbjct: 250 EVAATLAYRFGNVTPRVSYAHGFKGSFDAT------------------NYNNDYDQVVVG 291

Query: 359 YTYDLSKRTSLYAYGSYGKNYAFIDGLKSTAGGVGMRHLF 398
YD SKRTS + + STAGGVG+RH F
Sbjct: 292 AEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032051ECOLNEIPORIN941e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 93.7 bits (233), Expect = 1e-23
Identities = 92/394 (23%), Positives = 140/394 (35%), Gaps = 68/394 (17%)

Query: 1 MKKKLLAAAVLTALASVAQAATSVTLYGLIDTGIGYNRITGDANGADYSGSRIGMINGVQ 60
MKK L+A LT A A VTLYG I G+ +R S I V
Sbjct: 1 MKKSLIA---LTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGI--VD 55

Query: 61 AGSRWGLRGTEDLGDGLRAVFRLENGFNSADGSRLQQGRMFGRQATIGLADDAWGSVDFG 120
GS+ G +G EDLG+GL+A++++E + A RQ+ IGL +G + G
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQKASIAGT----DSGWGNRQSFIGLKGG-FGKLRVG 110

Query: 121 RQTSVGSLLLADINPFRTTFTQASIGTTFSAANTMRWDNMVLYRSPWTDGFQFAVGYSFN 180
R SV DINP+ + + + V Y SP G +V Y+ N
Sbjct: 111 RLNSV-LKDTGDINPWDSKSDYLGVNKIAEPEARL---ISVRYDSPEFAGLSGSVQYALN 166

Query: 181 VDGTDEEQSGFRTADNARGITAGLRYANGPLNIVLTFDQLNGSNLASVDAFGDPVDHNAT 240
+ +G N+ AG Y NG + + + + +
Sbjct: 167 DN------AGR---HNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQE-NVNIE--KYQI- 213

Query: 241 PRQYAVGMSYDLEVVKLAAAYGRTTDGWFVGQDLPAGVRGKQAAGAMRRSEKLFGTNRYE 300
+ + YD + + A+ + + E+ + N
Sbjct: 214 ---HRLVSGYDNDAL-YASVAVQQ--------------------QDAKLVEENYSHNSQT 249

Query: 301 EGFRANSYMLGATVP-----LGGGGSVFGAWQHASASSDALTGDDANMDIWSLGYTYDLS 355
E +Y G P G GS + + D +G YD S
Sbjct: 250 EVAATLAYRFGNVTPRVSYAHGFKGSFDAT------------NYNNDYDQVVVGAEYDFS 297

Query: 356 KRTNLYVYGSYGKNYAFIEGLKSTAGGVGIQHRF 389
KRT+ V + + STAGGVG++H+F
Sbjct: 298 KRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_032071ACRIFLAVINRP270.038 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.038
Identities = 13/61 (21%), Positives = 24/61 (39%), Gaps = 11/61 (18%)

Query: 65 LAFAVVQIVVHIIYFLHMDTKSESGWNMLALIFTLVLVVITLSGSIWIMYHLN--SNMMP 122
L A++ + + + FL NM A + + V + L G+ I+ N +
Sbjct: 344 LFEAIMLVFLVMYLFLQ---------NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLT 394

Query: 123 M 123
M
Sbjct: 395 M 395


107NH44784_041371NH44784_041451N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0413713163.954554Transcription regulatory protein opdE
NH44784_0413810132.218817Major facilitator family transporter
NH44784_041391-1102.636030transcriptional regulator, TetR family
NH44784_041401-293.566094hypothetical protein
NH44784_041411-273.044720Methyltransferase
NH44784_041421-162.524170LacI family transcriptional regulator
NH44784_041431-152.460156D-galactonate transporter
NH44784_041441-163.364288Aspartyl-tRNA(Asn) amidotransferase subunit A
NH44784_041451081.952080hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041371TCRTETB386e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.9 bits (88), Expect = 6e-05
Identities = 26/145 (17%), Positives = 52/145 (35%), Gaps = 1/145 (0%)

Query: 49 IAASLDASTGRVGWLMVVPGLLAAL-CAPLVVMGARGVDRRRILCGLLLLLAGANLGSAL 107
IA + W+ L ++ A + + +R +L G+++ G+ +G
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 108 APDMAWFLAARVLVGVCIGGIWAVAGGLAGRLVPPAAIGMATAVIFGGVAVASVLGVPLG 167
+ + AR + G A+ + R +P G A +I VA+ +G +G
Sbjct: 100 HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 168 ALIGNLAGWRSAFGAMAVLSAAVLL 192
+I + W + V
Sbjct: 160 GMIAHYIHWSYLLLIPMITIITVPF 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041381TCRTETB362e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.0 bits (83), Expect = 2e-04
Identities = 31/160 (19%), Positives = 60/160 (37%), Gaps = 1/160 (0%)

Query: 25 LLALALAGFIAIMTETVPAGLLPQIGLGLGVSEALAGQLVTLFAAGSVLAAIPIIVATRG 84
L+ L + F +++ E V LP I A + T F + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 85 WNRRPLLLLAIGGLCIFNVVTALS-AHYALALAARFGGGMAAGLLWGLLAGYARRMVATR 143
+ LLL I C +V+ + + ++L + ARF G A L+ R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 144 QQGRALAVVGAGQPLALCLGVPAGAWLGSLMDWRGVFWLM 183
+G+A ++G+ + +G G + + W + +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041391HTHTETR776e-20 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 77.4 bits (190), Expect = 6e-20
Identities = 31/157 (19%), Positives = 59/157 (37%), Gaps = 3/157 (1%)

Query: 13 DAAMQVFWRRGYAATSVQDLVDGTGLGRGSLYNAFGSKQGLYEAALRRYHELTAANLDLL 72
D A+++F ++G ++TS+ ++ G+ RG++Y F K L+
Sbjct: 18 DVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEY 77

Query: 73 A--RPGNARERIGRLLDFIADDELREPSRRGCLVA-NASLEMAGQDEMVAELVRRNFQRL 129
PG+ + +L + + + E RR + E G+ +V + R
Sbjct: 78 QAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLES 137

Query: 130 EKALEQILAEGQANGEIDAGRSPRALARFIVTTVQGL 166
+EQ L + A R A + + GL
Sbjct: 138 YDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041431TCRTETA445e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 5e-07
Identities = 67/359 (18%), Positives = 116/359 (32%), Gaps = 39/359 (10%)

Query: 48 GDWGAVLGYFGYGYMVGALFGGMLADRYGPRKVWIVAGVTWSIFEIATAWAGDFGLAFLG 107
+G +L + A G L+DR+G R V +V+ ++ A A + ++G
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 108 GSALAGFATIRILFGAAEGPAYSIINKTIANWATPRERGFVVGFGLLSTPLGALLTAPVA 167
RI+ G G ++ IA+ ER FG +S G + A
Sbjct: 103 ----------RIVAGIT-GATGAVAGAYIADITDGDER--ARHFGFMSACFGFGMVAGPV 149

Query: 168 VGLLSLTGSWRAMFYILGAAGLLVLVLFMRIFTDRPDTNPRVSPAELREIQAARAEQAAA 227
+G L S A F+ AA L L F P +
Sbjct: 150 LGGLMGGFSPHAPFFA--AAALNGLNFLTGCFLLPESHKGERRPLRREALNPLA------ 201

Query: 228 GQGGDAPALPWWSFFRSKTLVLNTLGYFSFNYVNFLLLTWTPKYLQDTFGYSLSSLWYMG 287
+ W ++ +F V + + +D F + +++ G
Sbjct: 202 -------SFRWARGMTVVAALMAV--FFIMQLVGQVPAALWVIFGEDRFHWDATTI---G 249

Query: 288 MIPWTGACVTVLLGGRISDMLARRTGNLKIARSWFAAGCLLATTLCFLLVSQAQSVFAVI 347
+ + L I+ +A R G + G + T LL + A
Sbjct: 250 ISLAAFGILHSLAQAMITGPVAARLGERRA----LMLGMIADGTGYILLAFATRGWMAFP 305

Query: 348 ALMTLANALNAMPNSVYWAVVIDTAPASRVGTFSGLMHFFANIASILAPTLTGYLAARH 406
++ LA+ MP A++ R G G + ++ SI+ P L + A
Sbjct: 306 IMVLLASGGIGMP--ALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_041451IGASERPTASE387e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 7e-05
Identities = 26/188 (13%), Positives = 53/188 (28%), Gaps = 12/188 (6%)

Query: 14 PLLFALPAAPAVAGMDCARARTPTEKTLCADAALYRLDDELGAAYARLRAAQQPGQNEAL 73
P A P+ + ++ + T + DA + A A+ NE
Sbjct: 1027 PPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 74 RQAQRGWLKQRDACDSDAECLRQRYDTRLAELQAQQSRALAYRPDDIDRLALEDLRQAIE 133
+ Q A ++ E + + + ++ E ++ E
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ--SETVQPQAE 1144

Query: 134 AARQSNPEFAVETVLAARSLKAEATAIRNERAADGDGPARLPAARPAGVTEDEWAAVLAS 193
AR+++P ++ E + N A + VTE S
Sbjct: 1145 PARENDPTVNIK----------EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 194 DLESDAEE 201
+E+
Sbjct: 1195 VVENPENT 1202


108NH44784_042001NH44784_042031N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_042001-2113.306057General secretion pathway protein J
NH44784_0420110122.830943General secretion pathway protein I
NH44784_0420211131.556154General secretion pathway protein H
NH44784_0420311131.361263General secretion pathway protein G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042001PilS_PF08805310.003 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 30.7 bits (69), Expect = 0.003
Identities = 17/89 (19%), Positives = 33/89 (37%), Gaps = 8/89 (8%)

Query: 4 RRVAPHAQAGFTLIEVLVALALMALVSLMAWRGLDSVSSARDW--IARQADDTDAIVRAL 61
RR G TL+EVL+ + ++ +++ A++ V S A +++L
Sbjct: 19 RRKKEQ-DKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTVIANMKSL 77

Query: 62 GQMGRDVEMAYNGPSFAAPGIDARVFTSG 90
GR Y ++ + S
Sbjct: 78 KFQGR-----YTDSNYIKTLYAQGLLPSD 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042011BCTERIALGSPG447e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.5 bits (105), Expect = 7e-09
Identities = 22/63 (34%), Positives = 38/63 (60%), Gaps = 3/63 (4%)

Query: 7 RRRGRQRGFTLIEVLVALAIIAVAMGAALRATGVMAANNRALQDKTLA-LLAAQNALTQL 65
R +QRGFTL+E++V + II V A+L +M +A + K ++ ++A +NAL
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVL--ASLVVPNLMGNKEKADKQKAVSDIVALENALDMY 59

Query: 66 RLE 68
+L+
Sbjct: 60 KLD 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042021BCTERIALGSPH318e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 31.1 bits (70), Expect = 8e-04
Identities = 13/72 (18%), Positives = 23/72 (31%), Gaps = 1/72 (1%)

Query: 1 MVVLVIVAIAASMVSLSVGKGA-DPLRDDAQRLLDAFTVAEGEARSDGRALRWSPASGGW 59
M++L+++ ++A MV L+ D R + G+ S W
Sbjct: 12 MLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFFGVSVHPDRW 71

Query: 60 SFERAGRSDAPT 71
F D
Sbjct: 72 QFLVLEARDGAD 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042031BCTERIALGSPG1713e-58 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 171 bits (436), Expect = 3e-58
Identities = 64/142 (45%), Positives = 90/142 (63%), Gaps = 8/142 (5%)

Query: 6 RGLARQQGFTLIEIMVVIVIMGILAALIVPRVLDRPDQARQVAARQDIGGIMQALKLYRL 65
R +Q+GFTL+EIMVVIVI+G+LA+L+VP ++ ++A + A DI + AL +Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTAQGLRALVEKPEGA---SNWR--GYLDKLPNDPWGHPYQYLSPGVKGEIDVF 120
DN YPTT QGL +LVE P +N+ GY+ +LP DPWG+ Y ++PG G D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 121 SFGADNKPGGEGGDADIGSWEL 142
S G D + G E DI +W L
Sbjct: 122 SAGPDGEMGTED---DITNWGL 140


109NH44784_042161NH44784_042201N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0421612102.721238Transcriptional regulator, TetR family
NH44784_042171393.528633Putative oxidoreductase YncB
NH44784_042181183.390110Transcriptional regulator, TetR family
NH44784_042191-182.188969Short-chain dehydrogenase/reductase SDR
NH44784_042201-1120.979475Permeases of the major facilitator superfamily
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042161HTHTETR712e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 70.8 bits (173), Expect = 2e-17
Identities = 43/184 (23%), Positives = 65/184 (35%), Gaps = 18/184 (9%)

Query: 3 TPAAPSDVRQHILDVARPLLLARGYTAVGLAEVLAAAKVPKGSFYHYFASKDAFGVALLE 62
T + RQHILDVA L +G ++ L E+ AA V +G+ Y +F K + E
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 63 DYFKGYLAYMDGLMAGPGNG-----RERLVRYFHEWRETQGCESAHSRCLVVKLGAEVCD 117
A RE L+ + R L++++ C+
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLES------TVTEERRRLLMEIIFHKCE 118

Query: 118 LSDPMRTVLQAGTRA---ITERIERGIEAGRADGSLPPA--PGALAPAILAA--NLYEHW 170
M V QA +RIE+ ++ LP A + L E+W
Sbjct: 119 FVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENW 178

Query: 171 LGAS 174
L A
Sbjct: 179 LFAP 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042181HTHTETR552e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 54.6 bits (131), Expect = 2e-11
Identities = 20/97 (20%), Positives = 40/97 (41%), Gaps = 1/97 (1%)

Query: 1 MAGPAS-RKELSHDRIVEAAARAIRREGYAGVGVADVMKEAGLTHGGFYAHFPSRDAMLV 59
MA + + I++ A R ++G + + ++ K AG+T G Y HF + +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 60 AAMERAGRDGAARMAQGMARRRAEGASPLRALVESYL 96
E + + + A+ + S LR ++ L
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVL 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042191DHBDHDRGNASE501e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 50.0 bits (119), Expect = 1e-09
Identities = 47/207 (22%), Positives = 85/207 (41%), Gaps = 15/207 (7%)

Query: 3 LKNATVLITGANRGIGLAFAREALARGARKVYAGARDPASISLPGLQAIKLDVTSDE--- 59
++ ITGA +GIG A AR ++GA + A +P + ++
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGA-HIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 60 DVAAAAALAKDVT----------LVINNAGIAATGGFLAGDSIESARRHLETNLLGPLRV 109
DV +AA+ + +++N AG+ G + S E N G
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNA 123

Query: 110 AQAFAPVLAANGGGALLNVLSIASWINRPLLGVYGMSKSAAWALTNGLRHELREQGTQVL 169
+++ + + G+++ V S + + R + Y SK+AA T L EL E +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 170 GLHMGFVDTDLTRGLDAPKSTPESVVR 196
+ G +TD+ L A ++ E V++
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIK 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042201TCRTETB1184e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 118 bits (297), Expect = 4e-31
Identities = 93/423 (21%), Positives = 162/423 (38%), Gaps = 31/423 (7%)

Query: 25 IFWIASVAVFLVSLDATMLYAAFGAMRAGFPEASAADMSWVLNAYTVVYAAMLIPSGGLA 84
+ W+ + F L+ +L + + F A +WV A+ + ++ G L+
Sbjct: 16 LIWLC-ILSFFSVLNEMVLNVSLPDIANDF-NKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 85 DTHGRKRMFMIGVLLFLAASAACGLAGT-VGWLIAARVAQAVGAALLTPASLSIVLAAF- 142
D G KR+ + G+++ S + + LI AR Q GAA PA + +V+A +
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAF-PALVMVVVARYI 132

Query: 143 PAAQRSVAVSLWGAVAALAAAVGPSLGAFVVDAAGWPWAFYINLPLGALSLWFGAARLVE 202
P R A L G++ A+ VGP++G + W + + +P+ + L++
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITI---ITVPFLMK 187

Query: 203 SVRPEARRR--VDGVGMALLMVGVGALALAIVQSESPHWSRGELWMVAAVGLVSLVAFVG 260
++ E R + D G+ L+ VG+ L V ++S + FV
Sbjct: 188 LLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSY---------SISFLIVSVLSFLIFVK 238

Query: 261 WARSTPHPLVDLALFRHRTYRYVNLATLSFGIAFAMM--FFAFFFYMTG-VWHYSLPRAG 317
R P VD L ++ + + L GI F + F + YM V S G
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFM---IGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIG 295

Query: 318 LAVT-PGPLLVMPTAIITGKLAARMGHRPFLVGGALVYAASGLWFLLVPGAEPAYLAHWL 376
+ PG + V+ I G L R G L G + S L + ++ +
Sbjct: 296 SVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMT-II 354

Query: 377 PGLLLSGLGVGMVLPSLAGAAVNRLPPQHYAVGSAVNQATRQIGAVLGVALTVLLLGKGA 436
+L GL ++ + L Q G ++ T + G+A+ LL
Sbjct: 355 IVFVLGGLSFTKT--VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412

Query: 437 LSH 439
L
Sbjct: 413 LDQ 415


110NH44784_042321NH44784_042441N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_042321-113-1.465668IcsA
NH44784_042331022-0.794945Oligopeptidase A
NH44784_042341121-1.639425Methylenetetrahydrofolate dehydrogenase (NADP+
NH44784_042351220-1.813370Nitrogen regulation protein NR(I
NH44784_042361118-1.332038tw-component sensor kinase
NH44784_042371320-1.376123Pyruvate dehydrogenase E1 component
NH44784_0423812150.587729Dihydrolipoamide acetyltransferase component of
NH44784_042391317-0.384357Dihydrolipoamide dehydrogenase of pyruvate
NH44784_042401315-0.745354Dehydrogenases with different specificities
NH44784_042411317-1.697815LysR family transcriptional regulator PA2877
NH44784_042421218-2.312763hypothetical protein
NH44784_042431119-2.683144hypothetical protein
NH44784_042441121-4.402214Flagellar biosynthesis protein FliC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042321PRTACTNFAMLY545e-09 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 53.9 bits (129), Expect = 5e-09
Identities = 58/231 (25%), Positives = 88/231 (38%), Gaps = 28/231 (12%)

Query: 1234 DDAWRLGVMAGYGSQRSKARGSMSGYQSRGDITGYSAGVYGTWYQDARTRAGLYVDGWAL 1293
W LG +AGY ++ +G G G Y T+ D+ G Y+D
Sbjct: 687 GGRWHLGGLAGY----TRGDRGFTGDGG-GHTDSVHVGGYATYIADS----GFYLDATLR 737

Query: 1294 FNRFDNTVKGDGLD----KEQYTSQGVTASVEAGYLFEAGSHTTGSGRENRFYVRPQAQV 1349
+R +N K G D K +Y + GV AS+EAG F + +++ PQA++
Sbjct: 738 ASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFT---------HADGWFLEPQAEL 788

Query: 1350 LWSGVTAGDYTERSGTKVQGGSSGNVRTRLGARLSLASTRPSAGPGRTGQVEVFLDANWL 1409
G Y +G +V+ +V RLG R GR QV+ ++ A+ L
Sbjct: 789 AVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLE---VGKRIELAGGR--QVQPYIKASVL 843

Query: 1410 HAAHPYG-VTMDDTRSVVQGGRNVVELRAGVEGRLTDRLSLSADMTQRQGS 1459
G V + + EL G+ L SL A +G
Sbjct: 844 QEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGP 894



Score = 31.6 bits (71), Expect = 0.032
Identities = 29/101 (28%), Positives = 31/101 (30%), Gaps = 8/101 (7%)

Query: 1056 PDDPVSPGAGTAGGGDAAGGGTAGGGDAAGGGTAGGGDAAGGDAAGGGTAGGGDA----- 1110
P V A+G A A T GG GG AAG G
Sbjct: 197 PPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGAVVHLQRA 256

Query: 1111 ---AGGGTAGGGTAGGGTAGGGTAGGGDAAGGGTAGGGTHG 1148
G AGG GG GG GG G G G +G
Sbjct: 257 TIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYG 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042351HTHFIS1175e-33 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 117 bits (294), Expect = 5e-33
Identities = 38/150 (25%), Positives = 70/150 (46%)

Query: 8 STVFIVDDDEAVRDSLRWLLEANGYRVRAYASGESFLEDYDPSQIGVLIADVRMPGMSGL 67
+T+ + DDD A+R L L GY VR ++ + +++ DV MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 ELQEQLIARNAPLPIVFITGHGDVPMAVSTMKKGAVDFLEKPFNESDLREIVARMLEQAT 127
+L ++ LP++ ++ A+ +KGA D+L KPF+ ++L I+ R L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 128 QRVSKHQAQKDHEAMLARLTAREQQVLERI 157
+R SK + L +A Q++ +
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042381IGASERPTASE381e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.7 bits (87), Expect = 1e-04
Identities = 33/156 (21%), Positives = 50/156 (32%), Gaps = 15/156 (9%)

Query: 79 AAPAAKEEPKAEAPKQAAAAAPAAKAEAAAPAASSGPVEIEVPDIGDFKEVEVIEVMVAV 138
A P+ E AE KQ + + +A A + V E V
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKE----------AKSNVKANT 1080

Query: 139 GDTIKAEQSLITVESDKALMEIPASQGGVVKEVKVKVGDKVAKGSVVVVVEGSAPAAAAA 198
A+ T E+ + A+ KE K KV + V V +P +
Sbjct: 1081 QTNEVAQSGSETKETQTTETKETATVE---KEEKAKV-ETEKTQEVPKVTSQVSPKQEQS 1136

Query: 199 PAAKAEAASARSEAPAAKAEAPAAP-ATPAVGSRPA 233
+ +A AR P + P + T A +PA
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042391RTXTOXIND340.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.002
Identities = 19/85 (22%), Positives = 32/85 (37%), Gaps = 8/85 (9%)

Query: 39 TVESDKASMEIPASTGGVVKSINVKVGDKVAEGSVVLEVEASDAAPAAKQAPKADAPKAE 98
+ S EI +VK I VK G+ V +G V+L++ A A AD K +
Sbjct: 89 KLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAE--------ADTLKTQ 140

Query: 99 APKAEAKPAAAAPAAATFKGSADAE 123
+ +A+ + +
Sbjct: 141 SSLLQARLEQTRYQILSRSIELNKL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042401DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 70.1 bits (171), Expect = 2e-16
Identities = 71/266 (26%), Positives = 119/266 (44%), Gaps = 19/266 (7%)

Query: 1 MADHSIKGKVVLIAGGAKNLGGLIARDLAQHGAKAVAIHYNSAASKADADATVAAIQAAG 60
M I+GK+ I G A+ +G +AR LA GA A+ YN + V++++A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKL----EKVVSSLKAEA 56

Query: 61 AKAVALQADLTTAGAVEKLFADAVAAVGRPDIAINTVGKVLKKPLTEISEAEYDEMSAVN 120
A A AD+ + A++++ A +G DI +N G + + +S+ E++ +VN
Sbjct: 57 RHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116

Query: 121 AKTAFFFLKEAGRHVND--NGKVCTLVTSLLGAFTPFYAAYAGTKAPVEHYTRAASKEFG 178
+ F + +++ D +G + T+ ++ G AAYA +KA +T+ E
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 ARGISVTAVGPGPMDTPFFYPAEGADAVAYHKTAAALSPFSKTGL--------TDIEDVV 230
I V PG +T + + A +L F KTG+ +DI D V
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETF-KTGIPLKKLAKPSDIADAV 235

Query: 231 PFIRHLVSD-GWWITGQTILINGGYT 255
F LVS IT + ++GG T
Sbjct: 236 LF---LVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042431IGASERPTASE330.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.001
Identities = 18/80 (22%), Positives = 36/80 (45%), Gaps = 2/80 (2%)

Query: 137 APRTADAPPPQPGATDSVPPSAAPSPDDAAPTPHIQGQSAASAAPAAAAIPAREGAAPHA 196
P+ P+ +++V P A P+ ++ PT +I+ + + A PA+E + +
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPAREN-DPTVNIKEPQSQTNTTADTEQPAKE-TSSNV 1179

Query: 197 DRPAMPKPTLRITEIPAENP 216
++P T+ ENP
Sbjct: 1180 EQPVTESTTVNTGNSVVENP 1199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042441FLAGELLIN2173e-66 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 217 bits (555), Expect = 3e-66
Identities = 237/564 (42%), Positives = 290/564 (51%), Gaps = 59/564 (10%)

Query: 2 AAVINTNYLSLVAQNNLNKSQSALGTAIERLSSGLRINSAKDDAAGMAIANRFTANVKGL 61
A VINTN LSL+ QNNLNKSQS+L +AIERLSSGLRINSAKDDAAG AIANRFT+N+KGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISLAQTTEGAASEINTHLQRVRELSVQAANGSYSQEQLNSMQDEINQRLS 121
TQA+RNANDGIS+AQTTEGA +EIN +LQRVRELSVQA NG+ S L S+QDEI QRL
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 DIDRISQQTDFNGVKVLSGNAKPLTLQVGANDGETITLNLTEISVKTLGLDGFNVNGTGV 181
+IDR+S QT FNGVKVLS + + + +QVGANDGETIT++L +I VK+LGLDGFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ-MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 182 TQNRTATVSDLQAAGGKAGAGAAANDWTVTTNHAAATAEQAFGKLENGNTVVVGGTTYTY 241
S G A A + A T A + G T
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 242 DAANGNFTFQNTTKGATPADSTANLAKLASSLTPATGTSTGTYTNNASASTTFEVDATGN 301
DA N T +T + A A D G
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAI----------------KGGKEGDTFDYKGV 283

Query: 302 LTIGGKAAYLAATGELSTNNPGGGAQATLTDV--LTTTSKAAAGTASISIGGKTFNSTGT 359
G++ST G T+ D+ AA +S ++ N T
Sbjct: 284 TFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFT 343

Query: 360 VDEVTYTDTVAKDALLATFKAGAAGSEINLGQGITAAKLTFTTGTSTDTWVDGSGSFTRT 419
D+ T ++ L A
Sbjct: 344 FDDKTKNESAKLSDLEANNAVKG------------------------------------- 366

Query: 420 QKYDTTYTVDPNTGKATVKSGTGTGDYAPKVGATAYVNSSGKLTTETTSKGGKTSDPLKT 479
++ TV+ A T S + + + T++PL +
Sbjct: 367 ---ESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLAS 423

Query: 480 LDAAFTKLDKLTGELGAVQNRLESTIANLNNVVNNLSSARSRIQDADYATEVSNMSKAQI 539
+D+A +K+D + LGA+QNR +S I NL N V NL+SARSRI+DADYATEVSNMSKAQI
Sbjct: 424 IDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQI 483

Query: 540 LQQAGTSVLAQANQVPQTVLSLLR 563
LQQAGTSVLAQANQVPQ VLSLLR
Sbjct: 484 LQQAGTSVLAQANQVPQNVLSLLR 507


111NH44784_042501NH44784_042631N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_04250109-2.466733Flagellar motor rotation protein MotA
NH44784_04251128-2.179754Flagellar motor rotation protein MotB
NH44784_042521110-1.655803Chemotaxis regulator-transmits chemoreceptor
NH44784_042531210-1.836233Signal transduction histidine kinase CheA
NH44784_042541211-2.232316Positive regulator of CheA protein activity
NH44784_042551114-0.884418Methyl-accepting chemotaxis protein I (serine
NH44784_042561-113-0.758274Chemotaxis protein methyltransferase CheR
NH44784_042571110-0.151298Chemotaxis response regulator protein-glutamate
NH44784_0425811111.415812Chemotaxis regulator-transmits chemoreceptor
NH44784_0425911101.739194Chemotaxis response-phosphatase CheZ
NH44784_0426011101.941102methyl-accepting chemotaxis sensory transducer
NH44784_042611182.111555Flagellar biosynthesis protein FlhB
NH44784_042621282.221663Flagellar biosynthesis protein FlhA
NH44784_0426311102.817483Flagellar biosynthesis protein FlhF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042501ACRIFLAVINRP371e-04 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 36.7 bits (85), Expect = 1e-04
Identities = 12/44 (27%), Positives = 23/44 (52%)

Query: 4 VIGFLVVIVSVIGSFVALGGHMGALYQPFELTLIFGAAFGAFLA 47
++G +V+ +V GG GA+Y+ F +T++ A +A
Sbjct: 442 LVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVA 485


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042511OMPADOMAIN444e-07 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 44.2 bits (104), Expect = 4e-07
Identities = 34/126 (26%), Positives = 56/126 (44%), Gaps = 11/126 (8%)

Query: 167 FATGRAEVQPYMRDILRELGPVLNEL---PNKVSISGHTDASQYARGERAYSNWELSADR 223
F +A ++P + L +L L+ L V + G+TD G AY N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDR----IGSDAY-NQGLSERR 277

Query: 224 ANASRQELVAGGMNESKVMRIQGLSSSMSLVKDDPYAAVNRRISLVVLNQSTQRRIENEN 283
A + L++ G+ K+ +G+ S + V + V +R +L+ + RR+E E
Sbjct: 278 AQSVVDYLISKGIPADKI-SARGMGES-NPVTGNTCDNVKQRAALIDC-LAPDRRVEIEV 334

Query: 284 AAAADV 289
DV
Sbjct: 335 KGIKDV 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042521HTHFIS813e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.6 bits (199), Expect = 3e-21
Identities = 27/104 (25%), Positives = 46/104 (44%), Gaps = 2/104 (1%)

Query: 6 TLTGAGWKVLTAGNGQEALEVAKSHPVDLVVSDWNMPVMGGLQLIQGLREQEQYLDVPVL 65
L+ AG+ V N + DLVV+D MP L+ ++ + D+PVL
Sbjct: 22 ALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIK--KARPDLPVL 79

Query: 66 VLTTEDDVDSKMAARDLGVCGWLSKPVDPDVLVELASELLDEQS 109
V++ ++ + + A + G +L KP D L+ + L E
Sbjct: 80 VMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042531PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 43.3 bits (102), Expect = 2e-06
Identities = 23/151 (15%), Positives = 51/151 (33%), Gaps = 52/151 (34%)

Query: 414 ELDKSLIERIIDPLT--HLVRNSLDHGIETPEKRVAAGKDPVGQLVLSAQHNGGNIVIEV 471
+++ ++++ + P+ LV N + HGI G+++L + G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 472 SDDGAGLNREKILKKAMAQGLPVNENSPDDEIWQLIFAPGFSTAEKVTDISGRGVGMDVV 531
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKN--------------------------------------TKESTGTGLQNV 318

Query: 532 RRNIQDMGG---HVQLSCEPGNGTTTRIVLP 559
R +Q + G ++LS + G +++P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042571HTHFIS536e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 6e-10
Identities = 36/152 (23%), Positives = 64/152 (42%), Gaps = 25/152 (16%)

Query: 27 ARELIKKHNPDVLTLDVEMPRMDGLDFLEKLMRLRP-MPVVMVSSLTERGGEITLRALEL 85
I + D++ DV MP + D L ++ + RP +PV+++S+ ++A E
Sbjct: 39 LWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNT--FMTAIKASEK 96

Query: 86 GAIDFVTKPKLGIRDGLLEYTEIIADKIRAASRAKLRAPSPHAPAAAPAPMLRRPLSSSE 145
GA D++ KP + TE+I RA + K R + S +
Sbjct: 97 GAYDYLPKP--------FDLTELIGIIGRALAEPKRRPS-------------KLEDDSQD 135

Query: 146 KLVIVGASTGGTEAIREVLQPLPPDSPAILIT 177
+ +VG S E R + + + D ++IT
Sbjct: 136 GMPLVGRSAAMQEIYRVLARLMQTDLT-LMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042581HTHFIS852e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.9 bits (210), Expect = 2e-22
Identities = 31/107 (28%), Positives = 53/107 (49%), Gaps = 3/107 (2%)

Query: 5 GIKILVVDDFPTMRRIIRNLLKELGFENVDEAEDGAIGLEKLRNGGFQFVVSDWNMPNLD 64
G ILV DD +R ++ L G++ V + A + G VV+D MP+ +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 65 GLEMLKQIRADASLASLPVLMVTAEAKKENIVAAAQAGANGYVVKPF 111
++L +I+ LPVL+++A+ + A++ GA Y+ KPF
Sbjct: 62 AFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042601CHANLCOLICIN300.026 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.0 bits (67), Expect = 0.026
Identities = 53/323 (16%), Positives = 112/323 (34%), Gaps = 36/323 (11%)

Query: 62 AQLAQEQARAAEALAASGAQIAALSA-----------STSTHAREIADVSGQNLRAAEQA 110
A+ A AAEA A + A AL+ ++ +++ N A +
Sbjct: 67 AEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAE 126

Query: 111 LAELSEVK-ERVDRMTREMA--AFTDVVGQLTDRARSVGDISKLIKDIALQTQLLALNAG 167
L K E R E A AF + + + R + + +K + + LA A
Sbjct: 127 DERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLA--AL 184

Query: 168 VEAARAGD-AGRGFAVVASEVGRLAERVNAATSDIG-----------RHTGEMLELVDST 215
E A+A + A + + SEV ++ + S + G+ EL ++
Sbjct: 185 SEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQAS 244

Query: 216 QRQTGTLREDVDASGAVLDKTR-----QDFQHFVRDFDNMNRQVGEVVQAIGEVDATNHG 270
+ S D + + + V + +V + ++ N
Sbjct: 245 AKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINAD 304

Query: 271 MSQDVSRIAALSADVRERVASMS-GEIDRIRRQTESVQEVLSDMRTGNTAF-DRLSEAL- 327
++Q I+ +S + +A + E + + Q + + D +F L+E
Sbjct: 305 ITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364

Query: 328 DAFRAAATRLLEQARARGLDVFD 350
+ + A L ++++ + + +
Sbjct: 365 EKYSKMAQELADKSKGKKIGNVN 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042611TYPE3IMSPROT366e-128 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 366 bits (940), Expect = e-128
Identities = 93/343 (27%), Positives = 176/343 (51%), Gaps = 2/343 (0%)

Query: 8 EKTEAASPRRLEKAREEGQIARSRELGTFMMLAVGVGAIWAGGGTIYKGLSGVLRNGLAF 67
EKTE +P+++ AR++GQ+A+S+E+ + ++ + ++ S ++ +
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLML--IPA 61

Query: 68 DQRVVADPGVMVEQAVNGFGHALMVILPIFGLLAVIAVLSSVLLGGIVISGKPLSPNFSK 127
+Q + + N + P+ + A++A+ S V+ G +ISG+ + P+ K
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 128 LSLFAGLKRMFSAQTVVELIKALAKAMLVGGVAVWILWRYHDDMLGLMHVAPSAALTKAL 187
++ G KR+FS +++VE +K++ K +L+ + I+ +L L
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 188 SLVAMCCAFIVASLLVIVLLDVPWQIWSHLKKLRMSKEDVRQEHKESEGDPHTKARIRQQ 247
++ +VI + D ++ + ++K+L+MSK+++++E+KE EG P K++ RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 248 QRQAARRRMMSEVPKADVVVTNPTHYAVALKYEEDKNGAPRVLAKGTGLIAAKIRELAAE 307
++ R M V ++ VVV NPTH A+ + Y+ + P V K T +R++A E
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 308 HRIPTLEAPPLARALHQHVELGQEIPAELYTAVAEVLAWVFQL 350
+P L+ PLARAL+ + IPAE A AEVL W+ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042631PF03544403e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.0 bits (93), Expect = 3e-05
Identities = 20/77 (25%), Positives = 26/77 (33%), Gaps = 3/77 (3%)

Query: 144 PEQEVLQEPVARQAPPAAVP---PAAPVVAAPAPAAPVRVEMPVLPPARPAPSVAPSPMA 200
L+ P A Q PP V P + P APV +E P P V
Sbjct: 55 VAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQP 114

Query: 201 RMPDAPLRNGPAMPPAA 217
+ P+ + PA P
Sbjct: 115 KRDVKPVESRPASPFEN 131



Score = 38.8 bits (90), Expect = 6e-05
Identities = 28/132 (21%), Positives = 38/132 (28%), Gaps = 7/132 (5%)

Query: 89 PPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRSIAAYQSAYATPGVPEQEV 148
P PA P S + A A + PP P P P A + + +
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEP----IPEPPKEAPVVIEKPKP 99

Query: 149 LQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPARP--APSVAPSPMARMPDAP 206
+P + P PA+P P P + A + P
Sbjct: 100 KPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159

Query: 207 L-RNGPAMPPAA 217
L RN P P A
Sbjct: 160 LSRNQPQYPARA 171



Score = 38.0 bits (88), Expect = 1e-04
Identities = 22/118 (18%), Positives = 33/118 (27%), Gaps = 5/118 (4%)

Query: 71 AGTPAPQAQPAPVAPRLQPPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRS 130
A PQA P P ++P P A + P P P R
Sbjct: 58 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRD 117

Query: 131 IAAYQSAYATPGVPEQEVLQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPA 188
+ +S A+P P + A + PV + + + P P
Sbjct: 118 VKPVESRPASPFENTA-----PARPTSSTATAATSKPVTSVASGPRALSRNQPQYPAR 170



Score = 35.7 bits (82), Expect = 5e-04
Identities = 22/138 (15%), Positives = 33/138 (23%), Gaps = 7/138 (5%)

Query: 73 TPAPQAQPAPVAPRLQPPPAMPASYGLGAHAAMAPPPHTAPIAPPAAPQPGYAVPSRSIA 132
PA VAP PP + P P PI P P +
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQP---PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKP 101

Query: 133 AYQSAYATPGVPEQEVLQEPVARQAPPAAVPPAAPVVAAPAPAAPVRVEMPVLPPARPAP 192
+ + ++ +R A P AP + A + + P
Sbjct: 102 KPKPKPVKKVEQPKRDVKPVESRPASPF--ENTAPARPTSSTATAATSKPVTSVASGPRA 159

Query: 193 SVAPSPMARMPDAPLRNG 210
P + P
Sbjct: 160 LSRNQP--QYPARAQALR 175


112NH44784_042681NH44784_043031N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_042681414-2.033990Flagellar basal-body rod protein FlgC
NH44784_042691413-1.590858Flagellar basal-body rod modification protein
NH44784_042701213-1.391229Flagellar hook protein FlgE
NH44784_042711112-1.744601Flagellar basal-body rod protein FlgF
NH44784_042721212-2.091452Flagellar basal-body rod protein FlgG
NH44784_042731110-2.266215Flagellar L-ring protein FlgH
NH44784_04274109-2.059469Flagellar P-ring protein FlgI
NH44784_042751010-1.530027Flagellar protein FlgJ [peptidoglycan
NH44784_042761111-1.439947Flagellar hook-associated protein FlgK
NH44784_042771216-0.807102Flagellar hook-associated protein FlgL
NH44784_042781316-0.593506hypothetical protein
NH44784_042791213-0.433960Methyl-accepting chemotaxis protein I (serine
NH44784_042801113-0.166695Methyl-accepting chemotaxis protein I (serine
NH44784_042811-111-1.246167putative lipoprotein
NH44784_042821012-1.162519hypothetical protein
NH44784_042831112-1.006851Aerotaxis sensor receptor protein
NH44784_042841212-0.382908methyl-accepting chemotaxis protein
NH44784_042851115-1.372433Flagellar biosynthesis protein FliR
NH44784_042861015-1.137423Flagellar biosynthesis protein FliQ
NH44784_042871-214-0.822705Flagellar biosynthesis protein FliP
NH44784_0428810151.827884Flagellar biosynthesis protein FliQ
NH44784_0428910151.080802Flagellar motor switch protein FliN
NH44784_042901-1151.545475Flagellar motor switch protein FliM
NH44784_042911-1173.250466Flagellar biosynthesis protein FliL
NH44784_042921-1182.324943hypothetical protein
NH44784_042931-2202.016575Flagellar hook-length control protein FliK
NH44784_042941-2210.541015Flagellar protein FliJ
NH44784_042951-2180.694885Flagellum-specific ATP synthase FliI
NH44784_042961018-0.102616Flagellar assembly protein FliH
NH44784_042971-116-0.450000Flagellar motor switch protein FliG
NH44784_0429810161.705359Flagellar M-ring protein FliF
NH44784_042991-1161.894539Flagellar hook-basal body complex protein FliE
NH44784_0430010171.795786hypothetical protein
NH44784_043011122-0.354128Flagellar biosynthesis protein FlhB
NH44784_043021222-0.824716ortholog of Bordetella pertussis (BX470248)
NH44784_043031119-1.349920ortholog of Bordetella pertussis (BX470248)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042681FLGHOOKAP1300.004 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.5 bits (66), Expect = 0.004
Identities = 12/41 (29%), Positives = 20/41 (48%)

Query: 96 SMPNVDPVAETVNMIAASRSYQANVEVLNTAKSLMQKTLTI 136
S+ V+ E N+ + Y AN +VL TA ++ + I
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042701FLGHOOKAP1432e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.0 bits (101), Expect = 2e-06
Identities = 24/102 (23%), Positives = 46/102 (45%), Gaps = 9/102 (8%)

Query: 386 QYTNGETSIVGTIVLAD-FANLQGLQPVGNNAWKETATSGQPILGQPGSNGLSKVVGQAT 444
+ ++ G D +A+L +GN TAT N ++++ Q
Sbjct: 453 DLQSNSKTVGGAKSFNDAYASLVSD--IGNK----TATLKT--SSATQGNVVTQLSNQQQ 504

Query: 445 ESSNVDMSKELVNMIIAQRTYQANSQTIKTQDEIMQVLMNLK 486
S V++ +E N+ Q+ Y AN+Q ++T + I L+N++
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 31.5 bits (71), Expect = 0.008
Identities = 15/52 (28%), Positives = 23/52 (44%), Gaps = 4/52 (7%)

Query: 1 MNAAAQNLDVIGNNIANSGTVGFKSSTASFAD----VYASSRVGLGTKVAAI 48
+NAA L+ NNI++ G+ T A + A VG G V+ +
Sbjct: 11 LNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042721FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 221 ETSNVNVAEELVNMITTQRAYEMNSKAVKTSDEMLARLTQL 261
S VN+ EE N+ Q+ Y N++ ++T++ + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 39.6 bits (92), Expect = 8e-06
Identities = 15/78 (19%), Positives = 32/78 (41%), Gaps = 14/78 (17%)

Query: 4 SLWIAKTGLEGQQTSMDVISNNLANVSTNGFKRGRAVFQDLMYQTLRQPGAQVGDATQLP 63
+ A +GL Q +++ SNN+++ + G+ R + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLG 48

Query: 64 SGLQLGTGARVAATERIH 81
+G +G G V+ +R +
Sbjct: 49 AGGWVGNGVYVSGVQREY 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042731FLGLRINGFLGH1973e-66 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 197 bits (503), Expect = 3e-66
Identities = 118/215 (54%), Positives = 148/215 (68%), Gaps = 6/215 (2%)

Query: 3 VAGCAMIPPEPIVTGPTTAAPPPPPMPAAQPTGSIY---QPTTYGNYPLFEDRRPRNVGD 59
+ GCA IP P+V G T+A P P P P A GSI+ QP YG PLFEDRRPRN+GD
Sbjct: 19 LTGCAWIPSTPLVQGATSAQPVPGPTPVA--NGSIFQSAQPINYGYQPLFEDRRPRNIGD 76

Query: 60 IVTIVLNEKTNASKNVATNTNRSGSASLGITAAPSFMDSW-ANANLNTDAKGGNVAQGKG 118
+TIVL E +ASK+ + N +R G + G P ++ NA + +A GGN GKG
Sbjct: 77 TLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKG 136

Query: 119 DSTANNAFTGTITTTVVGVMSNGNLQVAGEKQIAINRGSEYVRFSGVVDPRSITGTNTVS 178
+ A+N F+GT+T TV V+ NGNL V GEKQIAIN+G+E++RFSGVV+PR+I+G+NTV
Sbjct: 137 GANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVP 196

Query: 179 STQVADARIEYRSKGVMDEVQTMGWLQRFFLIASP 213
STQVADARIEY G ++E Q MGWLQRFFL SP
Sbjct: 197 STQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042741FLGPRINGFLGI381e-134 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 381 bits (981), Expect = e-134
Identities = 154/357 (43%), Positives = 216/357 (60%), Gaps = 9/357 (2%)

Query: 14 LLAGASAAHAERLKDLASIQGVRGNQLIGYGLVVGLDGSGDQVRQTPFTQQSLTNMLSQL 73
L + A R+KD+AS+Q R NQLIGYGLVVGL G+GD +R +PFT+QS+ ML L
Sbjct: 19 LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNL 78

Query: 74 GITVPAGSNMQLKNVAAVMVTTTLPAFARPGQTLDVVVSSMGNAKSLRGGTLLMTPLKGA 133
GIT G + KN+AAVMVT LP FA PG +DV VSS+G+A SLRGG L+MT L GA
Sbjct: 79 GITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGA 137

Query: 134 DSQVYAIAQGNILVGGAGASAGGSSVQINQLNGGRISGGAIVERGVPTTFARDGYIYLEM 193
D Q+YA+AQG ++V G A +++ R+ GAI+ER +P+ F + L++
Sbjct: 138 DGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQL 197

Query: 194 NNTDFGTAQNVVNALNR----KFGQGTATALDGRVVQVRTPMEQASQARFLSQVEDLQVT 249
N DF TA V + +N ++G A D + + V+ P A R ++++E+L V
Sbjct: 198 RNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRLMAEIENLTVE 256

Query: 250 RAPTVAKVIINARTGSVVMNRTVMIEEAAVAHGNLSVIINRQNQVSQPDTPFTEGQTVVV 309
T AKV+IN RTG++V+ V I AV++G L+V + QV QP PF+ GQT V
Sbjct: 257 -TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSRGQTAVQ 314

Query: 310 PNTQIEVRQDNGSLQRVTTSANLADVVKGLNALGATPQDLLAILQAMKTAGALRAEL 366
P T I Q+ + + +L +V GLN++G ++AILQ +K+AGAL+AEL
Sbjct: 315 PQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042751FLGFLGJ2269e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 226 bits (578), Expect = 9e-75
Identities = 128/309 (41%), Positives = 186/309 (60%), Gaps = 13/309 (4%)

Query: 16 SVFDMGRLSDLKRDVTKDPANPGTEQQKQVAKQFEALFLQMMLKRMREATPKEGLFDSQQ 75
+ +D L++LK +DPA + VA+Q E +F+QMMLK MR+A PK+GLF S+
Sbjct: 11 AAWDAQSLNELKAKAGEDPAA----NIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEH 66

Query: 76 TQMLQSMADEQLALHL-ATPGIGLSQSILAQMQQGKPGDLPAEAVQRLGQGTDLDFQTGG 134
T++ SM D+Q+A + A G+GL++ ++ QM +P + + +T
Sbjct: 67 TRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPM----KFPLETVV 122

Query: 135 SRQVSALMDVMRNNRASDRALAAAEGAPEHVINFVSKMSRAATLASQQSGVPARLIMGQA 194
Q AL +++ + + P F++++S A LASQQSGVP LI+ QA
Sbjct: 123 RYQNQALSQLVQKAVPRN----YDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 195 ALESGWGQREIKHEDGRTSYNLFGIKAGPSWKGKVVNVLTTEYEDGVAKKVTQPFRAYSS 254
ALESGWGQR+I+ E+G SYNLFG+KA +WKG V + TTEYE+G AKKV FR YSS
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 255 YEESFADYARLIGNSPRYEAVTQARNEIEAARRIQSAGYATDPQYAQKLIGVMSQLRGAA 314
Y E+ +DY L+ +PRY AVT A + + A+ +Q AGYATDP YA+KL ++ Q++ +
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSIS 298

Query: 315 AKVDISRQM 323
KV + M
Sbjct: 299 DKVSKTYSM 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042761FLGHOOKAP1327e-108 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 327 bits (840), Expect = e-108
Identities = 208/556 (37%), Positives = 319/556 (57%), Gaps = 21/556 (3%)

Query: 7 ALGGLNASQAGLATTGHNINNATTVGYNRQRVMISTAGAQATTNGYIGRGVQVDTVVRSY 66
A+ GLNA+QA L T +NI++ GY RQ +++ A + G++G GV V V R Y
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQREY 66

Query: 67 DSFLYKQLVGAQGSGAQLQAQLDQVSQVNNLFADRTVGIAPGLTNFFTSMNTVASKPADP 126
D+F+ QL AQ + L A+ +Q+S+++N+ + T +A + +FFTS+ T+ S DP
Sbjct: 67 DAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVSNAEDP 126

Query: 127 AARQDLLGKANSLTTQIRSAYQEMQNQRLGLNTQITTTVEQVNSYLTRINDLNSQISTAR 186
AARQ L+GK+ L Q ++ Q +++Q +N I +V+Q+N+Y +I LN QIS
Sbjct: 127 AARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQISRLT 186

Query: 187 AKAGGNPPNDLLDQRDQAVSELNQLIGVT-TYEQGDKLSISLASGGQALLSGDTVYPLQA 245
G PN+LLDQRDQ VSELNQ++GV + + G +I++A+ G +L+ G T L A
Sbjct: 187 GVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMAN-GYSLVQGSTARQLAA 245

Query: 246 VSSAKDVSRTVIAYTLPAGSGGKTVAVELNDAEVTGGKLGGLLQFRATSLDFMQAQLGQM 305
V S+ D SRT +AY G +E+ + + G LGG+L FR+ LD + LGQ+
Sbjct: 246 VPSSADPSRTTVAYV-----DGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 306 AVGLALSFNEQHRQGLDTAGNPGTDFFSIGKPQGVPNAGNKSGAQISGEFTNVNNINAKD 365
A+ A +FN QH+ G D G+ G DFF+IGKP + N NK I T+ + + A D
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 366 YEISFDGTNYMVTRLPEGTQVYNGPATGTPPTATLDLN---AEMGVTLTIDSPPQAGDKW 422
Y+ISFD + VTR A+ T T T D N A G+ LT P D +
Sbjct: 361 YKISFDNNQWQVTR----------LASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSF 410

Query: 423 ALSPTRDAARDIKVIITDPEKVAAADTE-GGDANGKNALKLAQLQNTKVLGHGTMSITEM 481
L P DA ++ V+ITD K+A A E GD++ +N L LQ+ G S +
Sbjct: 411 TLKPVSDAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDA 470

Query: 482 FGQVVNTVGVQTAQIQSANTAQKNLIAQKTAAQQAVSGVNLNEEYVSLSLYQEQYQASAR 541
+ +V+ +G +TA +++++ Q N++ Q + QQ++SGVNL+EEY +L +Q+ Y A+A+
Sbjct: 471 YASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQ 530

Query: 542 IIDVASTLFDTLLGLR 557
++ A+ +FD L+ +R
Sbjct: 531 VLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042771FLAGELLIN449e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 43.9 bits (103), Expect = 9e-07
Identities = 60/363 (16%), Positives = 120/363 (33%), Gaps = 23/363 (6%)

Query: 1 MRLSTSMMYSNGLKGVLAQESDMNRLVEQVGSGKKFLTPADDPLSASLAINVAQTQSMNS 60
++T+ + + +S ++ +E++ SG + + DD +A AI T ++
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDD--AAGQAIANRFTSNIKG 59

Query: 61 TYQLNRNT--AKTNLGQENNVLDSITTALADVRTRVVQAGNGTFADSDRQALSTALKNAR 118
Q +RN + L+ I L VR VQA NGT +DSD +++ ++
Sbjct: 60 LTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRL 119

Query: 119 DALLGLANSTDGNGQYLFSGYQGGVVPYSQDTNGKI------VYSGATGERTVQVDQSRQ 172
+ + ++N T NG + S + + I + + G V+ ++
Sbjct: 120 EEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 173 MSTSDIGSDIFNRANPGSQAYVSTAAQANTGTAQFSTVSVTPGSPNIGKDFRLQFESDPA 232
+ D+ S N + A + + + + T + P D
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV------------PDKV 227

Query: 233 TGNMGYRVITTDPNANPPVPPVTTPAPPAAPTVYTPDAAIDFGGVSVVIKGTPQNGDVID 292
N +TTD N + A T A G G
Sbjct: 228 YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFD-YKGVTFT 286

Query: 293 VQSVQSADVDMFNTLDSLIKTLDSPIAGDPVALAKLNNELATANKKLSTNYDNVQTVAAS 352
+ + D + + + + +A A ++ ++K + T+ N Q
Sbjct: 287 IDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDD 346

Query: 353 VGA 355

Sbjct: 347 KTK 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042791PF03544330.002 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.4 bits (76), Expect = 0.002
Identities = 26/122 (21%), Positives = 35/122 (28%), Gaps = 10/122 (8%)

Query: 518 DVIEVPARQLAQQHSAPRVAAAQTE----AQVSAARTAKPAAQAAAKPEPAAEP----DH 569
VIE+PA AQ S VA A E Q +P + PEP E +
Sbjct: 39 QVIELPAP--AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK 96

Query: 570 SPVTPPAAPAPRLAQPVRARPTANSGATAARPLRRPVVKTTDATDVKPVKPAPPAARRAP 629
P P P R + A P ++ P + +
Sbjct: 97 PKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156

Query: 630 PA 631
P
Sbjct: 157 PR 158



Score = 31.1 bits (70), Expect = 0.009
Identities = 18/99 (18%), Positives = 27/99 (27%), Gaps = 1/99 (1%)

Query: 534 PRVAAAQTEAQVSAARTAKPAAQAAAKPEPAAEPDHSPVTPPAAPAPRLAQPVRARPTAN 593
P A + V+ A P A PEP EP+ P P P + +P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQP-PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 594 SGATAARPLRRPVVKTTDATDVKPVKPAPPAARRAPPAD 632
+ + +P A R +
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042801PF05272357e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.4 bits (81), Expect = 7e-04
Identities = 13/54 (24%), Positives = 18/54 (33%)

Query: 510 ARLTHSAPRAKTATAEAASAARPPRRPAPRAAANDAKPAPAATRRQAPADDDWE 563
AR + + TA A A PP++ P A A P +W
Sbjct: 380 ARALLADVSSPTAAAGGAGGGEPPKKRDPSAGAGTDPGGPGGGDDGEDPFGEWL 433


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042851TYPE3IMRPROT1673e-53 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 167 bits (425), Expect = 3e-53
Identities = 121/252 (48%), Positives = 179/252 (71%), Gaps = 1/252 (0%)

Query: 1 MFSFTIEQLNGWIGQFLWPFVRILALVGTAPLFSESTVPVKVKIGLAFVLAVAVSPALDP 60
M T EQ W+ + WP +R+LAL+ TAP+ SE +VP +VK+GLA ++ A++P+L P
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSL-P 59

Query: 61 APPIAPGSFAGLWMVMQQVLIGIAMGFTMRLVFAAVQTAGEFVGLQMGLSFASFFDPSTG 120
A + SF LW+ +QQ+LIGIA+GFTM+ FAAV+TAGE +GLQMGLSFA+F DP++
Sbjct: 60 ANDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASH 119

Query: 121 ANTAVLSRLFNIVAMLTFLALDGHLLVLAALVRSFDTLPVAAIQLHQNGWGVVVEWGKTV 180
N VL+R+ +++A+L FL +GHL +++ LV +F TLP+ L+ N + + + G +
Sbjct: 120 LNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLI 179

Query: 181 FVSGLLLALPLICALLTINLAMGILNRAAPQLSVFSVGFPVSLIVGVVLLTVVLPNSGPF 240
F++GL+LALPLI LLT+NLA+G+LNR APQLS+F +GFP++L VG+ L+ ++P PF
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPF 239

Query: 241 LESLFESGLSAM 252
E LF + +
Sbjct: 240 CEHLFSEIFNLL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042861TYPE3IMQPROT608e-16 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 60.2 bits (146), Expect = 8e-16
Identities = 26/78 (33%), Positives = 46/78 (58%)

Query: 4 ETVMTMTYQAMKIALAMAGPLLVITLIVGLVISIFQAATQINEMTLSFIPKLLAMCGVLV 63
+ ++ +A+ + L ++G ++ I+GL++ +FQ TQ+ E TL F KLL +C L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 LLGPWLIGIMVDYIRQLI 81
LL W +++ Y RQ+I
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042871FLGBIOSNFLIP287e-100 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 287 bits (735), Expect = e-100
Identities = 162/239 (67%), Positives = 190/239 (79%), Gaps = 2/239 (0%)

Query: 21 ALLGLALFPAGVVAQATLPALTATPGPGGSETYSLSMQTLLLMTSLSFLPAALLMMTGFT 80
A + L L AQ LP +T+ P PGG +++SL +QTL+ +TSL+F+PA LLMMT FT
Sbjct: 8 APVLLWLITPLAFAQ--LPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 81 RIIIVLGLLRSAMGTAMSPPNHVLIGLALFLTFYTMSPVFDKIYTDAYKPLSEGSIQFEA 140
RIIIV GLLR+A+GT +PPN VL+GLALFLTF+ MSPV DKIY DAY+P SE I +
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 141 AVERAAGPLRTFMLHQTRENDLALFANLAKQPALEDPSQVPLRILVPAFITSELKTAFQI 200
A+E+ A PLR FML QTRE DL LFA LA L+ P VP+RIL+PA++TSELKTAFQI
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 201 GFTIFIPFLIIDLVVASVLMALGMMMVPPVTVALPFKLMLFVLADGWNLLMGSLAQSFY 259
GFTIFIPFLIIDLV+ASVLMALGMMMVPP T+ALPFKLMLFVL DGW LL+GSLAQSFY
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSFY 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042891FLGMOTORFLIN1432e-46 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 143 bits (361), Expect = 2e-46
Identities = 89/157 (56%), Positives = 105/157 (66%), Gaps = 28/157 (17%)

Query: 28 NPPTQADGLKPAQDD-WAAAMAEQTAATPPATAAPAPAAAAPAAPAPAAPAAQSAAQSVF 86
NP + G A DD WA A+ EQ A T +SAA +VF
Sbjct: 6 NPSDENTG---ALDDLWADALNEQKATT-----------------------TKSAADAVF 39

Query: 87 KPLAGSA-AGNGTDIDLIMDVPVQLTVELGRTRLTIKNLLQLGQGSVVELDGLAGEPMDI 145
+ L G +G DIDLIMD+PV+LTVELGRTR+TIK LL+L QGSVV LDGLAGEP+DI
Sbjct: 40 QQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDI 99

Query: 146 FVNGYLIAQGEVVVVEDKYGIRLTDIITPSERINRLN 182
+NGYLIAQGEVVVV DKYG+R+TDIITPSER+ RL+
Sbjct: 100 LINGYLIAQGEVVVVADKYGVRITDIITPSERMRRLS 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042901FLGMOTORFLIM2724e-92 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 272 bits (697), Expect = 4e-92
Identities = 85/282 (30%), Positives = 147/282 (52%), Gaps = 5/282 (1%)

Query: 7 LSQDEVDALLAGV-TGESDSK-KESAGPNDGARAYDLSSPDRVVRRRMQTLELINERFAR 64
LSQDE+D LL + +G++ + YD PD+ + +M+TL L++E FAR
Sbjct: 5 LSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFAR 64

Query: 65 QMRSVLLNFMRRSADITVGSIKIQKYADFERNLPVPSNLNMVQMKPLRGTALFTYDPNLV 124
+ L +R + V S+ Y +F R++P PS L ++ M PL+G A+ DP++
Sbjct: 65 LTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSIT 124

Query: 125 FLVIDSLFGGDGRYHTRVEGRDFTTTEQRIIRRLLNLTLESYGKSWDPVYPIEFDYVRSE 184
F +ID LFGG G+ RD T E ++ ++ L + +SW V + + E
Sbjct: 125 FSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQIE 182

Query: 185 MHTKFASITGNNEVVVVTSFHIEFGATGGDLNICLPYSMIEPVRDLL-TRPLQETTLEEV 243
+ +FA I +E+VV+ + + G G +N C+PY IEP+ L ++ +
Sbjct: 183 TNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRRSS 242

Query: 244 DQRWSQQLSRQVRSADIDVVAEFARIPSSIRELMRMKVGDIL 285
++ L ++ + D+DVVAE + S+R+++ ++VGDI+
Sbjct: 243 TTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDII 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042931FLGHOOKFLIK591e-11 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 59.1 bits (142), Expect = 1e-11
Identities = 79/313 (25%), Positives = 121/313 (38%), Gaps = 6/313 (1%)

Query: 147 AAVVQAALTANDAIATPRTPVPATAQQAEAAAALASRDAQVAQPVVTAAVKAGSAPAVEA 206
+ +V A AN I TP +Q+ + ++ +A K A +
Sbjct: 66 SDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKADDLNE 125

Query: 207 APVKTAPAAELPRPAPEAAAPRAANPRVAAPEAAPALPQHARTEE-DPAAPLPSTAQQFQ 265
+ A P + P P P L +E+ A P + Q
Sbjct: 126 DVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAPGTPAQ 185

Query: 266 LPQHATNAQDHLAQAIAAASQRQQPA-PVSNPASVAVPNVPVMLQVATPVGGTHWGTELG 324
A+ I+ S A P+ P ++ P+G W L
Sbjct: 186 PLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQSLS 245

Query: 325 QQMVVMSNNVRQGTQTAELRLDPPDLGPLRVSINLADGVASASFVSAHASVRQAIETAIP 384
Q + + + RQG Q+AELRL P DLG +++S+ + D A VS H VR A+E A+P
Sbjct: 246 QHISLFT---RQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALP 302

Query: 385 QLQQALAQAGISLGQTSV-GEQAAQQEFAQHNGGGSQRQGGGNGAAVADAAIDGQAATVT 443
L+ LA++GI LGQ+++ GE + Q+ A SQR A D ++
Sbjct: 303 VLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVSLQ 362

Query: 444 AARNANALVDTFA 456
N+ VD FA
Sbjct: 363 GRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042941FLGFLIJ683e-17 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 67.5 bits (164), Expect = 3e-17
Identities = 48/145 (33%), Positives = 82/145 (56%)

Query: 1 MPSQLPLDMLINLAKESTDEAARLLGRLSTERTNAERQLGMLQDYRQDYLERLQQAMTTG 60
M L L +LA++ ++AARLLG + AE QL ML DY+ +Y L M+ G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MSASDCHNYQRFIGTLDDAIGQQQNVLRQADENLAKGRLYWQQEKRKLNSYDALAQRELR 120
++++ NYQ+FI TL+ AI Q + L Q + + W+++K++L ++ L +R+
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AQAVVESRREQRANDEYSARLVQRQ 145
A + E+R +Q+ DE++ R R+
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042961FLGFLIH891e-23 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 89.1 bits (220), Expect = 1e-23
Identities = 63/217 (29%), Positives = 108/217 (49%), Gaps = 8/217 (3%)

Query: 21 WQMLSFDE--PPPVEVEP--EPEPDPGPDPEVVMAQLRAQAIAEGREEGYAQGHAAGLEA 76
W+ + D+ PP E P EPE + E + Q AQ + E+GY G A G +
Sbjct: 7 WKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQ 66

Query: 77 GRAEGHEAGLAAGREAGYAEGLAQAREQGAQEAQHLHALVQACAESVGSLEEKMGQSLLT 136
G +G++ GLA G E +GLA+A+ Q A + LV ++ +L+ + L+
Sbjct: 67 GHKQGYQEGLAQGLE----QGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQ 122

Query: 137 LALDIAQQVVRTTLAEQPETVASAVREVLHINPTAGGQMRLWANPADIELIRLHLADELK 196
+AL+ A+QV+ T + ++++L P G+ +L +P D++ + L L
Sbjct: 123 MALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLS 182

Query: 197 EGHWRVLADESIARGGCRAETPFGDIDATLQTRWRRV 233
WR+ D ++ GGC+ GD+DA++ TRW+ +
Sbjct: 183 LHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042971FLGMOTORFLIG301e-103 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 301 bits (772), Expect = e-103
Identities = 113/333 (33%), Positives = 188/333 (56%), Gaps = 2/333 (0%)

Query: 2 KNDSKPMDGMTRSAVLMMSLGEDAAAEVFKYLTAREVQQVGAAMASLKQVTRNDVAVVLE 61
D + G ++A+L++S+G + +++VFKYL+ E++ + +A L+ +T VL
Sbjct: 9 ILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLL 68

Query: 62 EFRQEADQFMAVTLGSDDYIRTVLTKALGSDRAAGLIEDILEAGEGGSGIDALNWLDPHT 121
EF++ + G DY R +L K+LG+ +A +I + L + + + DP
Sbjct: 69 EFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPAN 127

Query: 122 VAELIGDEHPQIIATILVHLERDRAAGVLALLTDRLRNDVMLRIATFGGVQPAALSELTE 181
+ I EHPQ IA IL +L+ +A+ +L+ L ++ +V RIA P + E+
Sbjct: 128 ILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVER 187

Query: 182 VLNSVLAGQGA-KRSKMGGVRTAAEILNMMNSAEEETVVASLRERDNDLAQKIIDEMFVF 240
VL LA + + GGV EI+NM + E+ ++ SL E D +LA++I +MFVF
Sbjct: 188 VLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVF 247

Query: 241 DNLLDIEDRAIQLILKEIDNDTLMVALKGAQEELRAKFLRNMSSRAAEMLREDLEAQGPI 300
++++ ++DR+IQ +L+EID L ALK ++ K +NMS RAA ML+ED+E GP
Sbjct: 248 EDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPT 307

Query: 301 RMSKVETEQKKILQIARRLAESGQIVLGNSGDD 333
R VE Q+KI+ + R+L E G+IV+ G++
Sbjct: 308 RRKDVEESQQKIVSLIRKLEEQGEIVISRGGEE 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042981FLGMRINGFLIF459e-159 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 459 bits (1183), Expect = e-159
Identities = 245/555 (44%), Positives = 345/555 (62%), Gaps = 25/555 (4%)

Query: 2 LEKIRALPKPVLLGAAAALVALIVAVAMWSSEPKYKVLFSNLDDRDGGAIVTALGTMNVP 61
L ++RA P+ L+ A +A VA++VA+ +W+ P Y+ LFSNL D+DGGAIV L MN+P
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 62 YKYNDTGTALLVPADRVYDARLQLASQGLPRGGSVGFELMDNARFGASQFAEQINYQRGL 121
Y++ + A+ VPAD+V++ RL+LA QGLP+GG+VGFEL+D +FG SQF+EQ+NYQR L
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 122 EGELARSIESMHTVQHARVHLAMPRQSLFVRERQPPTASVLLNVYPGRSLSDAQVSAISW 181
EGELAR+IE++ V+ ARVHLAMP+ SLFVRE++ P+ASV + + PGR+L + Q+SA+
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 182 LVASSVPELTAENVSIVDQNGRLLSAPTGEGRGMDADQMRMVRETEQRTVERILTILNPL 241
LV+S+V L NV++VDQ+G LL+ GR ++ Q++ + E R RI IL+P+
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 242 VGPGNVHAQASAEMDFSRREETSEVYRPNQEPGQAAIRSQQTSESNQRGINPAQGVPGAL 301
VG GNVHAQ +A++DF+ +E+T E Y PN + +A +RS+Q + S Q G GVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 302 SNQPPANAQAPIANPPQTQPPRPGQPQQQGQQQQQQTTTGAGAQGAGGTVSDRRDATTNY 361
SNQP +APIA PP QQ Q QT+T + AG S +R+ T+NY
Sbjct: 316 SNQPAPPNEAPIATPPT---------NQQNAQNTPQTSTSTNSNSAGPR-STQRNETSNY 365

Query: 362 EVDRTISHIKQPVGMLKRLSVAVVVNYVRDKDGEPQALPPEELSKLTNLVREAMGYSESR 421
EVDRTI H K VG ++RLSVAVVVNY DG+P L +++ ++ +L REAMG+S+ R
Sbjct: 366 EVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKR 425

Query: 422 GDSLNLVNSQFNDGPPAV---PMWRDPEMISLFKTILAWLVGGVLALWLYRKVRRSVGDY 478
GD+LN+VNS F+ P W+ I WL+ V+A L+RK R
Sbjct: 426 GDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTR 485

Query: 479 LYPPVDP-----------EVAEAERIEAQREAQDLARAKETDRYQDNLERARTMANKDPR 527
E A R+ + Q RA + + +R R M++ DPR
Sbjct: 486 RVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQR-RANQRLGAEVMSQRIREMSDNDPR 544

Query: 528 AVAMVLRTWMTKDEK 542
VA+V+R WM+ D +
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_042991FLGHOOKFLIE602e-15 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 60.1 bits (145), Expect = 2e-15
Identities = 45/107 (42%), Positives = 64/107 (59%), Gaps = 4/107 (3%)

Query: 4 SGLSGIESMLQQMRVAVTKAETGNLAAGEAVAQPDGFAAELQRSIRRVTSAQNAATAQAK 63
S + GIE ++ Q++ A ++ FA +L ++ R++ Q AA QA+
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTIS----FAGQLHAALDRISDTQTAARTQAE 56

Query: 64 AFELGAPDVSLNDVMIDLQKASIGFQTAVQVRNKLVAAYKEISSMAV 110
F LG P V+LNDVM D+QKAS+ Q +QVRNKLVAAY+E+ SM V
Sbjct: 57 KFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043011TYPE3IMSPROT692e-17 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 69.4 bits (170), Expect = 2e-17
Identities = 23/78 (29%), Positives = 34/78 (43%), Gaps = 2/78 (2%)

Query: 18 AVAISYD-DKDAAPRVVAKGYGQLADTIVRAAEENGLYVHESRELV-GLLMQVDLDAHIP 75
A+ I Y + P V K T+ + AEE G+ + + L L +D +IP
Sbjct: 268 AIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIP 327

Query: 76 PQLYVAVAELLAWLYRLE 93
+ A AE+L WL R
Sbjct: 328 AEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_043031PF05616300.005 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.5 bits (68), Expect = 0.005
Identities = 20/70 (28%), Positives = 25/70 (35%)

Query: 129 QPASQPGAPGQPTAAPAAGQAAAPAPAPPPRRHWPRLPRPVPTPPPRHRPPTGPRAQPDR 188
+P PG+ P A P + A PA P + RP P P P P P
Sbjct: 312 RPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQP 371

Query: 189 AKAPPAPGTP 198
P +P P
Sbjct: 372 GTRPDSPAVP 381


113NH44784_046211NH44784_046411N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0462112181.531770Permeases of the major facilitator superfamily
NH44784_0462210160.797281Lactoylglutathione lyase
NH44784_0462310130.626632hypothetical protein
NH44784_0462410120.333059FIG00433089: hypothetical protein
NH44784_046251-111-0.124563two-component sensor histidine kinase
NH44784_046261-112-0.445631hypothetical protein
NH44784_046271-111-0.522282RNA polymerase sigma-54 factor RpoN
NH44784_0462810110.644529Response regulator of zinc sigma-54-dependent
NH44784_0462911141.518421Putative transcriptional regulatory protein
NH44784_0463011142.263422hypothetical protein
NH44784_0463110143.088206hypothetical protein
NH44784_0463210123.022783hypothetical protein
NH44784_0463311152.551701hypothetical protein
NH44784_0463410171.601844hypothetical protein
NH44784_0463510151.2750203-oxoacyl-[acyl-carrier protein] reductase
NH44784_0463610150.912463hypothetical protein
NH44784_0463712151.904336hypothetical protein
NH44784_0463812161.773381DNA topoisomerase IB (poxvirus type
NH44784_0463911131.831730ThiJ/PfpI family protein
NH44784_0464011113.013488hypothetical protein
NH44784_0464110112.624733hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046211TCRTETB513e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.4 bits (123), Expect = 3e-09
Identities = 71/371 (19%), Positives = 115/371 (30%), Gaps = 62/371 (16%)

Query: 56 MPLFTQAFGVSPAESSLVLSLCTGLLAIAIFLVGLFSQALPRKRIMALSLLASAILGTAA 115
+P F PA ++ V + +I + G S L KR++ ++ +
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 116 AIAPDWHSLLAL-RALQGVAMGGVPAVAMAYLAEEVEPDTLGFAMGLYISGSAFGGLSGR 174
+ + SLL + R +QG PA+ M +A + + G A GL S A G G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 175 VITGLVSDHFGWRAAL------------------------GTLGLLGVV---AAVLFIWL 207
I G+++ + W L G + G++ ++F L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 208 LPASRRFT------------PRRPRGWAGVAGDVRAGIGAHLRNGPLCGLFAMGGLLMGA 255
S + + R D G +N P GG++ G
Sbjct: 217 FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLG-----KNIPFMIGVLCGGIIFGT 271

Query: 256 FVTVYNYVGFRLLAP-----PFSLSQTFIG--FIFVVYLVGIFASTYFGRLADRHGRGAM 308
GF + P LS IG IF + I G L DR G +
Sbjct: 272 VA------GFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYV 325

Query: 309 LAAATGLALAGLLLT--LSDALPLLIAGIVVFTFGFFAAHAVASGWVGQMS--RGYKALA 364
L L L + + I+VF G + + S +
Sbjct: 326 LNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAG 385

Query: 365 ASLYLLVYYVG 375
SL ++
Sbjct: 386 MSLLNFTSFLS 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046251HTHFIS756e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 6e-16
Identities = 29/116 (25%), Positives = 50/116 (43%), Gaps = 1/116 (0%)

Query: 875 VMVVEDEANVREMVCECLAELGMRVLAAEDGDTGLERLRANADIDLLISDVGLPGLNGRQ 934
++V +D+A +R ++ + L+ G V + T + A D DL+++DV +P N
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMPDENAFD 64

Query: 935 LADAARATHPGLKVLLMTGYAESAARGSGFLDTGMELIVKPFALDALAQRVQRMLE 990
L + P L VL+M+ + + KPF L L + R L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046261GPOSANCHOR332e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.1 bits (75), Expect = 2e-04
Identities = 15/70 (21%), Positives = 23/70 (32%), Gaps = 5/70 (7%)

Query: 40 AVQPNRPVPPGQAAPAPNATPSTQSMPPNVDKRGTTPSSDKRSGEMPNRNSEANDNVPRT 99
A Q + + TP + V +G P + + PN+N R
Sbjct: 449 AKQAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTK----PNQNKAPMKETKRQ 504

Query: 100 -PAAGATPPP 108
P+ G T P
Sbjct: 505 LPSTGETANP 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046281HTHFIS328e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 328 bits (843), Expect = e-112
Identities = 126/344 (36%), Positives = 178/344 (51%), Gaps = 32/344 (9%)

Query: 2 DGIMNVSGLSPVMQKLATQVRLAAETDASVFIVGESGAGKEVVARAIHEASERRDHPFIA 61
M + G S MQ++ + +TD ++ I GESG GKE+VARA+H+ +RR+ PF+A
Sbjct: 134 QDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVA 193

Query: 62 VNCGAISGSLAHAELFGHEKGSFTGAVAQNAGYFESASGGTLFLDEVTEMPPDMQVHFLR 121
+N AI L +ELFGHEKG+FTGA ++ G FE A GGTLFLDE+ +MP D Q LR
Sbjct: 194 INMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLR 253

Query: 122 VLETGTYQRVGGTDLLRANARILCASNRDPAASVADGRLRQDFLHRLLVIPLRVPPLRER 181
VL+ G Y VGG +R++ RI+ A+N+D S+ G R+D +RL V+PLR+PPLR+R
Sbjct: 254 VLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDR 313

Query: 182 DGDAVTLARHFLNELNARHGTSKTFSPRMLQAIGRHEWPGNVRELRNAVQRAYILCDEEL 241
D L RHF+ + K F L+ + H WPGNVREL N V+R L +++
Sbjct: 314 AEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDV 373

Query: 242 ----------DAELPRARAAPH---------------------AEFKDGA-LSFPVGTAL 269
+E+P + A F D S L
Sbjct: 374 ITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVL 433

Query: 270 GNAQRDFILATLAHHGGDKRLTAETLGVSLKTLYNRLDVYEKDV 313
+ ILA L G++ A+ LG++ TL ++ V
Sbjct: 434 AEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046291HTHFIS491e-174 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 491 bits (1266), Expect = e-174
Identities = 175/471 (37%), Positives = 255/471 (54%), Gaps = 34/471 (7%)

Query: 2 PHLLIVDDDDAIRETLAELGRDSGFTVALAASVKDALIQLERQAPDLVLTDVRLPGGSGM 61
+L+ DDD AIR L + +G+ V + ++ + DLV+TDV +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 DIFKNVAVA--SAEVVVMTGHGTVDNAVQALRLGATDYLVKPICMERLNGILARIMSNTG 119
D+ + A V+VM+ T A++A GA DYL KP + L GI+ R ++
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 120 RELPGTPFEAPGRFGKMYGSSAPMRELYRQIGRVAPTGVTVLLVGESGTGKELAAHAIHE 179
R + + G SA M+E+YR + R+ T +T+++ GESGTGKEL A A+H+
Sbjct: 124 RRPSKLE-DDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD 182

Query: 180 LSTRRQRPFIAVNCGAISPHLIESEMFGHERGSFTGADRQHKGYFERADGGTLFLDEVTE 239
RR PF+A+N AI LIESE+FGHE+G+FTGA + G FE+A+GGTLFLDE+ +
Sbjct: 183 YGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGD 242

Query: 240 MPLDLQVKLLRVLETGQFMRVGTNREIGCDIRIVAATNRNPEQAVQEGKLREDLYYRLNV 299
MP+D Q +LLRVL+ G++ VG I D+RIVAATN++ +Q++ +G REDLYYRLNV
Sbjct: 243 MPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNV 302

Query: 300 FPLELPPLRERGEDILLLADRFLQAQNEETGRSKAFSARAAAALSQYEWPGNVRELKNFV 359
PL LPPLR+R EDI L F+Q +E K F A + + WPGNVREL+N V
Sbjct: 303 VPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 360 RRAFIMAEGDELDADLLAPQVS--------PSGDVSGGQVSVPV---------------- 395
RR + D + +++ ++ G +S+
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 396 -------GETLAEADRRLILATLERCKGVKKQAAAVLGISPKTLYNRLEEY 439
LAE + LILA L +G + +AA +LG++ TL ++ E
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046301INFPOTNTIATR260.041 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 26.1 bits (57), Expect = 0.041
Identities = 32/119 (26%), Positives = 47/119 (39%), Gaps = 17/119 (14%)

Query: 8 LSLTLIPAAALGLSALALPAAAQTASGASSAVR-------------SNEQID-NQYKMDK 53
+ + L+ AA +GL+ AA S + + N+ ID N + K
Sbjct: 1 MKMKLVTAAIMGLAMSTAMAATDATSLTTDKDKLSYSIGADLGKNFKNQGIDINPDVLAK 60

Query: 54 KQCDAMKGNQKDVCEQQAQATRDKARADAKAGKEKAEANHDAA--KARNEADYKVGKEK 110
D M G Q + E+Q + K + D A K AE N A KA+ +A K K
Sbjct: 61 GMQDGMSGAQLILTEEQMKDVLSKFQKDLMA-KRSAEFNKKAEENKAKGDAFLSANKSK 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046341RTXTOXIND320.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.002
Identities = 18/165 (10%), Positives = 41/165 (24%), Gaps = 10/165 (6%)

Query: 16 FMLLAAPLAARAAQLSPADSHFLQAAAAADAFQLQAARLASERAASTEVRAFAEKMRSSY 75
F ++ R L F +L + +ER E +
Sbjct: 176 FQNVSEEEVLRLTSL--IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 76 QERDAGLRQLARAKRITLPAKAEPGDQRA-----LEAMAGKTGEAFDSLYIEQVALQAHE 130
+ R L + I A E ++ L + + I +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESE--ILSAKEEYQL 291

Query: 131 KSERLYRTAAEQSADAQVRDFALRHQPVLADQLALARALKRDPAA 175
++ ++ L + ++ ++ R P +
Sbjct: 292 VTQLFKNEILDKLRQTTDNIGLLTLELAKNEER-QQASVIRAPVS 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046351DHBDHDRGNASE977e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 97.0 bits (241), Expect = 7e-26
Identities = 71/237 (29%), Positives = 111/237 (46%), Gaps = 13/237 (5%)

Query: 50 LAGMAAIVTGGDSGIGRAVSVLFAREGADVAIVYLNEHEDARETERAVQAEGRRCLLIAG 109
+ G A +TG GIG AV+ A +GA +A V N E + +++AE R
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPA 64

Query: 110 DVRESAFCDRAVEQAAQAFGRLDVLVNNAAYQQHDEGLSAISDEKWDKTLRTNIYGYFYM 169
DVR+SA D + + G +D+LVN A + ++SDE+W+ T N G F
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVN-VAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 170 ARAILPHLRA--GAAIINTGSVTGLRGSGGLLDYSTSKGAIHAFTRSLASNLASQGIRVN 227
+R++ ++ +I+ GS + Y++SK A FT+ L LA IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 228 AVAPGPVWTPLNP---ADRDADE--IPSFGQDTELGRP----AQPEEISPAYVFLAA 275
V+PG T + AD + E I + + G P A+P +I+ A +FL +
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046401PRTACTNFAMLY270.003 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 26.9 bits (59), Expect = 0.003
Identities = 11/37 (29%), Positives = 13/37 (35%)

Query: 10 PPKPEPIPPGSPPGDLPVDPDTDEPEVDLPPLEPPPA 46
PP P+P P P P P + P P A
Sbjct: 572 PPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSA 608


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046411PF06580240.043 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 23.7 bits (51), Expect = 0.043
Identities = 8/44 (18%), Positives = 21/44 (47%)

Query: 1 MSTILLIVLILLLIGAVPAWPYSRGWGYYPSGLLGIVLIVLIVL 44
+ I + ++ L+L A ++ +GW G + + ++ V+
Sbjct: 43 IFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVV 86


114NH44784_046881NH44784_046931N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0468811174.575913General secretion pathway protein H
NH44784_0468912153.876204General secretion pathway protein I
NH44784_0469013163.231405General secretion pathway protein J
NH44784_0469116171.883329hypothetical protein
NH44784_0469213140.217169COGs COG3558
NH44784_046931415-0.011346Transcriptional regulator, TetR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046881BCTERIALGSPH474e-09 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 46.9 bits (111), Expect = 4e-09
Identities = 18/73 (24%), Positives = 34/73 (46%)

Query: 24 QRGFTLIELMVVLVIVGVATAALGLSIRSDPARQLRDDAQRLVERLAAAQSEVRIDGRAI 83
QRGFTL+E+M++L+++GV+ + L+ + R +L Q G+
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFF 62

Query: 84 AWQADADGYRFVR 96
D ++F+
Sbjct: 63 GVSVHPDRWQFLV 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046891PilS_PF08805310.001 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 31.1 bits (70), Expect = 0.001
Identities = 10/38 (26%), Positives = 20/38 (52%)

Query: 18 AGFTLLEILIALAIVSVALAAVMRTTGMLTTNNGVLRE 55
G TL+E+L+ + ++ V A+ + M+ +N E
Sbjct: 26 KGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNE 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046901PF05946320.001 Toxin-coregulated pilus subunit TcpA
		>PF05946#Toxin-coregulated pilus subunit TcpA

Length = 199

Score = 32.2 bits (73), Expect = 0.001
Identities = 23/73 (31%), Positives = 37/73 (50%), Gaps = 10/73 (13%)

Query: 10 TLIEVVIAIMIMAVIS----LISWRAIDSVALTSRRLDQHTEEALALQRAFDQFERDIGA 65
TL+EV+I + IM V+S ++ RAIDS +T ++ +Q A Q R +G
Sbjct: 2 TLLEVIIVLGIMGVVSAGVVTLAQRAIDSQNMTKAAQSLNS-----IQVALTQTYRGLG- 55

Query: 66 RSPDLAESAAPAA 78
P A++ A +
Sbjct: 56 NYPATADATAASK 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_046931HTHTETR698e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.9 bits (168), Expect = 8e-17
Identities = 35/199 (17%), Positives = 65/199 (32%), Gaps = 10/199 (5%)

Query: 1 MPSDTALPSPDTRQRLLQATERLVYAGGIHATGMDLIVRTSGVARKQVYRLYPNKDALVA 60
M T + +TRQ +L RL G+ +T + I + +GV R +Y + +K L +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AALRARDERWMQWFVAASSR-----AQAPRARLLAMFDALREWFGTDDFRGCAF--LNAA 113
+ + + ++ R L+ + ++ F
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 114 GEIGDEASPILAVAREHKARLLEYVRTLTRAAALP---DPDEAAAQLLVLIDGAIAVALV 170
GE+ + E R+ + ++ A LP AA + I G + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 171 TRDPAIADSAGRAAAALLG 189
R A+L
Sbjct: 181 APQSFDLKKEARDYVAILL 199


115NH44784_049361NH44784_049411N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0493612133.490926HlyD family secretion protein
NH44784_0493712122.981003Cobalt-zinc-cadmium resistance protein CzcA;
NH44784_0493810102.924467RND multidrug efflux transporter; Acriflavin
NH44784_049391-193.131125Transcriptional regulator, LysR family
NH44784_049401-1121.888894Acetylornithine
NH44784_049411-2120.865643hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049361RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 16/140 (11%), Positives = 42/140 (30%), Gaps = 1/140 (0%)

Query: 35 VAVSVAAARSGPLPRDLHALGTITPLARV-VLRSQVDGQLQRLHYTEGQAVRRGQLLAEI 93
+ ++ + G + A G +T R ++ + ++ + EG++VR+G +L ++
Sbjct: 68 LVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKL 127

Query: 94 DPRPYQAALAAAEGELAHVEALLGNAEIDLRRYRQLARQEAVAGQQLDTAEAQARSYAAQ 153
+A + L +I R E +
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 154 RQRLAAAVADARRLLALTRI 173
+ + + +
Sbjct: 188 TSLIKEQFSTWQNQKYQKEL 207



Score = 29.4 bits (66), Expect = 0.024
Identities = 18/64 (28%), Positives = 28/64 (43%), Gaps = 11/64 (17%)

Query: 60 LARVVLRSQVDGQLQRLH-YTEGQAVRRGQLLAEIDPRPYQAALAAAEGELAHVEALLGN 118
V+R+ V ++Q+L +TEG V + L I P E + V AL+ N
Sbjct: 325 QQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP----------EDDTLEVTALVQN 374

Query: 119 AEID 122
+I
Sbjct: 375 KDIG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049371ACRIFLAVINRP7910.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 791 bits (2045), Expect = 0.0
Identities = 291/1038 (28%), Positives = 482/1038 (46%), Gaps = 33/1038 (3%)

Query: 3 LSRPFILRPVATSFLMLALLLSGILAWRMLPVAALPQVDYPIIQVTTPYPGASPDVTARA 62
++ FI RP+ L + L+++G LA LPVA P + P + V+ YPGA
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTAPLERRFGQIPGLKQMSSSSGS-GISVITLQFSLDVSLGVAEQEVQAAISASGSLLPS 121
VT +E+ I L MSS+S S G ITL F +A+ +VQ + + LLP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DLPTPPVYRKVNPADVPILTLAVTSDSLPLPQ--VYDLVDTRMTQRLAQLSGVGMVSLAG 179
++ + + ++ SD+ Q + D V + + L++L+GVG V L G
Sbjct: 121 EVQQQGISV-EKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 GQRPAVRVQANPMALAARGLQLSDLQEAIAKANSNQPKGSFDGPV------RSVIMDAND 233
Q A+R+ + L L D+ + N G G + + A
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QLQSAQEYRDLIV-AWRNGAPVRLGEVATVEDGAEDRYLAAWVDRQPAVLVNIQRQPGAN 292
+ ++ +E+ + + +G+ VRL +VA VE G E+ + A ++ +PA + I+ GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 VIAVADQVKALLPQLTASLPAAVQVRVLTDRTESIRASVRGVQWELAFAVGLVVLVTFLF 352
+ A +KA L +L P ++V D T ++ S+ V L A+ LV LV +LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNLPATLIPSLAVPLSLIGTFGVMHLAGFSTNNLTLMALTIGAGFVVDDAIVMLENIAR 412
L+N+ ATLIP++AVP+ L+GTF ++ G+S N LT+ + + G +VDDAIV++EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 Y-REQGHSPMAAALKGAGQIGFTLVSLTLSLIAVLIPLLFMEDVVGRLFREFAITLAVAI 471
E P A K QI LV + + L AV IP+ F G ++R+F+IT+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 LISLAVSLTLTPMMCARLLPP----HAERPPGL-------LDRLQARYAGWLDVTLRHQR 520
+S+ V+L LTP +CA LL P H E G D Y + L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 521 LTLLAMLATVALTALLYLAVPKGFFPAQDGGVLQGVTQSAQSTSFEAMSQRQQALAQSLL 580
LL VA +L+L +P F P +D GV + Q + E + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 581 QD--PDVASLSSFIGIDGMNATLNTGRLLVNLKPWSERGAPLADIMARLDARARQVRGI- 637
++ +V S+ + G N G V+LKPW ER A + ++ I
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 638 SLYLQPVQELNIEDRVSRGQYQFTLTS---PDAALLARWSRALAQRLDAAP-QLADISSD 693
++ P I + + + F L L + L P L + +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 694 LQGGGRQAYLEVSRDAAARLGLTMDDVAQALYNAFGQRQVATLFTQSNQYRVVLEVDPRL 753
Q LEV ++ A LG+++ D+ Q + A G V + ++ ++ D +
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 754 AASPEALERIHLKTADGQPIPLAALATVSERAVPLAVNHLSQFPAVNFSFNLPPGGSLGA 813
PE ++++++++A+G+ +P +A T + + P++ PG S G
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 814 AIAAIEAARQDIAMPPSVELRLQGAAAAFEASLSNTLWLMLAAVVTMYLVLGMLYESAIH 873
A+A +E +P + G + S + L+ + V ++L L LYES
Sbjct: 839 AMALME--NLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 874 PVTILSTLPSATVGALLALLLTGRPLDLIAVIGIILLIGLVKKNGIMMVDFALESERSRG 933
PV+++ +P VG LLA L + D+ ++G++ IGL KN I++V+FA + G
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 934 LAPREAIREAALLRLRPILMTTLAALFGALPLMLATGSGAELRQPLGWVMVGGLLVSQVL 993
EA A +RLRPILMT+LA + G LPL ++ G+G+ + +G ++GG++ + +L
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 994 TLFTTPAVYLFFHRLGAG 1011
+F P ++ R G
Sbjct: 1017 AIFFVPVFFVVIRRCFKG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049381ACRIFLAVINRP7330.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 733 bits (1893), Expect = 0.0
Identities = 291/1035 (28%), Positives = 492/1035 (47%), Gaps = 29/1035 (2%)

Query: 1 MIRALLHRPIACIFLALALTLLGAVAWRLLPVAPLPQVDFPTIEVRAELPGASPESMAST 60
M + RPI LA+ L + GA+A LPVA P + P + V A PGA +++ T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VAAPLERALGGIAGVSAMSSSS-NQGATRVLLQFALDRDINAAARDVQAAINAARAELPS 119
V +E+ + GI + MSS+S + G+ + L F D + A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 GMPGNPTYRKVNPSQAPIMALALSS--PTRPAGRLYDLGATVLAQKLSQIVGVGEVTLGG 177
+ S + +M S P + D A+ + LS++ GVG+V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 178 SSLPAVRVQVNPNALAHYGVALDDLRQSIADAAPMGPQGQLDSA----GQRWEVGTPGQP 233
+ A+R+ ++ + L Y + D+ + GQL GQ+ Q
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 RM--ADDYNGLIVR-HQDGATIRLAQIARVSDSVENRYSSGFHNHDPAVVLTISRQPGAN 290
R +++ + +R + DG+ +RL +ARV EN N PA L I GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 291 IIETIAAINQALPGLRALMPADVDLTVVLDRSPGIHATLREAHVTLGLAVGLVILVVWLF 350
++T AI L L+ P + + D +P + ++ E TL A+ LV LV++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 351 LGSARAAAIPSVAIPVCLVATFAVMYLWGFSLNNLSLMALIVAAGLVVDDAIVVLENISR 410
L + RA IP++A+PV L+ TFA++ +G+S+N L++ +++A GL+VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 411 HL-ERGLGPRQAALRGVREVGFTLVAMTLALSVVFVSILFMGGLVERLFREFSITLVAAI 469
+ E L P++A + + ++ LV + + LS VF+ + F GG ++R+FSIT+V+A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 470 LISLVVSVAIIPSLCARWLRP-PAEAQTRPSRLRAVFARVHQW----YGASLARVLGHAR 524
+S++V++ + P+LCA L+P AE F Y S+ ++LG
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 525 LTLLLLAAVVALNAYLYAQAPKGFLPQQDTGQLMGFVRGDDGFSFQVMQPKIDVYRQLVL 584
LL+ A +VA L+ + P FLP++D G + ++ G + + Q +D L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 585 KHPAVQD-----VIGYNGGSLGISNSLFLIRLKPASER---RESSAQVIDWLRANAPPVP 636
K+ V G++ + + + LKP ER S+ VI + +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 637 GGMFFLNVDQDLRMPGGFGNSGDHELAIMASDVPALRQWSRRI-SRAMQDIPELRDVDAV 695
G + G + AL Q ++ A Q L V
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 696 GDAATQQVVINIDRAAARRLGVDMRTIASVLGNSFSQRQVATLYDPMNQYRVVLELDPRY 755
G T Q + +D+ A+ LGV + I + + V D ++ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 756 TEDPEVLERVQVVAADGNRVPLSAFSTYEHGLVNDRVFHDGLFAAVGVGFSLAEGVSLQQ 815
PE ++++ V +A+G VP SAF+T + R+ ++ + A G S
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 816 GLAAIDAAMARLMVPAHIQTRLGGDARSFQQSLQDQPWLILGVLVAIYLVLGILYESPLH 875
+A ++ ++L PA I G + + S P L+ V ++L L LYES
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 876 PLTILSTLPSAGVGALLALRLADIEFSLIALLGLFLLVGVVMKNAILMIDFALGLERREG 935
P++++ +P VG LLA L + + + ++GL +G+ KNAIL+++FA L +EG
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 936 LTPEQAIHRAAMLRLRPIVMTNLAGLLGALPLVLGMGEGSELRRPLGVTIVGGLMISQFL 995
+A A +RLRPI+MT+LA +LG LPL + G GS + +G+ ++GG++ + L
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 996 TLYTTPIVYLALERL 1010
++ P+ ++ + R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049411PRTACTNFAMLY290.038 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.9 bits (64), Expect = 0.038
Identities = 24/80 (30%), Positives = 37/80 (46%), Gaps = 8/80 (10%)

Query: 15 ALALQATLAAPAAHAAYPEQAIKLVVPFTPG----GATDAVARLLAN---RLSGKLGQAV 67
A+AL A AAPAAHA + Q+I G G+ R + ++SG+ Q +
Sbjct: 20 AMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTIKVSGRQAQGI 79

Query: 68 IVENRPGASTVIGANAVAQA 87
++EN P A +V +
Sbjct: 80 LLEN-PAAELQFRNGSVTSS 98


116NH44784_049621NH44784_049691N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_049621-1111.597160autotransporter
NH44784_049631-2110.722279hypothetical protein
NH44784_049641-181.253441Non-specific DNA-binding protein Dps /
NH44784_049651-1102.526749Xaa-Pro aminopeptidase
NH44784_0496611112.793386Transcriptional regulator YbiH, TetR family
NH44784_0496711132.680835Transcriptional regulator, TetR family
NH44784_0496810141.324559FIG01111792: hypothetical protein
NH44784_0496911131.510218Major facilitator superfamily MFS_1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049621PERTACTIN320.010 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 32.4 bits (73), Expect = 0.010
Identities = 54/183 (29%), Positives = 66/183 (36%), Gaps = 27/183 (14%)

Query: 640 GASPLWAQAFGNWSTLGGDGNASSVKQSAGGFFVGGDGAV---GGGWRLGGALGYT-GSH 695
A W + F L + Q GF +G D AV GG W LGG GYT G
Sbjct: 657 DAGGAWGRGFAQRQQLD-NRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDR 715

Query: 696 SSIADAASRTDVDSYTATLFGGRNFAAGPGHVRFMAGAAYTWHDIDTKRNVAAGSLNQQL 755
D TD + GG +A + F A ++ V AGS +
Sbjct: 716 GFTGDGGGHTD-----SVHVGG--YATYIANSGFYLDATLRASRLENDFKV-AGSDGYAV 767

Query: 756 KSSYRASSTQVFTELGYHLPLNDAYAIEPFAGL-----------AWNQLRTRDFEESGGS 804
K YR V E G D + +EP A L A N LR RD GGS
Sbjct: 768 KGKYRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVGGGAYRAANGLRVRD---EGGS 824

Query: 805 AAL 807
+ L
Sbjct: 825 SVL 827


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049641HELNAPAPROT931e-26 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 93.0 bits (231), Expect = 1e-26
Identities = 41/137 (29%), Positives = 67/137 (48%), Gaps = 1/137 (0%)

Query: 35 ISAVLNQVLADVFALYLKTKNFHWHVSGPHFRDYHLLLDEQGDQLFAMTDPIAERIRKIG 94
+ LN L++ F LY K FHW+V GPHF H +E D D IAER+ IG
Sbjct: 13 VENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIG 72

Query: 95 GTTLRSIGHISRSQRIADNDADYVQPLDMLAELREDNKTLAATLREAHNVTDEHRDIASS 154
G + ++ + I D + +M+ L D K +++ + + +E++D A++
Sbjct: 73 GQPVATVKEYTEHASITDGGNETSAS-EMVQALVNDYKQISSESKFVIGLAEENQDNATA 131

Query: 155 SLIENWIDETERRTWFL 171
L I+E E++ W L
Sbjct: 132 DLFVGLIEEVEKQVWML 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049661HTHTETR568e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.8 bits (134), Expect = 8e-12
Identities = 31/144 (21%), Positives = 55/144 (38%), Gaps = 12/144 (8%)

Query: 7 ESPATERAADTRDRLLRAGLALFSQLGLEGVRTRQLAQAAGVNQSAIPYHFGGKEGVYAA 66
+ A +TR +L L LFSQ G+ ++A+AAGV + AI +HF K +++
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 67 VLEQTAQAIAE--------RLDWPDTAPRTAADAALQL---EAIMRGFAAALL-DSEASA 114
+ E + I E P + R L+ E R + E
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 115 ARSLLLAREQLQPTPKFEAIHAVL 138
+++ ++ ++ I L
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTL 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049671HTHTETR492e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.2 bits (117), Expect = 2e-09
Identities = 23/144 (15%), Positives = 52/144 (36%), Gaps = 4/144 (2%)

Query: 12 RPARPDAEL-RAAALEAATWLLLNQGYAAATLEAVAKRAGMAKKTVYRFAANREDLVAQV 70
R + +A+ R L+ A L QG ++ +L +AK AG+ + +Y ++ DL +++
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 71 VRGWTDTFEPVMAQDVGSAADVLPALARVLQAIADRALSADAVGMFRLLTADFPARAALL 130
+ + ++ A VL+ I L + R L +
Sbjct: 63 WEL---SESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 131 AVYQENGIERGTAMLAAWFEKLAA 154
+ + ++++
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQ 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049691TCRTETB320.004 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 32.2 bits (73), Expect = 0.004
Identities = 23/130 (17%), Positives = 51/130 (39%), Gaps = 4/130 (3%)

Query: 61 TPVVAHVADQRGGAHGLIFAALAASLALFICYPFSQAPAWLFGVTLLLNAVFPAVLPLLD 120
T V ++DQ G L+F + I + + L + + A P L
Sbjct: 66 TAVYGKLSDQLGIKRLLLFGIIINCFGSVIGF-VGHSFFSLLIMARFIQGAGAAAFPALV 124

Query: 121 RMAIA---SGRGQGNSYTIIRACGSLGFALVTVAGGYLIKTFGADWVMWLSMLLIVACLA 177
+ +A +G ++ +I + ++G + GG + +++ + M+ I+
Sbjct: 125 MVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPF 184

Query: 178 CVRLLPQAAR 187
++LL + R
Sbjct: 185 LMKLLKKEVR 194


117NH44784_049801NH44784_049871N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_049801-120-3.157821Taurine transport ATP-binding protein TauB
NH44784_049811-219-2.915141Taurine transport system permease protein TauC
NH44784_049821-121-3.148021Type I secretion outer membrane protein, TolC
NH44784_049831-121-2.975001HlyD family secretion protein
NH44784_049841-121-3.094984cyclolysin secretion ATP-binding protein
NH44784_049851-120-2.581435Alkaline phosphatase
NH44784_049861-115-1.731571Transcriptional regulator, GntR family domain /
NH44784_049871017-1.496432Per-activated serine protease autotransporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049801PF05272310.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.006
Identities = 15/41 (36%), Positives = 22/41 (53%), Gaps = 4/41 (9%)

Query: 41 EPG---EFVVAL-GASGCGKTTLLSLLAGFLSPTDGEITLG 77
EPG ++ V L G G GK+TL++ L G +D +G
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049831RTXTOXIND378e-129 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 378 bits (972), Expect = e-129
Identities = 161/477 (33%), Positives = 270/477 (56%), Gaps = 8/477 (1%)

Query: 5 HAMGAWRDLARRYGAVFAYSWRERKSRERTRYRGDEAEFLPAALSLQEQTVSPAPRVAMW 64
+ + + RY V++ +W+ RK + DE EFLPA L L E VS PR+ +
Sbjct: 3 TWLMGFSEFLLRYKLVWSETWKIRKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAY 62

Query: 65 LLIVFAALALAWSVLGRVDIVATAQGKIIPSEGTQVVQPIGTATIKAIHVREGQAVRAGD 124
++ F +A SVLG+V+IVATA GK+ S ++ ++PI + +K I V+EG++VR GD
Sbjct: 63 FIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGD 122

Query: 125 LLIELDGTVTRAEQQRIVNELQTWALQAAHAEAMLIAIDQGSAPVLDHVPADSRMT---- 180
+L++L A+ + + L L+ + + +I+ P L +P +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK-LPDEPYFQNVSE 181

Query: 181 ---EDARRLARAEFLALQAKLAGIDAEIVRRQAELQTTQALTRKLERTAPIARQRAESLK 237
L + +F Q + + + +++AE T A + E + + + R +
Sbjct: 182 EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS 241

Query: 238 ELAGDQYVARNAYLERERERIEQESDLVMQRSRLKEIEATIAEARQQRETVRADALHASL 297
L Q +A++A LE+E + +E ++L + +S+L++IE+ I A+++ + V + L
Sbjct: 242 SLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 298 NRLNEARQRMALLEQDLVKARQSEHLTRLNAPVSGTVQQLAVRTVGGVVTEAQPLMLVVP 357
++L + + LL +L K + + + + APVS VQQL V T GGVVT A+ LM++VP
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361

Query: 358 QEHVLEIEAFLENKDVGFVVPGQDAEVKVETFPYTKYGIVPSTVKSVSDDAINDEKRGLV 417
++ LE+ A ++NKD+GF+ GQ+A +KVE FPYT+YG + VK+++ DAI D++ GLV
Sbjct: 362 EDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLV 421

Query: 418 YSMRIEMARSTIAVNGADIRLAPGMAVTAEIKTGRRRVIEYFLDPLLQYGAESIRER 474
+++ I + + ++ +I L+ GMAVTAEIKTG R VI Y L PL + ES+RER
Sbjct: 422 FNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049851RTXTOXINA1334e-33 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 133 bits (336), Expect = 4e-33
Identities = 79/255 (30%), Positives = 110/255 (43%), Gaps = 49/255 (19%)

Query: 1660 KLEGSAGNDVLIGGAGDDTLYGHGGDDKLHGGDGDDTLYGGDGIDHLYGGAGRNTLVGGA 1719
K GS D+ G GDD + G+ G+D+L+G G+DTL GG+G D LYGG G + L+G A
Sbjct: 730 KFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVA 789

Query: 1720 GSDVYHVDSADDVIIELPDQGTDRVFATVSYTLADNVEHLHLEGDQAIDGTGNGGNNLLL 1779
G++ + DD E QG + N+L
Sbjct: 790 GNNYLNGGDGDD---EFQVQG------------------------------NSLAKNVLF 816

Query: 1780 GNGAANTLNGGGGDDTLNGGGGDDTLIGGDGNDRYVIARDSGMDRVIEDDDTAGNNDVVA 1839
G + L G G D L+GG GDD L GG GND Y G + +D G D ++
Sbjct: 817 GGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDD---GGKEDKLS 873

Query: 1840 FGAGVAADQLWFSRLGNDL-------EVRIIGTAGGVTISNWFL-----GARYRVEQFTT 1887
A + + F R GNDL V IG G+T NWF + + +EQ
Sbjct: 874 L-ADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFD 932

Query: 1888 TQGAILREDKVEALV 1902
G I+ D ++ +
Sbjct: 933 KSGRIITPDSLKKAL 947



Score = 111 bits (279), Expect = 2e-26
Identities = 76/253 (30%), Positives = 111/253 (43%), Gaps = 51/253 (20%)

Query: 1946 GTAGDDILEGASNNDVLWGLDGDDILIGNDGDDHLHGGTGANTLIGGKGNDVYYIDSPLD 2005
GT D G+ D+ G DGDD++ GNDG+D L+G G +TL GG G+D Y
Sbjct: 724 GTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLY------ 777

Query: 2006 RIIELAGEGGDRVESTISFTLPEHVEGLALKGSGAIN-GTGNDATNTLLGNNAANVLSAG 2064
G+G D++ + + G+ +N G G+D + A NVL G
Sbjct: 778 -----GGDGNDKL--------------IGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGG 818

Query: 2065 KGND---------LINGMDGDDILIGGLGSDRYLFAQEFGQDRIIETESDPGDVDVAVLG 2115
KGND L++G +GDD+L GG G+D Y + +G I + D G D L
Sbjct: 819 KGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDD---DGGKEDKLSLA 875

Query: 2116 GRSFDQLWFRRVGEDL-------EVSVIGTQNRVTVERWYA-----GEQHRIEK-FQAGA 2162
F + F+R G DL V IG +N +T W+ H IE+ F
Sbjct: 876 DIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSG 935

Query: 2163 RFLLDTKVENLVN 2175
R + ++ +
Sbjct: 936 RIITPDSLKKALE 948



Score = 110 bits (276), Expect = 4e-26
Identities = 70/233 (30%), Positives = 106/233 (45%), Gaps = 20/233 (8%)

Query: 1401 GYLHTGNLTGSSYDDVLLGDSQNNVIQGGAGNDRIAGGAGNNTLDGGAGIDTVDYSGAGA 1460
G GS + D+ G +++I+G GNDR+ G GN+TL GG G D + Y G G
Sbjct: 724 GTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQL-YGGDGN 782

Query: 1461 GVVMDLATGSAQNGLGGVDSLSNFENVTGSAYADKLSGNALDNVLDGGRGNDILDGRGGN 1520
L + N L G D F+ S + L G ++ L G G D+LDG G+
Sbjct: 783 DK---LIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGD 839

Query: 1521 DVLIGGAGNDTYLFDLGYGVDRIVENDLTAGNTDTVLFGAGIRVQQLWFSRAGDDLIVAM 1580
D+L GG GND Y + GYG I ++ G D + A I + + F R G+DLI+
Sbjct: 840 DLLKGGYGNDIYRYLSGYGHHIIDDD---GGKEDKLSL-ADIDFRDVAFKREGNDLIMYK 895

Query: 1581 P-------GTADRLTVTNWFL-----GAQYQVEIFQTASGALLHAAGVSALVD 1621
G + +T NWF + +++E SG ++ + ++
Sbjct: 896 GEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALE 948



Score = 95.0 bits (236), Expect = 2e-21
Identities = 57/219 (26%), Positives = 96/219 (43%), Gaps = 31/219 (14%)

Query: 718 NELIASNYG-DTLIGGAWKTVLRGGAGHDTLVVTGTGYYMDGGGGSDTVSYAQWQRAVSA 776
++LI N G D L G L GG G D L GG G+D + ++
Sbjct: 746 DDLIEGNDGNDRLYGDKGNDTLSGGNGDDQL---------YGGDGNDKLIGVAGNNYLNG 796

Query: 777 SLVSGSDDLGSTLAGIENLVGSGWA--DRLTGNGGGNRLDGGAGDDLLVGGGGNDVYVFG 834
G D+ + V G D+L G+ G + LDGG GDDLL GG GND+Y +
Sbjct: 797 G--DGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYL 854

Query: 835 RGYGSDTVQNGIAANGGASSMIRVSAGIGIGDLWYERRGDDLL-------IRILGTKDTL 887
GYG I +GG + ++ I D+ ++R G+DL+ + +G K+ +
Sbjct: 855 SGYGHHI----IDDDGGKEDKLSLA-DIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGI 909

Query: 888 ALQGWYQEAFR-----KVAILELQGGLRLDAAAIESLVE 921
+ W+++ ++ + + G + +++ +E
Sbjct: 910 TFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALE 948



Score = 59.2 bits (143), Expect = 2e-10
Identities = 43/184 (23%), Positives = 66/184 (35%), Gaps = 53/184 (28%)

Query: 1425 VIQGGAGNDRIAGGAGNNTLDGGAGIDTVDYSGAGAGVVMDLATGSAQNG---------- 1474
G G+D++ AG+ + G G D V Y G + T + + G
Sbjct: 613 ESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGG 672

Query: 1475 -------------------------------------LGGVDSLSNFENVTGSAYADKLS 1497
L D+L + E + G+ ADK
Sbjct: 673 DVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFF 732

Query: 1498 GNALDNVLDGGRGNDILDGRGGNDVLIGGAGNDTYLFDLGYGVDRIVENDLTAGNTDTVL 1557
G+ ++ G G+D+++G GND L G GNDT G G D++ GN D ++
Sbjct: 733 GSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDT--LSGGNGDDQL---YGGDGN-DKLI 786

Query: 1558 FGAG 1561
AG
Sbjct: 787 GVAG 790



Score = 45.0 bits (106), Expect = 4e-06
Identities = 41/186 (22%), Positives = 68/186 (36%), Gaps = 19/186 (10%)

Query: 1669 VLIGGAGDDTLYGHGGDDKLHGGDGDDTLYGG---------DGIDHLYGGAGRNTLVGGA 1719
G GDD ++ G ++ G G D +Y DG G T V G
Sbjct: 613 ESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGG 672

Query: 1720 GSDVYH--VDSADDVIIELPDQGTDRVFATVSYTLADNVEHLHLEGDQAIDGTGNGGNNL 1777
V V + + + ++ R + + E +L + + GT +
Sbjct: 673 DVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTR--ADK 730

Query: 1778 LLGNGAANTLNGGGGDDTLNGGGGDDTLIGGDGNDRYVIARDSGMDRVIEDDDTAGNNDV 1837
G+ + +G GDD + G G+D L G GND ++ +G D++ ND
Sbjct: 731 FFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGND--TLSGGNGDDQL----YGGDGNDK 784

Query: 1838 VAFGAG 1843
+ AG
Sbjct: 785 LIGVAG 790



Score = 41.9 bits (98), Expect = 4e-05
Identities = 28/95 (29%), Positives = 39/95 (41%), Gaps = 14/95 (14%)

Query: 1938 VNGPLTLSGTAGDDILE------------GASNNDVLWGLDGDDILIGNDGDDHLHGGTG 1985
V G L+G GDD + G ND L+G +G D+L G +GDD L GG G
Sbjct: 788 VAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYG 847

Query: 1986 ANTLIGGK--GNDVYYIDSPLDRIIELAGEGGDRV 2018
+ G+ + D + + LA V
Sbjct: 848 NDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDV 882



Score = 31.9 bits (72), Expect = 0.043
Identities = 32/140 (22%), Positives = 56/140 (40%), Gaps = 18/140 (12%)

Query: 1966 DGDDILIGNDGDDHLHGGTGANTLIGGKGNDVYY-IDSP---------LDRIIELAGEGG 2015
DGDD + + G +++ G G + + K + Y ID + R++ +
Sbjct: 618 DGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGDVKVL 677

Query: 2016 DRVESTISFTLPEHVEGLALKGS--GAINGTGNDATNTL------LGNNAANVLSAGKGN 2067
V ++ + E + ING T+ L +G A+ K
Sbjct: 678 QEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFGSKFT 737

Query: 2068 DLINGMDGDDILIGGLGSDR 2087
D+ +G DGDD++ G G+DR
Sbjct: 738 DIFHGADGDDLIEGNDGNDR 757


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_049871IGASERPTASE460e-137 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 460 bits (1185), Expect = e-137
Identities = 232/916 (25%), Positives = 368/916 (40%), Gaps = 118/916 (12%)

Query: 52 ALPLLIAAALPGTARAGYASIEIPYQTYRDFAENKGAFRPGALGIPIYDKSGSLRG-SLS 110
+ L +A AL A ++ YQ +RDFAENKG F GA + + DK+ G +L
Sbjct: 10 FIALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNKDLGTALP 69

Query: 111 DVIPFIDFSSVSDNNAIGTLIAPQYIVGVAHNG---------------------PHDSTR 149
+ IP IDFS V + I TLI PQY+VGV H H
Sbjct: 70 NGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNAKAHRDVS 129

Query: 150 LAGATYSQIRRDVHPD-------YHRERDWTQEWDFDVTRLGKLVTDATPIATFPVPRTS 202
Y + ++ +P ++ + D+ + RL K VT+ PI
Sbjct: 130 SEENRYFSVEKNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIE-------- 181

Query: 203 TPLLSLGDPDLWTEEEKRKFPLFYRVGMGIQFVGDHPEHGHNPMTGALEPG--LHELGYA 260
S D T ++ K+P F R+G G QF+ ++ + G L +G A
Sbjct: 182 ---ASTASSDAGTYNDQNKYPAFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNLKLVGDA 238

Query: 261 YKWATGGIVTPTWSNETFVIARGDNGL-------------LPSFVQHGDSGSPLFAWDAE 307
Y + G +I G++ L ++ GDSGSPLF +D E
Sbjct: 239 YTYGIAGTPYKVNHENNGLIGFGNSKEEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDRE 298

Query: 308 HRRWVLAGVAADLKIPSGEPGYVYWSHLPADFLHSVFDKDNDPEVVFTEGKGPLRWAFDS 367
+W+ G D + + W+ + F V +KD+ K +++ S
Sbjct: 299 KGKWLFLGSY-DFWAGYNKKSWQEWNIYKSQFTKDVLNKDSA--GSLIGSKTD--YSWSS 353

Query: 368 AQGAGHLTQQGRQFAMHGYKKTPGLAYAGGNGALDFAGLDAGRNLSLRSEGGKGVGQITL 427
+T + + + G++++ +G G +TL
Sbjct: 354 NGKTSTITGGEKSLNVDLADGKDKP--------------NHGKSVTF-----EGSGTLTL 394

Query: 428 QDSVNQGAGSLTFHNSYVVSPETD-QTWMGGGIDVREGATVVWRVNGVAGDNLHKIGPGT 486
++++QGAG L F Y V +D TW G G+ V EG TV W+V+ D L KIG GT
Sbjct: 395 NNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQYDRLAKIGKGT 454

Query: 487 LTIEGKGVNPGGLKVGDGQVILSQRPDDDGHVQAFDGITLSSGRAEVVLGDGRQVDPDHI 546
L +EG G N G LKVGDG VIL Q+ + G AF + + SGR+ +VL D +QVDP+ I
Sbjct: 455 LIVEGTGDNKGSLKVGDGTVILKQQTNGSGQ-HAFASVGIVSGRSTLVLNDDKQVDPNSI 513

Query: 547 KWGARGGVLNLNGNDITFNRLHANAKDHGAAITNTATKTAA-MTI---NLTPMPPPTVSD 602
+G RGG L+LNGN +TF+ + D GA + N A+ +TI +L P
Sbjct: 514 YFGFRGGRLDLNGNSLTFDHIRNI--DDGARLVNHNMTNASNITITGESLITDPNTITPY 571

Query: 603 SIRMPVRA------GTGQRGDLYKQG--LSYFVLKQD-SFGAVPTYGSQASDDYWEYAGQ 653
+I P G LY +Y+ L++ S + S S++ W Y G+
Sbjct: 572 NIDAPDEDNPYAFRRIKDGGQLYLNLENYTYYALRKGASTRSELPKNSGESNENWLYMGK 631

Query: 654 SFAAALEKANERKREYFPGQNEFIF---SGTVRGNIDVGV-ASPSKTVFIADGGMDLGEN 709
+ A F G GN++V + F+ GG +L
Sbjct: 632 TSDEAKRNVMNHINNERMNGFNGYFGEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNL-NG 690

Query: 710 TFTQRQGELVLQGHPVIHAINTPDEARKLLDLGDDSVRTQSVSFDQPDWESRKFAIRKLA 769
T +G L L G P HA + + D + + DW +R F +
Sbjct: 691 DLTVEKGTLFLSGRPTPHARDIAGISST---KKDPHFAENNEVVVEDDWINRNFKATTMN 747

Query: 770 LS-DTTFRLSRN-ATLLGDIVA-ERARVVLGSPL--LYLNKNDG-GELPQEPVKGTSIAN 823
++ + + RN A + +I A +A+V +G ++D G + K + A
Sbjct: 748 VTGNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKAL 807

Query: 824 TDADKSRYQGTVALSDHSTLDIHEK--FAGAVDARDSAINVFSDQA-ALTGYSR-----F 875
+ + +G V L++ + + + F +S + + + LTG S
Sbjct: 808 NSFNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNSDVHQLDL 867

Query: 876 TASALTLHAGAHLQGT 891
+ L++ +
Sbjct: 868 ANGHIHLNSADNSNNV 883


118NH44784_051961NH44784_052041N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_051961-181.805740outer membrane protein (porin
NH44784_0519710102.330030ABC transporter ATP-binding protein
NH44784_0519810122.328736Ferric iron ABC transporter, iron-binding
NH44784_0519910142.860050Ferric iron ABC transporter, permease protein
NH44784_052001-1142.285831diguanylate cyclase (GGDEF domain) with PAS/PAC
NH44784_0520110131.901899Chemotaxis response regulator protein-glutamate
NH44784_0520210132.181700Sensor histidine kinase/response regulator
NH44784_052031-2122.549958Chemotaxis signal transduction protein
NH44784_052041-1132.870086Chemotaxis protein methyltransferase CheR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051961ECOLNEIPORIN573e-11 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 57.1 bits (138), Expect = 3e-11
Identities = 82/373 (21%), Positives = 135/373 (36%), Gaps = 73/373 (19%)

Query: 21 PALAATSTSSSVQLYGLIDTGLQYLSN----GPSGNSKTGMSTGNLSGSRWGLRGTEDLG 76
AL + + V LYG I G++ + G S + GS+ G +G EDLG
Sbjct: 11 AALPVAAMAD-VTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLG 69

Query: 77 DGLSTVFVLE-----NGFDSGNGTTMQGGRLFGRQAYVGLSSKSLGRLTLGRHNTLMIDW 131
+GL ++ +E G DSG G RQ+++GL G+L +GR N+++ D
Sbjct: 70 NGLKAIWQVEQKASIAGTDSGWGN---------RQSFIGL-KGGFGKLRVGRLNSVLKDT 119

Query: 132 MSKYNPFDNAN-----FSIKRPDPAF-SDRTDNAVMYVGKFGPVSVGGYYSFGWNNEQSF 185
NP+D+ + I P+ S R D+ +F +S Y+ N+
Sbjct: 120 -GDINPWDSKSDYLGVNKIAEPEARLISVRYDSP-----EFAGLSGSVQYAL--NDNAGR 171

Query: 186 DDKTLGRMLGGGVRYQSGGLDAGVLYHSKNADKPAKGANSDNREDRIVAGLSYDFEGV-- 243
+ G Y++GG K + + N + + YD + +
Sbjct: 172 HN---SESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIH-RLVSGYDNDALYA 227

Query: 244 KVYAGYRWLEQKLTQRTYKNNL-TWLGVTYL-PVQNNRLSLAVYHLNDSVCDDMNNAVCP 301
V + + KL + Y +N T + T N ++ H
Sbjct: 228 SVAVQQQ--DAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFK------------ 273

Query: 302 AAQAAGTGQKSTM--FVLGNEYDLSKRTTVYAVAAYAVNDDKSAQSVVGGKYGANVEPGK 359
T + V+G EYD SKRT+ A + K+ +
Sbjct: 274 -GSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGE------SKFVSTA---- 322

Query: 360 NQLGLNLGLRHRF 372
+GLRH+F
Sbjct: 323 ----GGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_051971PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.002
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 39 TVLALLGPSGCGKSTLLKSLAGL 61
+ L G G GKSTL+ +L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052001HTHFIS664e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 4e-14
Identities = 34/162 (20%), Positives = 68/162 (41%), Gaps = 17/162 (10%)

Query: 17 AAMVLLVDDQVMVGEAIRRALATEGNIDFHYCSDPYKALTVAVQTRPTVILQDLVLPGVD 76
A +L+ DD + + +AL+ G D S+ +++ D+V+P +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAG-YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 77 GLSLVQEYRSHPVTRDIPIIVLSTKEDPAIKSAAFAAGANDYLVK---LPDSIELVAR-- 131
L+ + D+P++V+S + A GA DYL K L + I ++ R
Sbjct: 62 AFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 132 ---------IRYHSRSYLALVQRDEAYQALRQSQQQLLESNM 164
+ S+ + LV R A Q + + +L+++++
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052011HTHFIS445e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 44.1 bits (104), Expect = 5e-07
Identities = 36/172 (20%), Positives = 65/172 (37%), Gaps = 20/172 (11%)

Query: 6 ETLRRVIALEPGLEVAWVASNGLEAVQQCAQDRPDVVLMDLVMPVMDGVEATRRIMAETP 65
L + ++ G +V + SN + A D+V+ D+VMP + + RI P
Sbjct: 17 TVLNQALSRA-GYDVR-ITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARP 74

Query: 66 -CAIVVVTVDVARHTARVFDAMGYGALDAADTPVVGG----------GDLRSAAAPLLRK 114
++V++ TA A GA D P + + + L
Sbjct: 75 DLPVLVMSAQNTFMTA--IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDD 132

Query: 115 IRNIGWLIGRYGNNRTALTPVVRDAAQAGPSQ-GLLVIGASAGGPATLAQLL 165
++ L+GR A+ + R A+ + L++ G S G +A+ L
Sbjct: 133 SQDGMPLVGR----SAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052021HTHFIS754e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 4e-16
Identities = 35/117 (29%), Positives = 55/117 (47%), Gaps = 3/117 (2%)

Query: 629 RKRVLVVDDSLTVRELERKLLLNRGFDVAVAVDGMDGWNMLRSEAFDLVVTDVDMPRMDG 688
+LV DD +R + + L G+DV + + W + + DLVVTDV MP +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 689 IELVSRIKADPKLQSLPVMVVSYKDREEDRRRGLDAGADYYLAKGSFHDDALLDAVE 745
+L+ RIK LPV+V+S ++ + + GA YL K F L+ +
Sbjct: 63 FDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK-PFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_052041PF03544280.047 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 28.4 bits (63), Expect = 0.047
Identities = 19/115 (16%), Positives = 30/115 (26%), Gaps = 1/115 (0%)

Query: 218 TQERAVTVLRRFARHDGILFVGPAET-SLMTGRRLPAVPLARAFAFRAEPAPPTEAAPAR 276
T V L A+ + V PA+ + P + P PP EA
Sbjct: 35 TSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVI 94

Query: 277 TASAASGRATPPPVVAPRAAQPWTPAARPQPYTPAAPARPHAASLAPHGTTAQAG 331
+ P PV + +P +P P + +
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKP 149


119NH44784_053581NH44784_053711N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0535810152.8305033-oxoacyl-[acyl-carrier protein] reductase
NH44784_0535910122.756171putative Hydrolase, alpha/beta fold family
NH44784_0536010112.762888Transcriptional regulator, TetR family
NH44784_0536110122.115195O-methyltransferase
NH44784_053621-1102.027883Permeases of the major facilitator superfamily
NH44784_0536313171.364012Transcriptional regulator, LysR family
NH44784_0536413181.230324putative hybrid sensor and regulator protein(
NH44784_0536514201.060546two-component hybrid sensor and regulator
NH44784_0536614201.003476hypothetical protein
NH44784_0536714211.086388Hemolysin activation/secretion protein
NH44784_0536814211.039539PUTATIVE HEMAGGLUTININ-RELATED PROTEIN
NH44784_053691-2100.261740Peptide transport system permease protein sapC
NH44784_053701-1120.836581putative acyl coenzyme A thioester hydrolase(
NH44784_053711-1120.737986hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053581DHBDHDRGNASE782e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.2 bits (192), Expect = 2e-19
Identities = 69/257 (26%), Positives = 111/257 (43%), Gaps = 26/257 (10%)

Query: 2 ALVTGSTSGIGAAIARRLSRDGYAVVLHSRQSAEAGRALAAELGTAVYVQADLAVDADRV 61
A +TG+ GIG A+AR L+ G A + + E + + L D
Sbjct: 11 AFITGAAQGIGEAVARTLASQG-AHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 62 RLVREAVAAW----GRLDVLVNNAGISRVIAHGDLAAATPAVWHEMHEVHVVAPFRLVAE 117
+ E A G +D+LVN AG+ R G + + + W V+ F
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRP---GLIHSLSDEEWEATFSVNSTGVFNA--- 123

Query: 118 AQAALSAAAARGRPGCVVNVSSHAGVRPKGASIPYAAAKAALNHVTRLLAVSLAPS-IRV 176
+++ R R G +V V S+ P+ + YA++KAA T+ L + LA IR
Sbjct: 124 SRSVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 177 NAVAPGLVDTPLTR-----DWTQAQQL------WRERAPMRRAAQPDDIAQAVAWLA--Q 223
N V+PG +T + + Q + ++ P+++ A+P DIA AV +L Q
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 224 SDYLTGEILLSDGGLNL 240
+ ++T L DGG L
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053601HTHTETR598e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.9 bits (142), Expect = 8e-13
Identities = 21/155 (13%), Positives = 57/155 (36%), Gaps = 3/155 (1%)

Query: 14 QPQQARSAELVAAILQAGAQVLAKEGAARFTTARVAEKAGVSIGSLYQYFPNKAAILYRL 73
+ + + E IL ++ +++G + + +A+ AGV+ G++Y +F +K+ + +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 74 QTQEWQQTSDLLRRLLEDARHAPPQRLRNLVHAFVR---SECEEAVMRVALNDAAPLYRD 130
+L P LR ++ + +E ++ + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 131 APEARATKEEGGRIVLAFLEEALPNASDATRVLAG 165
+ + +E+ L + +A + A
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPAD 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053621TCRTETB1007e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 100 bits (251), Expect = 7e-25
Identities = 82/387 (21%), Positives = 139/387 (35%), Gaps = 14/387 (3%)

Query: 47 IALPAIARDLGGSPVALNWITNAFMLVFGSSLLVAGALADRYGRRRVFLAGAAIFIAASL 106
++LP IA D P + NW+ AFML F V G L+D+ G +R+ L G I S+
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 107 -GLVHAPDIVAFDLLRAAQGLGGAAIFSGGAAALAQAFDGPGRMRAFSLLGASFGAGLSF 165
G V + R QG G AA + +A+ R +AF L+G+ G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 166 GPIASALLIERYGWRSIFLLVIGLAAVALLLGARCLRESRDPAAGRLDWPGAATFSAALA 225
GP ++ W + L+ + L +E R G D G S +
Sbjct: 155 GPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR--IKGHFDIKGIILMSVGIV 212

Query: 226 LLTWGIMRAPQSGWGDPSTLAMLAAAVALFAAFGVIELRAVRPMLDLTLFRYPRFLGVQL 285
S L +V F F + P +D L + F+ L
Sbjct: 213 FFMLFTTSYSIS---------FLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVL 263

Query: 286 LAAAPAYAFVVLLILLPVRFIAVDGLGEIEAGR-MMIALSAPLLLLPIVAGQLTRWLAPA 344
+ ++P V L E G ++ + +++ + G L P
Sbjct: 264 CGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPL 323

Query: 345 LICGAGLLLAAAGLAWLSLAMTAGQVTALVGPMLLIGIGISLPWGLMDGLAVSVVPVERA 404
+ G+ + S + + + ++G G+S ++ + S + + A
Sbjct: 324 YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTVISTIVSSSLKQQEA 382

Query: 405 GMATGIFSTTRVAGEGVALAVVSALLT 431
G + + T EG +A+V LL+
Sbjct: 383 GAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053641HTHFIS685e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 5e-14
Identities = 41/160 (25%), Positives = 71/160 (44%), Gaps = 6/160 (3%)

Query: 695 RIWLAEDTPEIREFLVEELASLGFLVESEADGRGMVARIQADDARRPDLILTDHRMEGAD 754
I +A+D IR L + L+ G+ V ++ + I A D DL++TD M +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDEN 61

Query: 755 GSAVLRAARARWPDLPVVAVSATPQETPPLDG--AGYDASLLKPINLVELRHLLGRLLHL 812
+L + PDLPV+ +SA + G L KP +L EL ++GR L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 813 ATNPAANEPAPAASPLPRLSRAE-LEQVQRLLDIGGISDL 851
+ + +P + R+ ++++ R+L +DL
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053651HTHFIS731e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.6 bits (178), Expect = 1e-16
Identities = 39/158 (24%), Positives = 65/158 (41%), Gaps = 11/158 (6%)

Query: 20 RVLVVEDRPEDRMLLVDFLTSQDCRVYVAEDGNDGYRKAQLVQPDIILMDVNMPVCDGLA 79
+LV +D R +L L+ V + + +R D+++ DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 80 A-CRLLKADPATRAIPVIFLTAASLPMERVAGLSAGAVDYVAKPFDFEEVRL---RLLIH 135
R+ KA P +PV+ ++A + M + GA DY+ KPFD E+ R L
Sbjct: 65 LLPRIKKARP---DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 136 LRVTPVAQPASAAGPCEDGAPAAGTGSTLDGVLFRAAR 173
+ P + +DG P G + + + AR
Sbjct: 122 PKRRPSKLEDDS----QDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053671PYOCINKILLER310.017 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.017
Identities = 18/66 (27%), Positives = 28/66 (42%), Gaps = 6/66 (9%)

Query: 22 GAEPGRQLPQPVMPESTPTAPPVTVQQGGATEAPAGADKLTFTLTDMRV-EGVTRYPADT 80
+ PG Q P +TP P GAT P A T+ + + +PAD+
Sbjct: 419 ASPPGNQNPS----STTPVVPKPVPVYEGATLTPVKATPETYPGVITLPEDLIIGFPADS 474

Query: 81 -LRPLY 85
++P+Y
Sbjct: 475 GIKPIY 480


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053681PF05860714e-16 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 71.0 bits (174), Expect = 4e-16
Identities = 31/109 (28%), Positives = 51/109 (46%), Gaps = 7/109 (6%)

Query: 44 PTGGNVVAGGATINDRGNGTLDINQSTGKAIINWKDFSIGANETVNFRQPGSNSITLNRV 103
P N+ G T Q+ ++++FS+ + T F P + ++RV
Sbjct: 10 PINSNITTEGNTRIIERG-----TQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRV 64

Query: 104 VGNDPSAIFGRLNANGT--VMLVNPNGVLFGKGARIDVGGLVATTANIS 150
G S I G + AN T + L+NPNG++FG+ AR+D+GG +
Sbjct: 65 TGGSVSNIDGLIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_053711FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.4 bits (76), Expect = 0.001
Identities = 21/58 (36%), Positives = 27/58 (46%), Gaps = 9/58 (15%)

Query: 2 PFTRRHALGALAALPLLPRLAFAQAQADFPTR-------PVRLL--VGFAPGGLTDIA 50
+RR L A+A PLL ++ A A A P R PV LL +G P G+ D
Sbjct: 6 LISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTI 63


120NH44784_058361NH44784_058461N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0583612131.809271Survival protein SurA precursor (Peptidyl-prolyl
NH44784_0583710160.896121FIG00958542: hypothetical protein
NH44784_058381-1161.792115hypothetical protein
NH44784_058391-1150.860944Putative outer membrane TonB-dependent receptor
NH44784_0584010100.930472Ferric siderophore transport system, periplasmic
NH44784_058411-1113.235898Biopolymer transport protein ExbD/TolR
NH44784_0584210123.606724MotA/TolQ/ExbB proton channel family protein
NH44784_0584310123.556286MotA/TolQ/ExbB proton channel family protein
NH44784_0584410133.708187Hemolysin activation/secretion protein
NH44784_0584511123.996649hypothetical protein
NH44784_0584611134.385017Filamentous haemagglutinin family outer membrane
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058361cdtoxina280.049 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 28.1 bits (62), Expect = 0.049
Identities = 14/45 (31%), Positives = 18/45 (40%), Gaps = 6/45 (13%)

Query: 33 SPAQSSESSPVPAKAAAAPEKGAAPAAAATAQPAPAKAPAVATLG 77
P+ P+P A P GA P P P APAV+ +
Sbjct: 46 VPSPDEPGLPLPGPGPALPTNGAIPI------PEPGTAPAVSLMN 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058381RTXTOXIND366e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.3 bits (84), Expect = 6e-05
Identities = 17/134 (12%), Positives = 38/134 (28%), Gaps = 2/134 (1%)

Query: 24 AAAALCAAGLSVALPAQAESMEERLRAQLRSTTQQLQQLQSEQAQVNAAKAAAEAQRDAA 83
+ + L + + + Q L + ++E+ V A E
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 84 QKELVALRSQLASAKGQAEKLAGQQEAVMESAQAQVAASHAQLGKFKGAYDEL-LTLSRA 142
+ L S L + Q+ +E A ++ +QL + +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVE-AVNELRVYKSQLEQIESEILSAKEEYQLV 292

Query: 143 KEAERQTLARTLAQ 156
+ + + L Q
Sbjct: 293 TQLFKNEILDKLRQ 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058401TONBPROTEIN426e-07 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 42.3 bits (99), Expect = 6e-07
Identities = 16/47 (34%), Positives = 18/47 (38%)

Query: 57 PLPPPPPPPPEPEKPPEPEPPKEEPVAPPEPEPQPTPEPPKPQDEAP 103
L PP P PE EPEP E PP+ P +P P
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP 100



Score = 33.0 bits (75), Expect = 7e-04
Identities = 19/92 (20%), Positives = 31/92 (33%), Gaps = 1/92 (1%)

Query: 16 RRWLKLAAVAVALIAAAYALWRWANDMAGVRREAPKATAIIPLPPPPPPPPEPEKPPEPE 75
RR+ ++V + A A + + + AP + + P P P PE
Sbjct: 7 RRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPE 66

Query: 76 PPKEEPVAPPEPEPQPTPEP-PKPQDEAPPRP 106
P E P P P + + P+P
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058421RTXTOXINA300.008 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.3 bits (68), Expect = 0.008
Identities = 14/50 (28%), Positives = 24/50 (48%), Gaps = 3/50 (6%)

Query: 97 KENQVLGAKLGSLSNAIAGGPYIGLLGTVLGIMVVFLGTAMAGDVNINAI 146
++ Q G LG + I +G G +L FLGTA++ + I+ +
Sbjct: 119 QKYQKAGNILGGGAENIGDN--LGKAGGILSTFQNFLGTALSS-MKIDEL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058441PF00577364e-04 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 36.0 bits (83), Expect = 4e-04
Identities = 36/213 (16%), Positives = 62/213 (29%), Gaps = 27/213 (12%)

Query: 242 TDNAKVWSGSYSMPLDKQWSLQFTG--YKSDSNVATIGGTNVLGK--GYSFGMSAIYTLA 297
+ + + + L W++ + G G +G S M+ +
Sbjct: 390 QEKPRFFQSTLLHGLPAGWTI-YGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTL 448

Query: 298 PQGDWYNSLSVGLDYKKFDETTRFGGNEDLIPLKYVPFTFSYNGYRYSEASQSSIGLSLV 357
P ++ SV Y K + GYRYS + + +
Sbjct: 449 PDDSQHDGQSVRFLYNKSLNESGT--------------NIQLVGYRYSTSGYFNFADTTY 494

Query: 358 GASRSFFGLGSDWKEFDDKRYRASPSFALL---RGDGTHTQNLFGDWQ-LGLRAGFQLAS 413
+ D ++ + A + T TQ L G L L Q
Sbjct: 495 SRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYW 553

Query: 414 GALVSNEQFSAGGSTSVRGY---LAAERTGDDG 443
G +EQF AG +T+ L+ T +
Sbjct: 554 GTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAW 586


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_058461PF05860588e-12 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 58.3 bits (141), Expect = 8e-12
Identities = 25/99 (25%), Positives = 47/99 (47%), Gaps = 6/99 (6%)

Query: 132 TEGGRQLVTI-EQTQSRAILNWDTFNVGRNTTLRFQQKAD-DAVLNRVVGASARPSQIQG 189
TEG +++ Q S ++ F+V + T F + +++RV G S S I G
Sbjct: 17 TEGNTRIIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGS--VSNIDG 74

Query: 190 AIQADGT--VLVVNQNGVIFSGTSQVNARNLVVAAATMS 226
I+A+ T + ++N NG+IF ++++ V +
Sbjct: 75 LIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


121NH44784_059161NH44784_059201N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_059161-290.487001efflux transporter, RND family, MFP subunit
NH44784_059171-29-0.405502RND multidrug efflux transporter; Acriflavin
NH44784_059181-280.227508RND efflux system, outer membrane lipoprotein
NH44784_059191-211-0.399488hypothetical protein
NH44784_059201-210-0.830051TRAP-type C4-dicarboxylate transport
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059161RTXTOXIND461e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 46.4 bits (110), Expect = 1e-07
Identities = 22/130 (16%), Positives = 45/130 (34%), Gaps = 7/130 (5%)

Query: 23 RTVELRSRVGGTVDAVSVPEGSLVRQGQPLFQIDPRPFQVALDTALAQLRQAQALADQAQ 82
R+ E++ V + V EG VR+G L ++ + + L QA+ + Q
Sbjct: 95 RSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQ 154

Query: 83 TDFDRAER------LVSTGAVSRKVHDEAASTRRARLAQVQVAKAAVAAARLDLSYARIT 136
E + + V +E + L + Q + + +L+ +
Sbjct: 155 ILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTS-LIKEQFSTWQNQKYQKELNLDKKR 213

Query: 137 APIAGRVDRV 146
A + R+
Sbjct: 214 AERLTVLARI 223



Score = 32.5 bits (74), Expect = 0.003
Identities = 18/100 (18%), Positives = 42/100 (42%), Gaps = 3/100 (3%)

Query: 61 QVALDTALAQLRQAQALADQAQTDFDRAERLVSTGAVSRKVHDEAASTRRARLAQVQVAK 120
+ A+ +LR ++ +Q +++ A+ V++ +E R + +
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL--VTQLFKNEILDKLRQTTDNIGLLT 315

Query: 121 AAVAAARLDLSYARITAPIAGRVDRVMV-TEGNLVSGGTA 159
+A + I AP++ +V ++ V TEG +V+
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059171ACRIFLAVINRP10550.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1055 bits (2731), Expect = 0.0
Identities = 433/1041 (41%), Positives = 649/1041 (62%), Gaps = 19/1041 (1%)

Query: 6 FFIARPIFAIVLSLLMLLAGAIAFWQLPLSEYPAVTPPTVQVTASYPGANPEVIADTVAA 65
FFI RPIFA VL++++++AGA+A QLP+++YP + PP V V+A+YPGA+ + + DTV
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLEQVINGVEGMLYMNSQMATDGRMVLTIAFKQGTDPDMAQIQVQNRVSRALPRLPEEVQ 125
+EQ +NG++ ++YM+S + G + +T+ F+ GTDPD+AQ+QVQN++ A P LP+EVQ
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 126 RIGVVTQKTSPDVLMVVHIVSPQKRYDSLYLSNFAIRQVRDELARLPGVGDVLVWGAGEY 185
+ G+ +K+S LMV VS +S++ V+D L+RL GVGDV ++G +Y
Sbjct: 124 QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG-AQY 182

Query: 186 SMRVWLDPSRVATRGLTASDVVAALREQNVQVAAGSVGQQPDTSA-AFQVTVNTLGRLAS 244
+MR+WLD + LT DV+ L+ QN Q+AAG +G P ++ R +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 245 EEQFGDIVVKTGADGQVTRLRDIARVSLGADAYTLRSLIDGESAPALQIIQSPGANAIDV 304
E+FG + ++ +DG V RL+D+ARV LG + Y + + I+G+ A L I + GANA+D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 305 SNAVRAKMETLQQGFPQDISYRIAYDPTVFVRASLQSVAVTLLEAVILVVLVVVLFLQTW 364
+ A++AK+ LQ FPQ + YD T FV+ S+ V TL EA++LV LV+ LFLQ
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 365 RASIIPLVAVPVSLVGTFAVMHMFGFSLNTLSLFGLVLSIGIVVDDAIVVVENVERHMA- 423
RA++IP +AVPV L+GTFA++ FG+S+NTL++FG+VL+IG++VDDAIVVVENVER M
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 424 LGERPIDAARKAMDEVTGPILAITSVLAAVFIPSAFLSGLQGEFYRQFALTIAISTVLSA 483
P +A K+M ++ G ++ I VL+AVFIP AF G G YRQF++TI + LS
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 484 INSLTLSPALAAMLLRPHHEAARADWLTRVIDRALGGFFRRFNRFFDRASQAYVVAVRRA 543
+ +L L+PAL A LL+P ++ GGFF FN FD + Y +V +
Sbjct: 483 LVALILTPALCATLLKP---------VSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 544 VRGSAIVLLLYAGFTGLTWLGFNQVPNGFVPAQDKYFLVGIAQLPSGASLDRTEAVVKQM 603
+ + LL+YA + F ++P+ F+P +D+ + + QLP+GA+ +RT+ V+ Q+
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 604 SEIALA--EPGVESVVAFPGLSVNGPVNVPNSALMFAMLKPFDERQDAALSANAIAGKLM 661
++ L + VESV G S +G N+ + F LKP++ER SA A+ +
Sbjct: 594 TDYYLKNEKANVESVFTVNGFSFSG--QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 662 GKFSQIPDGFVGIFPPPPVPGLGAMGGFKLQIEDRAALGFEALAQAQGQIMARAMQAP-E 720
+ +I DGFV F P + LG GF ++ D+A LG +AL QA+ Q++ A Q P
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 721 LANMLASFQTNAPQLQVDVDRVKAKSLGVSLTDVFETLQINLGSLYVNDFNRFGRTYRVM 780
L ++ + + Q +++VD+ KA++LGVSL+D+ +T+ LG YVNDF GR ++
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 781 AQADAPFRMQAEDIGLLKVRNADGHMIPLSAFVTLTRGSGPDRVIHYNGFPSADISGGPA 840
QADA FRM ED+ L VR+A+G M+P SAF T G R+ YNG PS +I G A
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 841 PGYSSGQATDAIEQIVRETLPDGMVYEWTDLVYQEKQAGNSALYIFPLAVLLAFLILAAQ 900
PG SSG A +E + + LP G+ Y+WT + YQE+ +GN A + ++ ++ FL LAA
Sbjct: 832 PGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 901 YNSWSLPFAVLLIAPMALLSAIGGVWLSGGDNNIFTQIGFVVLVGLAAKNAILIVEFAR- 959
Y SWS+P +V+L+ P+ ++ + L N+++ +G + +GL+AKNAILIVEFA+
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 960 AKEDEGADPLAAVLEAARLRLRPILMTSFAFIAGVVPLVLASGAGAEMRHAMGIAVFAGM 1019
E EG + A L A R+RLRPILMTS AFI GV+PL +++GAG+ ++A+GI V GM
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 1020 LGVTLFGLLLTPVFYVVVRKL 1040
+ TL + PVF+VV+R+
Sbjct: 1011 VSATLLAIFFVPVFFVVIRRC 1031



Score = 98.4 bits (245), Expect = 7e-23
Identities = 76/435 (17%), Positives = 140/435 (32%), Gaps = 44/435 (10%)

Query: 643 FDERQDAALSANAIAGKLMGKFSQIPDGFVGIFPPPPVPGLGAMGGFKLQIEDRAALGFE 702
F D ++ + KL +P + + + + GF
Sbjct: 94 FQSGTDPDIAQVQVQNKLQLATPLLPQEV----QQQGISVEKSSSSYLMVA------GFV 143

Query: 703 ALAQAQGQIMARAMQAPELANML------ASFQTNAPQLQ--VDVDRVKAKSLGVSLTDV 754
+ Q A + + L Q Q + +D ++ DV
Sbjct: 144 SDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDV 203

Query: 755 FETLQIN--------LGSLYVNDFNRFGRTYRVMAQADAPFRMQAEDIGLLKVR-NADGH 805
L++ LG + + + P E+ G + +R N+DG
Sbjct: 204 INQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP-----EEFGKVTLRVNSDGS 258

Query: 806 MIPLSAFVTLTRGSGPDRVI-HYNGFPSADISGGPAPGYSSGQATDAIEQIV---RETLP 861
++ L + G VI NG P+A + A G ++ AI+ + + P
Sbjct: 259 VVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFP 318

Query: 862 DGM----VYEWTDLVYQEKQAGNSALYIFPLAVLLAFLILAAQYNSWSLPFAVLLIAPMA 917
GM Y+ T V L+ A++L FL++ + + P+
Sbjct: 319 QGMKVLYPYDTTPFVQLSIHEVVKTLFE---AIMLVFLVMYLFLQNMRATLIPTIAVPVV 375

Query: 918 LLSAIGGVWLSGGDNNIFTQIGFVVLVGLAAKNAILIVE-FARAKEDEGADPLAAVLEAA 976
LL + G N T G V+ +GL +AI++VE R ++ P A ++
Sbjct: 376 LLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSM 435

Query: 977 RLRLRPILMTSFAFIAGVVPLVLASGAGAEMRHAMGIAVFAGMLGVTLFGLLLTPVFYVV 1036
++ + A +P+ G+ + I + + M L L+LTP
Sbjct: 436 SQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCAT 495

Query: 1037 VRKLALRRAPSRARA 1051
+ K
Sbjct: 496 LLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059181PYOCINKILLER300.023 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.023
Identities = 29/130 (22%), Positives = 44/130 (33%), Gaps = 5/130 (3%)

Query: 209 LRLTTDTLATQQRSYDLTAQLARVGNATQLDLRLAEITLRAAEADRAAYTRQAARDRNAL 268
L++ +TL + S + A A R AE R A RAA T + + +
Sbjct: 200 LQIRMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVV 259

Query: 269 VLLLGQPLTPALARQLDQANTLPDDI-----VPATLPSGLPAQLLARRPDLRAAEQRLRA 323
G+ L A + D I V A+ PS + + R AEQ
Sbjct: 260 ATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQ 319

Query: 324 ANANIGAARA 333
++ A
Sbjct: 320 TPDSVRYALG 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_059201PF03544330.003 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.003
Identities = 24/148 (16%), Positives = 43/148 (29%), Gaps = 9/148 (6%)

Query: 297 MTLALAPLAAAGLLVGAVHYAVTQPEVVYDIPAELRLGSDEFQS-GLSEPPLGGLAEPPT 355
L+ ++ G ++ +V Q V ++PA + S + EPP P
Sbjct: 16 WPTLLSVCIHGAVVAGLLYTSVHQ---VIELPAPAQPISVTMVAPADLEPPQAVQPPPEP 72

Query: 356 DGLAEPPADGLAEPPASGVTEPAASGLAEPPGT-----DGGGQAATAPGAPQRAGADAAS 410
EP + + EPP P + P + A +
Sbjct: 73 VVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENT 132

Query: 411 GSGAPADTASTSPTGGEKPAPRGFWIFL 438
P + +T+ T + L
Sbjct: 133 APARPTSSTATAATSKPVTSVASGPRAL 160


122NH44784_061441NH44784_061541N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_061441-2100.983728Transcriptional regulator, TetR family
NH44784_061451-2110.839864Acid phosphatase
NH44784_061461-2111.032860possible MutT-like domain
NH44784_061471-2110.932120Selenophosphate-dependent tRNA 2-selenouridine
NH44784_061481-2110.612501outer membrane autotransporter
NH44784_061491-311-0.305967Signal transduction histidine kinase
NH44784_061501-116-0.098002Probable response regulator
NH44784_0615110140.301988periplasmic sensor hybrid histidine kinase
NH44784_0615213160.000002hypothetical protein
NH44784_0615311130.087854hypothetical protein
NH44784_061541110-1.274705hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061441HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 68.5 bits (167), Expect = 2e-16
Identities = 39/194 (20%), Positives = 72/194 (37%), Gaps = 11/194 (5%)

Query: 24 AEVRLDELMGAAQALFLEKGVEATTISEITEAAGVAKGTFYVYFPSKQEMLVALGERYVR 83
A+ ++ A LF ++GV +T++ EI +AAGV +G Y +F K ++ + E
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 84 EFVERLERAVAACAEDDWEGRLRAWIHANVSMYLRTYR--VHDIVFTHPHHHA------- 134
E A D IH S R + +I+F
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 135 REGHTENQVVTQLQGILRAGAQAGAWRLD-NPRMMALLIYAGVHCATDDAILSRQS-DAT 192
+ + + +++ L+ +A D R A+++ + ++ + + QS D
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLK 188

Query: 193 DFARGIADAYLRML 206
AR L M
Sbjct: 189 KEARDYVAILLEMY 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061481PRTACTNFAMLY941e-21 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 94.3 bits (234), Expect = 1e-21
Identities = 206/914 (22%), Positives = 316/914 (34%), Gaps = 126/914 (13%)

Query: 202 TLHGGKITTTGNYGDGISVRGANALVSVYGTEINVGAGVGARLEGGRFVMDGGSLTAGGS 261
T G I +G GI + A E+ G V G L+ G
Sbjct: 63 TASGTTIKVSGRQAQGILLENPAA-------ELQFRNGS---------VTSSGQLSDDGI 106

Query: 262 AVMLSSGTGAASSVDIRNATLTST----GDFGYGININAKDTSAAVENVSISALG--YYG 315
L + T A + +ATL + D G + + + A++ + ++ G
Sbjct: 107 RRFLGTVTVKAGKLVADHATLANVGDTWDDDGIALYVAGEQAQASIADSTLQGAGGVQIE 166

Query: 316 TGVWLPSTGTSFTATGFDLTSSHVGVDNRAG--RVTLVDGNVTTRDANAHGLYVSREYGS 373
G + ++ G + + RV L D NVT A+ VS S
Sbjct: 167 RGANVTVQRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGAS 226

Query: 374 SATIKATKVNVKTRGDGAVGVLARASGAAIELLDASVTTEGASAYGLFASGS----GVSL 429
T+ + G A GV A GA + L A++ A A G G+ V
Sbjct: 227 ELTLDGGHIT----GGRAAGVAA-MQGAVVHLQRATIRRGDAPAGGAVPGGAVPGGAVPG 281

Query: 430 SARNTQISTSGANATGLSVSNRAAVTLDNTGFTLAGAGAHGIWSYVTAAGVSNTVTLRNG 489
G+ VS ++V L + GA G VT+ G
Sbjct: 282 GFGPGGFGPVLDGWYGVDVSG-SSVELAQSIVEAPELGAAI------RVGRGARVTVSGG 334

Query: 490 SRIDTQDGVGLLASGGGHTFLVSDS--DITARAGGDVGSGVLLHSRAVTVTSGGVSTVIE 547
S + G ++ +GG F + IT +AG LL+ T+
Sbjct: 335 S-LSAPHG-NVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVK---LTLTG 389

Query: 548 SEQVTLDATNARLTGDVLADSGTVDVSLANGSTLNGALVQRGTGRINGLTLDGTSTWIVR 607
D L G +DV+LA+ + GA T ++ L++D +TW++
Sbjct: 390 GADAQGDIVATELPSIPGTSIGPLDVALASQARWTGA-----TRAVDSLSIDN-ATWVMT 443

Query: 608 GDSSLATLS--NAGTVAFAAPAGGAGFKTLTVNNYVGGGTLVLNTQLGDDASPTDKLVID 665
+S++ L + G+V F PA FK LTVN G G +N D +DKLV+
Sbjct: 444 DNSNVGALRLASDGSVDFQQPAEAGRFKVLTVNTLAGSGLFRMNVFA--DLGLSDKLVVM 501

Query: 666 GGTTSGNTALRIVNAGGSGGQTTYGIRVVETINGGTTTADAFHLDSGSTGYRASARTVSL 725
+ + V GS + + +V+T G T + D V +
Sbjct: 502 QDASGQHRLW--VRNSGSEPASANTLLLVQTPLGSAATFTLANKDG----------KVDI 549

Query: 726 NGYDYSLVRGGNGAVAPDWYLTSDYTPPVTPPDPVDPINPTDPGDPTGPVTPPDPVLPGG 785
Y Y L GNG W L PP P P P P P P P P G
Sbjct: 550 GTYRYRLAANGNGQ----WSLVGAKAPP--APKPAPQPGPQPPQPPQPQPEAPAPQPPAG 603

Query: 786 PGFKNVSPESGAYAGNRLAATRLFTHS--LHDRVPA-YADGDADARHGRGLWARVQGRHD 842
+ + G LA+T + S L R+ + DA GRG R Q +
Sbjct: 604 RELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNR 663

Query: 843 SGLRMSEGRVDVDTDSAMVQLGGDLLKAPLGREGALYAGLMGGYGDARSRSVSTLMLPGA 902
+G R D A +LG D A G + G + GY
Sbjct: 664 AGRRF-------DQKVAGFELGAD--HAVAVAGGRWHLGGLAGYTRGD------------ 702

Query: 903 TQSTHARARGKVSGYAGGVYGTFYANDLTRLGAYADTWIQYGRYTNQ---VSSELGSVR- 958
+ G G Y T+ A+ G Y D ++ R N S+ +V+
Sbjct: 703 -RGFTGDGGGHTDSVHVGGYATYIADS----GFYLDATLRASRLENDFKVAGSDGYAVKG 757

Query: 959 -YRSNVWSASLEGGYALKPFAAGSALEALVIEPNAQLVY-----SRYRAQDATLQGTRMR 1012
YR++ ASLE G + + +EP A+L YRA + G R+R
Sbjct: 758 KYRTHGVGASLEAGRRF------THADGWFLEPQAELAVFRAGGGAYRAAN----GLRVR 807

Query: 1013 SGDNGSWQSRVGVRLYPQVTPQPGESSVRPFLETNWLHRSD-DPTVRMGSTTLQAQPSRN 1071
S R+G+ + ++ G V+P+++ + L D TV + +
Sbjct: 808 DEGGSSVLGRLGLEVGKRIELAGG-RQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGT 866

Query: 1072 ALELKVGAEGRVGK 1085
EL +G +G+
Sbjct: 867 RAELGLGMAAALGR 880


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061491HTHFIS688e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 8e-14
Identities = 32/160 (20%), Positives = 64/160 (40%), Gaps = 8/160 (5%)

Query: 852 RILVAEDNPINQVTMCDQLEQLGCQVTVAPDGAEALAQWNIEPFDVVLTDVNMPRMNGYE 911
ILVA+D+ + + L + G V + + A D+V+TDV MP N ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 912 LAGALRAQGVTVPIIGVTANALKDEEARCKAAGMSNWLVKPIKLSLLWSQLRAVRGEPAA 971
L ++ +P++ ++A + G ++L KP L+ L + + A
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL---IGIIGRALAE 121

Query: 972 PRTAGAAEAASPRAAPPIVGKYREVFLETMTQDLARMEQA 1011
P+ + + P+VG+ M + + +
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRS-----AAMQEIYRVLARL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061501HTHFIS638e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 8e-13
Identities = 28/153 (18%), Positives = 53/153 (34%), Gaps = 6/153 (3%)

Query: 8 KVLVLDDHALQCLHLKDMLQQAGFGHVDTVESAGAALDRISAEGYHLVLMDISMPGMDGV 67
+LV DD A L L +AG+ V +A I+A LV+ D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 QFIHELARLNLRPILAVVTACSRRMANSVGLMAKENGFSMLGTFVKPVTGEQIASLADRL 127
+ + + + V++A N+ K + KP ++ + R
Sbjct: 64 DLLPRIKKARPDLPVLVMSA-----QNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 128 RRRAPDDAQEPQAHRGDTEGLLDRASVESALRD 160
+ + D L+ R++ +
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYR 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061511HTHFIS701e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 1e-14
Identities = 25/102 (24%), Positives = 46/102 (45%)

Query: 802 RILVVEDNVINQLILREQLEHLGCAVTLAGNGEEALQRWRGERFDLVLTDLNMPVVNGYE 861
ILV +D+ + +L + L G V + N + DLV+TD+ MP N ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 862 LARALREGGYDGPLIGLTSSSAAEIAQRGVAAGMTQVLCKPL 903
L +++ D P++ +++ + A + G L KP
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_061541RTXTOXINA290.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.003
Identities = 13/57 (22%), Positives = 24/57 (42%), Gaps = 7/57 (12%)

Query: 21 AGPALAAEKKPTAGDISRAI---AYADVMRAFAYSKQATEFKERFDKIGPGPDGLEA 74
+ A A ++ AI ++ + F + + E+ +RF K+G D L A
Sbjct: 303 SAAAAGL----IASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLA 355


123NH44784_062431NH44784_062571N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_062431-1141.271708Protein secretion chaperonin CsaA
NH44784_062441-1130.859241Co-activator of prophage gene expression IbrA
NH44784_0624510100.686094hypothetical protein
NH44784_062461-190.467741Sensory histidine kinase QseC
NH44784_062471110-0.508936Two-component system response regulator QseB
NH44784_062481010-0.464107probable membrane protein
NH44784_062491011-1.790521Putative fimbrial-like protein
NH44784_062501-111-2.003277FIG036507: Fimbriae usher protein StdB
NH44784_062511-212-2.797830Probable fimbrial chaperone protein
NH44784_062521-312-2.488738hypothetical protein
NH44784_062531-312-2.784607hypothetical protein
NH44784_062541-211-2.973970two component response regulator
NH44784_062551-112-3.129289kinase sensor protein
NH44784_062561014-2.419401DNA-binding response regulator, LuxR family
NH44784_062571013-1.991364ABC opine/polyamine transporter, periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062431PF07675270.017 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 27.4 bits (60), Expect = 0.017
Identities = 13/33 (39%), Positives = 15/33 (45%)

Query: 91 VRSEVLVLGLADDAGDTVLVTPEFDVPNGARLT 123
V S + D LVTPE +PNG LT
Sbjct: 677 VSSASYINFEGPQNPDNYLVTPELSLPNGGTLT 709


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062461PF06580300.019 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.019
Identities = 20/118 (16%), Positives = 38/118 (32%), Gaps = 27/118 (22%)

Query: 295 LPDALLDCAVRN-----LVDNALKY----SPDDKPVEVDAGIVADWLVVTVRDHGSGLTP 345
+ A++D V LV+N +K+ P + + + + V + GS
Sbjct: 246 INPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALK 305

Query: 346 AQCAQAAEPFWRGHRQVEGAGLGLSIV-ATIAARHGG--MLELTPAVGGGLCGRLSLP 400
E G GL V + +G ++L+ G + +P
Sbjct: 306 --------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062471HTHFIS972e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.8 bits (241), Expect = 2e-25
Identities = 41/135 (30%), Positives = 61/135 (45%), Gaps = 4/135 (2%)

Query: 2 RILLVEDDASIASAIQAGLGLQGFVVDAVGSLRQADLAVQTSHASACVLDLCLPDGDGVT 61
IL+ +DDA+I + + L G+ V + + V D+ +PD +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLAKWRARRQDLPVLVLTARSAIAQRVAALRAGADDYVLKPFDLDELAAR----LQALIR 117
LL + + R DLPVLV++A++ + A GA DY+ KPFDL EL L R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 118 RAAGHSADRIDHGRL 132
R + D D L
Sbjct: 125 RPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062501PF005775960.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 596 bits (1537), Expect = 0.0
Identities = 228/847 (26%), Positives = 378/847 (44%), Gaps = 65/847 (7%)

Query: 2 CTAQATEFSTGFLNTKDKNNIDLSTFSRDGYIAPGSYLLDIYLDQRLIQGQTLVKAVPVI 61
++ F+ FL + DLS F + PG+Y +DIYL+ + + +
Sbjct: 42 LSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDS 101

Query: 62 GDGTVFCVTPGMVDMLSLKDEFRARLAQVHTAEGGPCIDLDT--PDSRVVYSAEHQSLTL 119
G V C+T + + L + + + C+ L + D+ Q L L
Sbjct: 102 EQGIVPCLTRAQLASMGLNTASVSGMNLLAD---DACVPLTSMIHDATAQLDVGQQRLNL 158

Query: 120 TVPQAWLQYQDPDWVPPARWSNGVNGVILDYNVLANRYMPRQGAGSASYTLYGTGGLNLG 179
T+PQA++ + ++PP W G+N +L+YN N R G S L GLN+G
Sbjct: 159 TIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIG 218

Query: 180 AWRLRSDYQYNRYDSGGQS--QARFSLPQTYLFRALPQWRSKLTLGQTYLASAIFDPFRF 237
AWRLR + ++ S S + ++ T+L R + RS+LTLG Y IFD F
Sbjct: 219 AWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINF 278

Query: 238 AGVTLASDERMLPPSLQGYAPQITGIATSNAEVTVSQNGRLLYQTRVSPGPFVLPALNQY 297
G LASD+ MLP S +G+AP I GIA A+VT+ QNG +Y + V PGPF + +
Sbjct: 279 RGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAA 338

Query: 298 -ISGNLDVSLRESDGTKRSWQVSTASVPFMTRKGNLRYQVSLGRPLFGGPSGNHVAKPRF 356
SG+L V+++E+DG+ + + V +SVP + R+G+ RY ++ G KPRF
Sbjct: 339 GNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR---SGNAQQEKPRF 395

Query: 357 MAGEATWGAFNNTSLYGGLIVTDDNYQALALGAGQNMGALGALSADVTRSDARLPYSSAP 416
G ++YGG + D Y+A G G+NMGALGALS D+T++++ LP
Sbjct: 396 FQSTLLHGLPAGWTIYGGTQLADR-YRAFNFGIGKNMGALGALSVDMTQANSTLP--DDS 452

Query: 417 RRTGYSYRVNYAKSFDDLGSTLAFMGYRFSGRHFLSMREFI------------------- 457
+ G S R Y KS ++ G+ + +GYR+S + + +
Sbjct: 453 QHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVK 512

Query: 458 VRSALHGGDFRDEKQSYTVSYSQYIQPLELSVSLSLSRLNYWNDAAANNHYMLSFNKNAR 517
+ + +++ ++ +Q + ++ LS S YW + + + N
Sbjct: 513 PKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTA-- 569

Query: 518 FGPLRNVSLSLSLARTQSVYG-PTQNQVYASLSIPLGDSR-----------QLSYGYQNS 565
+++ +LS + T++ + + +++IP SY +
Sbjct: 570 ---FEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHD 626

Query: 566 GGGRMQQNVGY--TDFSNPDTTWNLSAS-DDRDAPDRHQSVSGNIQSRTPYGRAAGDFTL 622
GRM G T + + ++++ + + + R YG A ++
Sbjct: 627 LNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSH 686

Query: 623 QPGQYRSVGLNWYGSVTATAEGAAFGPPSSGNEPRMMIDTDGIAGVQIENGDGV-TNRFG 681
+ + G V A A G G P N+ +++ G ++EN GV T+ G
Sbjct: 687 SDD-IKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGAKDAKVENQTGVRTDWRG 743

Query: 682 IAVVNGVSSYRESNLAVDVNALPDGVDVADAMISQVLTEGAIGYTRVGARHGEQVLGRVE 741
AV+ + YRE+ +A+D N L D VD+ +A+ + V T GAI AR G ++L +
Sbjct: 744 YAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT 803

Query: 742 LADGSHPPLGALVLSTRSGKTAGMVADGGLVYLNVEPDDREALTVTWGGARECR----LA 797
+ P GA+V S S +++G+VAD G VYL+ P + V WG
Sbjct: 804 -HNNKPLPFGAMVTS-ESSQSSGIVADNGQVYLSGMP-LAGKVQVKWGEEENAHCVANYQ 860

Query: 798 LPAAAAI 804
LP +
Sbjct: 861 LPPESQQ 867


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062521BCTERIALGSPF240.028 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 23.6 bits (51), Expect = 0.028
Identities = 8/41 (19%), Positives = 17/41 (41%), Gaps = 1/41 (2%)

Query: 2 ALAAVMAWCLSLSESACVRYGGWIAGGTLLGMLGLGLWVSR 42
AL + +S+ A +G W+ L G + + + +
Sbjct: 208 ALPLSTRVLMGMSD-AVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062541HTHFIS673e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.2 bits (164), Expect = 3e-14
Identities = 30/140 (21%), Positives = 52/140 (37%), Gaps = 7/140 (5%)

Query: 1 MLSYRVLIVDDQLLQREYLKNLFLQAGIAHVETAENGSHALRLLEKQRYDLVLSDLLMPE 60
M +L+ DD R L +AG V N + R + DLV++D++MP+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 LDGVQLIQRLSTYETPPLLAVISSSPKRLIAGASLVAKTLGMRVIDQLSKPAQPQAIIAL 120
+ L+ R+ + V+S+ K D L KP +I +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQ-----NTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 121 VAK-LKAASRLKAESLVSGQ 139
+ + L R ++ Q
Sbjct: 115 IGRALAEPKRRPSKLEDDSQ 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062551HTHFIS771e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.2 bits (190), Expect = 1e-16
Identities = 27/117 (23%), Positives = 53/117 (45%)

Query: 846 RVLVAEDNPVNRTLLVRQLEELGCEVTLCVDGVETLARWNSDEFDVLITDMNMPLMNGYE 905
+LVA+D+ RT+L + L G +V + + + + D+++TD+ MP N ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 906 LVRTLRARHTKAPIIGLTADALQTQRDAGFAAGLNAWVVKPVDLRTLHDALSTACAR 962
L+ ++ P++ ++A G ++ KP DL L + A A
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062561HTHFIS585e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.5 bits (139), Expect = 5e-12
Identities = 38/183 (20%), Positives = 70/183 (38%), Gaps = 13/183 (7%)

Query: 2 TRILLADDHQVVILGVRNVLEGVARHHIVGEALSATELIRKARELTPDIIITDYNMPSED 61
IL+ADD + + L + V +A L R D+++TD MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPD-- 59

Query: 62 GYGDGLALVQYLRRNFPRVRILIHTMISSPVIVASLYEEGVSGVLFKSGDLAEIL----T 117
+ L+ +++ P + +L+ + ++ + E+G L K DL E++
Sbjct: 60 --ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117

Query: 118 ALGALENNQIYRSTPQVLAANHPGGA---QDAATRIARLSPKELEVLRHFLSGTSVGDIA 174
AL + G + Q+ +ARL +L ++ SGT +A
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA 177

Query: 175 RIM 177
R +
Sbjct: 178 RAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_062571MALTOSEBP290.038 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.5 bits (63), Expect = 0.038
Identities = 14/46 (30%), Positives = 25/46 (54%), Gaps = 4/46 (8%)

Query: 134 ALPAMYDSMTLVYAKSAFAQPPRSWREM----WDPKYRGKVAVATN 175
A P ++++L+Y K PP++W E+ + K +GK A+ N
Sbjct: 131 AYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFN 176


124NH44784_064141NH44784_064261N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NH44784_0641411100.877953Multidrug resistance transporter, Bcr/CflA
NH44784_0641511101.015432LSU ribosomal protein L28p
NH44784_064161091.099003LSU ribosomal protein L33p
NH44784_0641711101.726366branched-chain amino acid-binding protein
NH44784_0641811121.920235inner-membrane translocator
NH44784_0641912120.972967Branched-chain amino acid transport ATP-binding
NH44784_0642012120.444725Sulfate permease
NH44784_0642110110.248060Phosphonopyruvate hydrolase
NH44784_0642211110.206732hypothetical protein
NH44784_064231016-0.933412D-alanyl-D-alanine carboxypeptidase
NH44784_064241019-0.965276histone-like protein
NH44784_064251-116-0.032463Transcriptional regulator, MarR family
NH44784_064261-2170.346418MFS permease protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064141TCRTETB606e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.9 bits (145), Expect = 6e-12
Identities = 32/157 (20%), Positives = 62/157 (39%), Gaps = 1/157 (0%)

Query: 18 AFPTIAANLGVPRGDVERTLAAYLIGLALAQVFYGPMADRYGRKPPLMVGLALFMVASLG 77
+ P IA + P A+++ ++ YG ++D+ G K L+ G+ + S+
Sbjct: 36 SLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVI 95

Query: 78 CALAGS-VQALTGWRVLQAMGGAAGIVIPRAVIRDHYETHEAARAMSLLMLIMGLAPILA 136
+ S L R +Q G AA + V+ + +A L+ I+ + +
Sbjct: 96 GFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155

Query: 137 PLAGGQLLGIASWRSLFWVMLAGGAMLMTAVVLIMKE 173
P GG + W L + + + + L+ KE
Sbjct: 156 PAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKE 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064161IGASERPTASE240.030 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 24.3 bits (52), Expect = 0.030
Identities = 8/30 (26%), Positives = 15/30 (50%)

Query: 9 IKLESTAGTGHFYTTTKNKRNMPEKMLIKK 38
+ + S +G G FY T +K+++ K
Sbjct: 889 LTVNSLSGNGSFYYLTDLSNKQGDKVVVTK 918


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064191PF05272280.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.041
Identities = 9/21 (42%), Positives = 11/21 (52%)

Query: 35 MLALIGPNGAGKSTCFNVVGG 55
+ L G G GKST N + G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064201FLGBIOSNFLIP290.029 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 29.4 bits (66), Expect = 0.029
Identities = 25/93 (26%), Positives = 39/93 (41%), Gaps = 8/93 (8%)

Query: 87 SLALFAVLTPLAVAGSPGYIELALAVTVLVGLMQWLVGALRLGSLAHF-ISPSALFGFTS 145
+ L ++TPLA A PG L W + L + P+ L TS
Sbjct: 8 APVLLWLITPLAFAQLPGITSQPLPGGG----QSWSLPVQTLVFITSLTFIPAILLMMTS 63

Query: 146 GAALLIAVHALKDALGLPAPASHGAGALLIGLA 178
++I L++ALG P+ + +L+GLA
Sbjct: 64 FTRIIIVFGLLRNALGTPSAPPN---QVLLGLA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064221PF05272300.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.005
Identities = 15/77 (19%), Positives = 24/77 (31%), Gaps = 7/77 (9%)

Query: 53 QVAAAAVVRYFQTAALHHHEDEEEDLFPALIESMAGSDAVCLHALVDGLVADHARLAALW 112
Q+ A A+ Y ++EE F E + + L+ AA
Sbjct: 728 QLFAEALHLYLAGERYFPSPEDEEIYF--RPEQELRLVETGVQGRLWALLTREGAPAAEG 785

Query: 113 AP-----LRQTLEAVAD 124
A + T +AD
Sbjct: 786 AAQKGYSVNTTFVTIAD 802


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064231BLACTAMASEA320.002 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 32.5 bits (74), Expect = 0.002
Identities = 23/117 (19%), Positives = 45/117 (38%), Gaps = 4/117 (3%)

Query: 21 LLDETSGQVIASHAATARIEPASLTKIMTAYVVFGAIHNKELSPDQKVVISTRAWKVPPG 80
+D SG+ + + A R S K++ V + + ++K+ + +
Sbjct: 44 EMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQD--LVDY 101

Query: 81 SSKMFLEPGSRVSVDQLLRGLMIQSGNDAAIALAEAVSG--SVEAFVARMNDTAVQL 135
S ++V +L + S N AA L V G + AF+ ++ D +L
Sbjct: 102 SPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRL 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064241DNABINDINGHU352e-05 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 35.1 bits (81), Expect = 2e-05
Identities = 29/97 (29%), Positives = 45/97 (46%), Gaps = 8/97 (8%)

Query: 56 NKTQLIAYIVEQSGVEAKSVKAVLASLETSVLSSVDKKGAGEFTLPGLFKVAVQKVPAKA 115
NK LIA + E + + K A + ++ ++V S + K + G F+V +A
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVR-----ERA 57

Query: 116 KRFGKDPFTGEERWFPAKPASVKVKVRPLKKLKDAAQ 152
R G++P TGEE AS + K LKDA +
Sbjct: 58 ARKGRNPQTGEE---IKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NH44784_064261TCRTETA522e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.7 bits (124), Expect = 2e-09
Identities = 74/368 (20%), Positives = 132/368 (35%), Gaps = 30/368 (8%)

Query: 35 PLLHTISQQFSLSNATAGSIVMIAQLSYALGLILLVPLG----DLFERRGLIVLMTLLSS 90
P+L + + SN ++ L YAL P+ D F RR ++++ ++
Sbjct: 26 PVLPGLLRDLVHSNDVTAHYGILLAL-YALMQFACAPVLGALSDRFGRRPVLLVSLAGAA 84

Query: 91 GGLLLSAFAPNIQMLMLGTAVTGMLSVVAQVLVPFAATLAAPHERGKAVGTVMSGLLLGI 150
+ A AP + +L +G V G+ V + A + ER + G + + G+
Sbjct: 85 VDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGM 144

Query: 151 LLARTAAGVLADLGSWRTVYWVAAILMLAMSAALWKVLPRHKSHAAMSYPRLLGSIFRLF 210
+ G++ + AA L L +SH P L
Sbjct: 145 VAGPVLGGLMGGFSPHAPFFAAAA---LNGLNFLTGCFLLPESHKGERRP-LRREALNPL 200

Query: 211 AEEPLLRGRSLLGGLLFAAFSM-----LWTPLTFLLSGPDYGFNNTTIGL-FGLAGAAGA 264
A RG +++ L+ F M + L + + ++ TTIG+ G +
Sbjct: 201 ASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHS 260

Query: 265 YA-ANRFGRLADRGLGNQATRIGLLLLLASWGLMAFGQAS-----VIALLVGILIQDLAI 318
A A G +A R +A +G++ + L+AF ++ LL I A+
Sbjct: 261 LAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPAL 320

Query: 319 QGVHVTNTSSLYRTRPEARSRLTAGYMTSYFIGGATGSLVSSWLYSHY--GWPGVV-TAG 375
Q + E + +L + G L+ + +Y+ W G AG
Sbjct: 321 QAMLSRQVDE------ERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAG 374

Query: 376 AVLAAMTL 383
A L + L
Sbjct: 375 AALYLLCL 382



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.