PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_007146.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_007146 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1NTHI0046NTHI0063Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0046013-3.112764rod shape-determining protein MreC
NTHI0047-212-3.488191rod shape-determining protein MreD
NTHI0048-214-3.296485hypothetical protein
NTHI0049-211-2.813126exonuclease III
NTHI0050-114-2.602015pseudouridine synthase
NTHI0051015-2.661778hypothetical protein
NTHI0052015-3.090535FtsH-interacting integral membrane protein
NTHI0053218-3.550228*hypothetical protein
NTHI0054418-4.164331PhnA-like protein
NTHI0055520-5.084395keto-hydroxyglutarate-aldolase/keto-deoxy-
NTHI0056723-5.422482glucuronate isomerase
NTHI0057827-6.337093D-mannonate oxidoreductase
NTHI0060725-5.316951TRAP-type C4-dicarboxylate transport system,
NTHI0061723-5.110326TRAP-type C4-dicarboxylate transport system,
NTHI0062416-3.286287TRAP-type C4-dicarboxylate transport system,
NTHI0063213-1.502356hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0057DHBDHDRGNASE997e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.4 bits (247), Expect = 7e-27
Identities = 72/269 (26%), Positives = 120/269 (44%), Gaps = 21/269 (7%)

Query: 13 LENKLIIITGAGGVLCSFLAKQLAYTKANIALLDLNFEAADKAAKEINQSGGKAKAYKTN 72
+E K+ ITGA + +A+ LA A+IA +D N E +K + A+A+ +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 73 VLELENIKEVRDQIAIDFGTCDILINGAGGNNPKATTDNEFHQFDLNETTRTFFDLDKSG 132
V + I E+ +I + G DIL+N AG P H L
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI-----HS------------LSDEE 108

Query: 133 IEFVFNLNYLGSLLPTQVFSKDMLGKQGANIINISSMNAFTPLTKIPAYSGAKAAISNFT 192
E F++N G ++ SK M+ ++ +I+ + S A P T + AY+ +KAA FT
Sbjct: 109 WEATFSVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFT 168

Query: 193 QWLAVYFSKVGIRCNAIAPGFLVSNQNRTLLFDTEGKP---TDRANKILTNTPMGRFGES 249
+ L + ++ IRCN ++PG ++ +L D G T P+ + +
Sbjct: 169 KCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKP 228

Query: 250 EELLGALLFLIDENYSAFVNGVVLPVDGG 278
++ A+LFL+ + + L VDGG
Sbjct: 229 SDIADAVLFLVSGQ-AGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0061PF06580270.039 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.1 bits (60), Expect = 0.039
Identities = 20/125 (16%), Positives = 37/125 (29%), Gaps = 8/125 (6%)

Query: 39 SVLRYFFDSGIAFSEEFSRICFVYMIFFGIILVAKDKAHLTVEIIISALPEQYRKIVLIV 98
L F + + S + + F I +++ L +I+L V
Sbjct: 23 YTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMG--QIILRV 80

Query: 99 ANICVLIAMIFIAYGALQLMSLTYTQQMPATGISSSFL------YLAAVISAVSYFFIVM 152
CV+I M++ L + P L + + ++ YF
Sbjct: 81 LPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHF 140

Query: 153 FSMIK 157
F K
Sbjct: 141 FKNYK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0063TYPE3IMSPROT310.006 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 31.3 bits (71), Expect = 0.006
Identities = 28/105 (26%), Positives = 45/105 (42%), Gaps = 22/105 (20%)

Query: 190 AIPILVDILEQRLEYAKSLGIEH--IVNPHKEDD----IK-RIK----EITSGRMAEVVM 238
+ + D + +Y K L + I +KE + IK + + EI S M E V
Sbjct: 196 VVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFHQEIQSRNMRENVK 255

Query: 239 EASGANISIKNTLHYTSFAGRIALTGWPKTETPLPTNLITFKELN 283
+S + + N H I + + + ETPLP L+TFK +
Sbjct: 256 RSS---VVVANPTHIA-----IGI-LYKRGETPLP--LVTFKYTD 289


2NTHI0100NTHI0118Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0100211-1.186184cystathionine gamma-synthase
NTHI0101414-1.253870***chromosome partitioning ATPase
NTHI0102415-1.013300replicative DNA helicase
NTHI0103415-0.790828hypothetical protein
NTHI0104417-0.383635hypothetical protein
NTHI0106522-0.920403hypothetical protein
NTHI0107321-0.232912hypothetical protein
NTHI0108319-0.387455hypothetical protein
NTHI0109219-0.459224single-stranded DNA-binding protein
NTHI0110320-0.876285lipoprotein
NTHI0111320-1.256076hypothetical protein
NTHI0112320-1.410716DNA topoisomerase III
NTHI0113520-2.463126hypothetical protein
NTHI0114522-2.868185hypothetical protein
NTHI0115321-2.039667hypothetical protein
NTHI0116321-1.634953hypothetical protein
NTHI0117221-0.560269DNA repair radC-like protein
NTHI01182190.095837hypothetical protein
3NTHI0132NTHI0162Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI01323161.827971hypothetical protein
NTHI01334171.536737lipoprotein
NTHI01344285.513262hypothetical protein
NTHI01354347.205129hypothetical protein
NTHI01365388.235839hypothetical protein
NTHI01375409.086780hypothetical protein
NTHI013964710.621493hypothetical protein
NTHI014154510.320679transposon Tn3 transposase
NTHI01404263.893164hypothetical protein
NTHI01423243.208709transposon Tn3 resolvase
NTHI20553221.864549beta-lactamase TEM
NTHI0143318-0.063059hypothetical protein
NTHI01444180.298551beta-lactamase TEM
NTHI01454190.418575hypothetical protein
NTHI0146422-0.185943hypothetical protein
NTHI0147218-0.943356hypothetical protein
NTHI0148318-1.279885hypothetical protein
NTHI0149418-1.660679hypothetical protein
NTHI0150318-0.563183hypothetical protein
NTHI01510200.696095hypothetical protein
NTHI0152-1210.943033hypothetical protein
NTHI0153-2260.850005hypothetical protein
NTHI0154-1291.013096antirestriction protein
NTHI0155-1301.203758type I restriction enzyme M subunit
NTHI0156328-1.791828hypothetical protein
NTHI0157827-5.290398hypothetical protein
NTHI01581125-5.394723hypothetical protein
NTHI01591225-6.363763hypothetical protein
NTHI0160821-4.976394resolvase/integrase-like protein
NTHI0161619-4.304637hypothetical protein
NTHI0162314-2.220087hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI2055BLACTAMASEA396e-142 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 396 bits (1018), Expect = e-142
Identities = 186/284 (65%), Positives = 228/284 (80%)

Query: 3 IQHFRVALIPFFAAFCLPVFAHPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRPEE 62
+++ R+ +I A L V A P+ L ++K +E QL RVG IE+DL SG+ L ++R +E
Sbjct: 1 MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADE 60

Query: 63 RFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCS 122
RFPMMSTFKV+LCGAVL+RVDAG EQL R+IHY Q DLV+YSPV+EKHL DGMTV ELC+
Sbjct: 61 RFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCA 120

Query: 123 AAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPA 182
AAITMSDN+AANLLL T+GGP LTAFL +GD+VTRLDRWE ELNEA+P D RDTT PA
Sbjct: 121 AAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPA 180

Query: 183 AMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRG 242
+MA TLRKLLT + L+ S++QL+ WM D+VAGPL+RS LPAGWFIADK+GAGERG+RG
Sbjct: 181 SMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARG 240

Query: 243 IIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW 286
I+A LGP+ K RIVVIY + A+M ERN+QIA IGA+LI+HW
Sbjct: 241 IVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHW 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0154TACYTOLYSIN280.024 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 28.0 bits (62), Expect = 0.024
Identities = 12/35 (34%), Positives = 16/35 (45%), Gaps = 5/35 (14%)

Query: 142 TYKAGAITEMGRKCYV-----YWDFVAYDDDGNEI 171
Y +G I + YV WD + YDD G E+
Sbjct: 462 EYTSGKINLSHQGAYVAQYEILWDEINYDDKGKEV 496


4NTHI0247NTHI0284Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0247216-0.2005963-ketoacyl-ACP reductase
NTHI0248016-0.623446ACP S-malonyltransferase
NTHI0249113-0.0933053-oxoacyl-ACP synthase
NTHI0250115-0.47097550S ribosomal protein L32
NTHI0251116-0.483455hypothetical protein
NTHI0252218-0.765922phosphatidylserine decarboxylase
NTHI0254223-0.139533glutathione reductase
NTHI0255220-0.119293lipoprotein
NTHI0256020-1.119327Na(+)-translocating NADH-quinone reductase
NTHI0257-117-0.814711Na(+)-translocating NADH-quinone reductase
NTHI0258-114-0.420006Na(+)-translocating NADH-quinone reductase
NTHI0259-190.351138Na(+)-translocating NADH-quinone reductase
NTHI0260-210-0.170265Na(+)-translocating NADH-quinone reductase
NTHI0261-110-0.407095Na(+)-translocating NADH-quinone reductase
NTHI0262-112-0.228786thiamine biosynthesis lipoprotein ApbE
NTHI02632170.155174hypothetical protein
NTHI02643180.045910tRNA-specific 2-thiouridylase MnmA
NTHI0266318-0.113040hypothetical protein
NTHI02674160.49207523S rRNA pseudouridine synthase D
NTHI02683131.273053hypothetical protein
NTHI02692151.367666hypothetical protein
NTHI02701121.190019pyruvate formate lyase-activating enzyme 1
NTHI02730131.112151formate acetyltransferase
NTHI02740120.355700formate transporter
NTHI0275212-0.034723N-acetyl-D-glucosamine kinase
NTHI0278215-2.103888Na+/alanine symporter
NTHI0279115-2.000266esterase
NTHI0280-114-2.574345HTH-type transcriptional regulator
NTHI0282-214-2.712518Sec-independent protein translocase protein
NTHI0283-214-2.773335sec-independent translocase
NTHI0284-212-4.278091Sec-independent protein translocase protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0254RTXTOXIND347e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 7e-04
Identities = 12/44 (27%), Positives = 21/44 (47%), Gaps = 4/44 (9%)

Query: 16 PAQVIHSGNAVNQVAILGEEYVGMRPSMKVREGDVVKKGQVLFE 59
++ HSG + I + + V+EG+ V+KG VL +
Sbjct: 87 NGKLTHSGRSKEIKPIEN----SIVKEIIVKEGESVRKGDVLLK 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0274PF04335290.031 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 29.0 bits (65), Expect = 0.031
Identities = 10/38 (26%), Positives = 17/38 (44%), Gaps = 1/38 (2%)

Query: 405 FVYFGAVRSGNVVWNFADTVMAVMAIINLIAILMLSPI 442
A RS + W A V +A ++A+ L+P+
Sbjct: 23 DKLAAAERSKKLAWVVA-GVAGALATAGVVAVAALTPL 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0280TATBPROTEIN1701e-56 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 170 bits (431), Expect = 1e-56
Identities = 71/171 (41%), Positives = 100/171 (58%), Gaps = 4/171 (2%)

Query: 1 MFDIGFSELILLMVLGLVVLGPKRLPIAIRTVMDWVKTIRGLAANVQNELKQELKLQELQ 60
MFDIGFSEL+L+ ++GLVVLGP+RLP+A++TV W++ +R LA VQNEL QELKLQE Q
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60

Query: 61 DSIKKAESLNLQALSPELSKTVEELKAQADKMK----AELEDKAAQAGTTVEDQIKEIKN 116
DS+KK E +L L+PEL +++EL+ A+ MK A +KA+ T+ + + +
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDELRQAAESMKRSYVANDPEKASDEAHTIHNPVVKDNE 120

Query: 117 AAENAEKPQNAISVEEAAETLSEAEKTPTDLTALETHEKVELNTHLSSYYP 167
AA P A + + E E P A + K + SS P
Sbjct: 121 AAHEGVTPAAAQTQASSPEQKPETTPEPVVKPAADAEPKTAAPSPSSSDKP 171


5NTHI0415NTHI0422Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0415121-4.279080apolipoprotein N-acyltransferase
NTHI0416023-5.42661516S ribosomal RNA methyltransferase RsmE
NTHI0419224-6.364927hypothetical protein
NTHI0420526-7.861408Holliday junction resolvase-like protein
NTHI0421019-5.455551prophage CP4-57-like integrase
NTHI0422014-3.176562hypothetical protein
6NTHI0469NTHI0476Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI04692170.300099cytochrome C-type protein NapC
NTHI04712190.762169adenylate kinase
NTHI04723210.337952integral membrane signal transducer protein
NTHI04734222.257948UDP-glucose 4-epimerase
NTHI04742150.828769CMP-Neu5Ac--lipooligosaccharide alpha 2-3
NTHI0475210-0.456539ABC-type nitrate/sulfonate/bicarbonate transport
NTHI0476211-0.827740ABC-type nitrate/sulfonate/bicarbonate transport
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0471NUCEPIMERASE1761e-54 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 176 bits (447), Expect = 1e-54
Identities = 82/350 (23%), Positives = 152/350 (43%), Gaps = 37/350 (10%)

Query: 1 MAILVTGGAGYIGSHTVVELLNAGKEVVVLDNLCNSSPKSLE--RVKQITGKEAKFYEGD 58
M LVTG AG+IG H LL AG +VV +DNL + SL+ R++ + +F++ D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 59 ILDRALLQKIFAENEINSVIHFAGLKAVGESVQKPTEYYMNNVAGTLVLIQEMKKAGVWN 118
+ DR + +FA V AV S++ P Y +N+ G L +++ + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 119 FVFSSSATVYGDPKIIPITEDCEVGGTTNPYGTSKYMVEQILCDTAKAEPKFSMTILRYF 178
+++SS++VYG + +P + D V + Y +K E ++ T T LR+F
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE-LMAHTYSHLYGLPATGLRFF 179

Query: 179 NPVGAHESGLIGEDPNGIPNNLLP-YISQVAIGKLAQLSVFGSDYDTHDGTGVRDYIHVV 237
G P G P+ L + + GK + V+ G RD+ ++
Sbjct: 180 TVYG----------PWGRPDMALFKFTKAMLEGK--SIDVYN------YGKMKRDFTYID 221

Query: 238 DLAVGHLKALQ---------------RHENDAGLHIYNLGTGHGYSVLDMVKAFEKANNI 282
D+A ++ + A +YN+G ++D ++A E A I
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 283 TIAYKLVERRAGDIATCYSDPSLAAKELGWVAERGLEKMMQDTWNWQKNN 332
++ + GD+ +D + +G+ E ++ +++ NW ++
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


7NTHI0496NTHI0507Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0496221-1.754962hypothetical protein
NTHI0497219-0.154026co-chaperone HscB
NTHI04982190.946832hypothetical protein
NTHI04993200.974896scaffold protein
NTHI05003181.223600cysteine desulfurase
NTHI05013162.070613hypothetical protein
NTHI05022150.895256tRNA/rRNA methyltransferase
NTHI05034140.084513**outer membrane protein P6
NTHI0504-114-1.246367translocation protein TolB
NTHI0505-210-0.734305cell envelope integrity inner membrane protein
NTHI0506-212-1.053622colicin uptake protein TolR
NTHI0507-112-3.109919hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0501OMPADOMAIN1076e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 107 bits (269), Expect = 6e-31
Identities = 37/117 (31%), Positives = 54/117 (46%), Gaps = 8/117 (6%)

Query: 42 ADLQQRYNT----VYFGFDKYDITGEYVQILDAHAAYLNAT--PAAKVLVEGNTDERGTP 95
++Q ++ T V F F+K + E LD + L+ V+V G TD G+
Sbjct: 208 PEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSD 267

Query: 96 EYNIALGQRRADAVKGYLAGKGVDAGKLGTVSYGEEKPAVLGHDEAAYSKNRRAVLA 152
YN L +RRA +V YL KG+ A K+ GE P V G+ K R A++
Sbjct: 268 AYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALID 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0503IGASERPTASE654e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 64.7 bits (157), Expect = 4e-13
Identities = 31/190 (16%), Positives = 65/190 (34%), Gaps = 3/190 (1%)

Query: 77 QKRPEPVVEEKPPEPNQEEIKHQQEVQRQEEIKHQQEVQRQEELKRQQEQQRQQEIKKQQ 136
+KR + V PN + EEI E + + + +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 137 EQARQEALEKQKQAE-EARAKQAAEAAKLKADAEAKRLAAAAKQAEEEAKAKAAEIAAQK 195
++++ +Q E A+ ++ A+ AK A + A+ E + + E +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT-NEVAQSGSETKETQTTE-TKET 1103

Query: 196 AKQEAETKAKLEAEAKAKAAAEAKAKAEAEAKAKAAAEAKAKAEAEAKAKAAAEAKAKAD 255
A E E KAK+E E + + + +++ A E +++ +
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 256 AEAKAKAAAE 265
A + A+
Sbjct: 1164 TTADTEQPAK 1173



Score = 52.4 bits (125), Expect = 2e-09
Identities = 30/239 (12%), Positives = 70/239 (29%), Gaps = 10/239 (4%)

Query: 43 EGEGDVIGAVIVDTGTAAQEWGRIQQQKKGQADKQKRPEPVVEEKPPEPNQEEIKHQQEV 102
E + + T Q + + PV P P++ +
Sbjct: 986 EKRNQTVDTTNITTPNNIQA-DVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 103 QRQEEIKHQQEVQRQEELKRQQEQQRQQEIKKQQEQARQEALEKQKQAEEARAKQAAEAA 162
+++ + + E E + +E ++ + + E + + +E + + E A
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 163 KLKADAEAKRLAAAAKQAEEEAKAKAAEIAAQ-KAKQEAETKAKLEAEAKAKAAAEAKAK 221
++ + +AK E E + ++ +Q KQE + +AE + K
Sbjct: 1105 TVEKEEKAK--------VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156

Query: 222 AEAEAKAKAAAEAKAKAEAEAKAKAAAEAKAKADAEAKAKAAAEAKAAAEAKRKADQAS 280
A + E + + + E A + + S
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215



Score = 52.0 bits (124), Expect = 4e-09
Identities = 29/183 (15%), Positives = 56/183 (30%), Gaps = 9/183 (4%)

Query: 110 HQQEVQRQEELKRQQEQQRQQEIKKQQEQARQEALEKQKQAEEARAKQAAEAAKLKA-DA 168
+ EV+++ + I+ E AR +A A +
Sbjct: 981 YNPEVEKRNQTVDTTNITTPNNIQADVPS------VPSNNEEIARVDEAPVPPPAPATPS 1034

Query: 169 EAKRLAAAAKQAEEEAKAKAAEIAAQKAKQ--EAETKAKLEAEAKAKAAAEAKAKAEAEA 226
E A + E + K + A + Q E +AK +A + A++ +E +
Sbjct: 1035 ETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 227 KAKAAAEAKAKAEAEAKAKAAAEAKAKADAEAKAKAAAEAKAAAEAKRKADQASLDDFFN 286
+ A E E KAK E + + + ++ + D N
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVN 1154

Query: 287 GGD 289
+
Sbjct: 1155 IKE 1157



Score = 49.7 bits (118), Expect = 2e-08
Identities = 35/234 (14%), Positives = 66/234 (28%), Gaps = 17/234 (7%)

Query: 56 TGTAAQEWGRIQQQKKGQADKQKRPEPVVEEKPPE--PNQEEIKHQQEVQRQ-EEIKHQQ 112
T T A+ + + + E E N + EV + E K Q
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 113 EVQRQEELKRQQEQQRQQEIKKQQEQARQEALEK---------QKQAEEARAKQAAEAAK 163
+ +E ++E++ + E +K QE + + Q QAE AR K
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156

Query: 164 LKADAEAKRLAAAAKQAEEEAKAKAAEIAAQ----KAKQEAETKAKLEAEAKAKAAAEAK 219
+ ++ A + A+E + + E
Sbjct: 1157 -EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215

Query: 220 AKAEAEAKAKAAAEAKAKAEAEAKAKAAAEAKAKADAEAKAKAAAEAKAAAEAK 273
+ ++ E + A D + A + A A+A+
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQ 1269



Score = 32.3 bits (73), Expect = 0.004
Identities = 25/188 (13%), Positives = 55/188 (29%), Gaps = 11/188 (5%)

Query: 57 GTAAQEWGRIQQQKKGQAD---KQKRPEPVVEEKPPEPNQEEIKHQQEVQRQEEIKHQQE 113
A E + Q+ K + KQ++ E V + P + + +E Q Q E
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 114 VQRQE-------ELKRQQEQQRQQEIKKQQEQARQEALEKQKQAEEARAKQAAEAAKLKA 166
+E + + + E + +E + + +++
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRS 1229

Query: 167 DAEAKRLAA-AAKQAEEEAKAKAAEIAAQKAKQEAETKAKLEAEAKAKAAAEAKAKAEAE 225
A ++ A +A KA+ A KA ++ ++ E
Sbjct: 1230 VPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMN 1289

Query: 226 AKAKAAAE 233
+ +
Sbjct: 1290 NEGQYNVW 1297


8NTHI0531NTHI0541Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0531290.213848acetyl-CoA carboxylase carboxyltransferase
NTHI05323100.461785high-affinity zinc uptake system membrane
NTHI05342101.713085ABC transporter ATP-binding protein
NTHI05351122.881985metalloprotease
NTHI05361133.247628transcriptional regulatory protein TyrR
NTHI05380154.359848RNA-binding protein Hfq
NTHI05391255.97959923S rRNA pseudouridylate synthase C
NTHI05401204.709934ribonuclease E
NTHI05411163.476211hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0534HTHFIS2809e-94 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 280 bits (719), Expect = 9e-94
Identities = 99/338 (29%), Positives = 164/338 (48%), Gaps = 37/338 (10%)

Query: 15 FIVQSEAMKSAVENAKRFAMFDVPLLIQGETGTGKDLLAKACHYQSLRRDKKFIAVNCAG 74
+ +S AM+ R D+ L+I GE+GTGK+L+A+A H RR+ F+A+N A
Sbjct: 139 LVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAA 198

Query: 75 LPDEDAESEMFGRKVG-----DSETIGFFEYANEGTVLLDGIAELSLGLQAKLLRFLTDG 129
+P + ESE+FG + G + + G FE A GT+ LD I ++ + Q +LLR L G
Sbjct: 199 IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQG 258

Query: 130 SFRRVGEEKEHYANVRVICTSQVPLHLLVEQGKVRADLFHRLNVLTLNVPPLRERMTDIE 189
+ VG ++VR++ + L + QG R DL++RLNV+ L +PPLR+R DI
Sbjct: 259 EYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIP 318

Query: 190 PLAQGFLQEISEELKIAKPTFDKDFLLYLQKYDWKGNVRELYNTLYRACSLVQDNHLTIE 249
L + F+Q+ +E K FD++ L ++ + W GNVREL N + R +L + +T E
Sbjct: 319 DLVRHFVQQAEKEGLDVK-RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITRE 377

Query: 250 -------------------------SLNLAPPQSAVISLDEFGN-----KTLDEIIGSYE 279
S++ A ++ FG+ D ++ E
Sbjct: 378 IIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEME 437

Query: 280 AQVLKLFYAEYPSTR-KLAQRLGVSHTAIANKLKQYGI 316
++ + K A LG++ + K+++ G+
Sbjct: 438 YPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0536RTXTOXIND280.049 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.3 bits (63), Expect = 0.049
Identities = 20/128 (15%), Positives = 46/128 (35%), Gaps = 28/128 (21%)

Query: 136 GVIEALRALRPEARFLELVHRLDRDTSGILLIAKKRSALRNLHEQLRVKTVQKDYLALVR 195
I L E +++E V+ L S + + S + + + + V + + +
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQL---EQIESEILSA--KEEYQLVTQLFKNEIL 301

Query: 196 GQWQSHIKVIQAPLLKNELSSGERIVRVSEQGKPSETRFSIEERYINATLVKASPVTGRT 255
+ + Q L+ + + E+ + S R +PV+ +
Sbjct: 302 DKLR------QTTDNIGLLT--LELAKNEERQQASVIR---------------APVSVKV 338

Query: 256 HQIRVHTQ 263
Q++VHT+
Sbjct: 339 QQLKVHTE 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0539OUTRMMBRANEA357e-06 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 34.9 bits (80), Expect = 7e-06
Identities = 23/72 (31%), Positives = 32/72 (44%), Gaps = 5/72 (6%)

Query: 4 GGDVKADQETSGSRSIKRIGFG--FIGGIGYDITPNITLDLDYRY-NDWGRLE--NVRFK 58
G +AD +++ G F GG+ Y ITP I L+Y++ N+ G R
Sbjct: 120 GMVWRADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPD 179

Query: 59 THEASFGVRYRF 70
S GV YRF
Sbjct: 180 NGMLSLGVSYRF 191


9NTHI0605NTHI0616Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI06052160.085518imidazole glycerol phosphate synthase subunit
NTHI0606217-0.7697151-(5-phosphoribosyl)-5-[(5-
NTHI0606_1220-0.142321imidazole glycerol phosphate synthase subunit
NTHI06072190.391224bifunctional phosphoribosyl-AMP
NTHI06083210.195790hypothetical protein
NTHI06093240.271758tyrosine-specific transport protein 1
NTHI0610018-0.591170ATP synthase F0F1 subunit epsilon
NTHI0611119-1.320402ATP synthase F0F1 subunit beta
NTHI0612116-3.578080ATP synthase F0F1 subunit gamma
NTHI0613116-4.101384ATP synthase F0F1 subunit alpha
NTHI0614-114-3.430318ATP synthase F0F1 subunit delta
NTHI0615-118-3.443790ATP synthase F0F1 subunit B
NTHI0616-120-3.384946ATP synthase F0F1 subunit C
10NTHI0636NTHI0641Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI06362211.379360ribose operon repressor
NTHI06373241.269671hypothetical protein
NTHI06384271.376747ribonuclease activity regulator protein RraA
NTHI06394291.1624941,4-dihydroxy-2-naphthoate
NTHI06404321.675373hypothetical protein
NTHI06412261.016450potassium-tellurite ethidium and proflavin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0641GPOSANCHOR350.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 0.002
Identities = 21/102 (20%), Positives = 31/102 (30%), Gaps = 4/102 (3%)

Query: 937 RDGVEKDKRALEIEEMQLREAKKDLTEELEILEAGLFARVRNLLISSGADAAQLDKVDRT 996
+ +EK K L E L A A + L GA +
Sbjct: 192 QAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK-ADLEKAL--EGAMNFSTADSAKI 248

Query: 997 KWLEQTIAD-EEKQNQLEQLAEQYEELRKEFEHKLEVKRKKI 1037
K LE A E +Q +LE+ E K++ +
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 290


11NTHI0660NTHI0670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI06600204.024727DNA primase
NTHI06610245.432642RNA polymerase sigma factor RpoD
NTHI06621215.499506aspartate ammonia-lyase
NTHI06631215.278040urease accessory protein UreH
NTHI06640203.890244urease accessory protein
NTHI06650243.892229urease accessory protein UreF
NTHI06661313.002161urease accessory protein UreE
NTHI06671272.422114urease subunit alpha
NTHI06684292.424884urease subunit beta
NTHI06693272.328332urease subunit gamma
NTHI06702222.018470co-chaperonin GroES
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0665UREASE10450.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1045 bits (2703), Expect = 0.0
Identities = 362/575 (62%), Positives = 436/575 (75%), Gaps = 8/575 (1%)

Query: 1 MALTISRAQYVATYGPTVGDKVRLGDTNLWATIEQDLLTKGDECKFGGGKSVRDGMAQSG 60
M+ +SRA Y +GPTVGDKVRL DT L+ +E+D T G+E KFGGGK +RDGM QS
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQS- 59

Query: 61 TATRDNPNVLDFVITNVMIIDAKLGIIKADIGIRDGRIVGIGQAGNPDTMDNVTPNMIIG 120
TR+ +D VITN +I+D GI+KADIG++DGRI IG+AGNPD V +I+G
Sbjct: 60 QVTREG-GAVDTVITNALILDH-WGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVG 115

Query: 121 ASTEVHNGAHLIATAGGIDTHIHFICPQQAQHAIESGVTTLIGGGTGPADGTHATTCTPG 180
TEV G I TAGG+D+HIHFICPQQ + A+ SG+T ++GGGTGPA GT ATTCTPG
Sbjct: 116 PGTEVIAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPG 175

Query: 181 AWYMERMFQAAEALPVNVGFFGKGNCSTLDPLREQIEAGALGLKIHEDWGATPAVIDSAL 240
W++ RM +AA+A P+N+ F GKGN S L E + GA LK+HEDWG TPA ID L
Sbjct: 176 PWHIARMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCL 235

Query: 241 KVADEMDIQVAIHTDTLNESGFLEDTMKAIDGRVIHTFHTEGAGGGHAPDIIKAAMYSNV 300
VADE D+QV IHTDTLNESGF+EDT+ AI GR IH +HTEGAGGGHAPDII+ NV
Sbjct: 236 SVADEYDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNV 295

Query: 301 LPASTNPTRPFTKNTIDEHLDMLMVCHHLDKRVPEDVAFADSRIRPETIAAEDILHDMGV 360
+P+STNPTRP+T NT+ EHLDMLMVCHHL +PED+AFA+SRIR ETIAAEDILHD+G
Sbjct: 296 IPSSTNPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGA 355

Query: 361 FSIMSSDSQAMGRIGEVVIRTWQTADKMKMQRGELGNE--GNDNFRIKRYIAKYTINPAI 418
FSI+SSDSQAMGR+GEV IRTWQTADKMK QRG L E NDNFR+KRYIAKYTINPAI
Sbjct: 356 FSIISSDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAI 415

Query: 419 AHGIAEHIGSLEVGKIADIVLWKPMFFGVKPEVVIKKGFISYAKMGDPNASIPTPQPVFY 478
AHG++ IGSLEVGK AD+VLW P FFGVKP++V+ G I+ A MGDPNASIPTPQPV Y
Sbjct: 416 AHGLSHEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHY 475

Query: 479 RPMYGAQGLATAQTAVFFVSQAAEKADIREKFGLHKETIAVKGCR-NVGKKDLVHNDVTP 537
RPM+GA G + ++V FVSQA+ A + + G+ KE +AV+ R +GK ++HN +TP
Sbjct: 476 RPMFGAYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTP 535

Query: 538 NITVDAERYEVRVDGELITCEPVDSVPLGQRYFLF 572
+I VD E YEVR DGEL+TCEP +P+ QRYFLF
Sbjct: 536 HIEVDPETYEVRADGELLTCEPATVLPMAQRYFLF 570


12NTHI0706NTHI0713Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0706219-3.488089DNA-binding transcriptional regulator OxyR
NTHI0707326-0.741186peroxiredoxin/glutaredoxin
NTHI0708325-1.564909hypothetical protein
NTHI0709222-2.102638FKBP-type peptidylprolyl isomerase
NTHI0710320-1.741465hypothetical protein
NTHI0711316-1.091267sulfur transfer complex subunit TusD
NTHI0712316-1.106750sulfur relay protein TusC
NTHI0713116-3.380749hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0707INFPOTNTIATR1217e-36 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 121 bits (304), Expect = 7e-36
Identities = 74/230 (32%), Positives = 117/230 (50%), Gaps = 9/230 (3%)

Query: 9 IAALMVSAVISGQVFAEDNTF---DEKAASYAVGTLMGSQMKDLVDSHKEVIKYDNARIL 65
+ A ++ +S + A D T D+ SY++G +G K + I + +
Sbjct: 6 VTAAIMGLAMSTAMAATDATSLTTDKDKLSYSIGADLGKNFK------NQGIDINPDVLA 59

Query: 66 DGLKDALEGKVDVRKDEKIQKTLESIEAKLVAASKAKAEAIAKQAKEEGDKFRAEFAKGK 125
G++D + G + +E+++ L + L+A A+ A++ K +GD F +
Sbjct: 60 KGMQDGMSGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKP 119

Query: 126 DVKTTQSGLMYKIESAGKGDTIKSTDTVKVHYTGKLPNGKVFDSSVERGQPVEFQLDQVI 185
+ SGL YKI AG G +DTV V YTG L +G VFDS+ + G+P FQ+ QVI
Sbjct: 120 GIVVLPSGLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVI 179

Query: 186 KGWTEGLQLVKKGGKIQLVIAPELGYGEQGAGASIPPNSTLIFDVEVLDV 235
GWTE LQL+ G ++ + +L YG + G I PN TLIF + ++ V
Sbjct: 180 PGWTEALQLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISV 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0712TCRTETOQM832e-19 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 83.4 bits (206), Expect = 2e-19
Identities = 58/194 (29%), Positives = 91/194 (46%), Gaps = 25/194 (12%)

Query: 13 VNVGTIGHVDHGKTTLT-------AAITTVLAKHYGGAARAFDQIDNAPEEKARGITINT 65
+N+G + HVD GKTTLT AIT + + G + DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTT-----RTDNTLLERQRGITIQT 58

Query: 66 SHVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQ 125
+ +D PGH D++ + + +DGAIL+++A DG QTR R+
Sbjct: 59 GITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRK 118

Query: 126 VGVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQ--------YDFPGDDTPIVRGSALQ 177
+G+P I F+NK D + L V +++E LS +P + + + +
Sbjct: 119 MGIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYP--NMCVTNFTESE 173

Query: 178 ALNGVAEWEEKILE 191
+ V E + +LE
Sbjct: 174 QWDTVIEGNDDLLE 187


13NTHI0860NTHI0865Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI08602110.202986uroporphyrin-III C-methyltransferase
NTHI0861212-0.414984adenylate cyclase
NTHI0862212-0.717350NAD(P)H-dependent glycerol-3-phosphate
NTHI0863212-0.854986serine acetyltransferase
NTHI08642140.633442shikimate 5-dehydrogenase
NTHI0865214-0.095576di- and tricarboxylate transporter
14NTHI1164NTHI1179Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1164211-1.452012isopropylmalate isomerase large subunit
NTHI1165113-1.845218isopropylmalate isomerase small subunit
NTHI1166215-2.154423IgA-specific serine endopeptidase
NTHI1167317-2.538412recombination protein F
NTHI1168620-3.207047DNA polymerase III subunit beta
NTHI1169517-3.119428chromosome replication initiator DnaA
NTHI1171-118-2.437481transferrin-binding protein 1
NTHI1172018-2.282645transferrin-binding protein 2
NTHI1173-115-3.565003hypothetical protein
NTHI1174-115-3.66164250S ribosomal protein L34
NTHI1175-113-3.030154ribonuclease P
NTHI1178-114-2.954414hypothetical protein
NTHI1179-213-3.435349inner membrane protein translocase component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1164IGASERPTASE17060.0 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 1706 bits (4420), Expect = 0.0
Identities = 1186/1798 (65%), Positives = 1309/1798 (72%), Gaps = 261/1798 (14%)

Query: 1 MLNKKFKLNFIALTVAYALTPYTEAALVRNDVDYQIFRDFAENKGKFSVGATNVEVRDNK 60
MLNKKFKLNFIALTVAYALTPYTEAALVR+DVDYQIFRDFAENKGKFSVGATNV V+D
Sbjct: 1 MLNKKFKLNFIALTVAYALTPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKN 60

Query: 61 NNNLGSALPKDIPMIDFSAVDVDKRIATLVNPQYVVGVKHVGNGVGELHFGNLNGNWNPK 120
N +LG+ALP IPMIDFS VDVDKRIATL+NPQYVVGVKHV NGV ELHFGNLNGN N
Sbjct: 61 NKDLGTALPNGIPMIDFSVVDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMN-- 118

Query: 121 FGNSIQHRDVSWEENRYYTVEKNNFSSELNGKTQNNEKDKQYTSNKKDVPSELYGQALVK 180
GN+ HRDVS EENRY++VEKN + P++L G+ +
Sbjct: 119 NGNAKAHRDVSSEENRYFSVEKNEY------------------------PTKLNGKTVTT 154

Query: 181 EQQNQKRREDYYMPRLDKFVTEVAPIEASTTSSDAGTYNDQNKYPAFVRLGSGSQFIYKK 240
E Q QKRREDYYMPRLDKFVTEVAPIEAST SSDAGTYNDQNKYPAFVRLGSGSQFIYKK
Sbjct: 155 EDQTQKRREDYYMPRLDKFVTEVAPIEASTASSDAGTYNDQNKYPAFVRLGSGSQFIYKK 214

Query: 241 GSHYELILEEKNEKKEIIHRWDVGGDNLKLVGNAYTYGIAGTPYKVNHTDDGLIGFGDST 300
G +Y LIL +VGG+NLKLVG+AYTYGIAGTPYKVNH ++GLIGFG+S
Sbjct: 215 GDNYSLILNN----------HEVGGNNLKLVGDAYTYGIAGTPYKVNHENNGLIGFGNSK 264

Query: 301 EDHNDPKEILSRKPLTNYAVLGDSGSPLFVYDKSKEKWLFLGAYDFWGGYKKKSWQEWNI 360
E+H+DPK ILS+ PLTNYAVLGDSGSPLFVYD+ K KWLFLG+YDFW GY KKSWQEWNI
Sbjct: 265 EEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYDFWAGYNKKSWQEWNI 324

Query: 361 YKPQFAENILKKDSAGLLKG-NTQYNWTSKGNTSLISGTSESLSVDLVDNK-NLNHGKNV 418
YK QF +++L KDSAG L G T Y+W+S G TS I+G +SL+VDL D K NHGK+V
Sbjct: 325 YKSQFTKDVLNKDSAGSLIGSKTDYSWSSNGKTSTITGGEKSLNVDLADGKDKPNHGKSV 384

Query: 419 TFEGSGNLTLNNNIDQGAGGLFFEGDYEVKGTSENTTWKGAGISVAEGKTVKWKVHNPQF 478
TFEGSG LTLNNNIDQGAGGLFFEGDYEVKGTS+NTTWKGAG+SVAEGKTV WKVHNPQ+
Sbjct: 385 TFEGSGTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGKTVTWKVHNPQY 444

Query: 479 DRLAKIGKGKLIVEGRGDNKGSLKVGDGTVVLKQQT-TTGQHAFASVGIVSGRSTVVLND 537
DRLAKIGKG LIVEG GDNKGSLKVGDGTV+LKQQT +GQHAFASVGIVSGRST+VLND
Sbjct: 445 DRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQHAFASVGIVSGRSTLVLND 504

Query: 538 DNQVDPNSIYFGFRGGRLDANGNNLTFEHIRNIDDGARLVNHNMTNASNITITGAGLITN 597
D QVDPNSIYFGFRGGRLD NGN+LTF+HIRNIDDGARLVNHNMTNASNITITG LIT+
Sbjct: 505 DKQVDPNSIYFGFRGGRLDLNGNSLTFDHIRNIDDGARLVNHNMTNASNITITGESLITD 564

Query: 598 PSQVTIYTPAITADDDNYYYVPSIPRGKDLYFSNTCYKYYALKQGGSPTAEMPCYSSEKS 657
P+ +T Y D+DN Y I G LY + Y YYAL++G S +E+P +S +S
Sbjct: 565 PNTITPYN-IDAPDEDNPYAFRRIKDGGQLYLNLENYTYYALRKGASTRSELP-KNSGES 622

Query: 658 DANWEFMGDNQNDAQKKAMVYINNRRMNGFNGYFGEEATKADQNGKLNVTFSGKSDQNRF 717
+ NW +MG ++A++ M +INN RMNGFNGYFGEE K NG LNVTF GKS+QNRF
Sbjct: 623 NENWLYMGKTSDEAKRNVMNHINNERMNGFNGYFGEEEGK--NNGNLNVTFKGKSEQNRF 680

Query: 718 LLTGGTNLNGELKVEKGTLFLSGRPTPHARDIANISSTEKDKHFAENNEVVVEDDWINRT 777
LLTGGTNLNG+L VEKGTLFLSGRPTPHARDIA ISST+KD HFAENNEVVVEDDWINR
Sbjct: 681 LLTGGTNLNGDLTVEKGTLFLSGRPTPHARDIAGISSTKKDPHFAENNEVVVEDDWINRN 740

Query: 778 FKATNINVTNNATLYSGRNVESITSNITASNKAKVHIGYKAGDTVCVRSDYTGYVTCHND 837
FKAT +NVT NA+LYSGRNV +ITSNITASNKA+VHIGYK GDTVCVRSDYTGYVTC D
Sbjct: 741 FKATTMNVTGNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTD 800

Query: 838 TLSTKALNSFNPTNLRGNVNLTESANFTLGKANLFGTINSTENSQVNLKENSHWYLTGNS 897
LS KALNSFNPTNLRGNVNLTESANF LGKANLFGTI S NSQV L ENSHW+LTGNS
Sbjct: 801 KLSDKALNSFNPTNLRGNVNLTESANFVLGKANLFGTIQSRGNSQVRLTENSHWHLTGNS 860

Query: 898 DVHQLDLANGHIHLNNVSDATKETKYHTLNISNLSGNGSFYYWVDFTKNQGDKVVVTKSA 957
DVHQLDLANGHIHLN+ ++ TKY+TL +++LSGNGSFYY D + QGDKVVVTKSA
Sbjct: 861 DVHQLDLANGHIHLNSADNSNNVTKYNTLTVNSLSGNGSFYYLTDLSNKQGDKVVVTKSA 920

Query: 958 KGTFTLQVANKTGEPNHNELTLFDASNATERSGLNVSLANGKVDRGAWSYTLKENSGRYY 1017
G FTLQVA+KTGEPNHNELTLFDAS A R LNVSL VD GAW Y L+ +GRY
Sbjct: 921 TGNFTLQVADKTGEPNHNELTLFDASKAQ-RDHLNVSLVGNTVDLGAWKYKLRNVNGRYD 979

Query: 1018 LHNPEVERRNQTVDTPSIATANNMQADVPSVSNNHEETARV-EAPIPLPAPPAPATGSAM 1076
L+NPEVE+RNQTVDT +I T NN+QADVPSV +N+EE ARV EAP+P PPAPAT
Sbjct: 980 LYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVP---PPAPAT---- 1032

Query: 1077 ANEQPETRPAETVQPTMEDTNTTHPSGSEPQADTTQADDPNSESVPSETIEKVAENSPQE 1136
PSET E VAENS QE
Sbjct: 1033 ---------------------------------------------PSETTETVAENSKQE 1047

Query: 1137 SETVAKNEQKATETTAQNDEVAKEAKPTVEANTQTNELAQNGSETEETQEAETARQSEIN 1196
S+TV KNEQ ATETTAQN EVAKEAK V+ANTQTNE+AQ+GSET+ETQ ET + +
Sbjct: 1048 SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 1197 STEETVVEDDPTISEPKSRPRRSISSSSNNINLAGTEDTAKVETEKTQEAPQVAFQASPK 1256
E+ AKVETEKTQE P+V Q SPK
Sbjct: 1108 KEEK-----------------------------------AKVETEKTQEVPKVTSQVSPK 1132

Query: 1257 QEEPEMAKQQEQPKTVQSQAQPETTTQQAEPARENVSTVNNVKEAQPQAKPTTVAAKETT 1316
QE Q ET QAEPAREN TVN KE
Sbjct: 1133 QE------------------QSETVQPQAEPARENDPTVN---------------IKEPQ 1159

Query: 1317 ASNSEQKETAQPVANPKTAENKAENPQSTETTDENIHQPEAHTAVASTEVVTPENATTPI 1376
+ + +T QP
Sbjct: 1160 SQTNTTADTEQPA----------------------------------------------- 1172

Query: 1377 KPVENKTTEAEQPVTETTTVSTENPVVKNPENTTPATTQSTVNSEAVQSETATTEAVVSQ 1436
+ ++ EQPVTE+TTV+T N VV+NPENTTPATTQ TVNSE
Sbjct: 1173 ---KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSE--------------- 1214

Query: 1437 SKVTSAEETTVASTQETTVDNSGSTPQPRSRRTRRSAQNSYEPVELHTENAENPQSGNDV 1496
S + P+ R RR+ RS ++ EP + +
Sbjct: 1215 ---------------------SSNKPKNRHRRSVRSVPHNVEPATTSSNDRST------- 1246

Query: 1497 ATQLVLRDLTSTNTNAVISDAMAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSM 1556
+ L DLTSTNTNAV+SDA AKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSM
Sbjct: 1247 ---VALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSM 1303

Query: 1557 KENYSSSQYRHFSSKSAQTQLGWDQTISSNVQLGGVFTYVRNSNNFDKASSKNTLAQANL 1616
+NYSSSQYR FSSKS QTQLGWDQTIS+NVQLGGVFTYVRNSNNFDKA+SKNTLAQ N
Sbjct: 1304 NKNYSSSQYRRFSSKSTQTQLGWDQTISNNVQLGGVFTYVRNSNNFDKATSKNTLAQVNF 1363

Query: 1617 YSKYYMDNHWYLAVDLGYGNFQSNLQTNHNAKFARHTAQFGLTAGKAFNLGNFAVKPTVG 1676
YSKYY DNHWYL +DLGYG FQS LQTNHNAKFARHTAQFGLTAGKAFNLGNF + P VG
Sbjct: 1364 YSKYYADNHWYLGIDLGYGKFQSKLQTNHNAKFARHTAQFGLTAGKAFNLGNFGITPIVG 1423

Query: 1677 VRYSYLSNANFALAKDRIKVNPISVKTAFAQVDLSYTYHLGEFSITPILSARYDANQGSG 1736
VRYSYLSNA+FAL + RIKVNPISVKTAFAQVDLSYTYHLGEFS+TPILSARYDANQGSG
Sbjct: 1424 VRYSYLSNADFALDQARIKVNPISVKTAFAQVDLSYTYHLGEFSVTPILSARYDANQGSG 1483

Query: 1737 KINVDRYDFAYNVENQQQYNAGLKLKYHNVKLSLIGGLTKAKQAEKQKTAEVKLSFSF 1794
KINV+ YDFAYNVENQQQYNAGLKLKYHNVKLSLIGGLTKAKQAEKQKTAE+KLSFSF
Sbjct: 1484 KINVNGYDFAYNVENQQQYNAGLKLKYHNVKLSLIGGLTKAKQAEKQKTAELKLSFSF 1541


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI117560KDINNERMP7790.0 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 779 bits (2013), Expect = 0.0
Identities = 304/551 (55%), Positives = 397/551 (72%), Gaps = 16/551 (2%)

Query: 1 MDSRRSLLVLALIFISFLVYQQWQLDKNPPVQTEQTTSITATSDVPASSPSNSQAIADSQ 60
MDS+R+LLV+AL+F+SF+++Q W+ DKNP Q +QTT T T+ A S ++ A Q
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTA---AGSAADQGVPASGQ 57

Query: 61 TRGRIITLENDVFRLKIDTLGGDVISSELLKYDAELDSKTPFELLKDTKEHIYIAQSGLI 120
G++I+++ DV L I+T GGDV + L Y EL+S PF+LL+ + + IY AQSGL
Sbjct: 58 --GKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLT 115

Query: 121 GKNGIDTRSG--RAQYQIEGDNFKLAEGQESLSVPLLF-EKDGVTYQKIFVLKRGSYDLG 177
G++G D + R Y +E D + LAEGQ L VP+ + + G T+ K FVLKRG Y +
Sbjct: 116 GRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVN 175

Query: 178 VDYKIDNQSGQAIEVEPYGQLKHSII------ESSGNVAMPTYTGGAYSSSETNYKKYSF 231
V+Y + N + +E+ +GQLK SI S N A+ T+ G AYS+ + Y+KY F
Sbjct: 176 VNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKF 235

Query: 232 ADMQDN-NLSIDTKAGWVAVLQHYFVSAWIPNQDVNNQLYTITDSKNNVASIGYRGPVVT 290
+ DN NL+I +K GWVA+LQ YF +AWIP+ D N YT + N +A+IGY+ V
Sbjct: 236 DTIADNENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYT-ANLGNGIAAIGYKSQPVL 294

Query: 291 IPAGSQETITSSLWTGPKLQNQMATVANNLDLTVDYGWAWFIAKPLFWLLTFIQGIVSNW 350
+ G + S+LW GP++Q++MA VA +LDLTVDYGW WFI++PLF LL +I V NW
Sbjct: 295 VQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNW 354

Query: 351 GLAIICVTIVVKAILYPLTKAQYTSMAKMRILQPKMQEMRERFGDDRQRMSQEMMKLYKE 410
G +II +T +V+ I+YPLTKAQYTSMAKMR+LQPK+Q MRER GDD+QR+SQEMM LYK
Sbjct: 355 GFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKA 414

Query: 411 EKVNPLGGCLPILLQMPIFIALYWTFLEAVELRHAPFFGWIQDLSAQDPYYILPILMGIS 470
EKVNPLGGC P+L+QMPIF+ALY+ + +VELR APF WI DLSAQDPYYILPILMG++
Sbjct: 415 EKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVT 474

Query: 471 MFLLQKMSPTPVTDPTQQKVMNFMPLVFMFFFLWFPSGLVLYWLVSNLITIAQQQLIYRG 530
MF +QKMSPT VTDP QQK+M FMP++F FFLWFPSGLVLY++VSNL+TI QQQLIYRG
Sbjct: 475 MFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRG 534

Query: 531 LEKKGLHSRKK 541
LEK+GLHSR+K
Sbjct: 535 LEKRGLHSREK 545


15NTHI1194NTHI1216Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1194012-3.004889transketolase
NTHI1195113-3.736772phosphoserine phosphatase
NTHI1196215-3.723173nucleotide-binding protein
NTHI1197115-3.762304magnesium/nickel/cobalt transporter CorA
NTHI1198115-3.588248hypothetical protein
NTHI1199116-2.859922hypothetical protein
NTHI1201316-1.860470hypothetical protein
NTHI1203314-0.326085ATPase
NTHI1204414-0.824507hypothetical protein
NTHI1205413-0.421390ferredoxin
NTHI12062120.300502twin-argninine leader-binding protein DmsD
NTHI1207111-0.328469anaerobic dimethyl sulfoxide reductase chain C
NTHI1208-114-0.253117anaerobic dimethyl sulfoxide reductase chain B
NTHI1209-212-1.130210anaerobic dimethyl sulfoxide reductase chain A
NTHI1211014-0.623686hypothetical protein
NTHI1212015-0.037137mercuric transporter MerT
NTHI1213316-0.203341copper chaperone MerP-like protein
NTHI12143190.744767ABC transporter ATP-binding protein
NTHI12153200.666544transcriptional regulator
NTHI12162170.166406hypothetical protein
16NTHI1268NTHI1283Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1268213-2.663789NAD-dependent DNA ligase LigA
NTHI1269114-2.592739cell division protein ZipA
NTHI1272217-3.283060sulfate transport protein CysZ
NTHI1273316-3.595580cysteine synthase
NTHI1274216-3.387690ADP-heptose--lipooligosaccharide
NTHI1275114-3.123522xylose operon regulatory protein
NTHI1276-1110.230996Na(+)/H(+) antiporter
NTHI1277-1130.972650aspartate aminotransferase
NTHI12780141.269596xylose isomerase
NTHI1279-1141.503311xylulose kinase
NTHI1280-1142.216407ADP-L-glycero-D-manno-heptose-6-epimerase
NTHI12820162.903061thioredoxin-like protein
NTHI12832182.394215deoxyribose-phosphate aldolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI126860KDINNERMP280.034 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.4 bits (63), Expect = 0.034
Identities = 9/51 (17%), Positives = 23/51 (45%), Gaps = 4/51 (7%)

Query: 43 GLFWLFISQISSAIDWVMNFIPDWLSFLSVILLTLSILTILLLFYFTFTTF 93
G W + + W+ +F+ +W S+I++T + +++ T +
Sbjct: 331 GWLWFISQPLFKLLKWIHSFVGNW--GFSIIIIT--FIVRGIMYPLTKAQY 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1278NUCEPIMERASE981e-25 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 97.9 bits (244), Expect = 1e-25
Identities = 78/348 (22%), Positives = 130/348 (37%), Gaps = 68/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDLGRKDILVVDNLKD--------------GTKFANLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYCDKEDFIASIIAGDEFGDIDAVFHEGACSATTEWDGKYIMHNNYEYSK-------EL 100
AD + + + A F + VF A +Y + N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLDREIP-FFYASSAATYGDTKV--FREEREFEGPLNVYGYSKFLFDQYVRNILPEA 157
L C +I YASS++ YG + F + + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 158 KSPVCGFRYFNVYGPRENHKGSMASVAFHLNNQILKGENPKLFAGSEGFRRDFVYVGDVA 217
P G R+F VYGP + MA F +L+G++ ++ +RDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 218 AVNI------------WCWQNGISG-------IYNLGTGNAESFRAVADAVVKFHG-KGE 257
I W + G +YN+G + A+ G + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 258 IETIPFPEHLKSRYQEYTQADLTKLRS-TGYDKPFKTVAEGVTEYMAW 304
+P T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQPG----DVLETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1279SECA280.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.025
Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 11/37 (29%)

Query: 75 TSPA-INSLANEGYQVVSVALRSGNEADVNDYLSKHD 110
T PA +N+L +G VV+V NDYL++ D
Sbjct: 113 TLPAYLNALTGKGVHVVTV----------NDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1282HTHFIS372e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.1 bits (86), Expect = 2e-04
Identities = 39/178 (21%), Positives = 58/178 (32%), Gaps = 53/178 (29%)

Query: 194 DLTDIIGQ----QHAKRALTIAAAGQHNLLFLGPPGTGKTMLASRLTGILPEMTDLEAIE 249
D ++G+ Q R L L+ G GTGK ++A A+
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA-------------RALH 181

Query: 250 TASVTSLVQNELNFHNWKQRPFRAPHHSASMP------ALVG-------GGTIPKPGEIS 296
+ PF A + A++P L G G G
Sbjct: 182 DYG------------KRRNGPFVAI-NMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFE 228

Query: 297 LATNGALFLDEL----PEFERKVLDALRQPLESGEIIISRANAKIQFPARFQLVAAMN 350
A G LFLDE+ + + ++L L+ GE I+ R +VAA N
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRV----LQQGEYTTVGGRTPIRSDVR--IVAATN 280


17NTHI1456NTHI1542Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1456214-2.623610leucine-responsive transcriptional regulator
NTHI1458215-2.655103DNA translocase FtsK
NTHI1459114-3.291049outer-membrane lipoprotein carrier protein
NTHI1460-212-2.587971recombination factor protein RarA
NTHI1462-211-1.052403hypothetical protein
NTHI1463-113-1.487297modification methylase BepI-like
NTHI1464-313-1.4476343-phosphoshikimate 1-carboxyvinyltransferase
NTHI1465-212-1.971286formyltetrahydrofolate deformylase
NTHI1466-213-2.613015DNA-binding protein H-NS-like protein
NTHI1468-314-1.497818Na+/H+ antiporter
NTHI1469019-4.625023hypothetical protein
NTHI1470-115-4.012472acetolactate synthase 3 catalytic subunit
NTHI1471015-3.299162acetolactate synthase 3 regulatory subunit
NTHI1472-212-1.798872arginyl-tRNA synthetase
NTHI1473-112-1.074759hypothetical protein
NTHI1474-110-1.527295lipoprotein
NTHI1475-2110.019155outer membrane lipoprotein PCP
NTHI1476-1120.489359UDP-GlcNAc--lipooligosaccharide
NTHI14770150.732998glucose-6-phosphate isomerase
NTHI1479218-0.026350alanine racemase
NTHI1480522-0.710806replicative DNA helicase
NTHI14821262.831400pyruvate kinase
NTHI14830262.521657*prophage CP4-57-like integrase
NTHI14843274.391581hypothetical protein
NTHI14852264.285235hypothetical protein
NTHI14862274.803838hypothetical protein
NTHI14870285.026978hypothetical protein
NTHI14882304.260246hypothetical protein
NTHI14890295.053604modification methylase Bsp6I-like
NTHI1490-1333.159014recombination associated protein
NTHI14910281.890361hypothetical protein
NTHI14922281.607543single-strand binding protein
NTHI14933291.910746hypothetical protein
NTHI14943291.350732recombinational DNA repair protein, RecE
NTHI14953260.949464hypothetical protein
NTHI1497322-0.893314hypothetical protein
NTHI1498423-0.145250modification methylase DpnIIB-like
NTHI14993281.188008hypothetical protein
NTHI15013281.169150hypothetical protein
NTHI15023291.613463hypothetical protein
NTHI15031291.483225hypothetical protein
NTHI1504-1303.271041hypothetical protein
NTHI1505-1293.149892hypothetical protein
NTHI1507-2252.678421hypothetical protein
NTHI1508-1242.335841hypothetical protein
NTHI15090262.932351hypothetical protein
NTHI15100273.566488hypothetical protein
NTHI15112283.197710hypothetical protein
NTHI15121273.491641hypothetical protein
NTHI15131263.292041hypothetical protein
NTHI15150283.014097hypothetical protein
NTHI1516-1323.362164hypothetical protein
NTHI15170311.735187hypothetical protein
NTHI15182301.276978hypothetical protein
NTHI15193310.100005hypothetical protein
NTHI1520531-0.478699hypothetical protein
NTHI15214261.250900hypothetical protein
NTHI15223231.319694hypothetical protein
NTHI15232252.530093hypothetical protein
NTHI15253273.992806hypothetical protein
NTHI15262274.676276DNA modification methylase
NTHI15272275.111503hypothetical protein
NTHI15282265.260960hypothetical protein
NTHI15293295.846086phage terminase large subunit
NTHI15302286.103820hypothetical protein
NTHI15313315.835352phage Mu protein gp30-like protein
NTHI15321304.958939hypothetical protein
NTHI15332284.975982hypothetical protein
NTHI15341265.204827hypothetical protein
NTHI15350264.538411hypothetical protein
NTHI15361254.457956hypothetical protein
NTHI15372235.286945hypothetical protein
NTHI15382225.057420hypothetical protein
NTHI15391214.965161hypothetical protein
NTHI15401234.358773hypothetical protein
NTHI15412234.808555hypothetical protein
NTHI15421224.821774hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1464IGASERPTASE300.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.003
Identities = 15/78 (19%), Positives = 29/78 (37%), Gaps = 3/78 (3%)

Query: 20 ELTLEQAENALEKLQTAIEEKRANEAELIKAETERKERL-AKYKELMEKEGITPEELHEI 78
E T + E A E + NE +ET+ + K +EKE E +
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 79 FGTKTVSIRAKRAPRPAK 96
+ + ++ +P+ +
Sbjct: 1120 --QEVPKVTSQVSPKQEQ 1135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1476ALARACEMASE473e-170 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 473 bits (1218), Expect = e-170
Identities = 148/358 (41%), Positives = 217/358 (60%), Gaps = 6/358 (1%)

Query: 4 KPATAKISSHALKQNLEIIKQKAPNSKIIAVVKANAYGHGVVFVASTLEQNVDCFGVARL 63
+P A + ALKQNL I++Q A ++++ +VVKANAYGHG+ + S + D F + L
Sbjct: 3 RPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGA-TDGFALLNL 61

Query: 64 EEALALRSNGITKPILLLEGFFNEQDLPILAVNNIETVVHNHEQLEALKRSNLPSPIKVW 123
EEA+ LR G PIL+LEGFF+ QDL I + + T VH++ QL+AL+ + L +P+ ++
Sbjct: 62 EEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIY 121

Query: 124 LKIDTGMHRLGVALDEVDYFYQELKKLPQIQPHLGFVSHFSRADELESDYTQLQINRFLS 183
LK+++GM+RLG D V +Q+L+ + + + +SHF+ A+ D + R
Sbjct: 122 LKVNSGMNRLGFQPDRVLTVWQQLRAMANVG-EMTLMSHFAEAEH--PDGISGAMARIEQ 178

Query: 184 ATKDKQGERTIAASGGILFWPKSHLECIRPGIIMYGISPTDTIGK--EFGLTPVMNLTSS 241
A + + R+++ S L+ P++H + +RPGII+YG SP+ GL PVM L+S
Sbjct: 179 AAEGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSE 238

Query: 242 LIAVRHHKQGDPVGYGGIWTSPRDTKIGVVAMGYGDGYPRDVPEGTPVYLNGRLVPIVGR 301
+I V+ K G+ VGYGG +T+ + +IG+VA GY DGYPR P GTPV ++G VG
Sbjct: 239 IIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGT 298

Query: 302 VSMDMLTVDLGADSQDLVGDEVILWGKELPIETVAKFTGILSYELITKLTPRVITEYV 359
VSMDML VDL Q +G V LWGKE+ I+ VA G + YEL+ L RV V
Sbjct: 299 VSMDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVTV 356


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI149560KDINNERMP300.007 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.3 bits (68), Expect = 0.007
Identities = 19/85 (22%), Positives = 30/85 (35%), Gaps = 17/85 (20%)

Query: 166 WFEWEKDNRKEIPKLHPT-------------QKPIAVLKRLIEIFTDEGDVVIDPVAG-- 210
W WE+D + T P + +LI + TD D+ I+ G
Sbjct: 20 WQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKLISVKTDVLDLTINTRGGDV 79

Query: 211 -SASTLRAARELNRPSYGFEIKKDS 234
A +ELN F++ + S
Sbjct: 80 EQALLPAYPKELNSTQ-PFQLLETS 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1512FbpA_PF05833280.036 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.036
Identities = 11/71 (15%), Positives = 27/71 (38%), Gaps = 1/71 (1%)

Query: 117 LPPLVNELNPEPSIEPSDNHHIKKTTQKSESEMLLEQF-GITGQLAKDFIAHRKAKKGVI 175
PP +LNP + K+ + + + + F G++ L+ + K +
Sbjct: 164 YPPKSPKLNPFDFSYDMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDL 223

Query: 176 NQTQLNRLQKQ 186
+ + L + +
Sbjct: 224 SLSNLKEIVEV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1525BACINVASINB250.023 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 25.5 bits (55), Expect = 0.023
Identities = 13/45 (28%), Positives = 24/45 (53%)

Query: 29 RLGNAKDAIKSAQMAYTNPNFVRQIKKDPDKLIYQGIQALIAKYG 73
+LGNA + + PN ++Q+ ++ KL QG+Q + + G
Sbjct: 436 KLGNALSKMMGETIKKLVPNVLKQLAQNGSKLFTQGMQRITSGLG 480


18NTHI1561NTHI1567Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI15612240.832013hypothetical protein
NTHI15623242.052982hypothetical protein
NTHI15635222.702957hypothetical protein
NTHI15645243.396780hypothetical protein
NTHI15657293.208380hypothetical protein
NTHI15667303.421734hypothetical protein
NTHI15671203.306532hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1567SYCDCHAPRONE270.046 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 26.8 bits (59), Expect = 0.046
Identities = 13/49 (26%), Positives = 22/49 (44%), Gaps = 2/49 (4%)

Query: 27 DTPEQQFQQGLTAYEQSNYQTAFKL--WLPLAEQGDAQAQGGLGMMYER 73
DT EQ + Y+ Y+ A K+ L + + D++ GLG +
Sbjct: 34 DTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQA 82


19NTHI1708NTHI1721Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1708213-3.141085phosphoribosylglycinamide formyltransferase
NTHI1709218-3.578050ABC transporter periplasmic protein
NTHI1711323-3.671192universal stress protein UspE
NTHI1712324-3.854910fumarate/nitrate reduction transcriptional
NTHI1713722-5.169252*integrase/recombinase
NTHI1714925-5.241680hypothetical protein
NTHI1715727-3.196291phage anti-repressor protein
NTHI1716727-2.289815hypothetical protein
NTHI1717527-1.411061hypothetical protein
NTHI1718223-0.777188hypothetical protein
NTHI1719-1220.321186hypothetical protein
NTHI17201212.033436hypothetical protein
NTHI17212222.425719hypothetical protein
20NTHI1772NTHI1787Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1772318-3.577312anthranilate synthase component II
NTHI1773218-3.281638anthranilate synthase component I
NTHI1774220-4.700019ferritin
NTHI1775119-5.002607ferritin
NTHI1776017-5.234800phosphate-binding periplasmic protein PstS
NTHI1777118-6.615409phosphate transport system permease protein
NTHI177829-3.180484phosphate transport system permease protein
NTHI177929-2.676918phosphate transporter ATP-binding protein
NTHI178028-2.350873phosphate regulon transcriptional regulatory
NTHI1781310-2.787120phosphate regulon sensor protein PhoR
NTHI178229-2.275008exonuclease I
NTHI178428-1.412005hypothetical protein
NTHI1785013-3.470986hypothetical protein
NTHI1786013-4.153944cell division protein MukB
NTHI1787-313-4.024612condesin subunit E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1778HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 2e-22
Identities = 36/130 (27%), Positives = 62/130 (47%), Gaps = 4/130 (3%)

Query: 1 MTR-KILIVEDECAIREMIALFLSQKYYDVIEASDFKTAINKI-KENPKLILLDWMLPGR 58
MT IL+ +D+ AIR ++ LS+ YDV S+ T I + L++ D ++P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 59 SGIQFIQYIKKQESYAAIPIIMLTAKSTEEDCIACLNAGADDYITKPFSPQILLARIEAV 118
+ + IKK +P+++++A++T I GA DY+ KPF L+ I
Sbjct: 61 NAFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 119 WRRIYEQQSQ 128
+ S+
Sbjct: 119 LAEPKRRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1784GPOSANCHOR422e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.6 bits (97), Expect = 2e-05
Identities = 26/267 (9%), Positives = 78/267 (29%), Gaps = 18/267 (6%)

Query: 881 EINRERNEIDRELNQFNNGEQQLRIQLDNAKEKLQLLNKLIPQLNVLTDEDLIDRIEECR 940
+++ + ++ + +L + L I +L + +
Sbjct: 75 DLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD----LEKALE 130

Query: 941 EQLDIAEQDEYFIRQYGVTLSQLEPIANSLQSDPENYDGLKNELTQAIERQKQVQQRVFA 1000
++ + D I+ + L L+ E + I+ + + + A
Sbjct: 131 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 190

Query: 1001 LADVVQRKHHFGYEDAGQAKTSELNEKLRQRLEQMQAQRDMQREQVRQKQSQFAEYNRVL 1060
+++ + +++ ++A++ + +
Sbjct: 191 RQAELEKALE---------GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFS 241

Query: 1061 IQLQSSYDSKYQLLNELISEISDLGVRADDGAEERARIRRDELHQQLSTSRQRRSYVEKQ 1120
+ + L + ++L + A E A ++ T ++ +E +
Sbjct: 242 TADSAKIKTLEAEKAALEARQAEL-----EKALEGAMNFSTADSAKIKTLEAEKAALEAE 296

Query: 1121 LTLIESEADNLNRLIRKTERDYKTQRE 1147
+E ++ LN + RD RE
Sbjct: 297 KADLEHQSQVLNANRQSLRRDLDASRE 323



Score = 34.3 bits (78), Expect = 0.005
Identities = 50/358 (13%), Positives = 111/358 (31%), Gaps = 29/358 (8%)

Query: 414 ANDALEESQAQFEQTEIEIDAVRAQLADYQQALDAQQTRALQYQQAITALEKAKTLCGLA 473
+ E S ++ V+ + ++ + + + AL+
Sbjct: 34 VVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKD-------- 85

Query: 474 DLSVKNVEDYHAEFEAHAESLTETVLELEHKMSISEAAKSQFDKAYQLVCKIAGEIPRSS 533
+ + + + +++ E K+ EA K+ +KA + +
Sbjct: 86 --HNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKI 143

Query: 534 AWESAKELLREYPSQKLQAQQTPQLRTKLHELEQRYAQQQSAVKLLNDFNQRANLSLQTA 593
E + + + ++ L +L+ A
Sbjct: 144 K-TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 202

Query: 594 EELEDYHAEQEALIEDISARLSEQVENRSTLRQKRENLTALYDENARKAPAWLTAQAALE 653
A I+ + A + ++ L + E ++ K T +A
Sbjct: 203 MNFST---ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIK---TLEAEKA 256

Query: 654 RLEQQSGETFEHSQDVMNFMQSQLVKERELTMQRDQLEQKRLQLDE--QISRLSQPDGSE 711
LE + E + + MNF + K + L ++ LE ++ L+ Q+ ++
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRR 316

Query: 712 DPRLNMLAERFGGVLLSELYDDVTIEDAPYFSALYGPARHAIVVRDLNAVREQLAQLE 769
D + A++ +L + I +A + + RDL+A RE QLE
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQNKISEA----SRQS------LRRDLDASREAKKQLE 364



Score = 32.0 bits (72), Expect = 0.019
Identities = 36/269 (13%), Positives = 76/269 (28%), Gaps = 22/269 (8%)

Query: 1020 KTSELNEKLRQRLEQMQAQRDMQREQVRQKQSQFAEYNRVLIQLQSSYDSKYQLLNELIS 1079
K E +K ++ + + + E L + + L+E S
Sbjct: 54 KVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKAS 113

Query: 1080 EISDLGVRADDG-----AEERARIRRDELHQQLSTSRQRRSYVEKQLTLIESEADNLNRL 1134
+I +L R D + L + + + L A N +
Sbjct: 114 KIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTA 173

Query: 1135 IRKTERDYKTQRELVVAAKVSWCVVLRLSRNSDMEKRLNRRELAYLSADELRSMSDKALG 1194
+ + ++ + A R +++EK L + +
Sbjct: 174 DSAKIKTLEAEKAALEA------------RQAELEKALEGAMNFSTADSAKIKTLEAEKA 221

Query: 1195 ALRTAVADNEYLRDSLR-VSEDSRKPENKVRFFIAVYQH----LRERIRQDIIKTDDPID 1249
AL AD E + S + A + L + + + +
Sbjct: 222 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSA 281

Query: 1250 AIEQMEIELSRLTAELTGREKKLAISSES 1278
I+ +E E + L AE E + + + +
Sbjct: 282 KIKTLEAEKAALEAEKADLEHQSQVLNAN 310


21NTHI1825NTHI1844Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1825219-0.574833spermidine/putrescine-binding periplasmic
NTHI1826118-1.109909ABC transporter ATPase
NTHI1827216-1.164642deoxyguanosinetriphosphate
NTHI1828314-0.193483effector of murein hydrolase
NTHI18292120.755276hypothetical protein
NTHI18302120.406983micrococcal nuclease-like protein
NTHI18311203.078061selenocysteine lyase
NTHI1832-1182.434949hypothetical protein
NTHI1833-1161.089564hypothetical protein
NTHI1834-1212.3813297-cyano-7-deazaguanine reductase
NTHI18350162.138255bifunctional chorismate mutase/prephenate
NTHI18380121.855411tRNA pseudouridine synthase B
NTHI1839180.608706ribosome-binding factor A
NTHI1841291.310492type I restriction enzyme HindVIIP M protein
NTHI18432112.531027type I restriction enzyme HindVIIP specificity
NTHI18442101.749310hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1844TCRTETOQM702e-14 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 69.9 bits (171), Expect = 2e-14
Identities = 59/278 (21%), Positives = 96/278 (34%), Gaps = 77/278 (27%)

Query: 350 IMGHVDHGKTSLLDYIRKAKVAAGEAG------------------GITQHIGAYHVEMDD 391
++ HVD GKT+L + + A E G GIT G + ++
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 392 GKMITFLDTPGHAAFTSMRARGAKATDIVVLVVAADDGVMPQTIEAIQHAKAAGAPLVVA 451
K + +DTPGH F + R D +L+++A DGV QT + G P +
Sbjct: 68 TK-VNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFF 126

Query: 452 VNKIDKPEANP-----------------------------------DRVEQELLQHDVIS 476
+NKID+ + ++ + + +D +
Sbjct: 127 INKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLL 186

Query: 477 EKFGGDVQ------------------FVPV---SAKKGTGVDDLLDAILLQSEVLELTAV 515
EK+ PV SAK G+D+L++ I ++ T
Sbjct: 187 EKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHR 244

Query: 516 KDGMASGVVIESYLDKGRGPVATILVQSGTLRKGDIVL 553
G V + + R +A I + SG L D V
Sbjct: 245 GQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


22NTHI1857NTHI1882Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI18570223.153845hypothetical protein
NTHI18580241.974882hypothetical protein
NTHI18601220.646453tail fiber protein
NTHI1861322-0.688109bacteriophage P2-related tail formation protein
NTHI1862-120-0.082274phage-like baseplate assembly protein
NTHI1863119-1.294600baseplate assembly protein W
NTHI1864117-1.440611phage P2-like baseplate assembly protein
NTHI1865217-0.671718hypothetical protein
NTHI1866417-0.434520hypothetical protein
NTHI18674190.276307hypothetical protein
NTHI1869225-0.067533hypothetical protein
NTHI18702280.455779phage-like tail protein
NTHI18712341.682873hypothetical protein
NTHI18724312.087028hypothetical protein
NTHI18732333.044753hypothetical protein
NTHI18741311.997501hypothetical protein
NTHI18752321.172130bacteriophage tail completion protein gpS-like
NTHI18763341.003925bacteriophage tail completion protein gpR-like
NTHI18783322.100005hypothetical protein
NTHI18793303.040384hypothetical protein
NTHI18801273.439669DnaK suppressor protein
NTHI18811263.543292hypothetical protein
NTHI18821253.435996phage-like lysozyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1869GPOSANCHOR300.012 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 29.6 bits (66), Expect = 0.012
Identities = 27/201 (13%), Positives = 59/201 (29%), Gaps = 14/201 (6%)

Query: 5 ELTVLLNAIDKISAPVRSASKSVHELSAKLKESKSIHRQLNQQNKQHQAAMKQYASAINP 64
L+ + I ++ A K++ + + L + A A+
Sbjct: 107 SLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEG 166

Query: 65 LKSKLDSL---NQELEQAKQKAASYAQYMKN-----------AQHPTAGFQKEVEKAKSA 110
+ + + LE K + ++ + E +
Sbjct: 167 AMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAAR 226

Query: 111 VKKLKQEQIDAANKLQQARQELAKSGISAEKLAQKQRELQKNTKSATDQIKNQEAALKKL 170
L++ A N ++ L +Q EL+K + A + A +K L
Sbjct: 227 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 286

Query: 171 NAKQAAYNQYRGQVEKLKDIS 191
A++AA + +E +
Sbjct: 287 EAEKAALEAEKADLEHQSQVL 307


23NTHI1942NTHI1948Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1942018-3.711872uracil phosphoribosyltransferase
NTHI1943321-2.474891uracil permease
NTHI1944218-2.215942DNA replication initiation factor
NTHI1945318-1.330624translation initiation factor 1-like proterin
NTHI1946420-0.521414orotidine 5'-phosphate decarboxylase
NTHI19474180.088667tetratricopeptide repeat protein
NTHI19484180.292784hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1943SECA260.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 26.0 bits (57), Expect = 0.025
Identities = 13/31 (41%), Positives = 17/31 (54%)

Query: 52 DLSDEDLKKLAAELKKRCGCGGSVKNGIIEF 82
LSDE+LK AE + R G ++N I E
Sbjct: 37 KLSDEELKGKTAEFRARLEKGEVLENLIPEA 67


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1947DNABINDINGHU1072e-34 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 107 bits (270), Expect = 2e-34
Identities = 34/89 (38%), Positives = 52/89 (58%), Gaps = 1/89 (1%)

Query: 2 TKSELMEKLSAKQPTLSAKEIENMVKDILEFISQSLENGDRVEVRGFGSFSLHHRQPRLG 61
K +L+ K+ A+ L+ K+ V + +S L G++V++ GFG+F + R R G
Sbjct: 3 NKQDLIAKV-AEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 62 RNPKTGDSVNLSAKSVPYFKAGKELKARV 90
RNP+TG+ + + A VP FKAGK LK V
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1948ACRIFLAVINRP300.028 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.028
Identities = 25/162 (15%), Positives = 51/162 (31%), Gaps = 22/162 (13%)

Query: 318 PSKVVSLGDTVEVMVLEIDEERRRISLG-LKQCKANPWTQFADTHNKGDKVTGKIKSITD 376
+ T ++ ++ + +I+ G L A P Q N + K+ +
Sbjct: 190 ADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL----NASIIAQTRFKNPEE 245

Query: 377 FGIFIGLEGGIDGLVHLSDISWSISGEEAVRQYKKGDEVSAVVLAV------------DA 424
FG +V L D++ G E + + A L + A
Sbjct: 246 FGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKA 305

Query: 425 VKERISLGIKQLEED-----PFNNFVAINKKGAVVSATVVEA 461
+K +++ + P++ + V T+ EA
Sbjct: 306 IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEA 347


24NTHI2000NTHI2007Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI2000013-3.972751molybdate transporter ATP-binding protein
NTHI2001114-5.666764molybdate ABC transporter permease protein
NTHI2002317-7.368409molybdate-binding periplasmic protein
NTHI2003517-7.496367transcriptional regulator ModE
NTHI2004417-7.065104UDP-galactose--lipooligosaccharide
NTHI2005316-5.913966UDP-galactose--lipooligosaccharide
NTHI2006212-4.044940UDP-GlcNAc--lipooligosaccharide
NTHI2007211-3.000516UDP-galactose--lipooligosaccharide
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI200656KDTSANTIGN310.008 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 30.7 bits (69), Expect = 0.008
Identities = 18/74 (24%), Positives = 28/74 (37%), Gaps = 6/74 (8%)

Query: 153 KIKKHYTVYPNYKNIVSNIEPISLWDNQIDCEIDG--EVSFFIGQPLLNTKEENISLIKK 210
I+K + + P + PIS+ D +I + QP LN ++ + I
Sbjct: 127 PIRKPFKLTPPQPTMS----PISIADRDFGIDIPNIPQAQRQAAQPPLNDQKRAAARIAW 182

Query: 211 LKEQFSFDYYFPHP 224
LK DY P
Sbjct: 183 LKNCAGIDYMVKDP 196


25NTHI0220NTHI0226N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0220-210-1.828525uridine kinase
NTHI0221-19-0.778886deoxycytidine triphosphate deaminase
NTHI0222-210-0.873294hypothetical protein
NTHI0223-116-1.395226sugar efflux transporter
NTHI0224016-1.243826GTP-binding protein EngA
NTHI0225-113-0.638595**DNA polymerase III subunit epsilon
NTHI0226-2130.282464ribonuclease H
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0220PREPILNPTASE300.016 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.8 bits (67), Expect = 0.016
Identities = 11/57 (19%), Positives = 22/57 (38%), Gaps = 8/57 (14%)

Query: 4 FALIVGIVALAIFSFLYIQLYRV--------QSAINEQLAQQNIAVQSINLSLFSPA 52
+ +V + +L I SFL + ++R+ Q+ + V +L P
Sbjct: 15 YFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPR 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0221TCRTETA543e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.4 bits (131), Expect = 3e-10
Identities = 46/277 (16%), Positives = 96/277 (34%), Gaps = 12/277 (4%)

Query: 46 QTADTGLMMTVYAWTVLIMSLPAMLATGNMERKSLLIKLFIIFIVGHILLVIAWNFWILL 105
TA G+++ +YA + + R+ +L+ V + ++ A W+L
Sbjct: 41 VTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLY 100

Query: 106 LARMCIALAHSVFWSITASLVMRISPKHKKTQALGMLAIGTALATILGLPIGRIVGQLVG 165
+ R+ + + ++ + + I+ ++ + G ++ + G +G ++G
Sbjct: 101 IGRIVAGITGATG-AVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FS 158

Query: 166 WRVTFGIIAVLALSIMFLIIRLLP--------NLPSKNAGSIASLPLLAKRPLLLWLYVT 217
F A L LLP L + +AS ++ L
Sbjct: 159 PHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAV 218

Query: 218 TAIVISAHFTAYTYIEPFMIDVGHLDPNFATAVLLVFGFSG-IAASLLFNRLY-RFAPTK 275
I+ F D H D L FG +A +++ + R +
Sbjct: 219 FFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERR 278

Query: 276 FIVVSMSLLMFSLLLLLFSTEAIIAMFSLVFIWGIGI 312
+++ M +LL F+T +A +V + GI
Sbjct: 279 ALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGI 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0222MYCMG045320.006 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 32.0 bits (72), Expect = 0.006
Identities = 15/49 (30%), Positives = 24/49 (48%)

Query: 166 EKMENADENDRTSEEEQDEWEQEFDFDSEEDTALIDDALDEELEEEQDK 214
K NA+ + +Q E+EFD+ +E AL++ EL E + K
Sbjct: 388 SKKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEKEDSAELNENEKK 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0225ECOLNEIPORIN453e-07 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 44.8 bits (106), Expect = 3e-07
Identities = 60/346 (17%), Positives = 121/346 (34%), Gaps = 38/346 (10%)

Query: 1 MKKTLAALIVGAFAASAANAAVVYNNEGTNVELGGRLSIIAEQSNSTIKDQKQQHGALRN 60
MKK+L AL + A +A +Y VE ++ Q+ S + +
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTG-----IVD 55

Query: 61 QSSRFHIKATHNFGDGFYAQGYLETRLVSAQSGTESDNFGHIITKYAYVTLGNKALGEVK 120
S+ K + G+G A +E + A GT+S + +++ L G+++
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQKASIA--GTDSGWG----NRQSFIGLK-GGFGKLR 108

Query: 121 LGRAKTIADGITS--AEDKEYGVLNNSKYIPTNGNTVGYTFKGID--GLVLGANYLLAQE 176
+GR ++ D + L +K + + + GL Y L
Sbjct: 109 VGRLNSVLKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALNDN 168

Query: 177 RHK---------YTTAAGTRAVTVAG---------EVYPQKISNGVQVGAKYDANNIIAG 218
+ + G V G E + ++ + YD + + A
Sbjct: 169 AGRHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYAS 228

Query: 219 IAYGRTNYREDIVDPDLGKKQQVNGALSTLGYRFSDLGLLVSLDSGYAKTKNYKDKHEKS 278
+ + +V+ + Q +TL YRF ++ VS G+ + + + +
Sbjct: 229 V-AVQQ-QDAKLVEENYSHNSQ-TEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNNDY 285

Query: 279 YFVSPGFQYELMEDTNVYGNFKYERDSVDQGKKTREQAVLFGVDHK 324
V G +Y+ + T+ + + ++ + K A G+ HK
Sbjct: 286 DQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVST-AGGVGLRHK 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0226UREASE340.001 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 33.9 bits (78), Expect = 0.001
Identities = 12/29 (41%), Positives = 19/29 (65%)

Query: 339 PARAIGIDDRLGSVEKGKIANLAVFTPNY 367
PA A G+ +GS+E GK A+L ++ P +
Sbjct: 413 PAIAHGLSHEIGSLEVGKRADLVLWNPAF 441


26NTHI0846NTHI0853N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI0846-213-0.371208trigger factor
NTHI0847-1140.457926ATP-dependent Clp protease proteolytic subunit
NTHI0848-1110.924315ATP-dependent protease ATP-binding subunit ClpX
NTHI0849-2130.258074preprotein translocase subunit SecE
NTHI0851-313-0.342769transcription antitermination protein NusG
NTHI0852-312-1.012471VacJ lipoprotein
NTHI0853-113-2.385375YjgF family translation initiation inhibitor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0846HTHFIS362e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.3 bits (84), Expect = 2e-04
Identities = 33/192 (17%), Positives = 67/192 (34%), Gaps = 53/192 (27%)

Query: 51 LEESAVENEEKLPTPHEIRAHLDDYVIGQDYAKKVLSVAVYNHYKRLRTNYESNDVELGK 110
+ A+ ++ P+ E + ++G+ A +Y R +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAA----MQEIYRVLAR---------LMQTD 160

Query: 111 SNILLIGPTGSGKTLLAQTL---ARRLNVPFAMADATTLTEAGYVGEDVENVLQKLLQNC 167
+++ G +G+GK L+A+ L +R N PF + + ++++ L
Sbjct: 161 LTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIP---------RDLIESELFGH 211

Query: 168 EYDT------------EKAEKGIIYIDEIDKISRKSEGASITRDVSGEGVQQALLKLIEG 215
E E+AE G +++DEI + Q LL++++
Sbjct: 212 EKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMP--------------MDAQTRLLRVLQQ 257

Query: 216 TIASIPPQGGRK 227
GGR
Sbjct: 258 G--EYTTVGGRT 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0847SECETRNLCASE1404e-46 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 140 bits (353), Expect = 4e-46
Identities = 50/121 (41%), Positives = 73/121 (60%), Gaps = 1/121 (0%)

Query: 18 EGKSKGLNTFLWVLVVIFFAAAAIGNIYFQQIYSLPIRVIGMAIALVIAFILAAITNQGT 77
+G +GL WV+VV A +GN ++ I LP+R + + I + A +A +T +G
Sbjct: 8 QGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIM-LPLRALAVVILIAAAGGVALLTTKGK 66

Query: 78 KARAFFNDSRTEARKVVWPTRAEARQTTLIVIGVTMIASLFFWAVDSIIVTVINFLTDLR 137
AF ++RTE RKV+WPTR E TTLIV VT + SL W +D I+V +++F+T LR
Sbjct: 67 ATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVSFITGLR 126

Query: 138 F 138
F
Sbjct: 127 F 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0849VACJLIPOPROT318e-113 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 318 bits (817), Expect = e-113
Identities = 106/253 (41%), Positives = 156/253 (61%), Gaps = 11/253 (4%)

Query: 4 KVILTAL-LSAIALTGCANHNATKQASERNDSLEDFNRTMWKFNYNVIDRYVLEPAAKGW 62
K+ L+AL L L GCA+ +Q R+D LE FNRTM+ FN+NV+D Y++ P A W
Sbjct: 2 KLRLSALALGTTLLVGCASSGTDQQ--GRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAW 59

Query: 63 NNYVPKPISSGLAGIANNLDEPVSFINRLIEGEPKKAFVHFNRFWINTVFGLGGFIDFAS 122
+YVP+P +GL+ NL+EP +N ++G+P + VHF RF++NT+ G+GGFID A
Sbjct: 60 RDYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAG 119

Query: 123 -ASKELRIDNQRGFGETLGSYGVDAGTYIVLPIYNATTPRQLTGAVVDAAYMYPFWQWVG 181
A+ +L+ FG TLG YGV G Y+ LP Y + T R G + DA +YP W+
Sbjct: 120 MANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADA--LYPVLSWLT 177

Query: 182 GPWALVKYGVQAVDARAKNLNNAELLRQAQDPYITFREAYYQNLQFKVNDGKLVESK--- 238
P ++ K+ ++ ++ RA+ L++ LLRQ+ DPYI REAY+Q F N G+L +
Sbjct: 178 WPMSVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPN 237

Query: 239 -ESLPDDILKEID 250
+++ DD LK+ID
Sbjct: 238 AQAIQDD-LKDID 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI0853PF01206905e-28 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 90.2 bits (224), Expect = 5e-28
Identities = 24/71 (33%), Positives = 42/71 (59%)

Query: 8 QTLNTLGLRCPEPVMLVRKNIRHLNDGEILLIIADDPATTRDIPSFCQFMDHTLLQSEVE 67
Q+L+ GL CP P++ +K + +N GE+L ++A DP + +D SF + H LL+ + E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 68 KPPFKYWVKRG 78
+ + +KR
Sbjct: 66 DGTYHFRLKRA 76


27NTHI1004NTHI1009N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1004-212-1.921982fumarate reductase flavoprotein subunit
NTHI1005-313-1.767437lysyl-tRNA synthetase
NTHI1006-212-2.192436transcriptional regulatory protein CpxR
NTHI1007-214-1.982389small protein A
NTHI1008-213-1.544322nucleoid-associated protein NdpA
NTHI1009-311-0.951046hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1004HTHFIS941e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 1e-24
Identities = 37/118 (31%), Positives = 61/118 (51%), Gaps = 2/118 (1%)

Query: 2 SKLLLVDDDIELTELLSTLLELEGFDVETANNGLEALQKL-NESYKLVLLDVMMPKLNGI 60
+ +L+ DDD + +L+ L G+DV +N + + LV+ DV+MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 61 ETLKEIRKV-SNVPVMMLTARGEDIDRVLGLELGADDCLPKPFNDRELIARIKAILRR 117
+ L I+K ++PV++++A+ + + E GA D LPKPF+ ELI I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1006DNABINDINGHU280.014 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 28.1 bits (63), Expect = 0.014
Identities = 9/34 (26%), Positives = 15/34 (44%)

Query: 216 DQGELNKEQTQAVKKQVFEYCKGQLSNGNNIELR 249
+ EL K+ + A VF L+ G ++L
Sbjct: 13 EATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLI 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1007PF08280260.011 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 26.3 bits (58), Expect = 0.011
Identities = 8/30 (26%), Positives = 17/30 (56%)

Query: 22 VLEKHKAPVDLSLIALGNMASNLLTTSVPQ 51
+L + P+ + +A + ++LLT S P+
Sbjct: 418 ILRNIQPPLVVVFVASNFINAHLLTDSFPR 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1009ECOLNEIPORIN290.027 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 29.0 bits (65), Expect = 0.027
Identities = 7/50 (14%), Positives = 18/50 (36%)

Query: 39 KIGIYHSSNLQQKWDLVDEVKRTSSVDDYYARFCQSIELMISQGVTAFGT 88
K G+ S ++ V+ + + D ++ + + G+ A
Sbjct: 28 KAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQ 77


28NTHI1058NTHI1066N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1058-210-0.466127zinc-binding protein
NTHI1059-111-0.189085ATP-dependent RNA helicase RhlB
NTHI1060-210-0.255957transcriptional regulator
NTHI1061-311-0.323039membrane-fusion protein
NTHI1062-111-0.017325cation/multidrug efflux pump
NTHI1063012-0.007609cell division protein
NTHI10650140.455992multidrug resistance protein
NTHI10660140.834565multidrug resistance protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1058HTHTETR585e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.5 bits (141), Expect = 5e-13
Identities = 19/76 (25%), Positives = 34/76 (44%)

Query: 1 MRQAKTDLAEQIFLATDRLMAKEGLDRLSMHKIAKEANVAAGTIYLYFKNKDELLEQFAH 60
+Q + + I RL +++G+ S+ +IAK A V G IY +FK+K +L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RVFSMFMATLEKDFDE 76
S + +
Sbjct: 65 LSESNIGELELEYQAK 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1059RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 16/55 (29%), Positives = 30/55 (54%)

Query: 90 GTISQVLVQNGQNVKKGEVLVELDSSVERANLQAAQAQLSALRQTYQRYVGLLNS 144
+ +++V+ G++V+KG+VL++L + A+ Q+ L R RY L S
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159



Score = 41.0 bits (96), Expect = 6e-06
Identities = 43/223 (19%), Positives = 79/223 (35%), Gaps = 36/223 (16%)

Query: 113 DSSVERANLQAAQAQLSALRQTYQRYVGLLNSNAVS--RQEMDNAKAAYDAQLANIESLK 170
+ V ++ L+ ++++ + ++ YQ L + + RQ DN +LA +
Sbjct: 267 ELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNI-GLLTLELA---KNE 322

Query: 171 AVIERRKIVAPFDGKAGIVKIN-VGQYVNVGT---EIVRVEDTSSMKVDFALSQNDLDKL 226
+ I AP K +K++ G V IV +DT ++V + D+ +
Sbjct: 323 ERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT--LEVTALVQNKDIGFI 380

Query: 227 HIGQRVTATTDARLGETFSARITAIEPAINSSTGLVDVQATFNPEDGHKLLSGMFSRLRI 286
++GQ +A F + G V ED G+ + I
Sbjct: 381 NVGQNAIIKVEA-----FPYTRY---GYL---VGKVKNINLDAIEDQR---LGLVFNVII 426

Query: 287 ALPTETNQVVVPQVAISYNMYGE----------IAYLLEPLSE 319
++ + +S M I+YLL PL E
Sbjct: 427 SIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEE 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1060ACRIFLAVINRP9020.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 902 bits (2333), Expect = 0.0
Identities = 326/1044 (31%), Positives = 537/1044 (51%), Gaps = 47/1044 (4%)

Query: 11 DIFIRRPVLAVSISLLMIILGLQAISKLAVREYPKMTTTVITVSTAYPGADANLIQAFVT 70
+ FIRRP+ A +++++++ G AI +L V +YP + ++VS YPGADA +Q VT
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 71 SKLEESIAQADNIDYMSSTSAPS-SSTITIKMKLNTDPAGALADVLAKVNAVKSALPNGI 129
+E+++ DN+ YMSSTS + S TIT+ + TDP A V K+ LP +
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 130 EDPSVSSS-SGGSGIMYISFRSKKLDSSQ--VTDYINRVVKPQFFTIEGVAEVQVFGAAE 186
+ +S S S +M F S ++Q ++DY+ VK + GV +VQ+FGA +
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-Q 181

Query: 187 YALRIWLDPQKMAAQNLSVPTVMSALSANNVQTAAGNDN------GYYVSYRNKVETTTK 240
YA+RIWLD + L+ V++ L N Q AAG G ++ +T K
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 241 SVEQLSNLIISSNGD-DLVRLRDIATVELNKENDNSRATANGAESVVLAINPTSTANPLT 299
+ E+ + + N D +VRL+D+A VEL EN N A NG + L I + AN L
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 300 VAEKIRPLYESIKTQLPDSMESDILYDRTIAINSSIHEVIKTIGEATLIVLVVILMFIGS 359
A+ I+ ++ P M+ YD T + SIHEV+KT+ EA ++V +V+ +F+ +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 360 FRAILIPILAIPISLIGVLMLLQSFNFSINLMTLLALILAIGLVVDDAIVVLENIDRHIK 419
RA LIP +A+P+ L+G +L +F +SIN +T+ ++LAIGL+VDDAIVV+EN++R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 420 AGETPFRAAII-GTREIAVPVISMTIALIAVYSPMALMGGITGTLFKEFALTLAGAVFIS 478
+ P + A +I ++ + + L AV+ PMA GG TG ++++F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 479 GVVALTLSPMMSSKLLKSNAKP---------TWMEERVEHTLGKVNRVYEYMLDLVMLNR 529
+VAL L+P + + LLK + W +H+ VN + ++
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHS---VNHYTNSVGKILGSTG 538

Query: 530 KSMLAFAVVIFSTLPFLFNSLSSELTPNEDKGAFIAIGNAPSSVNVDYIQNAMQP----Y 585
+ +L +A+++ + LF L S P ED+G F+ + P+ + Q + Y
Sbjct: 539 RYLLIYALIVAGMV-VLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 586 MKNVMETPEVSF---GMSIAGAPTSNSSLNIITLKDWKERSRK---QSAIMNEINEKAKS 639
+KN E F G S +G N+ + ++LK W+ER+ A+++ +
Sbjct: 598 LKNEKANVESVFTVNGFSFSG-QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGK 656

Query: 640 IPEVSVSAFNIPEIDTG--EQGAPVSIVLKTAQDYKSLANTAEKFLS-AMKASGKFIYTN 696
I + V FN+P I G ++ + + +L + L A + +
Sbjct: 657 IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 697 LDLTYDTAQMTISVDKEKAGTYGITMQQISNTLGSFLSGATVTRVDVDGRAYKVISQVKR 756
+ DTAQ + VD+EKA G+++ I+ T+ + L G V GR K+ Q
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 757 DDRLSPESFQNYYLTASNGQSVPLSSVISMKLETQPTSLPRFSQLNSAEISAVPMPGISS 816
R+ PE Y+ ++NG+ VP S+ + L R++ L S EI PG SS
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 817 GDAIAWLQQQANDNLPQGYTYDFKSEARQLVQEGNALTTTFILAVVIIFLVLAIQFESIR 876
GDA+A ++ A+ LP G YD+ + Q GN ++ V++FL LA +ES
Sbjct: 837 GDAMALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 877 DPMVIMISVPLAVSGALVSLNILSFFSIAGTTLNIYSQVGLITLVGLITKHGILMCEVAK 936
P+ +M+ VPL + G L++ + ++Y VGL+T +GL K+ IL+ E AK
Sbjct: 896 IPVSVMLVVPLGIVGVLLAAT------LFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 937 EEQLNYGKTRIEAITHAAKVRLRPILMTTAAMVAGLIPLLYATGAGAVSRFSIGIVIVAG 996
+ GK +EA A ++RLRPILMT+ A + G++PL + GAG+ ++ ++GI ++ G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 997 LSIGTIFTLFVLPVVYSYIATEHK 1020
+ T+ +F +PV + I K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1062TCRTETB1431e-39 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 143 bits (361), Expect = 1e-39
Identities = 91/397 (22%), Positives = 171/397 (43%), Gaps = 17/397 (4%)

Query: 23 LSLATFMQVLDSTIANVAIPTIAGDLGASFSQGTWVITSFGVANAISIPITGWLAKRFGE 82
L + +F VL+ + NV++P IA D + WV T+F + +I + G L+ + G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 83 VRLFLVSTFLFVVSSWLCGIADSLEALIIF-RVIQGAVAGPVIPLSQSLLLNNYPPEKRG 141
RL L + S + + S +L+I R IQGA A L ++ P E RG
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 142 MALAFWSMTIVVAPIFGPILGGWISDNIHWGWIFFINVPIGLLVVLISWKILGSRESEIV 201
A + + GP +GG I+ IHW + + +P+ + ++ + + + ++ +
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPM-ITIITVPFLMKLLKKEVRI 195

Query: 202 HQPIDKVGLVLLVLGVGCLQLMLDQGREQDWFNSNEIIILAVVAVVCLIALVIWELTDDN 261
D G++L+ +G+ L F ++ I +V+V+ + V +
Sbjct: 196 KGHFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 262 PVVDISLFHSRNFSVGCLCTSLAFLIYLGSVVLIPLLLQQVFHY-TATWAGLAASPVGLF 320
P VD L + F +G LC + F G V ++P +++ V TA + P +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 321 PILLSPIIGRFGYKIDMRILVTISFIVYAITFYWRAVTFEPSMTFVDVALPQLVQGLAVS 380
I+ I G + ++ I +++F + E + F+ + + ++ GL+ +
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFT 365

Query: 381 CFFMPLTTITLSGLPAHKMASASSLFNFLRTLAGSVG 417
++TI S L + + SL NF L+ G
Sbjct: 366 K--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1063RTXTOXIND801e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 80.3 bits (198), Expect = 1e-18
Identities = 43/287 (14%), Positives = 87/287 (30%), Gaps = 35/287 (12%)

Query: 91 ELDDTNAKLSFEQAKSNLANAVRQVE----QLGFTVQQLQSAVHANEISLAQAQGNLARR 146
E + ++ S N Q E + + + ++ E + L
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 147 VQLEKMGAIDKESFQHAKEAVELAKANLNASKNQLAANQALLRNVPLR------------ 194
L AI K + + A L K+QL ++ + +
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 195 ------EQPQIQNAINSLKQAWLNLQRTKIRSPIDGYVARRNVQ-VGQAVSVGGALMAVV 247
I L + Q + IR+P+ V + V G V+ LM +V
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 248 -SNEQMWLEANFKETQLTNMRIGQPVKIHFDLYGKNK--EFDGVINGIEMGTGNAFSLLP 304
++ + + A + + + +GQ I + + + G + I +
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA-------I 413

Query: 305 SQNATGNWIKVVQRVPVRIKLDPQQFTETPLRIGLSATAKVRISDSS 351
G V+ + + PL G++ TA+++ S
Sbjct: 414 EDQRLGLVFNVIISIEENCLSTGNK--NIPLSSGMAVTAEIKTGMRS 458


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1066CARBMTKINASE361e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.3 bits (84), Expect = 1e-04
Identities = 33/133 (24%), Positives = 48/133 (36%), Gaps = 22/133 (16%)

Query: 129 IPVINENDAVATAEIKVGDNDNLSALVAILVQAEQLYLLTDQQGLFDSDPRKNPEAKLIP 188
+PVI E+ + E V D D +A V A+ +LTD G E L
Sbjct: 197 VPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILTDVNGAA-LYYGTEKEQWLRE 254

Query: 189 V-VEQITDHIRSIAGGSGTNLGTGGMMTKIIAADVATRSGIETIIAPGNRPNVIADL--- 244
V VE++ + + G M K++AA G +IA L
Sbjct: 255 VKVEELRKYYEE------GHFKAGSMGPKVLAAIRFIEW--------GGERAIIAHLEKA 300

Query: 245 --AYEQNIGTKFI 255
A E GT+ +
Sbjct: 301 VEALEGKTGTQVL 313



Score = 33.6 bits (77), Expect = 9e-04
Identities = 16/49 (32%), Positives = 25/49 (51%), Gaps = 4/49 (8%)

Query: 3 KKTIVVKFGTSTLTQGSPKLNSPHMMEIVR----QIAQLHNDGFRIVIV 47
K +V+ G + L Q K + MM+ VR QIA++ G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVIT 50


29NTHI1074NTHI1079N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI1074010-0.240345thymidylate synthase
NTHI1075080.375568hypothetical protein
NTHI10760120.505620hypothetical protein
NTHI1077-214-0.148486hypothetical protein
NTHI1078-116-0.487354preprotein translocase subunit SecA
NTHI1079020-0.725078mutator protein MutT
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1074FbpA_PF05833260.021 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 26.4 bits (58), Expect = 0.021
Identities = 12/61 (19%), Positives = 27/61 (44%), Gaps = 8/61 (13%)

Query: 19 EQSPFAKIIKK---GLAINELNQ-KFNRIFPQEFHGKFRIGNITDHLIFIEV----SNAI 70
+ F +++K I +++Q +RI +F +G + + + IE+ SN
Sbjct: 71 KAPMFCMVLRKYISNAKIVDIHQINQDRIVVIDFESTDELGFNSIYSLIIEIMGRHSNMT 130

Query: 71 V 71
+
Sbjct: 131 L 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1075PF06580260.032 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.0 bits (57), Expect = 0.032
Identities = 11/49 (22%), Positives = 24/49 (48%), Gaps = 4/49 (8%)

Query: 56 KVAREVQRQAIPQPSISRQTEK---QPKIQPHFLADVLN-LSAPIRAGP 100
K ++ + S++++ + + +I PHF+ + LN + A I P
Sbjct: 142 KNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDP 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1076SECA13320.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1332 bits (3449), Expect = 0.0
Identities = 611/897 (68%), Positives = 731/897 (81%), Gaps = 1/897 (0%)

Query: 1 MSILTRIFGSRNERVLRKLKKQVVKINKMEPAFEALSDDELKAKTQEFRDRLSGGETLQQ 60
+ +LT++FGSRN+R LR+++K V IN MEP E LSD+ELK KT EFR RL GE L+
Sbjct: 3 IKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLEN 62

Query: 61 ILPEAFATVREASKRVLGMRHFDVQLIGGMVLTNRCIAEMRTGEGKTLTATLPCYLIALE 120
++PEAFA VREASKRV GMRHFDVQL+GGMVL RCIAEMRTGEGKTLTATLP YL AL
Sbjct: 63 LIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALT 122

Query: 121 GKGVHVVTVNDYLARRDAETNRPLFEFLGMSVGVNIPGLSPEEKRAAYAADITYATNSEL 180
GKGVHVVTVNDYLA+RDAE NRPLFEFLG++VG+N+PG+ KR AYAADITY TN+E
Sbjct: 123 GKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNEY 182

Query: 181 GFDYLRDNLAHSKEERFQRTLGYALVDEVDSILIDEARTPLIISGQAENSSELYIAVNKL 240
GFDYLRDN+A S EER QR L YALVDEVDSILIDEARTPLIISG AE+SSE+Y VNK+
Sbjct: 183 GFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKI 242

Query: 241 IPSLIKQEKEDTEEYQGEGDFTLDLKSKQAHLTERGQEKVEDWLITQGLMPEGDSLYSPS 300
IP LI+QEKED+E +QGEG F++D KS+Q +LTERG +E+ L+ +G+M EG+SLYSP+
Sbjct: 243 IPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPA 302

Query: 301 RIVLLHHVMAALRAHTLFEKDVDYIVKDGEIVIVDEHTGRTMAGRRWSDGLHQAIEAKEG 360
I+L+HHV AALRAH LF +DVDYIVKDGE++IVDEHTGRTM GRRWSDGLHQA+EAKEG
Sbjct: 303 NIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEG 362

Query: 361 VDVKSENQTVASISYQNYFRLYERLAGMTGTADTEAFEFQQIYGLETVVIPTNRPMIRDD 420
V +++ENQT+ASI++QNYFRLYE+LAGMTGTADTEAFEF IY L+TVV+PTNRPMIR D
Sbjct: 363 VQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKD 422

Query: 421 RTDVMFENEQYKFNAIIEDIKDCVERQQPVLVGTISVEKSEELSKALDKAGIKHNVLNAK 480
D+++ E K AIIEDIK+ + QPVLVGTIS+EKSE +S L KAGIKHNVLNAK
Sbjct: 423 LPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAK 482

Query: 481 FHQQEAEIVAEAGFPSAVTIATNMAGRGTDIILGGNWKAQAAKLENPTQEQIEALKAEWE 540
FH EA IVA+AG+P+AVTIATNMAGRGTDI+LGG+W+A+ A LENPT EQIE +KA+W+
Sbjct: 483 FHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQ 542

Query: 541 KNHEIVMKAGGLHIIGTERHESRRIDNQLRGRSGRQGDPGSSRFYLSLEDGLMRIYLNEG 600
H+ V++AGGLHIIGTERHESRRIDNQLRGRSGRQGD GSSRFYLS+ED LMRI+ ++
Sbjct: 543 VRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDR 602

Query: 601 KLNLMRKAFTVAGEAMESKMLAKVIASAQAKVEAFHFDGRKNLLEYDDVANDQRHAIYEQ 660
+MRK GEA+E + K IA+AQ KVE+ +FD RK LLEYDDVANDQR AIY Q
Sbjct: 603 VSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQ 662

Query: 661 RNHLLDNDDISETINAIRHDVFNGVIDQYIPPQSLEEQWDIKGLEERLSQEFGMELPISN 720
RN LLD D+SETIN+IR DVF ID YIPPQSLEE WDI GL+ERL +F ++LPI+
Sbjct: 663 RNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 721 WLEEDNNLHEESLCERIVEIAEKEYKEKEALVGEDAMRHFEKGVMLQTLDELWKEHLASM 780
WL+++ LHEE+L ERI+ + + Y+ KE +VG + MRHFEKGVMLQTLD LWKEHLA+M
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 781 DYLRQGIHLRGYAQKDPKQEYKKESFRMFTEMLDSLKHQVITTLTRVRVRTQEEMEEAER 840
DYLRQGIHLRGYAQKDPKQEYK+ESF MF ML+SLK++VI+TL++V+VR EE+EE E+
Sbjct: 783 DYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEELEQ 842

Query: 841 ARQEMAARINQ-NNLPVDENSQTTQNSETEDYSDRRIGRNEPCPCGSGKKYKHCHGS 896
R+ A R+ Q L ++ + +R++GRN+PCPCGSGKKYK CHG
Sbjct: 843 QRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1079ADHESNFAMILY290.022 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 28.7 bits (64), Expect = 0.022
Identities = 20/75 (26%), Positives = 30/75 (40%), Gaps = 3/75 (4%)

Query: 58 HLQLYLERG--AAKVIGTDLSEKMLEQAEKDLQKCGQFSGRF-SLYHLPIEKLAELPESH 114
H L LE G AK I LS K E + +++ + L +K ++P
Sbjct: 139 HAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEK 198

Query: 115 FDVITSSFAFHYIEN 129
++TS AF Y
Sbjct: 199 KLIVTSEGAFKYFSK 213


30NTHI1278NTHI1286N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
NTHI12780141.269596xylose isomerase
NTHI1279-1141.503311xylulose kinase
NTHI1280-1142.216407ADP-L-glycero-D-manno-heptose-6-epimerase
NTHI12820162.903061thioredoxin-like protein
NTHI12832182.394215deoxyribose-phosphate aldolase
NTHI1285-1192.250118competence protein ComM
NTHI1286-3192.472040ribosome biogenesis GTP-binding protein YsxC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1278NUCEPIMERASE981e-25 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 97.9 bits (244), Expect = 1e-25
Identities = 78/348 (22%), Positives = 130/348 (37%), Gaps = 68/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDLGRKDILVVDNLKD--------------GTKFANLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYCDKEDFIASIIAGDEFGDIDAVFHEGACSATTEWDGKYIMHNNYEYSK-------EL 100
AD + + + A F + VF A +Y + N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLDREIP-FFYASSAATYGDTKV--FREEREFEGPLNVYGYSKFLFDQYVRNILPEA 157
L C +I YASS++ YG + F + + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 158 KSPVCGFRYFNVYGPRENHKGSMASVAFHLNNQILKGENPKLFAGSEGFRRDFVYVGDVA 217
P G R+F VYGP + MA F +L+G++ ++ +RDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 218 AVNI------------WCWQNGISG-------IYNLGTGNAESFRAVADAVVKFHG-KGE 257
I W + G +YN+G + A+ G + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 258 IETIPFPEHLKSRYQEYTQADLTKLRS-TGYDKPFKTVAEGVTEYMAW 304
+P T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQPG----DVLETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1279SECA280.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.025
Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 11/37 (29%)

Query: 75 TSPA-INSLANEGYQVVSVALRSGNEADVNDYLSKHD 110
T PA +N+L +G VV+V NDYL++ D
Sbjct: 113 TLPAYLNALTGKGVHVVTV----------NDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1282HTHFIS372e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.1 bits (86), Expect = 2e-04
Identities = 39/178 (21%), Positives = 58/178 (32%), Gaps = 53/178 (29%)

Query: 194 DLTDIIGQ----QHAKRALTIAAAGQHNLLFLGPPGTGKTMLASRLTGILPEMTDLEAIE 249
D ++G+ Q R L L+ G GTGK ++A A+
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA-------------RALH 181

Query: 250 TASVTSLVQNELNFHNWKQRPFRAPHHSASMP------ALVG-------GGTIPKPGEIS 296
+ PF A + A++P L G G G
Sbjct: 182 DYG------------KRRNGPFVAI-NMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFE 228

Query: 297 LATNGALFLDEL----PEFERKVLDALRQPLESGEIIISRANAKIQFPARFQLVAAMN 350
A G LFLDE+ + + ++L L+ GE I+ R +VAA N
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRV----LQQGEYTTVGGRTPIRSDVR--IVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
NTHI1286HTHFIS300.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.011
Identities = 9/16 (56%), Positives = 12/16 (75%)

Query: 54 VVGESGCGKSTLARAI 69
+ GESG GK +ARA+
Sbjct: 165 ITGESGTGKELVARAL 180



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.