PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomesequence.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP001047 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1MARTH_orf040MARTH_orf066Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf0402141.658060hypothetical membrane protein
MARTH_orf0412121.558561ATP synthase A chain
MARTH_orf0420102.933145ATP synthase C chain
MARTH_orf0430102.140654ATP synthase B chain
MARTH_orf0441103.125886ATP synthase delta subunit
MARTH_orf0461103.392696ATP synthase alpha subunit
MARTH_orf0470122.778151ATP synthase gamma subunit
MARTH_orf048390.572579ATP synthase beta subunit
MARTH_orf04969-0.318885ATP synthase epsilon subunit
MARTH_orf05159-0.054794potassium uptake protein
MARTH_orf05258-0.046672ribosome recycling factor
MARTH_orf053570.338397uridylate kinase
MARTH_orf057660.110723massive surface protein MspA
MARTH_orf058-1100.496539conserved hypothetical protein
MARTH_orf059-390.153313cell division protein
MARTH_orf061-210-0.653858ribosomal protein S2
MARTH_orf063-211-1.992086elongation factor Ts
MARTH_orf064-113-3.041233hypothetical lipoprotein
MARTH_orf065015-3.675561hypothetical lipoprotein
MARTH_orf066115-3.006447hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf057CHANLCOLICIN401e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 39.7 bits (92), Expect = 1e-04
Identities = 67/387 (17%), Positives = 142/387 (36%), Gaps = 47/387 (12%)

Query: 1456 LRKEQEKNTQSISAEKTKNEKLIKKPLEESQKALDKANDAIQKSNDDSQKEKALTDAESE 1515
L+K Q + A K +Q+ D N+A++ + + L A +
Sbjct: 62 LKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNA 121

Query: 1516 LKKQKKSLDDLIKGELKDDSENKTKLVDEINDINKKLQEIDKAKKELKQSQDQKAESLAE 1575
+ + L K E K E K QE ++ +KE+++ + + L
Sbjct: 122 AMQAEDERLRLAKAE--------EKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKL 173

Query: 1576 QAKELKKE--LDDLVALLDVANNKWSQSQTAIANIRAKITNIDEFLTSNNNLKNNTKLSQ 1633
E K+ L + +++A K S +Q+ + + +I ++ L+S+ + ++ +
Sbjct: 174 AEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTL 233

Query: 1634 AYNNLEDAKKSANGAIASAEKTINDSREEFLRQIDTLKSKESQAKSELSSTNTIEKLDSL 1693
A E A+ SA ++ + +K +A L + E
Sbjct: 234 AGKRNELAQASA----------------KYKELDELVKKLSPRANDPLQNRPFFEATRRR 277

Query: 1694 IQKIGDNSNGLLKEVNDLLNQISKFEESSDELSASLSELRSNLENIIKEAREKKKTKEQR 1753
+ G K + Q++ E + ++A + + ++ I + + R
Sbjct: 278 V--------GAGKIREEKQKQVTASETRINRINADI----TQIQKAISQVSNNRNAGIAR 325

Query: 1754 ISNLNKEIIEAKTSLNNQNAKTSAIIASKVDELQTELDKLQEKITSANQEKQKIDDEIDD 1813
+ + + +A+ +L N K + + + V QT +K EK K+ E+ D
Sbjct: 326 VHEAEENLKKAQNNLLNSQIKDA--VDATVSFYQTLTEKYGEKY-------SKMAQELAD 376

Query: 1814 EIKAKTKQKYDELVQSIESAKTTIQNK 1840
+ K K +E + + E K + K
Sbjct: 377 KSKGKKIGNVNEALAAFEKYKDVLNKK 403



Score = 35.1 bits (80), Expect = 0.003
Identities = 57/301 (18%), Positives = 125/301 (41%), Gaps = 17/301 (5%)

Query: 488 KAYKEKEAVEALANAKKRFEAKKQAKVKIDVDLDAKQNEFNKIKNTIESLNSINDIPALQ 547
+A + K A EA A AK +A Q D+ +A ++ ++ + E ++ N A+Q
Sbjct: 69 QAARAKAAAEAQAKAKANRDALTQRLK--DIVNEALRHNASRTPSATELAHANN--AAMQ 124

Query: 548 SAIKQLEDLKPQVKAIEESAKNISYPEGEQKAKKLLEEISKLDLEAKAKLEKLQEEKETQ 607
+ ++L K + KA +E+ + + Q+A++ +EI + E + +L+ + E++
Sbjct: 125 AEDERLRLAKAEEKARKEAE---AAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRL 181

Query: 608 ---EQEKKLIESLREKITKITNNLETLDSRGKSQLSEAT---PKQSQIAKTLKELEEQLK 661
+E K +E ++K++ + + +D K+ S + + KTL +L
Sbjct: 182 AALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELA 241

Query: 662 NAQEAQKEVQDAHKENALKTELEKLEEAKNKATTTKQKLETKISDSNKNIK---EQLDNA 718
A KE+ + K+ + + +AT + + K + +++
Sbjct: 242 QASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRI 301

Query: 719 KSELEDLKKKISDAYNAKNLEELTKELNKLEKLKEKIEDLETLATTIESKRKQEVTNLLS 778
+++ ++K IS N +N + + E LK+ +L L+
Sbjct: 302 NADITQIQKAISQVSNNRN-AGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360

Query: 779 E 779
E
Sbjct: 361 E 361



Score = 32.7 bits (74), Expect = 0.019
Identities = 45/283 (15%), Positives = 99/283 (34%), Gaps = 20/283 (7%)

Query: 373 DIQEAEKLAKQAADATLLVDEQKALDAKKQAEAALDKLNDLFSQKKAQELAKTELEKAYS 432
++ A A QA D L + + + A+K+AEAA + ++K E K E E+
Sbjct: 114 ELAHANNAAMQAEDERLRLAKAEE-KARKEAEAAEKAFQEAEQRRKEIEREKAETERQLK 172

Query: 433 ALVKATEEANKADDENTLPLAISQLEDAIAKAELALGKHENLKEKALIKPIYDVLKAYKE 492
+ +E A+ + ++ A+ + K + + + + E
Sbjct: 173 LAEAEEKRLAALSEEAK---AVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAE 229

Query: 493 KEAVEALANAKKRFEAKKQAKVKIDVDLDAKQNEFNKIKNTIESLNSINDIPALQSAIKQ 552
+ + N QA K + + + + +++ A K
Sbjct: 230 MKTLAGKRN------ELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKI 283

Query: 553 LEDLKPQVKAIEESAKNISYPEGEQKAKKLLEEISKLDLEAKAKLEKLQEEKETQEQEKK 612
E+ + QV A E I+ + Q K + + + + E + K+ Q
Sbjct: 284 REEKQKQVTASETRINRIN-ADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLN 342

Query: 613 LIESLREKITKITNNLETLDSRGKSQLSEATPKQSQIAKTLKE 655
++I + ++ S ++ + K S++A+ L +
Sbjct: 343 ---------SQIKDAVDATVSFYQTLTEKYGEKYSKMAQELAD 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf059TYPE4SSCAGX310.009 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.9 bits (69), Expect = 0.009
Identities = 27/84 (32%), Positives = 50/84 (59%), Gaps = 10/84 (11%)

Query: 7 LKEKIFGTKEERIAKKQAKLEEKEKRKLEKELDKKNK--LDNYIAGLSK-----SNNSFT 59
L+E+ ++E+ AK+QA+ +K+KR+ KE KN+ L+N +S +N + +
Sbjct: 141 LEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLS 200

Query: 60 ESIKALQNKYNKINK-EFFEDLEE 82
E IK Q + N++++ E ED++E
Sbjct: 201 ELIK--QQRENELDQMERLEDMQE 222


2MARTH_orf126MARTH_orf158Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf1262163.259806HINT (histidine triad nucleotide-binding
MARTH_orf1272152.419331recombinational DNA repair protein O
MARTH_orf1280110.961780metallo-beta-lactamase superfamily protein
MARTH_orf1290110.635656ornithine carbamoyltransferase
MARTH_orf130-18-0.114118carbamate kinase
MARTH_orf131-18-1.323838SpoU class tRNA/rRNA methyltransferase
MARTH_orf132-29-1.639252cysteinyl-tRNA synthetase
MARTH_orf133-212-1.217951hypothetical protein
MARTH_orf134114-1.340745arginyl-tRNA synthetase
MARTH_orf135215-2.229905acyl carrier protein phosphodiesterase
MARTH_orf136013-1.738755conserved hypothetical protein
MARTH_orf137-111-0.770204conserved hypothetical protein
MARTH_orf138-190.036067cytosine-specific methyltransferase, related to
MARTH_orf139370.151577conserved hypothetical protein
MARTH_orf140470.486538ABC transporter ATP-binding protein
MARTH_orf142570.690725ABC transporter permease protein
MARTH_orf143381.560921ribosomal protein L1
MARTH_orf144371.001099ribosomal protein L11
MARTH_orf150360.814020massive surface protein MspI
MARTH_orf1510101.251099SsrA-binding protein
MARTH_orf153-1111.122738hypothetical lipoprotein
MARTH_orf154-1101.018908preprotein translocase subunit SecA
MARTH_orf156114-1.562693hypothetical protein
MARTH_orf158214-0.772025DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf130CARBMTKINASE366e-129 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 366 bits (940), Expect = e-129
Identities = 142/315 (45%), Positives = 193/315 (61%), Gaps = 12/315 (3%)

Query: 3 RIVIALGGNALGN-----NPEEQKELVKVPAKKIAELVKAGHQVVVGHGNGPQVGMIFNG 57
R+VIALGGNAL + EE + V+ A++IAE++ G++VV+ HGNGPQVG +
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 58 FASAHEVNEKSPLVPLPEAGAMSQGYIGYHMVNAITNAFIDEGLAEQEVMYILTQTLVDG 117
A + P P+ AGAMSQG+IGY + A+ N G+ E++V+ I+TQT+VD
Sbjct: 64 M-DAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGM-EKKVVTIITQTIVDK 121

Query: 118 ADPAFQNPTKPIGPFYTTKEEAE--AMNPNSVIKEDAGRGYRKVVASPKPINFVGINQIK 175
DPAFQNPTKP+GPFY +E A+ A ++KED+GRG+R+VV SP P V IK
Sbjct: 122 NDPAFQNPTKPVGPFYD-EETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIK 180

Query: 176 KSIEAGATVIVGGGGGIPTVTCPKKGHISGVDGVIDKDFALAKLASLTNADYFVVLTAVD 235
K +E G VI GGGG+P + + G I GV+ VIDKD A KLA NAD F++LT V+
Sbjct: 181 KLVERGVIVIASGGGGVPVIL--EDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVN 238

Query: 236 NVMVNYKQPNQKALTHATKAELEEYIKQNQFAPGSMLPKVQAAIKFVEEGGKAAFIGDLK 295
+ Y ++ L EL +Y ++ F GSM PKV AAI+F+E GG+ A I L+
Sbjct: 239 GAALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLE 298

Query: 296 DLENIINGKTGTKVT 310
+ GKTGT+V
Sbjct: 299 KAVEALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf135CHANLCOLICIN280.033 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 27.7 bits (61), Expect = 0.033
Identities = 19/78 (24%), Positives = 32/78 (41%), Gaps = 4/78 (5%)

Query: 17 SYSTYFATKFVEEYQKINQE--DEIKIIDLNSFDVSQKTLTS--GNFATFFNENDSDALI 72
S+ K+ E+Y K+ QE D+ K + + + + F++ D DA+
Sbjct: 354 SFYQTLTEKYGEKYSKMAQELADKSKGKKIGNVNEALAAFEKYKDVLNKKFSKADRDAIF 413

Query: 73 NELKSVDKLIVASPMINF 90
N L SV A + F
Sbjct: 414 NALASVKYDDWAKHLDQF 431


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf150cloacin432e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 42.8 bits (100), Expect = 2e-05
Identities = 35/169 (20%), Positives = 57/169 (33%), Gaps = 35/169 (20%)

Query: 792 AEAKYKKLKKSLEDEQSLLDSLNKAIEKQSKTLNDAKQGADKATTIDGKDSKYSLLDKAL 851
AE Y++ + L + + K + N K D A +K L
Sbjct: 319 AERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAA-------------NKTL 365

Query: 852 QDAKTELDKMKEEAKDLKDQANKDSINNKIENLEKQISDGEEDLKKKQEALAKDKEKNDK 911
DA E+ + A D ++ + + + D+ KQ A
Sbjct: 366 ADAIAEIKQFNRFAHDPMAGGHR-----MWQMAGLKAQRAQTDVNNKQAAF--------- 411

Query: 912 LIKDLTDEANDAISKANDAIQNPFDKNKTKEAEDALKDAHKKLNEEKDK 960
D A S A+ A+ + + K KE D + A LN+EK+K
Sbjct: 412 ------DAAAKEKSDADAALSSAMESRKKKE--DKKRSAENNLNDEKNK 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf154SECA8690.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 869 bits (2248), Expect = 0.0
Identities = 346/818 (42%), Positives = 482/818 (58%), Gaps = 60/818 (7%)

Query: 9 EMRIAEATLRKINDFEEDIQILSDKELQNKTSEFRQRINLGESPESIRAEVFAVSREATK 68
+R + IN E +++ LSD+EL+ KT+EFR R+ GE E++ E FAV REA+K
Sbjct: 17 TLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLENLIPEAFAVVREASK 76

Query: 69 RILGKRPFDVQMIGGIILDLGSVAEMKTGEGKTITSIAPVYLNALTGKSVIVSTVNEYLA 128
R+ G R FDVQ++GG++L+ +AEM+TGEGKT+T+ P YLNALTGK V V TVN+YLA
Sbjct: 77 RVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALTGKGVHVVTVNDYLA 136

Query: 129 ERDAEEMGQVFKFLGLTVGINKAQMPTNEKREAYACDITYSVHSELGFDYLRDNMVMSKE 188
+RDAE +F+FLGLTVGIN MP KREAYA DITY ++E GFDYLRDNM S E
Sbjct: 137 QRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNEYGFDYLRDNMAFSPE 196

Query: 189 EKVQRGLDFILLDEVDSILIDEAKTPLIISGGDEANSPLYNVADLFVRTLSND------- 241
E+VQR L + L+DEVDSILIDEA+TPLIISG E +S +Y + + L
Sbjct: 197 ERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKIIPHLIRQEKEDSET 256

Query: 242 -----DYFIDEETKSVYLTEKGIEKANKYFNFS-------NLYDIQNSELVHRIQNALRA 289
+ +DE+++ V LTE+G+ + +LY N L+H + ALRA
Sbjct: 257 FQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPANIMLMHHVTAALRA 316

Query: 290 HKVMKLDVEYIVRNDKIELVDSFTGRVMEGRAYSEGLQQAIQAKERVEIEGETKTLATIT 349
H + DV+YIV++ ++ +VD TGR M+GR +S+GL QA++AKE V+I+ E +TLA+IT
Sbjct: 317 HALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEGVQIQNENQTLASIT 376

Query: 350 YQNFFRLFKKISGMTGTAKTEEKEFIEIYNMRVNVVPTNRPLARLDDKDEIYVTMHAKWQ 409
+QN+FRL++K++GMTGTA TE EF IY + VVPTNRP+ R D D +Y+T K Q
Sbjct: 377 FQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKDLPDLVYMTEAEKIQ 436

Query: 410 AVVKEVKRVYEKRQPILIGTAQVEDSEILHEYLIEERIPHTVLNAKQDASEAEIIAKAGQ 469
A+++++K K QP+L+GT +E SE++ L + I H VLNAK A+EA I+A+AG
Sbjct: 437 AIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGY 496

Query: 470 VGAVTIATNMAGRGTDIK-----------------------------PSKEAIELGGLYV 500
AVTIATNMAGRGTDI +E GGL++
Sbjct: 497 PAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQVRHDAVLEAGGLHI 556

Query: 501 LGTEKAESRRIDNQLKGRSGRQGDVGYSKFYLSLDDQLILRFSVQDRWKEIFKAYG---D 557
+GTE+ ESRRIDNQL+GRSGRQGD G S+FYLS++D L+ F+ DR + + G
Sbjct: 557 IGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFA-SDRVSGMMRKLGMKPG 615

Query: 558 DPIEGEAIRKAFLNAQKKIEGFNFDNRKSVLNYDDVIRQQRDLIYEQRDLILDRDDLGSI 617
+ IE + KA NAQ+K+E NFD RK +L YDDV QR IY QR+ +LD D+
Sbjct: 616 EAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQRNELLDVSDVSET 675

Query: 618 IRKMISVCVTQTVD---NPYFINESTLDVPRFCEYLNKNWMDLTEYKFTEAELQKYDRDE 674
I + T+D P + E D+P E L ++ + + +
Sbjct: 676 INSIREDVFKATIDAYIPPQSLEEMW-DIPGLQERLKNDFDLDLPIAEWLDKEPELHEET 734

Query: 675 LVDYLIGIFNREYDILRQNIVEKYGVSALTNSERTIILNVFDSAWQDHINTMDKLRRSSH 734
L + ++ Y E G + + E+ ++L DS W++H+ MD LR+ H
Sbjct: 735 LRERILAQSIEVYQ----RKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAMDYLRQGIH 790

Query: 735 LVQYSQKNPYQVYTQLGSKRFKELTQRIALESVVNLMN 772
L Y+QK+P Q Y + F + + + E + L
Sbjct: 791 LRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSK 828


3MARTH_orf204MARTH_orf229Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf2042153.221620hypothetical protein
MARTH_orf2051183.567078hypothetical protein
MARTH_orf2060162.929486transcription elongation factor
MARTH_orf2080152.724797ribosomal protein L10
MARTH_orf2110122.408745ribosomal protein L7/L12
MARTH_orf2120111.913879DNA-directed RNA polymerase beta subunit
MARTH_orf213-280.304825DNA-directed RNA polymerase beta' subunit
MARTH_orf214313-2.236971hypothetical lipoprotein
MARTH_orf216513-1.083195predicted O-methyltransferase
MARTH_orf2171115-1.143364methionyl-tRNA synthetase
MARTH_orf2181516-2.120796hypothetical lipoprotein
MARTH_orf2201517-2.192573hypothetical lipoprotein
MARTH_orf2211117-2.946449hypothetical lipoprotein
MARTH_orf223819-3.488789hypothetical lipoprotein
MARTH_orf224819-4.728642hypothetical lipoprotein
MARTH_orf226422-5.047276hypothetical lipoprotein
MARTH_orf227323-4.484912hypothetical lipoprotein
MARTH_orf229322-3.059729hypothetical lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf220TONBPROTEIN353e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.6 bits (79), Expect = 3e-04
Identities = 17/88 (19%), Positives = 32/88 (36%), Gaps = 1/88 (1%)

Query: 10 LLLASVGSLSLFTTSVVAAACDNTNKKPEEPKKDEPKKEDPKKE-EPKKDELKKEDPKKD 68
LL SV + + EP + +P E EP+ + + + +
Sbjct: 27 LLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAP 86

Query: 69 EPKKEDPKKEDPKQNPKDEPKEEPKKSP 96
++ K PK P + +E+PK+
Sbjct: 87 VVIEKPKPKPKPKPKPVKKVQEQPKRDV 114



Score = 30.3 bits (68), Expect = 0.006
Identities = 14/59 (23%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 37 PEEPKKDEPKKEDPKKEEPKKDELKKEDPK-KDEPKKEDPKKEDPKQNPKDEPKEEPKK 94
P+ + +P+ E E KE P ++PK + K P + +++PK + K
Sbjct: 58 PQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116



Score = 30.3 bits (68), Expect = 0.007
Identities = 8/64 (12%), Positives = 17/64 (26%)

Query: 35 KKPEEPKKDEPKKEDPKKEEPKKDELKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKK 94
+ +PK K K +E K ++K + + P + +
Sbjct: 89 IEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTS 148

Query: 95 SPEE 98

Sbjct: 149 VASG 152



Score = 30.0 bits (67), Expect = 0.009
Identities = 10/47 (21%), Positives = 18/47 (38%)

Query: 35 KKPEEPKKDEPKKEDPKKEEPKKDELKKEDPKKDEPKKEDPKKEDPK 81
+PE + P+ +K + K + K K ++ K D K
Sbjct: 69 VEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf221TONBPROTEIN344e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.8 bits (77), Expect = 4e-04
Identities = 15/53 (28%), Positives = 23/53 (43%), Gaps = 1/53 (1%)

Query: 37 LEEPKKEDPKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKK 89
LE P+ P E + EP+ E + + ++ PK PK +PK K
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK-PKPKPKPKPKPVKKV 106



Score = 31.9 bits (72), Expect = 0.002
Identities = 11/54 (20%), Positives = 23/54 (42%)

Query: 38 EEPKKEDPKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKKSP 91
+P E + EP+ + + + ++ K PK P + +E+PK+
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114



Score = 31.5 bits (71), Expect = 0.003
Identities = 16/78 (20%), Positives = 29/78 (37%), Gaps = 13/78 (16%)

Query: 45 PKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKKSPEERLLELNILKKE 104
P EP + +P + + +P E PK+ P K +PK P+
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK------------ 99

Query: 105 IAKKQVELEKTIKSDISK 122
K ++++ K D+
Sbjct: 100 -PKPVKKVQEQPKRDVKP 116



Score = 30.0 bits (67), Expect = 0.007
Identities = 14/53 (26%), Positives = 24/53 (45%), Gaps = 1/53 (1%)

Query: 38 EEPKKEDPKKEEPKKDEPKKEDPK-KDEPKKEDPKKEDPKQNPKDEPKEEPKK 89
+P+ E EP KE P ++PK + K P + +++PK + K
Sbjct: 64 PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf223IGASERPTASE320.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.3 bits (73), Expect = 0.001
Identities = 28/168 (16%), Positives = 52/168 (30%), Gaps = 12/168 (7%)

Query: 33 TDKKLEEPKKDGQKKEDSKKPNTKDSNKPDEGTTPKEEKQNPHVPKGQPEWEDELHKFGQ 92
T + E + ++K + T++ K +PK+E+ P+ +P E++
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND------ 1150

Query: 93 PDENATPKEEKQNPHVPKGQPEWEDELHKFGQPDENATPKEEKQNPHVPKG------QPE 146
P N + + N QP E + E+ T P+ QP
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210

Query: 147 WEDELHKFGQPDENATPKEEKQNPHVPKGQPEWEDELHSSGETSKNSN 194
E + + + N + TS N+N
Sbjct: 1211 VNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf224TONBPROTEIN382e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 37.7 bits (87), Expect = 2e-05
Identities = 14/61 (22%), Positives = 28/61 (45%), Gaps = 1/61 (1%)

Query: 37 PEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQDPKDKPKEEPK 96
P + +P EP+ E P+ P+ + ++PK + K P + +++PK + K
Sbjct: 57 PPQAVQPPPEPVVEPEPE-PEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK 115

Query: 97 K 97

Sbjct: 116 P 116



Score = 37.3 bits (86), Expect = 3e-05
Identities = 18/50 (36%), Positives = 21/50 (42%)

Query: 37 PEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQD 86
PE PE PK+ E PK PK KPK K E K K + +
Sbjct: 73 PEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPA 122



Score = 36.5 bits (84), Expect = 6e-05
Identities = 21/81 (25%), Positives = 33/81 (40%), Gaps = 1/81 (1%)

Query: 22 ATSVVAAACDNTNKKPEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKE 81
+ ++V A + + P +P + EP+ E+PK K KP K+
Sbjct: 46 SVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKK 105

Query: 82 DPKQDPKD-KPKEEPKKSPEE 101
+Q +D KP E SP E
Sbjct: 106 VQEQPKRDVKPVESRPASPFE 126



Score = 35.7 bits (82), Expect = 1e-04
Identities = 14/61 (22%), Positives = 29/61 (47%)

Query: 43 PEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQDPKDKPKEEPKKSPEER 102
EP + +P +P+ +P+ P+ + +K PK PK KP ++ ++ P+
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 103 L 103
+
Sbjct: 114 V 114



Score = 29.6 bits (66), Expect = 0.012
Identities = 18/105 (17%), Positives = 36/105 (34%), Gaps = 1/105 (0%)

Query: 11 LLASVGSLSLFATSVVAAACDNTNKKPEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKK 70
L SV ++ + + P P +P P +P Q + P+ +
Sbjct: 13 TLLSVCIHGAVVAGLLYTSVHQVIELPA-PAQPISVTMVTPADLEPPQAVQPPPEPVVEP 71

Query: 71 DEPKKEKPKKEDPKQDPKDKPKEEPKKSPEERLLELNSLKKEITK 115
+ + P+ +KPK +PK P+ K+++
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf227RTXTOXINA290.022 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.022
Identities = 11/50 (22%), Positives = 28/50 (56%), Gaps = 1/50 (2%)

Query: 123 VNVWKKLINNYQEAINELNSFEEQIAKIEKLLNVVRDLDGL-NQIKKLSK 171
+ + +L++ N +NSF +Q+ + +L+ + L+G+ N+++ L
Sbjct: 185 IELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf229PF05272300.021 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.021
Identities = 9/25 (36%), Positives = 13/25 (52%)

Query: 53 EKPKTSEDTKKGQDHSPKQDDKQTP 77
P +ED + Q H+P D+Q P
Sbjct: 851 WPPVIAEDKEADQAHAPGDQDQQQP 875


4MARTH_orf335MARTH_orf366Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf335-213-3.586879CDP-diacylglycerol-glycerol-3-phosphate
MARTH_orf336-212-3.372874ribosomal protein L33
MARTH_orf338-212-3.344405Xaa-Pro aminopeptidase
MARTH_orf339-214-4.945720predicted O-methyltransferase
MARTH_orf340-111-3.899959hypothetical protein
MARTH_orf341-210-1.874890pyridoxal-dependent decarboxylase
MARTH_orf342-111-0.206694endonuclease IV
MARTH_orf343-1110.251399nucleoside 2-deoxyribosyltransferase
MARTH_orf345-1100.233254hypothetical lipoprotein
MARTH_orf346-180.180522RNA polymerase sigma-70 factor
MARTH_orf34708-0.351799DNA primase
MARTH_orf34818-0.310320glycyl-tRNA synthetase
MARTH_orf35028-1.021039signal recognition particle
MARTH_orf35237-1.323834hypothetical membrane protein
MARTH_orf35348-1.168251cysteine protease
MARTH_orf35859-1.618444massive surface protein MspB
MARTH_orf359112-2.598976hypothetical lipoprotein
MARTH_orf362011-2.680475hypothetical protein
MARTH_orf364112-2.294224SpoU class tRNA/rRNA methyltransferase
MARTH_orf365012-3.452956GTPase protein YlqF
MARTH_orf366-110-3.198442putative esterase or lipase, membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf353PF05616330.006 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 32.8 bits (74), Expect = 0.006
Identities = 16/49 (32%), Positives = 23/49 (46%), Gaps = 6/49 (12%)

Query: 97 QNDKNNSNSNQKKNDPKNPETKPEDEDNTPKNPDQKPKNDENSKTKPEN 145
+N NN N+ NP T+P E + NPD P D T+P++
Sbjct: 335 ENPANNPAPNE------NPGTRPNPEPDPDLNPDANPDTDGQPGTRPDS 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf358FbpA_PF05833350.003 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 35.2 bits (81), Expect = 0.003
Identities = 47/229 (20%), Positives = 90/229 (39%), Gaps = 12/229 (5%)

Query: 1161 QDFLEKAKKKLKSIDKEIDELNKSLLEEKTQLEEKSKFKDIANDDVDGLKSAIDVLEKEI 1220
+ ++ + LK I + +L K + K + +K L S D + +
Sbjct: 218 NNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQY 277

Query: 1221 DKANKLKNDANSKRNSDQKIKNQTDTNFGILENAIKQAKKELKTKQQTLEALKTSNQSKA 1280
D ++KL + ++ ++K+++ I+ N I + K+ K TL+ + + K
Sbjct: 278 DSSSKLLENFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKL 337

Query: 1281 NETIKKANEIIKKLKDASDVTSITNAINDSNQTKSILEKAIDELKKDSEQKQ-------K 1333
+ AN I LK + N +++ T I +DE K S+ Q K
Sbjct: 338 YGELLTAN--IYALKKGLSHIELANYYSENYDTVKI---TLDENKTPSQNVQSYYKKYNK 392

Query: 1334 IENKLIELNNEIAAAKQKLETLKQDADSQKLAALKDEINKIKKSLKNAN 1382
++ N ++ +++L L + A DEI +IKK L
Sbjct: 393 LKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETG 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf359FRAGILYSIN270.008 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 27.0 bits (59), Expect = 0.008
Identities = 17/86 (19%), Positives = 38/86 (44%), Gaps = 7/86 (8%)

Query: 3 HFQKKKITSFLIFIGATAALSSGAMFTVSCASTVKSK-------EKKETTTLQKRINKIS 55
+F K K L+ +G A L++ + S +++ + + T L ++N +S
Sbjct: 5 NFNKMKNVKLLLMLGTAALLAACSNEADSLTTSIDAPVTASIDLQSVSYTDLATQLNDVS 64

Query: 56 DDLKTLEAQTQSLIKQIHFKPNELTQ 81
D K + + +Q+H ++ T+
Sbjct: 65 DFGKMIILKDNGFNRQVHVSMDKRTK 90


5MARTH_orf531MARTH_orf560Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf531012-3.811872hypothetical protein
MARTH_orf533012-4.195320lipoprotein, putative acid phosphatase
MARTH_orf534114-4.719780S-adenosylmethionine synthetase
MARTH_orf535013-5.589111hypothetical membrane protein
MARTH_orf536114-5.236912hypothetical membrane protein
MARTH_orf538214-5.456092hypothetical membrane protein
MARTH_orf539216-4.219987ABC transporter ATP-binding protein
MARTH_orf542217-2.875802hypothetical lipoprotein
MARTH_orf544114-3.356331hypothetical protein
MARTH_orf545013-3.083288hypothetical protein
MARTH_orf546-212-2.291533hypothetical lipoprotein
MARTH_orf548-310-1.967537hypothetical protein
MARTH_orf549-210-2.081630hypothetical protein, lipoprotein?
MARTH_orf551-211-2.428350hypothetical membrane protein
MARTH_orf554-214-2.110782hypothetical protein
MARTH_orf555-313-2.328110hypothetical membrane protein
MARTH_orf556-214-3.948340hypothetical membrane protein
MARTH_orf557016-4.193564hypothetical membrane protein
MARTH_orf558014-3.648693hypothetical membrane protein
MARTH_orf560017-3.546234ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf545SECA280.047 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.5 bits (61), Expect = 0.047
Identities = 16/68 (23%), Positives = 25/68 (36%), Gaps = 8/68 (11%)

Query: 125 RIILESYRGLAKMEVKNEFLKLRYLKSPDYGWSHHIDT-DYLKRYGNYSTLKFRSYRIK- 182
+ + + E+ F K L++ D W H+ DYL+ + R Y K
Sbjct: 744 IEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAMDYLR-----QGIHLRGYAQKD 798

Query: 183 -ITEYDSE 189
EY E
Sbjct: 799 PKQEYKRE 806


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf546ACRIFLAVINRP290.008 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.008
Identities = 9/42 (21%), Positives = 19/42 (45%), Gaps = 1/42 (2%)

Query: 6 LLISILSALTSLPSLVAVKCKPEETQEQKDEK-LAKKYEKEY 46
+ +S+L AL P+L A KP + +++ + +
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTF 519


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf555CHANLCOLICIN290.044 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.9 bits (64), Expect = 0.044
Identities = 21/108 (19%), Positives = 44/108 (40%), Gaps = 8/108 (7%)

Query: 17 KDSSKQVYEKLEKLYDKYSLRSKKESNYIFNNNKGEYEVNKSVDDKHLASNALQHYNYQS 76
KD+ + L +KY + K + + + +KG + +V++ A A + Y
Sbjct: 346 KDAVDATVSFYQTLTEKYGEKYSKMAQELADKSKG--KKIGNVNE---ALAAFEKYKDVL 400

Query: 77 AQALNNYLREGFLNSLANVDYSKYVKNTTIMKNYEFGFNYYKKVAIAY 124
+ + R+ N+LA+V Y + K+ + + V+ Y
Sbjct: 401 NKKFSKADRDAIFNALASVKYDDWAKH---LDQFAKYLKITGHVSFGY 445


6MARTH_orf576MARTH_orf590Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf576312-0.017697hypothetical lipoprotein
MARTH_orf5773110.730674ABC transporter ATP-binding protein
MARTH_orf5792100.416424hypothetical lipoprotein, potential protease
MARTH_orf580213-0.185581cytosine deaminase
MARTH_orf582214-0.363799hypothetical lipoprotein, cysteine protease
MARTH_orf584415-0.271627virulence-associated lipoprotein MIA
MARTH_orf588113-2.027425lipoprotein signal peptidase, signal peptidase
MARTH_orf589113-1.966205isoleucyl-tRNA synthetase
MARTH_orf590214-1.397543hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf584IGASERPTASE535e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.8 bits (126), Expect = 5e-10
Identities = 46/203 (22%), Positives = 72/203 (35%), Gaps = 8/203 (3%)

Query: 26 VAAACDNTENKPEEPKKPESETPKKPESETPKKPESETPKKPESETPKKPEGETPKKPEG 85
+ + E E E K+ +S ++ + SET + ET +
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 86 ETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPK 145
E +K + ET K E PK +PK+ + ET + + K+P+ +T
Sbjct: 1107 EKEEKAKVETEKTQ--EVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 146 KPEGETPKKPESETPKKPEGE-TPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPE 204
+ E P K S ++P E T + PE TP T E+ KP+
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA----TTQPTVNSESSNKPK 1220

Query: 205 GETPKKPEGETPKKPEGETTDSN 227
+ P E TT SN
Sbjct: 1221 NRHRRSVRSV-PHNVEPATTSSN 1242



Score = 52.0 bits (124), Expect = 1e-09
Identities = 48/225 (21%), Positives = 79/225 (35%), Gaps = 6/225 (2%)

Query: 33 TENKPEEPKKPESETPKKPESE--TPKKPESETPKKPESETPKKPEGETPKKPEGETPKK 90
T E K ES+T +K E + E K+ +S + + ET +
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 91 PEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGE 150
ET + E +K + ET K E PK +PK+ + ET + +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQE--VPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 151 TPKKPESETPKKPEGETPKKPEGETPKKPEGE-TPKKPEGETPKKPEGETPKKPEGETPK 209
K+P+S+T + E P K ++P E T + PE TP + T
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ-PTVN 1212

Query: 210 KPEGETPKKPEGETTDSNPNDSSSLSVIDEEEPSENWTEGWITTT 254
PK + S P++ + + + + T T
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNT 1257



Score = 51.2 bits (122), Expect = 2e-09
Identities = 40/207 (19%), Positives = 72/207 (34%), Gaps = 4/207 (1%)

Query: 28 AACDNTENKPEEPKKPESETPKKPESETPKKPESETPKKPESETPKKPEGETPKKPEGET 87
A D P P P SET + + ++ ++ + ++ E K+ +
Sbjct: 1018 ARVDEAPVPPPAPATP-SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNV 1076

Query: 88 PKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKP 147
+ + ET + ET + E +K + ET K E PK +PK+
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE--VPKVTSQVSPKQE 1134

Query: 148 EGETPKKPESETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGE-TPKKPEGE 206
+ ET + + K+P+ +T + E P K ++P E T
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 207 TPKKPEGETPKKPEGETTDSNPNDSSS 233
+ PE TP + + N +
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKN 1221



Score = 50.8 bits (121), Expect = 2e-09
Identities = 38/210 (18%), Positives = 72/210 (34%), Gaps = 7/210 (3%)

Query: 37 PEEPKKPESETPKKPESETPKKPESETPKKP-----ESETPKKPEGETPKKPEGETPKKP 91
P +++ P P + E P P SET + + ++ + +
Sbjct: 997 ITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 92 EGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGET 151
+ E K+ + + + ET + ET + E +K + ET
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET 1116

Query: 152 PKKPESETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKPEGETPKKP 211
K E PK +PK+ + ET + + K+P+ +T + E P K
Sbjct: 1117 EKT--QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 212 EGETPKKPEGETTDSNPNDSSSLSVIDEEE 241
++P E+T N +S + +
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTP 1204


7MARTH_orf637MARTH_orf653Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf63758-0.883892bacteriocin exporter and processing endoprotease
MARTH_orf639670.254028hypothetical membrane protein
MARTH_orf640670.322397hypothetical lipoprotein
MARTH_orf641680.310253glutamyl-tRNA synthetase
MARTH_orf642680.227749hypothetical protein
MARTH_orf647680.345651massive surface protein MspG
MARTH_orf653380.735116massive surface protein MspH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf637PF05272340.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.9 bits (77), Expect = 0.002
Identities = 12/33 (36%), Positives = 15/33 (45%), Gaps = 7/33 (21%)

Query: 502 GKNGVGKSTLAKLLFGLYDDYSGAIYFNDKELS 534
G G+GKSTL L GL +F+D
Sbjct: 603 GTGGIGKSTLINTLVGLD-------FFSDTHFD 628


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf641CHLAMIDIAOMP320.007 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 31.5 bits (71), Expect = 0.007
Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 3/62 (4%)

Query: 37 GDFIF-RLEDTDVERNVEGGEASQLNNLAWLGIVPDESPLKPNPKYGKYRQSEKLAIYQA 95
GDF+F R+ TDV + + G+ + P + NP YG++ Q ++ A
Sbjct: 66 GDFVFDRVLKTDVNKEFQMGDK--PTSTTGNATAPTTLTARENPAYGRHMQDAEMFTNAA 123

Query: 96 YI 97
+
Sbjct: 124 CM 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf647TYPE4SSCAGA387e-04 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 37.8 bits (87), Expect = 7e-04
Identities = 92/411 (22%), Positives = 185/411 (45%), Gaps = 24/411 (5%)

Query: 1762 FENIKQELEKIKKSLNDANNDFKNKTDLKAKEEANNKLKDAIDKAETDLASQKNEISKLQ 1821
F + +EL + N A D KN + ++A L+ ++ K E + ++
Sbjct: 577 FLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKS 636

Query: 1822 DQNNKTSANTKVEEVEKLIKRLKDEQKANAQSISAKKTKNEGLIKKPLEDSQKALDKANE 1881
NK A + + I L +++ AN + + +N IK+ L D + ++K +
Sbjct: 637 GNKNKMEAKAQANSQKDEIFALINKE-ANRDARAIAYAQNLKGIKRELSDKLENVNKNLK 695

Query: 1882 AIKKSNDD--SQKENALTDAENELKKKKKTLDDL-IKGELKNDSENKNKLEYKIKD-IDK 1937
KS D+ + K + AE LK K ++ DL I E + EN N + K+ +K
Sbjct: 696 DFDKSFDEFKNGKNKDFSKAEETLKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNK 755

Query: 1938 KLQEVDQAKQDLKQSQDQKAENLATEARKLKEKLTNLVSQLNPENE--KWNQTETKIANI 1995
+V QAK DL+ S N +K+ +K+ NL ++ +++ E +A++
Sbjct: 756 DFSKVTQAKSDLENSVKDVIIN-----QKVTDKVDNLNQAVSVAKATGDFSRVEQALADL 810

Query: 1996 RKMIKDEIDEFLKSGGAADKLKGHSKLKDAIKELEQAKSSATNAIQRAQEVINNK----- 2050
+ K+++ + + + + + S++ ++K + N + +A+ +K
Sbjct: 811 KNFSKEQLAQQAQKNESLN-ARKKSEIYQSVKN-GVNGTLVGNGLSQAEATTLSKNFSDI 868

Query: 2051 KREFNNKIDSFEAKTNS-VQTELNEAETNAKLSDLIAKIGDET-SGLLKEVN---DLINE 2105
K+E N K+ +F N+ ++ E A+ N K + A + + + + K+VN D +N+
Sbjct: 869 KKELNAKLGNFNNNNNNGLKNEPIYAKVNKKKAGQAASLEEPIYAQVAKKVNAKIDRLNQ 928

Query: 2106 ISKFEGNGGELTTKVNDLKERLKEIQRQANERKQNKDKKIEELNKELSEAK 2156
I+ G G+ +++ ++ + R Q +KI+ LN+ +SEAK
Sbjct: 929 IASGLGVVGQAAGFPLKRHDKVDDLSKVGLSRNQELAQKIDNLNQAVSEAK 979



Score = 32.8 bits (74), Expect = 0.026
Identities = 62/275 (22%), Positives = 115/275 (41%), Gaps = 27/275 (9%)

Query: 1151 KEELSKKIKTLETNLLDIKEEANSKLQSIQDKITKLNKELETTNTSLNAQNSTTNGVA-- 1208
++E+ KK+++ N K EA ++ S +D+I L + + A G+
Sbjct: 625 EKEVEKKLESKSGN--KNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRE 682

Query: 1209 -SDDVDSLKAEIKKLQDQLNEANKLKEKAQKEADQKIKD--------GIKEKLETLEQSI 1259
SD ++++ +K +E K K +A++ +K GI + + +++
Sbjct: 683 LSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLKALKGSVKDLGINPEWISKVENL 742

Query: 1260 NVATTTITNKQQQLNSQKTTNDTDRENAIK------KSTDAKQKLADANNLADNDVNKIG 1313
N A N + + S+ T +D EN++K K TD L A ++A
Sbjct: 743 NAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKVTDKVDNLNQAVSVA-KATGDFS 801

Query: 1314 KLEEALED-ANKAKETLEQTIQKLAKDNENQKLVETELKDLEKQIKSSQDKLDFLKEGDK 1372
++E+AL D N +KE L Q QK N+ L + ++ + +K+ + +
Sbjct: 802 RVEQALADLKNFSKEQLAQQAQK------NESLNARKKSEIYQSVKNGVNGTLVGNGLSQ 855

Query: 1373 KKLENIEKELEKIKKSLTDADNDFNNKTDLKAKEE 1407
+ + K IKK L +FNN + K E
Sbjct: 856 AEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKNE 890


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf653cloacin393e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 38.9 bits (90), Expect = 3e-04
Identities = 34/169 (20%), Positives = 56/169 (33%), Gaps = 35/169 (20%)

Query: 792 AKAKYEKLKKSLEDEQSLLDSLNKAIEKQSKTLNDAKQEADKATTINDKDNKYGLLDKAL 851
A+ YE+ + L + + K + N K E D A +K L
Sbjct: 319 AERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAA-------------NKTL 365

Query: 852 QNAKTELDKIKENAKDLKDQANKDSINNKINDLEKQISDNEEDLKKKQEELTKDKEKNDK 911
+A E+ + A D ++ + + D+ KQ
Sbjct: 366 ADAIAEIKQFNRFAHDPMAGGHR-----MWQMAGLKAQRAQTDVNNKQAAF--------- 411

Query: 912 LIKDLTDEADDAISKANDAIKNPSDKNKTKEAEDALKDAHKKLNEEKDK 960
D A S A+ A+ + + K KE D + A LN+EK+K
Sbjct: 412 ------DAAAKEKSDADAALSSAMESRKKKE--DKKRSAENNLNDEKNK 452



Score = 33.1 bits (75), Expect = 0.016
Identities = 29/145 (20%), Positives = 62/145 (42%), Gaps = 14/145 (9%)

Query: 1502 FDKSQKLNDAKTKLNETQKELNDLKDELTGDSENQAKVQE-------ELTKITKKLQDLE 1554
+D + + A+ + ELN +++ + E QAK + EL K L D
Sbjct: 310 WDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAANKTLADAI 369

Query: 1555 QAKKDLEQ------SENQKAEALAD-EAKRLKQELDNLVNALDAANSKWSKSNNDIATIE 1607
K + + + +A +A+R + +++N A DAA + S ++ +++
Sbjct: 370 AEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAM 429

Query: 1608 NKISNEINKYLAADSEASKLKNNPK 1632
+ +K +A++ + KN P+
Sbjct: 430 ESRKKKEDKKRSAENNLNDEKNKPR 454


8MARTH_orf758MARTH_orf773Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf758317-1.708608hypothetical membrane protein
MARTH_orf761417-1.960281bacteriophage MAV1 replication protein RepB
MARTH_orf763415-1.908284bacteriophage MAV1 protein RepA
MARTH_orf764214-1.803378bacteriophage MAV1 protein RepP
MARTH_orf765214-1.099764bacteriophage MAV1 putative C5 methylase MarMP
MARTH_orf767115-1.235727bacteriophage MAV1 protein MarRP
MARTH_orf768014-1.272557bacteriophage MAV1 hypothetical protein
MARTH_orf770015-1.303292bacteriophage MAV1 hypothetical protein
MARTH_orf771017-1.158228bacteriophage MAV1 hypothetical protein
MARTH_orf772117-1.281532bacteriophage MAV1 hypothetical protein
MARTH_orf773019-3.022844bacteriophage MAV1 hypothetical protein
9MARTH_orf220MARTH_orf229N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf2201517-2.192573hypothetical lipoprotein
MARTH_orf2211117-2.946449hypothetical lipoprotein
MARTH_orf223819-3.488789hypothetical lipoprotein
MARTH_orf224819-4.728642hypothetical lipoprotein
MARTH_orf226422-5.047276hypothetical lipoprotein
MARTH_orf227323-4.484912hypothetical lipoprotein
MARTH_orf229322-3.059729hypothetical lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf220TONBPROTEIN353e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 34.6 bits (79), Expect = 3e-04
Identities = 17/88 (19%), Positives = 32/88 (36%), Gaps = 1/88 (1%)

Query: 10 LLLASVGSLSLFTTSVVAAACDNTNKKPEEPKKDEPKKEDPKKE-EPKKDELKKEDPKKD 68
LL SV + + EP + +P E EP+ + + + +
Sbjct: 27 LLYTSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAP 86

Query: 69 EPKKEDPKKEDPKQNPKDEPKEEPKKSP 96
++ K PK P + +E+PK+
Sbjct: 87 VVIEKPKPKPKPKPKPVKKVQEQPKRDV 114



Score = 30.3 bits (68), Expect = 0.006
Identities = 14/59 (23%), Positives = 26/59 (44%), Gaps = 1/59 (1%)

Query: 37 PEEPKKDEPKKEDPKKEEPKKDELKKEDPK-KDEPKKEDPKKEDPKQNPKDEPKEEPKK 94
P+ + +P+ E E KE P ++PK + K P + +++PK + K
Sbjct: 58 PQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116



Score = 30.3 bits (68), Expect = 0.007
Identities = 8/64 (12%), Positives = 17/64 (26%)

Query: 35 KKPEEPKKDEPKKEDPKKEEPKKDELKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKK 94
+ +PK K K +E K ++K + + P + +
Sbjct: 89 IEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTS 148

Query: 95 SPEE 98

Sbjct: 149 VASG 152



Score = 30.0 bits (67), Expect = 0.009
Identities = 10/47 (21%), Positives = 18/47 (38%)

Query: 35 KKPEEPKKDEPKKEDPKKEEPKKDELKKEDPKKDEPKKEDPKKEDPK 81
+PE + P+ +K + K + K K ++ K D K
Sbjct: 69 VEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf221TONBPROTEIN344e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.8 bits (77), Expect = 4e-04
Identities = 15/53 (28%), Positives = 23/53 (43%), Gaps = 1/53 (1%)

Query: 37 LEEPKKEDPKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKK 89
LE P+ P E + EP+ E + + ++ PK PK +PK K
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK-PKPKPKPKPKPVKKV 106



Score = 31.9 bits (72), Expect = 0.002
Identities = 11/54 (20%), Positives = 23/54 (42%)

Query: 38 EEPKKEDPKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKKSP 91
+P E + EP+ + + + ++ K PK P + +E+PK+
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114



Score = 31.5 bits (71), Expect = 0.003
Identities = 16/78 (20%), Positives = 29/78 (37%), Gaps = 13/78 (16%)

Query: 45 PKKEEPKKDEPKKEDPKKDEPKKEDPKKEDPKQNPKDEPKEEPKKSPEERLLELNILKKE 104
P EP + +P + + +P E PK+ P K +PK P+
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK------------ 99

Query: 105 IAKKQVELEKTIKSDISK 122
K ++++ K D+
Sbjct: 100 -PKPVKKVQEQPKRDVKP 116



Score = 30.0 bits (67), Expect = 0.007
Identities = 14/53 (26%), Positives = 24/53 (45%), Gaps = 1/53 (1%)

Query: 38 EEPKKEDPKKEEPKKDEPKKEDPK-KDEPKKEDPKKEDPKQNPKDEPKEEPKK 89
+P+ E EP KE P ++PK + K P + +++PK + K
Sbjct: 64 PPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf223IGASERPTASE320.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.3 bits (73), Expect = 0.001
Identities = 28/168 (16%), Positives = 52/168 (30%), Gaps = 12/168 (7%)

Query: 33 TDKKLEEPKKDGQKKEDSKKPNTKDSNKPDEGTTPKEEKQNPHVPKGQPEWEDELHKFGQ 92
T + E + ++K + T++ K +PK+E+ P+ +P E++
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND------ 1150

Query: 93 PDENATPKEEKQNPHVPKGQPEWEDELHKFGQPDENATPKEEKQNPHVPKG------QPE 146
P N + + N QP E + E+ T P+ QP
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210

Query: 147 WEDELHKFGQPDENATPKEEKQNPHVPKGQPEWEDELHSSGETSKNSN 194
E + + + N + TS N+N
Sbjct: 1211 VNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf224TONBPROTEIN382e-05 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 37.7 bits (87), Expect = 2e-05
Identities = 14/61 (22%), Positives = 28/61 (45%), Gaps = 1/61 (1%)

Query: 37 PEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQDPKDKPKEEPK 96
P + +P EP+ E P+ P+ + ++PK + K P + +++PK + K
Sbjct: 57 PPQAVQPPPEPVVEPEPE-PEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVK 115

Query: 97 K 97

Sbjct: 116 P 116



Score = 37.3 bits (86), Expect = 3e-05
Identities = 18/50 (36%), Positives = 21/50 (42%)

Query: 37 PEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQD 86
PE PE PK+ E PK PK KPK K E K K + +
Sbjct: 73 PEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPA 122



Score = 36.5 bits (84), Expect = 6e-05
Identities = 21/81 (25%), Positives = 33/81 (40%), Gaps = 1/81 (1%)

Query: 22 ATSVVAAACDNTNKKPEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKE 81
+ ++V A + + P +P + EP+ E+PK K KP K+
Sbjct: 46 SVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKK 105

Query: 82 DPKQDPKD-KPKEEPKKSPEE 101
+Q +D KP E SP E
Sbjct: 106 VQEQPKRDVKPVESRPASPFE 126



Score = 35.7 bits (82), Expect = 1e-04
Identities = 14/61 (22%), Positives = 29/61 (47%)

Query: 43 PEEPKKEEPKKEDPKQDPKDKPKEEPKKDEPKKEKPKKEDPKQDPKDKPKEEPKKSPEER 102
EP + +P +P+ +P+ P+ + +K PK PK KP ++ ++ P+
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 103 L 103
+
Sbjct: 114 V 114



Score = 29.6 bits (66), Expect = 0.012
Identities = 18/105 (17%), Positives = 36/105 (34%), Gaps = 1/105 (0%)

Query: 11 LLASVGSLSLFATSVVAAACDNTNKKPEEPKKPEEPKKEEPKKEDPKQDPKDKPKEEPKK 70
L SV ++ + + P P +P P +P Q + P+ +
Sbjct: 13 TLLSVCIHGAVVAGLLYTSVHQVIELPA-PAQPISVTMVTPADLEPPQAVQPPPEPVVEP 71

Query: 71 DEPKKEKPKKEDPKQDPKDKPKEEPKKSPEERLLELNSLKKEITK 115
+ + P+ +KPK +PK P+ K+++
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf227RTXTOXINA290.022 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.022
Identities = 11/50 (22%), Positives = 28/50 (56%), Gaps = 1/50 (2%)

Query: 123 VNVWKKLINNYQEAINELNSFEEQIAKIEKLLNVVRDLDGL-NQIKKLSK 171
+ + +L++ N +NSF +Q+ + +L+ + L+G+ N+++ L
Sbjct: 185 IELINQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQNLPN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf229PF05272300.021 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.021
Identities = 9/25 (36%), Positives = 13/25 (52%)

Query: 53 EKPKTSEDTKKGQDHSPKQDDKQTP 77
P +ED + Q H+P D+Q P
Sbjct: 851 WPPVIAEDKEADQAHAPGDQDQQQP 875


10MARTH_orf353MARTH_orf367N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf35348-1.168251cysteine protease
MARTH_orf35859-1.618444massive surface protein MspB
MARTH_orf359112-2.598976hypothetical lipoprotein
MARTH_orf362011-2.680475hypothetical protein
MARTH_orf364112-2.294224SpoU class tRNA/rRNA methyltransferase
MARTH_orf365012-3.452956GTPase protein YlqF
MARTH_orf366-110-3.198442putative esterase or lipase, membrane protein
MARTH_orf367-19-2.606843FAD synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf353PF05616330.006 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 32.8 bits (74), Expect = 0.006
Identities = 16/49 (32%), Positives = 23/49 (46%), Gaps = 6/49 (12%)

Query: 97 QNDKNNSNSNQKKNDPKNPETKPEDEDNTPKNPDQKPKNDENSKTKPEN 145
+N NN N+ NP T+P E + NPD P D T+P++
Sbjct: 335 ENPANNPAPNE------NPGTRPNPEPDPDLNPDANPDTDGQPGTRPDS 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf358FbpA_PF05833350.003 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 35.2 bits (81), Expect = 0.003
Identities = 47/229 (20%), Positives = 90/229 (39%), Gaps = 12/229 (5%)

Query: 1161 QDFLEKAKKKLKSIDKEIDELNKSLLEEKTQLEEKSKFKDIANDDVDGLKSAIDVLEKEI 1220
+ ++ + LK I + +L K + K + +K L S D + +
Sbjct: 218 NNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQY 277

Query: 1221 DKANKLKNDANSKRNSDQKIKNQTDTNFGILENAIKQAKKELKTKQQTLEALKTSNQSKA 1280
D ++KL + ++ ++K+++ I+ N I + K+ K TL+ + + K
Sbjct: 278 DSSSKLLENFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKL 337

Query: 1281 NETIKKANEIIKKLKDASDVTSITNAINDSNQTKSILEKAIDELKKDSEQKQ-------K 1333
+ AN I LK + N +++ T I +DE K S+ Q K
Sbjct: 338 YGELLTAN--IYALKKGLSHIELANYYSENYDTVKI---TLDENKTPSQNVQSYYKKYNK 392

Query: 1334 IENKLIELNNEIAAAKQKLETLKQDADSQKLAALKDEINKIKKSLKNAN 1382
++ N ++ +++L L + A DEI +IKK L
Sbjct: 393 LKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETG 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf359FRAGILYSIN270.008 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 27.0 bits (59), Expect = 0.008
Identities = 17/86 (19%), Positives = 38/86 (44%), Gaps = 7/86 (8%)

Query: 3 HFQKKKITSFLIFIGATAALSSGAMFTVSCASTVKSK-------EKKETTTLQKRINKIS 55
+F K K L+ +G A L++ + S +++ + + T L ++N +S
Sbjct: 5 NFNKMKNVKLLLMLGTAALLAACSNEADSLTTSIDAPVTASIDLQSVSYTDLATQLNDVS 64

Query: 56 DDLKTLEAQTQSLIKQIHFKPNELTQ 81
D K + + +Q+H ++ T+
Sbjct: 65 DFGKMIILKDNGFNRQVHVSMDKRTK 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf367LPSBIOSNTHSS300.008 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.8 bits (67), Expect = 0.008
Identities = 19/114 (16%), Positives = 43/114 (37%), Gaps = 29/114 (25%)

Query: 23 ICLGGFESLHLGHYELIKKAREINATNSENKLAILLLKNSINKGRIQEKKLFQLKTRLYT 82
I G F+ + GH ++I++ + ++ + +L+N ++ +F ++ RL
Sbjct: 4 IYPGSFDPITFGHLDIIERGCRLFD-----QVYVAVLRNP------NKQPMFSVQERLEQ 52

Query: 83 ISALNFDYAFYIEVSEDLINLSAEKF----IEKLKEINVKQIVCG----PDFCF 128
I+ + L N + F + ++ I+ G DF
Sbjct: 53 IA----------KAIAHLPNAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFEL 96


11MARTH_orf456MARTH_orf500N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf4566100.676904conserved hypothetical lipoprotein
MARTH_orf4579110.321880hypothetical protein
MARTH_orf45999-0.885763hypothetical membrane protein
MARTH_orf46299-1.329272hypothetical lipoprotein
MARTH_orf46999-1.983472massive surface protein MspC
MARTH_orf47189-2.560015massive surface protein MspJ
MARTH_orf47298-1.354713massive surface protein MspJ'
MARTH_orf481109-1.159517massive surface protein MspD
MARTH_orf48489-0.384461hypothetical protein
MARTH_orf48688-0.064534hypothetical lipoprotein
MARTH_orf488680.147677hypothetical lipoprotein
MARTH_orf492680.278156massive surface protein MspE
MARTH_orf49728-0.062295massive surface protein MspF
MARTH_orf498-1120.242010ATP synthase subunit B
MARTH_orf499010-1.527434ATP synthase subunit A
MARTH_orf50009-0.888392conserved hypothetical membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf456PF01540320.007 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 32.0 bits (72), Expect = 0.007
Identities = 19/57 (33%), Positives = 29/57 (50%), Gaps = 1/57 (1%)

Query: 1 MRKISKLLLTVVGFGVAVSPAVVSVSCKNMTLADAANNSKVDVINKQ-NKLAREVKK 56
M+K K+ +T+ G + ++SC + LA+ K D KQ N LA E+KK
Sbjct: 1 MKKSKKIFITLCGIAATAVLPIATISCNDDKLAEKNGKEKADAALKQANALAEELKK 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf459IGASERPTASE300.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.003
Identities = 22/113 (19%), Positives = 43/113 (38%), Gaps = 1/113 (0%)

Query: 24 TEKKPKESKKPESEEPKKEDQKQKPEEPKPADPKSDESKKLEEPGKNNVEESTKSNPSNT 83
TEK + K PK+E + + +PA ++D + ++EP + P+
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPA-RENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 84 TEGETANSTEGTTDSEDVNSTEGTTDSEDANSTEGNTDSEDVNSTETTTQGSL 136
T +T NS ++ +T+ +SE N + + S+
Sbjct: 1175 TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV 1227



Score = 27.0 bits (59), Expect = 0.033
Identities = 23/118 (19%), Positives = 40/118 (33%), Gaps = 11/118 (9%)

Query: 24 TEKKPKESKKPESEEPKKEDQK--------QKPEEPKPADPKSDESKKLEEPGKNNVEES 75
++P + E+P E + PE PA + + + KN S
Sbjct: 1167 DTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRS 1226

Query: 76 TKSNPSNTTEGETANSTEGTTDSEDVNSTEGTTDSEDANSTEGNTDSEDVNSTETTTQ 133
+S P N T+++ T D+ ST DA + +N + +Q
Sbjct: 1227 VRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAK---AQFVALNVGKAVSQ 1281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf462IGASERPTASE382e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 2e-05
Identities = 23/139 (16%), Positives = 41/139 (29%), Gaps = 4/139 (2%)

Query: 31 EDVAKKE-DDPKEKPADPKPADPKPADPKLADPKPADPKPADPKPADPKPADP---KPAD 86
E+ AK E + +E P PK + P+ + DP +P
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 87 PKPADPKPADPKPADPKPADPKPADPKPADPKPADPKPADPKPADPKPADPKPADPKPAD 146
+PA ++ + + + +P+ P P PK +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVR 1228

Query: 147 PKPADPKPADPKPADPKPA 165
P + +PA D
Sbjct: 1229 SVPHNVEPATTSSNDRSTV 1247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf469CHANLCOLICIN360.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.2 bits (83), Expect = 0.002
Identities = 38/239 (15%), Positives = 88/239 (36%), Gaps = 16/239 (6%)

Query: 1873 KALDKANEAIQKSNDDNQKENALTDAESELKKQKKNLDDLIKGELKNDSENKKKLEDKIK 1932
+A +KA + ++ + ++E A T+ + +L + ++ + E K +KKL
Sbjct: 144 EAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQS 203

Query: 1933 DIDKKLQEVDQVKQELKQSQDEKSKELAIRARELK---------QELDAWINKLDAANKK 1983
++ K E+ + L S + E+ A + +ELD + KL
Sbjct: 204 EVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRAND 263

Query: 1984 WSQSSKDIANVQKAIDKIEEFLQANKNLKNNLNLKTDYEALMKSKDVGNKALQNAKELIE 2043
Q+ ++ + + + K + + +T + KA+
Sbjct: 264 PLQNRPFFEATRRRVGAGKIREEKQKQVTAS---ETRINRINADITQIQKAISQVSNNRN 320

Query: 2044 NKKVEFEEKIKTLEEKTRDLETNLLSNNTTDGLNSIIREIGNESIGLLKKINQIQSEIA 2102
E EE + + NLL++ D +++ + + +K +++ E+A
Sbjct: 321 AGIARVHEA----EENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYGEKYSKMAQELA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf481CHANLCOLICIN350.003 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 0.003
Identities = 35/223 (15%), Positives = 82/223 (36%), Gaps = 15/223 (6%)

Query: 1850 EETIRTKKEELKTLQEKNNKI-ATEAINKSRVAKTELD-NVNKLDDNSPEKINNLKNAIS 1907
+E + +KE + E ++ EA K A +E V + +
Sbjct: 151 QEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDG 210

Query: 1908 EAKKAKTELEEALTLLEKDVENKAKVQGELQALNAEITNAQNKLNLLMTSETEKL----- 1962
E K + L ++ + +++ A + EL +A+ + L + L
Sbjct: 211 EIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPF 270

Query: 1963 -ELIERKLEEIRNSLNTANQNLENQTELNAKETANEALKVAITDAKSNIEAPKNNLNNIH 2021
E R++ + Q ++T +N ++ AI+ +N A ++
Sbjct: 271 FEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAE 330

Query: 2022 EESARKRADEKIKELEN-------FIKSLEQKYQDTKEQIAQD 2057
E + + + ++++ F ++L +KY + ++AQ+
Sbjct: 331 ENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYGEKYSKMAQE 373



Score = 31.6 bits (71), Expect = 0.044
Identities = 30/197 (15%), Positives = 69/197 (35%), Gaps = 1/197 (0%)

Query: 152 LEEKETKLKKETENLLKEIEETLAKVPNVGDLTLEIQKLINKLKDLKNTSEALQEQLLNS 211
L + E K +KE E K +E + + E ++ + + + AL E+
Sbjct: 132 LAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAV 191

Query: 212 RLVEDAKKLNDANLKIKDAINKLQKRLLEKSPEELEKLAKEKIQDLEKSKKAVDDAKNIS 271
+ + + + D K L S + K + +A K +
Sbjct: 192 EIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 272 TLPNALESLKKDIETAKEIKEQANSK-GLDELSKKLEDAIKEAESTLEKGKQKQEQIEKE 330
L L D + E + G ++ ++ + + +E+ + + QI+K
Sbjct: 252 ELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKA 311

Query: 331 NEELKQKIDALVTKVNQ 347
++ +A + +V++
Sbjct: 312 ISQVSNNRNAGIARVHE 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf486IGASERPTASE300.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.013
Identities = 33/149 (22%), Positives = 57/149 (38%), Gaps = 13/149 (8%)

Query: 40 PKFQEKSESKPETPKQDSQSKQT-RPDNNSNNEENTVPQDQNPQNPPAPDLNDKNHEKSP 98
P+ ++++++ T + Q P SNNEE + P PPAP + E
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEA-PVPPPAPATPSETTETVA 1041

Query: 99 ENPQVPIPNEDLDKNEDLDNNKEDTKPKINRKSITEEEKEKNRAFLKTTQEKANEFSKRI 158
EN + +++ ++ N++D + +E + N T E A S+
Sbjct: 1042 ENSK--------QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 159 EEYR---KNSNKSEKEESIKSLLFQIVEE 184
E K + EKEE K + E
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEV 1122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf488IGASERPTASE310.010 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.010
Identities = 17/129 (13%), Positives = 47/129 (36%), Gaps = 3/129 (2%)

Query: 32 KNPEAPDKKQDTPKQDIPQNPDKKENQNLNPNNKQQNTPQNPNSKEDQKPQAPENTEIEK 91
E ++ T +++ + + N N + E + + E
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS---ETKET 1095

Query: 92 QNPKKQDKDQNQDNKKDQQQKDQQQEDPKTQNDKENQQEQQDSNKDKDDEEKNQYIDENE 151
Q + ++ + +K + + ++ QE PK + +QEQ ++ + + + + N
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 152 REANNLISK 160
+E + +
Sbjct: 1156 KEPQSQTNT 1164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf492cloacin424e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 41.6 bits (97), Expect = 4e-05
Identities = 34/171 (19%), Positives = 79/171 (46%), Gaps = 19/171 (11%)

Query: 810 IKEKQDLLDKAKQDADTANTIDDKKDQYEKLGKSIENVEKELEKLKE-EAKKLQEQNNKD 868
+K++QD ++ +Q+ D + ++ + YE+ + +++ + +E +AK +Q N++
Sbjct: 296 VKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNSR- 354

Query: 869 DIENKLNELNKNLEDAKSDLEEKNKALEQEKQNNVDQTKPLLDEADEAIKKAEDAIKNPS 928
+++L+ NK L DA +++++ N+ + A KA+ A
Sbjct: 355 --KSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQ-------MAGLKAQRA----- 400

Query: 929 NKDKVKEAQNALEKAKKDLEDQKAKPEINGDSETKKSIDDKLVDIKDKQDK 979
+ V Q A + A K+ D A ++ E++K +DK ++ +
Sbjct: 401 -QTDVNNKQAAFDAAAKEKSDADAA--LSSAMESRKKKEDKKRSAENNLND 448



Score = 36.2 bits (83), Expect = 0.002
Identities = 31/139 (22%), Positives = 63/139 (45%), Gaps = 30/139 (21%)

Query: 1413 EALKSAIDKAKAEIQTQRDEIAKLQEQ---------AKKAEAEAAAKTIEKLIQDLEK-- 1461
EA + ++A+AE+ +++A+ QE+ ++K+E +AA KT+ I ++++
Sbjct: 317 EAAERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAANKTLADAIAEIKQFN 376

Query: 1462 --------------KYIDTKEAIDKDKAANKTKTGNA----LTSADAAINAAKDAIK-KQ 1502
+ K + NK +A + ADAA+++A ++ K K+
Sbjct: 377 RFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAMESRKKKE 436

Query: 1503 NTDPDKEKALDDAKKELAK 1521
+ E L+D K + K
Sbjct: 437 DKKRSAENNLNDEKNKPRK 455



Score = 33.5 bits (76), Expect = 0.013
Identities = 33/174 (18%), Positives = 68/174 (39%), Gaps = 15/174 (8%)

Query: 1274 DALKEQQTKNAQIKDEALKEAYDATKAIDEANKLEDSNNTKIAKLTKALEDANNAKTKLE 1333
D L Q K Q ++ ++ +DAT ++ A + + ++ + + + + K
Sbjct: 289 DVLSPDQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAV 348

Query: 1334 GTISSL--AKDTVNKKLVEDALENLKTKIKDAQDKLDA---LKEGETKLLERIKKALDAI 1388
+S D NK L DA+ +K + A D + + + +R + ++
Sbjct: 349 QVYNSRKSELDAANKTL-ADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNK 407

Query: 1389 KEALDDADDNLENATDLKDREATNEALKSAIDKAKAEIQTQRDEIAKLQEQAKK 1442
+ A D A +A + AL SA++ K + +R L ++ K
Sbjct: 408 QAAFDAAAKEKSDA---------DAALSSAMESRKKKEDKKRSAENNLNDEKNK 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf497CHANLCOLICIN392e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 39.3 bits (91), Expect = 2e-04
Identities = 71/368 (19%), Positives = 149/368 (40%), Gaps = 18/368 (4%)

Query: 1172 ADKIAKLNKNLKAATNALGAQDTVTKNVNKTDIDSLTAQIKKLAEQIEV--ANKAKTAAE 1229
A++ A+ +A A +D +T+ + ++L + E+ AN A AE
Sbjct: 67 AEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAE 126

Query: 1230 KERVKDSKIQSETKAEFDKLVKAL---EKANKTLANKKVELQEQTNINNAL--TEDAIQK 1284
ER++ +K + + + E + KA E+ K + +K E + Q + A A+ +
Sbjct: 127 DERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSE 186

Query: 1285 SAAAI----AKINAA----NKLEDTDTTKISSLSDALNDANAAKEALQKTKEKLEKDAAN 1336
A A+ K++AA K++ T S LS +++ +A + L + +L + A+
Sbjct: 187 EAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQ-ASA 245

Query: 1337 KQRVEGELQKLESEIANAQSKLEVLKEGESKKLEQIKKDLENIKNDLDNANKNFDSKTDL 1396
K + EL K S AN + E +++ K E K + + D+
Sbjct: 246 KYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADI 305

Query: 1397 KDKEEANEALKAAINKAKKDLQVQKNKVNTLQEADKKAEANTKIKEIEDLIKTLTQKQQD 1456
++A + N + + + Q ++ + +TLT+K +
Sbjct: 306 TQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYGE 365

Query: 1457 KAAEISAKRTQNGE-KTTQAIQKADEALKKAKDAIEKDKNDSTKETDLNDAKS-ELEKQK 1514
K ++++ + + K + +A A +K KD + K + + ++ N S + +
Sbjct: 366 KYSKMAQELADKSKGKKIGNVNEALAAFEKYKDVLNKKFSKADRDAIFNALASVKYDDWA 425

Query: 1515 NILENLSK 1522
L+ +K
Sbjct: 426 KHLDQFAK 433



Score = 37.0 bits (85), Expect = 0.001
Identities = 70/393 (17%), Positives = 148/393 (37%), Gaps = 55/393 (13%)

Query: 1277 LTEDAIQKSAAAIAKINAANKLEDTDTTKISSLSDALNDANAAKEALQKTKEKLEKDAAN 1336
L + +++A A A A K + L D +N+A + +T E AN
Sbjct: 62 LKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNAS--RTPSATELAHAN 119

Query: 1337 KQRVEGELQKLESEIANAQSKLEVLKEGESKKLEQIKKDLENIKNDLDNANKNFDSKTDL 1396
++ E ++L +K E E++ E+ ++ E + +++ + + L
Sbjct: 120 NAAMQAEDERL------RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKL 173

Query: 1397 KDKEEANEALKAAINKAKKDLQVQKNKVNTLQEADKKAEANTKIKEIEDLIKTLTQKQQD 1456
+ EE A A+++ K +++ + K++ Q +E EI+ L L+
Sbjct: 174 AEAEEKRLA---ALSEEAKAVEIAQKKLSAAQ-----SEVVKMDGEIKTLNSRLSSSIHA 225

Query: 1457 KAAEISAKRTQNGEKTTQAIQKADEALKKAKDAIEKDKNDSTKETDLNDAKSELEKQKNI 1516
+ AE+ + E QA K E + K + + + + K
Sbjct: 226 RDAEMKTLAGKRNE-LAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIR 284

Query: 1517 LENLSKNDLKDDSENRAKIDSKTTEIDKKLQEIDNAQKALKQSQDQRATELA-------- 1568
E + ++ + T I++ +I QKA+ Q + R +A
Sbjct: 285 ------------EEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEEN 332

Query: 1569 -------IEAKKLQDALEKLANELKPETEQWSQTRGKISQIEQKVKEIEEG-----FLKA 1616
+ +++DA++ + + TE++ + K S++ Q++ + +G +A
Sbjct: 333 LKKAQNNLLNSQIKDAVDATVSFYQTLTEKYGE---KYSKMAQELADKSKGKKIGNVNEA 389

Query: 1617 NGEANKLKNNSKLK---PAYDALVKAIETVKAK 1646
K K+ K DA+ A+ +VK
Sbjct: 390 LAAFEKYKDVLNKKFSKADRDAIFNALASVKYD 422



Score = 35.1 bits (80), Expect = 0.006
Identities = 38/244 (15%), Positives = 78/244 (31%), Gaps = 6/244 (2%)

Query: 1827 ASALNNATATLAAKKAALTRKQAQNDALVENLIQKSSEAKKALADANKLANNHPDKIKNL 1886
L A A A+K A ++A +A E ++ K KLA ++ L
Sbjct: 127 DERLRLAKAEEKARKEAEAAEKAFQEA--EQRRKEIEREKAETERQLKLAEAEEKRLAAL 184

Query: 1887 DDAITKATEAKKALENAINSLAKDSANKKKVEVEFQDLNTKIADAQNKLKILREGESKKL 1946
+ A+K L A + + K K + L R ++
Sbjct: 185 SEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQAS 244

Query: 1947 EQIKKDLDKIKNDLDNANKNFDAKIDLNDKEEANKALKDAIDKATSDLQKQKNEANKIQE 2006
+ K+ + +K AN + A K +K Q +E +
Sbjct: 245 AKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQ---KQVTASETRINRI 301

Query: 2007 DAKKLEANQEIKTIQDLIASLTKKYQDKQKEIKQTKDA-NKKKANDAIARADEALNKANE 2065
+A + + I + + + + + ++ +K+ ++ + DA+ E
Sbjct: 302 NADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTE 361

Query: 2066 ALGK 2069
G+
Sbjct: 362 KYGE 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf500TONBPROTEIN310.011 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.1 bits (70), Expect = 0.011
Identities = 17/68 (25%), Positives = 21/68 (30%), Gaps = 1/68 (1%)

Query: 265 TRVIEYDLEHNQK-NNEQTPKTPPKPEQPPEIIPEDKPKDFDPSIENVAPLVPRIRTKYL 323
T V DLE Q P P+PE P P + + P+ K
Sbjct: 48 TMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQ 107

Query: 324 TNPSRIVS 331
P R V
Sbjct: 108 EQPKRDVK 115



Score = 29.2 bits (65), Expect = 0.049
Identities = 13/58 (22%), Positives = 26/58 (44%), Gaps = 1/58 (1%)

Query: 280 EQTPKTPPKPEQPPEIIPEDKPKDFDPSIENVAPLVPRIRTKYL-TNPSRIVSGYTSN 336
++ P KP+ P+ P+ K + +V P+ R + + T P+R+ S +
Sbjct: 83 KEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATA 140


12MARTH_orf627MARTH_orf653N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MARTH_orf627-28-0.347188heat shock ATP-dependent protease
MARTH_orf628-18-1.038973spermidine/putrescine ABC transporter substrate
MARTH_orf62908-0.554645spermidine/putrescine ABC transporter permease
MARTH_orf630-16-1.983723spermidine/putrescine ABC transporter permease
MARTH_orf631-16-1.814325spermidine/putrescine transport ATP-binding
MARTH_orf634-19-1.616969spermidine/putrescine ABC transporter substrate
MARTH_orf63509-0.834949GTP-binding protein Era
MARTH_orf636-110-1.246278conserved hypothetical protein
MARTH_orf63758-0.883892bacteriocin exporter and processing endoprotease
MARTH_orf639670.254028hypothetical membrane protein
MARTH_orf640670.322397hypothetical lipoprotein
MARTH_orf641680.310253glutamyl-tRNA synthetase
MARTH_orf642680.227749hypothetical protein
MARTH_orf647680.345651massive surface protein MspG
MARTH_orf653380.735116massive surface protein MspH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf627HTHFIS494e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.4 bits (118), Expect = 4e-08
Identities = 28/105 (26%), Positives = 49/105 (46%), Gaps = 15/105 (14%)

Query: 404 NIPILTLIGPPGTGKTTLAKSIAEALGRQ---FVKISLGGVKD---ESEIRGHRRTYVGA 457
++ ++ + G GTGK +A+++ + R+ FV I++ + ESE+ GH + GA
Sbjct: 160 DLTLM-ITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GA 215

Query: 458 LPGKIISGIKKAGVSNP-VILLDEIDKMSSDFRGDPLSALLEVLD 501
G + + + LDEI M D + + LL VL
Sbjct: 216 FTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf628MYCMG045358e-04 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 35.1 bits (80), Expect = 8e-04
Identities = 63/347 (18%), Positives = 139/347 (40%), Gaps = 57/347 (16%)

Query: 19 FIVFISFLTIKLSGGYRPSIY-NYESYLAPEIKNKL--KQNYNYKEFKEVNEFTQALNAE 75
F +F+S +I S G + N+ESY++P + ++ K + + +
Sbjct: 10 FSLFVSLSSILSSCGSTTFVLANFESYISPLLLERVQEKHPLTFLTYPSNEKLINGFANN 69

Query: 76 KAIAGVGSDFQAAQLIIDNKIRKIDFTKIFNENANSWEYRKKLFRPAIVKHMEAFDRYIY 135
V S + ++LI + + ID+++ + ++S
Sbjct: 70 TYSVAVASTYAVSELIERDLLSPIDWSQFNLKKSSSSS---------------------- 107

Query: 136 DAIANKKHPKARILDQEKKTYDVDGDGKADHFYDYIIPYYSQDKGIAYNTNKSTRPHLDT 195
D + N K +D K+ D K + + +PY+ Q+ Y K
Sbjct: 108 DKVNNASDAKDLFIDSIKEISQQTKDSKNNELLHWAVPYFLQNLVFVYRGEK-------- 159

Query: 196 EAATKELANQNGKLNWLEIVEIL----KKYNYKRFGWTNAYYDNLMLGAINKNVSNYNQI 251
EL +N ++W ++++ + ++N R + + D + ++ V+ N
Sbjct: 160 ---ISELEQEN--VSWTDVIKAIVKHKDRFNDNRLVFID---DARTIFSLANIVNTNNNS 211

Query: 252 TEENYKE-YIDSFVDFVKKATGANIKDTDLN--FMSNDGLELLNHLIEPKAKRSDAAVLY 308
+ N KE I F + + + ++L+ F+++D ++N L + R ++Y
Sbjct: 212 ADVNPKEDGIGYFTNVYESFQRLGLTKSNLDSIFVNSDSNIVINEL---ASGRRQGGIVY 268

Query: 309 NGDALDAYYSRDNFDSVQDGNV------RFIRPKNNYILMDVWIISQ 349
NGDA+ A D D + + + ++PK + + +D+ +I++
Sbjct: 269 NGDAVYAALGGDLRDELSEEQIPDGNNFHIVQPKISPVALDLLVINK 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf631PF05272357e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 7e-04
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 1/36 (2%)

Query: 45 VTLLGPSGSGKTTILRIIGGFEWVTRGEVKFY-GKD 79
V L G G GK+T++ + G ++ + GKD
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf634MYCMG045350.001 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 34.7 bits (79), Expect = 0.001
Identities = 41/188 (21%), Positives = 73/188 (38%), Gaps = 22/188 (11%)

Query: 184 IPYFMQDKVIAYNINKKYRDHLKDIDSIVFKDTNWLTIIKTLVEKHNYKHISWTSSFLDN 243
+PYF+Q+ V Y K I + ++ +W +IK +V KH + F+D+
Sbjct: 144 VPYFLQNLVFVYRGEK--------ISELEQENVSWTDVIKAIV-KHKDRFNDNRLVFIDD 194

Query: 244 AMIGQFYATESKIKNYLISGKITQLDNKNYKEVFDHFFNFVKEATGRDIRDVRTNKLITS 303
A A N + + V++ F + D V ++ I
Sbjct: 195 ARTIFSLANIVNTNNNSADVNPKEDGIGYFTNVYESFQRLGLTKSNLDSIFVNSDSNIV- 253

Query: 304 GLELVNDIIEPSNQRADIAVMYNGDAVDSFYAKD-----NFEVLGDQQQIKYVRPKNNYM 358
I E ++ R ++YNGDAV + D + E + D V+PK + +
Sbjct: 254 -------INELASGRRQGGIVYNGDAVYAALGGDLRDELSEEQIPDGNNFHIVQPKISPV 306

Query: 359 LLDAWIVS 366
LD +++
Sbjct: 307 ALDLLVIN 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf637PF05272340.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.9 bits (77), Expect = 0.002
Identities = 12/33 (36%), Positives = 15/33 (45%), Gaps = 7/33 (21%)

Query: 502 GKNGVGKSTLAKLLFGLYDDYSGAIYFNDKELS 534
G G+GKSTL L GL +F+D
Sbjct: 603 GTGGIGKSTLINTLVGLD-------FFSDTHFD 628


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf641CHLAMIDIAOMP320.007 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 31.5 bits (71), Expect = 0.007
Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 3/62 (4%)

Query: 37 GDFIF-RLEDTDVERNVEGGEASQLNNLAWLGIVPDESPLKPNPKYGKYRQSEKLAIYQA 95
GDF+F R+ TDV + + G+ + P + NP YG++ Q ++ A
Sbjct: 66 GDFVFDRVLKTDVNKEFQMGDK--PTSTTGNATAPTTLTARENPAYGRHMQDAEMFTNAA 123

Query: 96 YI 97
+
Sbjct: 124 CM 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf647TYPE4SSCAGA387e-04 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 37.8 bits (87), Expect = 7e-04
Identities = 92/411 (22%), Positives = 185/411 (45%), Gaps = 24/411 (5%)

Query: 1762 FENIKQELEKIKKSLNDANNDFKNKTDLKAKEEANNKLKDAIDKAETDLASQKNEISKLQ 1821
F + +EL + N A D KN + ++A L+ ++ K E + ++
Sbjct: 577 FLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKS 636

Query: 1822 DQNNKTSANTKVEEVEKLIKRLKDEQKANAQSISAKKTKNEGLIKKPLEDSQKALDKANE 1881
NK A + + I L +++ AN + + +N IK+ L D + ++K +
Sbjct: 637 GNKNKMEAKAQANSQKDEIFALINKE-ANRDARAIAYAQNLKGIKRELSDKLENVNKNLK 695

Query: 1882 AIKKSNDD--SQKENALTDAENELKKKKKTLDDL-IKGELKNDSENKNKLEYKIKD-IDK 1937
KS D+ + K + AE LK K ++ DL I E + EN N + K+ +K
Sbjct: 696 DFDKSFDEFKNGKNKDFSKAEETLKALKGSVKDLGINPEWISKVENLNAALNEFKNGKNK 755

Query: 1938 KLQEVDQAKQDLKQSQDQKAENLATEARKLKEKLTNLVSQLNPENE--KWNQTETKIANI 1995
+V QAK DL+ S N +K+ +K+ NL ++ +++ E +A++
Sbjct: 756 DFSKVTQAKSDLENSVKDVIIN-----QKVTDKVDNLNQAVSVAKATGDFSRVEQALADL 810

Query: 1996 RKMIKDEIDEFLKSGGAADKLKGHSKLKDAIKELEQAKSSATNAIQRAQEVINNK----- 2050
+ K+++ + + + + + S++ ++K + N + +A+ +K
Sbjct: 811 KNFSKEQLAQQAQKNESLN-ARKKSEIYQSVKN-GVNGTLVGNGLSQAEATTLSKNFSDI 868

Query: 2051 KREFNNKIDSFEAKTNS-VQTELNEAETNAKLSDLIAKIGDET-SGLLKEVN---DLINE 2105
K+E N K+ +F N+ ++ E A+ N K + A + + + + K+VN D +N+
Sbjct: 869 KKELNAKLGNFNNNNNNGLKNEPIYAKVNKKKAGQAASLEEPIYAQVAKKVNAKIDRLNQ 928

Query: 2106 ISKFEGNGGELTTKVNDLKERLKEIQRQANERKQNKDKKIEELNKELSEAK 2156
I+ G G+ +++ ++ + R Q +KI+ LN+ +SEAK
Sbjct: 929 IASGLGVVGQAAGFPLKRHDKVDDLSKVGLSRNQELAQKIDNLNQAVSEAK 979



Score = 32.8 bits (74), Expect = 0.026
Identities = 62/275 (22%), Positives = 115/275 (41%), Gaps = 27/275 (9%)

Query: 1151 KEELSKKIKTLETNLLDIKEEANSKLQSIQDKITKLNKELETTNTSLNAQNSTTNGVA-- 1208
++E+ KK+++ N K EA ++ S +D+I L + + A G+
Sbjct: 625 EKEVEKKLESKSGN--KNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRE 682

Query: 1209 -SDDVDSLKAEIKKLQDQLNEANKLKEKAQKEADQKIKD--------GIKEKLETLEQSI 1259
SD ++++ +K +E K K +A++ +K GI + + +++
Sbjct: 683 LSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLKALKGSVKDLGINPEWISKVENL 742

Query: 1260 NVATTTITNKQQQLNSQKTTNDTDRENAIK------KSTDAKQKLADANNLADNDVNKIG 1313
N A N + + S+ T +D EN++K K TD L A ++A
Sbjct: 743 NAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKVTDKVDNLNQAVSVA-KATGDFS 801

Query: 1314 KLEEALED-ANKAKETLEQTIQKLAKDNENQKLVETELKDLEKQIKSSQDKLDFLKEGDK 1372
++E+AL D N +KE L Q QK N+ L + ++ + +K+ + +
Sbjct: 802 RVEQALADLKNFSKEQLAQQAQK------NESLNARKKSEIYQSVKNGVNGTLVGNGLSQ 855

Query: 1373 KKLENIEKELEKIKKSLTDADNDFNNKTDLKAKEE 1407
+ + K IKK L +FNN + K E
Sbjct: 856 AEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKNE 890


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MARTH_orf653cloacin393e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 38.9 bits (90), Expect = 3e-04
Identities = 34/169 (20%), Positives = 56/169 (33%), Gaps = 35/169 (20%)

Query: 792 AKAKYEKLKKSLEDEQSLLDSLNKAIEKQSKTLNDAKQEADKATTINDKDNKYGLLDKAL 851
A+ YE+ + L + + K + N K E D A +K L
Sbjct: 319 AERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAA-------------NKTL 365

Query: 852 QNAKTELDKIKENAKDLKDQANKDSINNKINDLEKQISDNEEDLKKKQEELTKDKEKNDK 911
+A E+ + A D ++ + + D+ KQ
Sbjct: 366 ADAIAEIKQFNRFAHDPMAGGHR-----MWQMAGLKAQRAQTDVNNKQAAF--------- 411

Query: 912 LIKDLTDEADDAISKANDAIKNPSDKNKTKEAEDALKDAHKKLNEEKDK 960
D A S A+ A+ + + K KE D + A LN+EK+K
Sbjct: 412 ------DAAAKEKSDADAALSSAMESRKKKE--DKKRSAENNLNDEKNK 452



Score = 33.1 bits (75), Expect = 0.016
Identities = 29/145 (20%), Positives = 62/145 (42%), Gaps = 14/145 (9%)

Query: 1502 FDKSQKLNDAKTKLNETQKELNDLKDELTGDSENQAKVQE-------ELTKITKKLQDLE 1554
+D + + A+ + ELN +++ + E QAK + EL K L D
Sbjct: 310 WDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAANKTLADAI 369

Query: 1555 QAKKDLEQ------SENQKAEALAD-EAKRLKQELDNLVNALDAANSKWSKSNNDIATIE 1607
K + + + +A +A+R + +++N A DAA + S ++ +++
Sbjct: 370 AEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAM 429

Query: 1608 NKISNEINKYLAADSEASKLKNNPK 1632
+ +K +A++ + KN P+
Sbjct: 430 ESRKKKEDKKRSAENNLNDEKNKPR 454



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.