PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomegenomic.gbffThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP038216 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1E4F39_00005E4F39_00155Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_00005-1123.655617ABC transporter permease
E4F39_00015-1164.363494syringomycin synthesis regulator SyrP
E4F39_00020-2154.379853class I SAM-dependent methyltransferase
E4F39_00025-1154.097333citrate/2-methylcitrate synthase
E4F39_000300142.996228acyl-CoA dehydrogenase
E4F39_000351153.5789243-oxoacyl-ACP synthase
E4F39_000401153.416955acyl-CoA ligase (AMP-forming), exosortase A
E4F39_000450152.302739PLP-dependent decarboxylase
E4F39_000502172.7959003-oxoacyl-ACP synthase
E4F39_000552162.863961ferritin-like domain-containing protein
E4F39_000601143.314729hypothetical protein
E4F39_000651133.135712hypothetical protein
E4F39_000701122.960896acyl carrier protein
E4F39_000752123.055067hypothetical protein
E4F39_000803123.191975fatty acyl-AMP ligase
E4F39_000854133.992061LysR family transcriptional regulator
E4F39_000903143.744302epstein-Barr virus
E4F39_000952143.467707GntR family transcriptional regulator
E4F39_001001132.918982N-acetylglucosamine-6-phosphate deacetylase
E4F39_00105-1123.283753SIS domain-containing protein
E4F39_00110-293.130316phosphoenolpyruvate--protein phosphotransferase
E4F39_00115-292.118359PTS N-acetyl-D-glucosamine transporter
E4F39_001200152.169928hypothetical protein
E4F39_00125-291.472418hypothetical protein
E4F39_00135-112-1.687589hypothetical protein
E4F39_00140-112-2.546738hypothetical protein
E4F39_00145111-3.658288cytochrome bd-I oxidase subunit CydX
E4F39_0015009-3.927522cytochrome d ubiquinol oxidase subunit II
E4F39_0015509-4.168259cytochrome d ubiquinol oxidase subunit I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00005ABC2TRNSPORT320.003 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 31.8 bits (72), Expect = 0.003
Identities = 33/155 (21%), Positives = 60/155 (38%), Gaps = 7/155 (4%)

Query: 163 YGEFFATGILIMAFMSIGVVSTA-TTIATLRERNTFKMYVCFPVSRF-VFLASLIVSRVI 220
Y F A G++ + M+ T + + T++ + + + L + +
Sbjct: 65 YTAFLAAGMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATK 124

Query: 221 LMLAASVTLMLAARYLFQVPLPLWSLRALRAIPVVLLGAAMLLSLGTLLASRARSLAAAE 280
LA + ++AA + SL L A+PV+ L SLG ++ + A S
Sbjct: 125 AALAGAGIGVVAAALGY---TQWLSL--LYALPVIALTGLAFASLGMVVTALAPSYDYFI 179

Query: 281 AWCNLIYFPLLFFSDLTIPLRAAPHWLRVVLLVLP 315
+ L+ P+LF S P+ P + LP
Sbjct: 180 FYQTLVITPILFLSGAVFPVDQLPIVFQTAARFLP 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00110PHPHTRNFRASE513e-175 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 513 bits (1323), Expect = e-175
Identities = 194/567 (34%), Positives = 312/567 (55%), Gaps = 7/567 (1%)

Query: 309 PNTLAGVCAAPGIAVGTLVRWDDAQIVPPELASGTPAAESRLLDRALAEVDAQLETTVRE 368
+ + G+ A+ G+A+ + + + + + E L AL + +L +
Sbjct: 2 HHKITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQ 61

Query: 369 ASRRGAIGEAGIFAVHRVLLEDPALVDAARDLI-SLGKSAGYAWRETIRAQTAVLADVDD 427
+A IFA H ++L+DP LVD + I + +A YA +E ++ +D+
Sbjct: 62 TEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDN 121

Query: 428 TLLAERAADLRDIDKRVLRAL-GYASASARELPAEAVLAAEEFTPSDLASLDRERVAALV 486
+ ERAAD+RD+ KRVL L G + S + E V+ AE+ TPSD A L+++ V
Sbjct: 122 EYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 487 MARGGATSHAAIIARQLGIPALVAVGDALYAIAQRTQVVVDASAGRLEYAPSALDVERAH 546
GG TSH+AI++R L IPA+V + I V+VD G + P+ +V+
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 547 HERQRLAGVREANRRMSGEAALTRDGHRIEVAANIATLDDARVALDNGADAVGLLRTELM 606
+R ++ ++ GE + T+DG +E+AANI T D L NG + +GL RTE +
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 607 FIHRQAAPTASEHQQSYQSIVDALQGRTAIIRTLDVGADKEVDYLTLPPEPNPALGLRGI 666
++ R PT E ++Y+ +V + G+ +IRTLD+G DKE+ YL LP E NP LG R I
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 667 RLAQVRPDLLDDQLRGLLAVKPYGSVRILLPMVTDVGELVRIRKRIDD-----FARAMGR 721
RL + D+ QLR LL YG+++++ PM+ + EL + + + + + +
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 722 AQAVEVGVMIEVPSAALLADQLAQHADFLSIGTNDLTQYTLAMDRCQADLAAQADGLHPA 781
+ ++EVG+M+E+PS A+ A+ A+ DF SIGTNDL QYT+A DR ++ HPA
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 782 VLRLVDATVRGAEKHGKWVGVCGALGGDPVAVPVLVGLGVTELSVDPVSVPGIKAQVRRL 841
+LRLVD ++ A GKWVG+CG + GD VA+P+L+GLG+ E S+ S+ ++Q+ +L
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 842 DYQLCRQRAQDLLALESAQAVRAASRE 868
+ + AQ L L++A+ V ++
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQLVKK 568


2E4F39_00305E4F39_00365Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_00305211-2.538271ATP-dependent protease
E4F39_00310112-3.043905RNase adapter RapZ
E4F39_00315113-3.267589HPr kinase/phosphorylase
E4F39_00320-114-3.038941PTS IIA-like nitrogen-regulatory protein PtsN
E4F39_00325-214-2.432721ribosome-associated translation inhibitor RaiA
E4F39_00330-215-0.924387hypothetical protein
E4F39_00335-113-0.425086RNA polymerase factor sigma-54
E4F39_003400130.894660LPS export ABC transporter ATP-binding protein
E4F39_003451121.107467lipopolysaccharide transport periplasmic protein
E4F39_003501111.058524LPS export ABC transporter periplasmic protein
E4F39_003553121.429582HAD family hydrolase
E4F39_003604111.413466KpsF/GutQ family sugar-phosphate isomerase
E4F39_003653110.922792sodium/hydrogen exchanger family protein
3E4F39_00415E4F39_00515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_00415414-0.872050MFS transporter
E4F39_00420518-0.848232single-stranded DNA-binding protein
E4F39_00425432-6.693239dienelactone hydrolase family protein
E4F39_00430436-7.858434hypothetical protein
E4F39_00435438-7.861004hydrolase
E4F39_00440110-1.161897transcriptional regulator
E4F39_00445-18-0.246252hypothetical protein
E4F39_0045008-0.108073DUF262 domain-containing protein
E4F39_004550102.399875carboxymuconolactone decarboxylase
E4F39_004601112.461844hypothetical protein
E4F39_004651112.500554hypothetical protein
E4F39_0047019-0.234880hypothetical protein
E4F39_00480826-5.383258hypothetical protein
E4F39_00485216-1.304141hypothetical protein
E4F39_004901103.594913hypothetical protein
E4F39_004951123.500437hypothetical protein
E4F39_005000113.503947hypothetical protein
E4F39_00505294.067892hypothetical protein
E4F39_005102113.994206FHA domain-containing protein
E4F39_005151113.354823TOMM system kinase/cyclase fusion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00415TCRTETA861e-20 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 86.0 bits (213), Expect = 1e-20
Identities = 77/368 (20%), Positives = 143/368 (38%), Gaps = 31/368 (8%)

Query: 17 RATTSLAAIFALRMLGLFMIMPVFSVYAKTIPGGENVVL-VGIALGAYGVTQSLLYIFYG 75
R + + AL +G+ +IMPV + + +V GI L Y + Q G
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 76 WASDKFGRKPVIAAGLLIFALGSFVAAFAHDITWIIVGRVIQGM-GAVSSAVLAFIADLT 134
SD+FGR+PV+ L A+ + A A + + +GR++ G+ GA + A+IAD+T
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT 124

Query: 135 SEHNRTKAMAMVGGSIGMSFAVAIVGAPI--VFHWVGMSGLFAIVGALSVAAIGVVLWVV 192
R + + G + G + + F AL+ +++
Sbjct: 125 DGDERARHFGFMSACFGFGM---VAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181

Query: 193 PDAPRPVHVPAPFAEVLHNVELLRLNFGVLVLHATQTALFLVVPRLLVDGGLPVA----- 247
P++ + P E L+ + R G+ V+ A F+ + + G +P A
Sbjct: 182 PESHKGERRPLR-REALNPLASFRWARGMTVVAALMAVFFI----MQLVGQVPAALWVIF 236

Query: 248 ----SHWQ-----VYLPVMGL--AFVMMVPAIIVAEKQGRMKPVLLGGIAAILIGQLLLG 296
HW + L G+ + + VA + G + ++L G+ A G +LL
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-GMIADGTGYILLA 295

Query: 297 VATHTILIVAAILFVYFLGFNILEASQPSLVSKLAPGSRKGAATGVYNTTQSIGLALGGV 356
AT + + V I + +++S+ R+G G S+ +G +
Sbjct: 296 FATRGWMAF--PIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 357 VGGVLLKH 364
+ +
Sbjct: 354 LFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00420cloacin424e-07 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 42.4 bits (99), Expect = 4e-07
Identities = 28/74 (37%), Positives = 29/74 (39%), Gaps = 11/74 (14%)

Query: 112 GGSGGGGGGGGDDGGYG--------GGGGGYGGGRDMERGGGGGRASGGGGAGARSGGGG 163
GG G G GGG G G GGG G G GG G GG G G G
Sbjct: 22 GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIH---WGGGSGHGNGGGNGNSGGGSGTG 78

Query: 164 GGASRPSAPAGGGF 177
G S +AP GF
Sbjct: 79 GNLSAVAAPVAFGF 92



Score = 35.5 bits (81), Expect = 9e-05
Identities = 26/76 (34%), Positives = 31/76 (40%), Gaps = 9/76 (11%)

Query: 110 GRGGSGGGGGGGGDDGGYGGGGGGYGGGRDMERGG------GGGRASG---GGGAGARSG 160
GRG + G G+ G G G GG D GGG SG GGG+G +G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 161 GGGGGASRPSAPAGGG 176
GG G + S G
Sbjct: 66 GGNGNSGGGSGTGGNL 81



Score = 32.0 bits (72), Expect = 0.001
Identities = 18/54 (33%), Positives = 19/54 (35%)

Query: 109 GGRGGSGGGGGGGGDDGGYGGGGGGYGGGRDMERGGGGGRASGGGGAGARSGGG 162
GG G G GGG G GG G GG + G G GG G
Sbjct: 57 GGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAG 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00450BINARYTOXINA300.033 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.033
Identities = 12/28 (42%), Positives = 17/28 (60%)

Query: 37 LEEKQRLIESIVNRYPIPAILIAERKSG 64
L+ K IE+ + PIP+ LI R+SG
Sbjct: 312 LDSKVNNIENALKLTPIPSNLIVYRRSG 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00465SALSPVBPROT607e-11 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 60.1 bits (145), Expect = 7e-11
Identities = 55/208 (26%), Positives = 81/208 (38%), Gaps = 41/208 (19%)

Query: 12 LNLPSGGGSVSGDGGDFSVDLNTGTATLKFDLTVPAGPNGITPPHTLQYSAGAGDGAFGI 71
LP GG ++S G D G A++ L + A G P L YS+G G+G FG+
Sbjct: 18 PFLPKGGKALSQSGPD-------GLASITLPLPISAE-RGFAPALALHYSSGGGNGPFGV 69

Query: 72 GWSLGLMTIRRR-----------------------ITPATGAAEPAPPGACSLVGVGELV 108
GWS M+I R T +TG A P P + V
Sbjct: 70 GWSCATMSIARSTSHGVPQYNDSDEFLGPDGEVLVQTLSTGDA-PNPVTCFAYGDVSFPQ 128

Query: 109 DMGARRFRPIVDATGLLIEFTGAS------WTATDKTDTQYTLGTSANARIG---GGALP 159
R++P +++ +E+ + W D + LG +A AR+ +
Sbjct: 129 SYTVTRYQPRTESSFYRLEYWVGNSNGDDFWLLHDSNGILHLLGKTAAARLSDPQAASHT 188

Query: 160 AAWLVDRCADSAGNAIAYTWLDVGGARV 187
A WLV+ AG I Y++L G V
Sbjct: 189 AQWLVEESVTPAGEHIYYSYLAENGDNV 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00470CHANLCOLICIN350.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.7 bits (79), Expect = 0.002
Identities = 37/196 (18%), Positives = 77/196 (39%), Gaps = 13/196 (6%)

Query: 518 VSQASGQINAAQQQLAVAQAQAQAYQAGVALAQTRATNAAKNAQ-EYGSLNSQVIVIQAT 576
+S+ + + AQ++L+ AQ++ + +R +++ E +L + +
Sbjct: 184 LSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQA 243

Query: 577 GQQVSGGDDGDYNGVSAMANQYLSGQ-RISGDSATVAAATNLAANRL---SQQFQIDSMN 632
+ D+ +S AN L + V A + + + +I+ +N
Sbjct: 244 SAKYKELDE-LVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRIN 302

Query: 633 RTTAEMQQALAQAQAQLAAANAQVSAAGANLAVAQLNAQAAAQTLGVFDADTFTPQVWKA 692
++Q+A++Q A A+V A NL AQ N + DA T ++
Sbjct: 303 ADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIK----DAVDATVSFYQT 358

Query: 693 MGNFVDQIYERYMNMA 708
+ ++ E+Y MA
Sbjct: 359 LT---EKYGEKYSKMA 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_00515YERSSTKINASE350.003 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 34.7 bits (79), Expect = 0.003
Identities = 17/42 (40%), Positives = 25/42 (59%), Gaps = 2/42 (4%)

Query: 149 QVLDGLAHAHENGVVHRDLKPQNVMVTTRDGEPCAKILDFGI 190
++LD H + GVVH D+KP NV+ GEP ++D G+
Sbjct: 253 RLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPV--VIDLGL 292


4E4F39_01190E4F39_01350Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_011902122.478097LysR family transcriptional regulator
E4F39_01195-1172.878920tRNA 2-selenouridine(34) synthase MnmH
E4F39_01205-192.906856hypothetical protein
E4F39_012100113.207412nitrate/sulfonate/bicarbonate ABC transporter
E4F39_01215-1103.292059ABC transporter permease subunit
E4F39_012200112.903917hypothetical protein
E4F39_012250103.151241hypothetical protein
E4F39_01230-1112.948334erythromycin esterase family protein
E4F39_012350153.095646hydrolase
E4F39_012400133.535794nicotinate phosphoribosyltransferase
E4F39_01245092.630988histidine kinase
E4F39_01255-1113.790713phosphoribosyltransferase
E4F39_01260-1104.234673glycosyl transferase family 51
E4F39_01265-293.235675hypothetical protein
E4F39_01270-293.544147cytochrome c
E4F39_01275-1103.809774cytochrome c oxidase subunit I
E4F39_01280-1103.882510cytochrome c oxidase subunit II
E4F39_01285-1103.580094thiamine pyrophosphate-requiring protein
E4F39_012950124.152309mandelate racemase
E4F39_013050104.536266hypothetical protein
E4F39_01310094.732291gluconate 2-dehydrogenase subunit 3 family
E4F39_01315093.465641GMC family oxidoreductase
E4F39_01320092.350074type VI secretion system baseplate subunit TssF
E4F39_013250111.679183hypothetical protein
E4F39_01330-1120.175671penicillin acylase family protein
E4F39_01335-1110.039452LysR family transcriptional regulator
E4F39_01340-114-2.608996response regulator
E4F39_01345017-2.465047oxidoreductase
E4F39_01350-120-3.016105response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01345HTHFIS588e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.5 bits (139), Expect = 8e-11
Identities = 34/122 (27%), Positives = 52/122 (42%), Gaps = 15/122 (12%)

Query: 484 RALVVDDNENARETLGAMLATLGIRVDLRGTGKEGLRCFGECQHDIVVLDLELPDISGFE 543
LV DD+ R L L+ G V + R D+VV D+ +PD + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 544 VAEQIRWATSSDAARKTTILGVSAYES------ALLKGDHAIFDAFIPKPIHLDTLGGIV 597
+ +I+ A +L +SA + A KG +D ++PKP L L GI+
Sbjct: 65 LLPRIK-----KARPDLPVLVMSAQNTFMTAIKASEKG---AYD-YLPKPFDLTELIGII 115

Query: 598 SR 599
R
Sbjct: 116 GR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01355HTHFIS385e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.9 bits (88), Expect = 5e-05
Identities = 22/125 (17%), Positives = 47/125 (37%), Gaps = 12/125 (9%)

Query: 188 ARIAVVDDSPDVAETICEYFAEKGVAAIAYYDSVSFRKALEVEDFDGYILDWLLGEETAA 247
A I V DD + + + + G ++ + + + D D + D ++ +E A
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 248 PLVRGIRASENADAPIFLLTGKISTGEASEDEIADIVSSFNARCEE---KPVRLPILFAE 304
L+ I+ D P+ +++ ++ + + + KP L L
Sbjct: 64 DLLPRIK-KARPDLPVLVMSA--------QNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 305 VARAL 309
+ RAL
Sbjct: 115 IGRAL 119


5E4F39_01400E4F39_01525Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_01400216-1.859803lipocalin-like domain-containing protein
E4F39_01405014-2.811228DMT family transporter
E4F39_01410014-4.060041mandelate racemase
E4F39_01415-119-5.207213hypothetical protein
E4F39_01420-118-5.442219response regulator transcription factor
E4F39_01425-215-3.9450183-phosphoshikimate 1-carboxyvinyltransferase
E4F39_01430-213-3.224126hypothetical protein
E4F39_01435-113-3.181988hypothetical protein
E4F39_01440011-1.133704*MFS transporter
E4F39_01450-19-0.455382HAMP domain-containing protein
E4F39_01455111-0.505270response regulator transcription factor
E4F39_01460214-1.864030recombinase RecA
E4F39_01470216-0.849556recombination regulator RecX
E4F39_01475-214-1.086044DUF2889 domain-containing protein
E4F39_01480-214-1.342775ADP-forming succinate--CoA ligase subunit beta
E4F39_01485-210-0.418433succinate--CoA ligase subunit alpha
E4F39_01490-290.284476TerC family protein
E4F39_014950120.459820pilin
E4F39_01500-113-0.009585polymerase
E4F39_01505319-0.363401hypothetical protein
E4F39_015102112.354328hypothetical protein
E4F39_015150113.106653DUF2501 domain-containing protein
E4F39_01520-1133.559757TonB family protein
E4F39_01525-2123.175458cyclic pyranopterin monophosphate synthase MoaC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01450TCRTETA358e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.8 bits (80), Expect = 8e-04
Identities = 47/264 (17%), Positives = 93/264 (35%), Gaps = 28/264 (10%)

Query: 77 AIVFGRLGDLVGRKHTFLITIVIMGISTFVVGFLPGYASIGIAAPVIFIAMRLLQGLALG 136
A V G L D GR+ L+++ + ++ P V++I R++ G+ G
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-------FLWVLYIG-RIVAGIT-G 110

Query: 137 GEYGGAATYVAEHAPSHRRGFYTSWIQTTATLGLFLSLLVILGVRTAIGEEAFGSWGWRV 196
A Y+A+ R + ++ G+ + G +G G +
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGM------VAG--PVLGGLM-GGFSPHA 161

Query: 197 PFVASILLLAVSVWIRLQLNESPVFLRIKAEGKTSKAPLTEAFGQWKNLKIVILALIGLT 256
PF A+ L ++ L + + + PL + + L
Sbjct: 162 PFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASF-----RWARGMTVVAALM 216

Query: 257 AGQAVVWYTGQFYA---LFFLTQTLKVDGASANILIALALLIGTPF-FVFFGSLSDRIGR 312
A ++ GQ A + F D + I +A ++ + + G ++ R+G
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGE 276

Query: 313 KPIILAGCLIAALTYFPLFKALTH 336
+ ++ G +IA T + L T
Sbjct: 277 RRALMLG-MIADGTGYILLAFATR 299



Score = 34.8 bits (80), Expect = 8e-04
Identities = 17/42 (40%), Positives = 24/42 (57%)

Query: 287 ILIALALLIGTPFFVFFGSLSDRIGRKPIILAGCLIAALTYF 328
IL+AL L+ G+LSDR GR+P++L AA+ Y
Sbjct: 47 ILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01455PF06580485e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 47.9 bits (114), Expect = 5e-08
Identities = 49/229 (21%), Positives = 86/229 (37%), Gaps = 53/229 (23%)

Query: 300 LAGLRTQAEF-ALRHEVNA-------DVARSLEQIATSSEQAARLVTQLLALARAENRAT 351
+A + +A+ AL+ ++N + R+L I +A ++T L L R
Sbjct: 154 MASMAQEAQLMALKAQINPHFMFNALNNIRAL--ILEDPTKAREMLTSLSELMRY----- 206

Query: 352 GLTFEPVEIASLARQ--AVRDWV---QAALAKQMDLGYEGPDTDAPLRIDGQPVMLREML 406
L + SLA + V ++ ++ + +++ P ML +
Sbjct: 207 SLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV---PPML---V 260

Query: 407 GNLIDNAIRY----TPAGGRITVRVRAERAAGAVHLEVEDTGPGIPPNERERVVERFYRI 462
L++N I++ P GG+I ++ + G V LEVE+TG N +E
Sbjct: 261 QTLVENGIKHGIAQLPQGGKILLKGTKDN--GTVTLEVENTGSLALKNTKE--------- 309

Query: 463 LGREGDGSGLGLAIVRE-IVAQHGGTLTIDDNVYQTSPRLAGTLVRVSI 510
+G GL VRE + +G I S + V I
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIK-----LSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01460HTHFIS996e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 98.8 bits (246), Expect = 6e-26
Identities = 35/118 (29%), Positives = 64/118 (54%), Gaps = 1/118 (0%)

Query: 2 RILIAEDDSILADGLTRSLRQSGYAVDHVRNGVEADTALSMQTFDLLILDLGLPRMSGLE 61
IL+A+DD+ + L ++L ++GY V N ++ DL++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRRLRARNSNLPVLILTAADSVDERVKGLDLGADDYMAKPFALNE-LEARVRALTRR 118
+L R++ +LPVL+++A ++ +K + GA DY+ KPF L E + RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01495BCTERIALGSPG416e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 40.6 bits (95), Expect = 6e-07
Identities = 17/55 (30%), Positives = 29/55 (52%), Gaps = 5/55 (9%)

Query: 2 RARGFTLIELMIVLAIVGVVAAYAIPAYQDYLARSRVGEGLALAASARLAVAENA 56
+ RGFTL+E+M+V+ I+GV+A+ +P + A + + ENA
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLM-----GNKEKADKQKAVSDIVALENA 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01500PF06580290.046 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.1 bits (65), Expect = 0.046
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 14/107 (13%)

Query: 205 AALSALLSVGLALTVSRGPWLQVG-----------VMVVAGFWMAFA-QARRDPA--ASR 250
+ + +L+ + R WL++ +V+ W R A ++
Sbjct: 49 SLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTK 108

Query: 251 ARAWAIPVVLGVLFVAVNVAVRWANVHYHLGLAESAADRMRDAGQIA 297
A+ +P+ L ++F V V W+ +++ ++ D ++A
Sbjct: 109 PVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMA 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01525PF03544391e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 39.2 bits (91), Expect = 1e-06
Identities = 18/97 (18%), Positives = 28/97 (28%), Gaps = 2/97 (2%)

Query: 18 AGCAAFAPRDAAKLECTMPVAAYPENAKPLERRATVLVRAMITASGNAENVTVTTSSRNA 77
+ L P YP A+ L V V+ +T G +NV + ++
Sbjct: 147 SKPVTSVASGPRALSRNQPQ--YPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPAN 204

Query: 78 AADRAAVDAMSRIACSQTPARGGEPYPFTLTRPFVFE 114
+R +AM R G E
Sbjct: 205 MFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTE 241


6E4F39_01785E4F39_01855Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_017850113.180584sn-glycerol-3-phosphate ABC transporter
E4F39_017900123.803566hypothetical protein
E4F39_017950124.510914haloacid dehalogenase-like hydrolase
E4F39_01800-1144.457561LysR family transcriptional regulator
E4F39_018050145.341761serine hydrolase
E4F39_018150155.384680MFS transporter
E4F39_018200144.927766sugar-binding transcriptional regulator
E4F39_018251145.212913xylulokinase
E4F39_018301134.937496mannitol dehydrogenase family protein
E4F39_018350135.199284LysR family transcriptional regulator
E4F39_018401134.436662benzoylformate decarboxylase
E4F39_018452133.801917aldehyde dehydrogenase
E4F39_018503133.6219892-dehydropantoate 2-reductase
E4F39_018552122.579494hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01790PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 14/35 (40%), Positives = 17/35 (48%)

Query: 32 VVFVGPSGCGKSTLMRMIAGLEEISGGELLIDGAK 66
VV G G GKSTL+ + GL+ S I K
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01800PF06776300.020 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 29.5 bits (66), Expect = 0.020
Identities = 11/49 (22%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 1 MKTGRRHFVRSVASASAALAAAAWSPARAAIDAPASPATALSLTPGRWS 49
+ + RR R+ A A A A A A+ G W
Sbjct: 38 LASCRRLARRNGARLMLAGAMAI--ALSFGWSDRADAQGAVRSVHGDWQ 84



Score = 28.7 bits (64), Expect = 0.040
Identities = 7/37 (18%), Positives = 13/37 (35%)

Query: 10 RSVASASAALAAAAWSPARAAIDAPASPATALSLTPG 46
+++ A L+ S R A A A ++
Sbjct: 25 KAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01810BLACTAMASEA300.019 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.019
Identities = 11/35 (31%), Positives = 15/35 (42%)

Query: 57 REDALFRFASVSKPIVSAAAMRAVAAGKLDLDASI 91
R D F S K ++ A + V AG L+ I
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKI 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01815TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 31/155 (20%), Positives = 59/155 (38%), Gaps = 5/155 (3%)

Query: 26 LLALATAGFITIVTEALPAGLLPLMGRDLRVSDALVGQLVTVYAAGSIVAAIPLVAATRG 85
L+ L F +++ E + LP + D A + T + + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 86 MRRRPLLLAALAGFVVANTATAASPYYAPVLV-ARCVAGVSAGLLWALLAGYASRMVDAR 144
+ + LLL + + + +L+ AR + G A AL+ +R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 145 QRGRAIAIAMLGAPVAMSVGI-PL-GTALGAALGW 177
RG+A ++G+ VAM G+ P G + + W
Sbjct: 136 NRGKAFG--LIGSIVAMGEGVGPAIGGMIAHYIHW 168


7E4F39_02085E4F39_02225Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_020850134.391134chromate efflux transporter
E4F39_02090-1154.069426Crp/Fnr family transcriptional regulator
E4F39_020950173.234062DUF962 domain-containing protein
E4F39_02100-1163.8610262-dehydropantoate 2-reductase
E4F39_021050103.137674hypothetical protein
E4F39_02110-1162.6462123-oxoacyl-ACP reductase
E4F39_021150121.243216molecular chaperone GroEL
E4F39_021200111.668477class I SAM-dependent methyltransferase
E4F39_021250103.030598EAL domain-containing protein
E4F39_02130-1112.932978DHA2 family efflux MFS transporter permease
E4F39_02135-3132.500627DUF2917 domain-containing protein
E4F39_02140191.934606helix-turn-helix domain-containing protein
E4F39_021502122.824114GntR family transcriptional regulator
E4F39_021556332.387351aldo/keto reductase
E4F39_021606342.105923hypothetical protein
E4F39_021659392.909920hypothetical protein
E4F39_0217013473.871856hypothetical protein
E4F39_021756261.084925molybdenum cofactor biosynthesis protein A
E4F39_021806251.5302362-dehydropantoate 2-reductase
E4F39_02185112-0.510870elongation factor G
E4F39_02190112-1.165052hypothetical protein
E4F39_02195213-0.359306DUF192 domain-containing protein
E4F39_02200216-2.483582hypothetical protein
E4F39_02205220-1.900660pseudouridine synthase
E4F39_02210-112-0.884710NADP-dependent isocitrate dehydrogenase
E4F39_02215111-1.514963GNAT family acetyltransferase
E4F39_0222007-2.023055multicopper oxidase family protein
E4F39_02225-18-3.226392cold-shock protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02140TCRTETB1383e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (349), Expect = 3e-38
Identities = 92/408 (22%), Positives = 171/408 (41%), Gaps = 15/408 (3%)

Query: 17 VMLWLVATGFFMQTLDATIVNTALPSMAASLGESPLRMQSVVIAYSLTMAVMIPVSGWLA 76
+++WL FF L+ ++N +LP +A + P V A+ LT ++ V G L+
Sbjct: 15 ILIWLCILSFF-SVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 77 DTLGTRRVFFSAILIFTLGSLLCANAHT-LPLLVAFRVIQGVGGAMLLPVGRLAVLRTFP 135
D LG +R+ I+I GS++ H+ LL+ R IQG G A + + V R P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 136 AERYLPALSFVAIPGLIGPLIGPTLGGWLVKIASWHWIFLINVPVGIAGCIATFYSMPDS 195
E A + +G +GP +GG + HW +L+ +P+ + +
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAH--YIHWSYLLLIPMITIITVPFLMKLLKK 191

Query: 196 RNPAAGRFDLKGYLLLTIGMIAISLSLDGLADLGMQHAMVLVLLILSLACFVAYGLYAVR 255
G FD+KG +L+++G++ L L++ +LS FV + +
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTT------SYSISFLIVSVLSFLIFVKHIR---K 242

Query: 256 APQPIFSLELFGIHTFSVGLLGNLFARIGSGAMPYLIPLLLQVSLGYGAFEAG-LMMLPV 314
P L F +G+L ++P +++ E G +++ P
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 315 AAAGMFSKRIITVLITRHGYRKVLLANTIMVGLMMASFALVSDAMPTWLKIAQLALFGGF 374
+ + I +L+ R G VL + + + + + + ++ I + + GG
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL 362

Query: 375 NSMQFTAMNTLTLKDLGTGGASSGNSLFSLVQMLSMSLGVTVAGALLA 422
+ + T ++T+ L A +G SL + LS G+ + G LL+
Sbjct: 363 SFTK-TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02180cloacin463e-07 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 46.2 bits (109), Expect = 3e-07
Identities = 42/113 (37%), Positives = 48/113 (42%), Gaps = 2/113 (1%)

Query: 340 GASGGGASGGGTSGGGTSGGGASGGGASGGGASGSGASGSGASGGGASGGGASGGGASGG 399
G G G + G S G GG +G G GG + GSG S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 400 GASGGGASGGGTSGGGTSGGGTSGGGPSGGGPSGGGTSGGGTSGGGTSGGGTS 452
G GG + GG SG G G ++ P G T G G S G S
Sbjct: 63 GNGGGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALS 113



Score = 45.5 bits (107), Expect = 5e-07
Identities = 34/82 (41%), Positives = 38/82 (46%)

Query: 405 GASGGGTSGGGTSGGGTSGGGPSGGGPSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSGG 464
G G G + G S G GGP+G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 465 GTSSGAGHGGHGGGTGGGGGNS 486
G G G+ G G GTGG
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAV 84



Score = 45.5 bits (107), Expect = 5e-07
Identities = 39/109 (35%), Positives = 45/109 (41%)

Query: 374 SGASGSGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGTSGGGPSGGGPSG 433
SG G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 434 GGTSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSSGAGHGGHGGGTGGG 482
G GG + GG SG G + + G S G GG G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAG 110



Score = 44.7 bits (105), Expect = 8e-07
Identities = 39/102 (38%), Positives = 46/102 (45%)

Query: 295 GASGGGTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGTSGG 354
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 355 GTSGGGASGGGASGGGASGSGASGSGASGGGASGGGASGGGA 396
G GG + GG SG G + S + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 44.7 bits (105), Expect = 9e-07
Identities = 38/102 (37%), Positives = 45/102 (44%)

Query: 190 GTSGRGAAGGGASSGGASGGGASGGGASGGGASGGSASGGGTSGGGASGGGASGGGASGG 249
G GRG G S+ G GG +G G GG + G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 250 GTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGASGGGA 291
G GG + GG SG G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 44.3 bits (104), Expect = 1e-06
Identities = 38/102 (37%), Positives = 46/102 (45%)

Query: 285 GASGGGASGGGASGGGTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGG 344
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 345 GASGGGTSGGGTSGGGASGGGASGGGASGSGASGSGASGGGA 386
G GG + GG SG G + + A G A + +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 43.9 bits (103), Expect = 2e-06
Identities = 38/102 (37%), Positives = 46/102 (45%)

Query: 240 GASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGASGGGASGGGASGG 299
G G G + G S G GG +G G GG + G G S + GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 300 GTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGA 341
G GG + GG SG G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 43.2 bits (101), Expect = 2e-06
Identities = 42/113 (37%), Positives = 50/113 (44%), Gaps = 1/113 (0%)

Query: 270 GASGGGASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGGGASGGGASGGGASGG 329
G G G + G+ S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 330 GASGGGASGGGASGGGASGGGTSGGGTSGGGA-SGGGASGGGASGSGASGSGA 381
G GG + GG SG G + + G A S GA G S S + S A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAA 115



Score = 43.2 bits (101), Expect = 2e-06
Identities = 37/100 (37%), Positives = 44/100 (44%)

Query: 385 GASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGTSGGGPSGGGPSGGGTSGGGTSGG 444
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 445 GTSGGGTSGGGTSGGGTSGGGTSSGAGHGGHGGGTGGGGG 484
G GG + GG SG G + ++ G T G GG
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGG 102



Score = 43.2 bits (101), Expect = 3e-06
Identities = 39/113 (34%), Positives = 45/113 (39%), Gaps = 2/113 (1%)

Query: 315 GASGGGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGASGGGASGGGASGS 374
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 375 GASGSGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGTSGGGPS 427
G G + GG SG G G ++ G T G G S G S
Sbjct: 63 GNGGGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALS 113



Score = 42.4 bits (99), Expect = 5e-06
Identities = 37/102 (36%), Positives = 44/102 (43%)

Query: 235 GASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGASGGGASGG 294
G G G + G S G GG +G G GG + G G S G SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 295 GASGGGTSGGGASGGGASGGGASGGGASGGGASGGGASGGGA 336
G GG + GG SG G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 42.0 bits (98), Expect = 6e-06
Identities = 32/80 (40%), Positives = 37/80 (46%)

Query: 450 GTSGGGTSGGGTSGGGTSSGAGHGGHGGGTGGGGGNSGGHGNGNGGGASGGSGSGNGHGS 509
G G G + G S G +G G GG G N GGG+ G G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 510 GNGGGNGNGGGGGNGGGSGN 529
GNGGGNGN GGG GG+ +
Sbjct: 63 GNGGGNGNSGGGSGTGGNLS 82



Score = 42.0 bits (98), Expect = 6e-06
Identities = 40/117 (34%), Positives = 46/117 (39%), Gaps = 2/117 (1%)

Query: 255 GASGGGTSGGGASGGGASGGGASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGG 314
G G G + G S G GG +G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 315 GASGGGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGASGGGASGGGA 371
G GG + GG SG G G ++ G T G G S G S A
Sbjct: 63 GNGGGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 42.0 bits (98), Expect = 7e-06
Identities = 37/102 (36%), Positives = 45/102 (44%)

Query: 305 GASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGASGG 364
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 365 GASGGGASGSGASGSGASGGGASGGGASGGGASGGGASGGGA 406
G GG + G SG+G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 42.0 bits (98), Expect = 7e-06
Identities = 37/102 (36%), Positives = 44/102 (43%)

Query: 230 GTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGASGG 289
G G G + G S G GG +G G GG + G G S GG SG GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 290 GASGGGASGGGTSGGGASGGGASGGGASGGGASGGGASGGGA 331
G GG + GG SG G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 41.6 bits (97), Expect = 7e-06
Identities = 36/102 (35%), Positives = 43/102 (42%)

Query: 220 GASGGSASGGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGG 279
G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 280 SASGGGASGGGASGGGASGGGTSGGGASGGGASGGGASGGGA 321
GG + GG SG G + + A G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 41.6 bits (97), Expect = 8e-06
Identities = 37/102 (36%), Positives = 46/102 (45%)

Query: 215 GASGGGASGGSASGGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGG 274
G G G + G+ S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 275 GASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGGGA 316
G GG+ + GG SG G + + G A +GG A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 41.6 bits (97), Expect = 9e-06
Identities = 35/100 (35%), Positives = 43/100 (43%)

Query: 200 GASSGGASGGGASGGGASGGGASGGSASGGGTSGGGASGGGASGGGASGGGTSGGGASGG 259
G G + G S G GG +G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 260 GTSGGGASGGGASGGGASGGSASGGGASGGGASGGGASGG 299
G GG + GG SG G + + + A G A +GG
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGG 102



Score = 41.2 bits (96), Expect = 1e-05
Identities = 32/89 (35%), Positives = 41/89 (46%)

Query: 519 GGGGNGGGSGNGTGNGGNNGGGHGNGSSGGGTSGGNGHGNGGGTSSGSGNGGGNGSGHGN 578
GG G G +G + +G NGG G G GG + G GSG+G G G G+
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 579 GGHGNGGGHGNGNGSGGAGNGGANGVGSG 607
G G G G G+G+GG + A V G
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFG 91



Score = 40.5 bits (94), Expect = 2e-05
Identities = 38/117 (32%), Positives = 45/117 (38%), Gaps = 2/117 (1%)

Query: 290 GASGGGASGGGTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGASGG 349
G G G + G S G GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 350 GTSGGGTSGGGASGGGASGGGASGSGASGSGASGGGASGGGASGGGASGGGASGGGA 406
G GG +G G G G ++ + G G G S G S A
Sbjct: 63 G--NGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 40.5 bits (94), Expect = 2e-05
Identities = 40/111 (36%), Positives = 47/111 (42%), Gaps = 2/111 (1%)

Query: 360 GASGGGASGGGASGSGASGSGASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGG 419
G G G + G S SG G +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 420 GTSGGGPSGGGPSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSSGA 470
G GG + GG SG G G ++ G T G G S+GA
Sbjct: 63 GNGGGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 38.9 bits (90), Expect = 5e-05
Identities = 40/114 (35%), Positives = 47/114 (41%), Gaps = 2/114 (1%)

Query: 168 GRGNSVGNASGGGTSGGGASGHGTSGRGAAGGGASSGGASGGGASGGGASGGGASGGSAS 227
GRG++ G S G GG +G G G + G G SS GG SG G GG SG
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 228 GGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSA 281
GG + GG SG G G ++ G T G G S G S A
Sbjct: 66 GGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 38.9 bits (90), Expect = 6e-05
Identities = 32/81 (39%), Positives = 40/81 (49%), Gaps = 1/81 (1%)

Query: 494 GGGASGGSGSGNGHGSGNGGGNGNGGGGGNGGGSGNGTGNGGNNGGGHGNGSSGGGTSGG 553
G G +G+ + G+ NGG G G GGG GSG + N GGG G+G GG SG
Sbjct: 4 GDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSEN-NPWGGGSGSGIHWGGGSGH 62

Query: 554 NGHGNGGGTSSGSGNGGGNGS 574
G G + GSG GG +
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSA 83



Score = 38.5 bits (89), Expect = 8e-05
Identities = 41/117 (35%), Positives = 47/117 (40%), Gaps = 4/117 (3%)

Query: 260 GTSGGGASGGGASGGGASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGGGASGG 319
G G G + G S G G +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 320 GASGGGASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGASGGGASGGGASGSGA 376
G GG G SGGG+ GG A+ S GA G S + S A
Sbjct: 63 GNGGGN----GNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAA 115



Score = 38.2 bits (88), Expect = 8e-05
Identities = 31/85 (36%), Positives = 36/85 (42%), Gaps = 3/85 (3%)

Query: 425 GPSGGGPSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSGGGTSSGAGHGGHGGGTGGGGG 484
G G G + G S G GG +G G GG + G G S G GG G G GGG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWG---GGSGSGIHWGGG 59

Query: 485 NSGGHGNGNGGGASGGSGSGNGHGS 509
+ G+G GNG G GN
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNLSAV 84



Score = 38.2 bits (88), Expect = 9e-05
Identities = 30/81 (37%), Positives = 39/81 (48%), Gaps = 1/81 (1%)

Query: 539 GGHGNGSSGGGTSGGNGHGNGGGTSSGSGNGGGNGSGHGNGGHGNGGGHGNGNGSGGAGN 598
GG G G + G S G+ NGG T G G G +GSG + + GGG G+G GG
Sbjct: 3 GGDGRGHNTGAHSTS-GNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 599 GGANGVGSGNGGHGNGGGHGN 619
G G +GG GG+ +
Sbjct: 62 HGNGGGNGNSGGGSGTGGNLS 82



Score = 38.2 bits (88), Expect = 1e-04
Identities = 33/100 (33%), Positives = 43/100 (43%), Gaps = 2/100 (2%)

Query: 473 GGHGGGTGGGGGNSGGHGNG--NGGGASGGSGSGNGHGSGNGGGNGNGGGGGNGGGSGNG 530
GG G G G ++ G+ NG G G GG+ G+G S N G G G + GG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 531 TGNGGNNGGGHGNGSSGGGTSGGNGHGNGGGTSSGSGNGG 570
GGN G G+G+ G ++ G S G GG
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGG 102



Score = 38.2 bits (88), Expect = 1e-04
Identities = 36/117 (30%), Positives = 43/117 (36%), Gaps = 2/117 (1%)

Query: 265 GASGGGASGGGASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGGGASGGGASGG 324
G G G + G S GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 325 GASGGGASGGGASGGGASGGGASGGGTSGGGTSGGGASGGGASGGGASGSGASGSGA 381
G GG +G G G G ++ G G G S S + A
Sbjct: 63 G--NGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 37.8 bits (87), Expect = 1e-04
Identities = 38/106 (35%), Positives = 44/106 (41%), Gaps = 1/106 (0%)

Query: 283 GGGASGGGASGGGASGGGTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGAS 342
G G + G S G GG +G G GG + G G S GG SG G GG SG G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 343 G-GGASGGGTSGGGTSGGGASGGGASGGGASGSGASGSGASGGGAS 387
G G SGGG+ GG A+ S GA G S +
Sbjct: 66 GGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 37.0 bits (85), Expect = 2e-04
Identities = 37/114 (32%), Positives = 42/114 (36%), Gaps = 2/114 (1%)

Query: 178 GGGTSGGGASGHGTSGRGAAGGGASSGGASGGGASGGGASGGGASGGSASGGGTSGGGAS 237
G G + G S G G G G G + G G S GG SG GG SG G
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNG 65

Query: 238 GGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGASGGGA 291
GG + GG SG G G ++ G G G S S G S A
Sbjct: 66 GGNGNSGGGSGTG--GNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 37.0 bits (85), Expect = 2e-04
Identities = 30/80 (37%), Positives = 37/80 (46%), Gaps = 1/80 (1%)

Query: 444 GGTSGGGTSGGGTSGGGTSGGGTSSGAGHGGHGGGTGGGGGNSGGHGNGNGGGASGGSGS 503
GG G +G ++ G +GG T G G G G N G G+G+G GGSG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 504 GNGHGSGNGGGNGNGGGGGN 523
GNG G G G+G GG
Sbjct: 63 GNG-GGNGNSGGGSGTGGNL 81



Score = 37.0 bits (85), Expect = 2e-04
Identities = 38/117 (32%), Positives = 43/117 (36%), Gaps = 2/117 (1%)

Query: 155 GTTGSTAGGTAATGRGNSVGNASGGGTSGGGASGHGTSGRGAAGGGASSGGASGGGASGG 214
G G A + GN G +G G GG + G G S GG S G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 215 GASGGGASGGSASGGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGA 271
G GG G S G GT G ++ G T G G S G S A
Sbjct: 63 GNGGGN--GNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 37.0 bits (85), Expect = 2e-04
Identities = 37/109 (33%), Positives = 43/109 (39%), Gaps = 1/109 (0%)

Query: 210 GASGGGASGGGASGGSASGGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGG 269
G G G + G S GG +G G GG + G G S GG SG G GG SG
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 270 GASG-GGASGGSASGGGASGGGASGGGASGGGTSGGGASGGGASGGGAS 317
G G G SGG + GG A+ S GA G S +
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGA 111



Score = 36.6 bits (84), Expect = 3e-04
Identities = 27/80 (33%), Positives = 32/80 (40%)

Query: 461 TSGGGTSSGAGHGGHGGGTGGGGGNSGGHGNGNGGGASGGSGSGNGHGSGNGGGNGNGGG 520
+ G G G G GG G G + G + G GSG+G G G G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 521 GGNGGGSGNGTGNGGNNGGG 540
GNGGG+GN G G G
Sbjct: 62 HGNGGGNGNSGGGSGTGGNL 81



Score = 35.8 bits (82), Expect = 5e-04
Identities = 42/109 (38%), Positives = 51/109 (46%), Gaps = 6/109 (5%)

Query: 128 NAGSGRGSSGGANDGNGAIGAAGIGVGGTTGSTAGGTAATGRGNSVGNASGGGTSGGGAS 187
+ G GRG + GA+ +G I GG TG GG A+ G G S N GG SG G
Sbjct: 2 SGGDGRGHNTGAHSTSGNIN------GGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIH 55

Query: 188 GHGTSGRGAAGGGASSGGASGGGASGGGASGGGASGGSASGGGTSGGGA 236
G SG G GG +SGG SG G + + A G A +GG A
Sbjct: 56 WGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA 104



Score = 35.5 bits (81), Expect = 7e-04
Identities = 30/86 (34%), Positives = 37/86 (43%), Gaps = 7/86 (8%)

Query: 525 GGSGNGTGNGGNNGGGHGNGSSGGGTSGGNGHGNGGGTSSGSGNGGGNGSGHGNGGHGNG 584
GG G G G ++ G+ NG G G GGG S GSG N G G G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGL-------GVGGGASDGSGWSSENNPWGGGSGSGIH 55

Query: 585 GGHGNGNGSGGAGNGGANGVGSGNGG 610
G G+G+G+GG G G+G
Sbjct: 56 WGGGSGHGNGGGNGNSGGGSGTGGNL 81



Score = 34.7 bits (79), Expect = 0.001
Identities = 37/115 (32%), Positives = 43/115 (37%), Gaps = 4/115 (3%)

Query: 228 GGGTSGGGASGGGASGGGASGGGTSGGGASGGGTSGGGASGGGASGGGASGGSASGGGAS 287
G G + G S G GG +G G GG + G G S GG SG G G GG
Sbjct: 6 GRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWG---GGSGH 62

Query: 288 GGGASGGGASGG-GTSGGGASGGGASGGGASGGGASGGGASGGGASGGGASGGGA 341
G G G + GG GT G ++ G G G S G S A
Sbjct: 63 GNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117



Score = 33.9 bits (77), Expect = 0.002
Identities = 28/70 (40%), Positives = 36/70 (51%)

Query: 551 SGGNGHGNGGGTSSGSGNGGGNGSGHGNGGHGNGGGHGNGNGSGGAGNGGANGVGSGNGG 610
SGG+G G+ G S SGN G +G G GG + G + + G G+ G G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 611 HGNGGGHGNG 620
HGNGGG+GN
Sbjct: 62 HGNGGGNGNS 71



Score = 32.0 bits (72), Expect = 0.007
Identities = 28/114 (24%), Positives = 39/114 (34%)

Query: 123 GSGGDNAGSGRGSSGGANDGNGAIGAAGIGVGGTTGSTAGGTAATGRGNSVGNASGGGTS 182
G G + +SG N G +G G G+ S+ G G+ + G G
Sbjct: 4 GDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHG 63

Query: 183 GGGASGHGTSGRGAAGGGASSGGASGGGASGGGASGGGASGGSASGGGTSGGGA 236
GG +G+ G G G ++ G G G S S G S A
Sbjct: 64 NGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIA 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02200TCRTETOQM6240.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 624 bits (1610), Expect = 0.0
Identities = 172/683 (25%), Positives = 295/683 (43%), Gaps = 75/683 (10%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVSHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWKGMAGNYPEHRINIIDTPGHVDFTIEVERSMRVLDGACMVYDSVGGVQPQSETVWR 128
+ W+ ++NIIDTPGH+DF EV RS+ VLDGA ++ + GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRVGADFFRVQRQIGERLKGVAVPIQIPVGAEEHFQGVVDLVKM 188
K +P I F+NK+D+ G D V + I E+L V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAIVWDDESQGVKFTYEDIPANLVELAHEWREKMVEAAAEASEELLEKYLTDHNSLTEDE 248
+ + Q + E +++LLEKY+ SL E
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYM-SGKSLEALE 199

Query: 249 IKAALRKRTIANEIVPMLCGSAFKNKGVQAMLDAVIDYLPSPADVPAILGHDLDDKEAER 308
++ R + P+ GSA N G+ +++ + + S
Sbjct: 200 LEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH---------------- 243

Query: 309 HPSDDEPFSALAFKIMTDPFVGQLIFFRVYSGVVESGDTLLNATKDKKERLGRILQMHAN 368
FKI +L + R+YSGV+ D++ + K+K ++ +
Sbjct: 244 --RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSING 300

Query: 369 ERKEIKEVRAGDIAAAVG--LK-EATTGDTLCDPGKPIILEKMEFPEPVISQAVEPKTKA 425
E +I + +G+I LK + GDT P + E++E P P++ VEP
Sbjct: 301 ELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQR----ERIENPLPLLQTTVEPSKPQ 356

Query: 426 DQEKMGLALNRLAQEDPSFRVQTDEESGQTIISGMGELHLEIIVDRMKREFGVEATVGKP 485
+E + AL ++ DP R D + + I+S +G++ +E+ ++ ++ VE + +P
Sbjct: 357 QREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEP 416

Query: 486 QVAYRETVRTVAEDVEGKFVKQSGGRGQYGHAVIKLEPNP-GKGYEFLDEIKGGVIPREF 544
V Y E + E + + + + P P G G ++ + G + + F
Sbjct: 417 TVIYME---RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSF 473

Query: 545 IPAVNKGIEETLKSGVLAGYPVVDVKVHLTFGSYHDVDSNENAFRMAGSMAFKEAMRRAK 604
AV +GI + G L G+ V D K+ +G Y+ S FRM + ++ +++A
Sbjct: 474 QNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAG 532

Query: 605 PVLLEPMMAVEVETPEDFMGNVMGDLSSRRGIVQGMEDIAGGGGKLVRAEVPLAEMFGYS 664
LLEP ++ ++ P++++ D + + ++ E+P + Y
Sbjct: 533 TELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ--LKNNEVILSGEIPARCIQEYR 590

Query: 665 TSLRSATQGRATYTMEFKHYAET 687
+ L T GR+ E K Y T
Sbjct: 591 SDLTFFTNGRSVCLTELKGYHVT 613


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02215CHLAMIDIAOM6250.024 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 25.4 bits (55), Expect = 0.024
Identities = 19/71 (26%), Positives = 31/71 (43%), Gaps = 11/71 (15%)

Query: 1 MNKLIAALV--------AGLFATAAFAQASAPAAAS---APAEKAEAASAPAKKAHAKKH 49
MNKLI V A LFA+ + A + ++ + A+ + K A+K+
Sbjct: 1 MNKLIRRAVTIFAVTSVASLFASGVLETSMAESLSTNVISLADTKAKDNTSHKSKKARKN 60

Query: 50 HAKKHAKKAKE 60
H+K+ KE
Sbjct: 61 HSKETPVDRKE 71


8E4F39_02585E4F39_02660Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_025851243.526167iron ABC transporter permease
E4F39_025903224.320206ABC transporter ATP-binding protein
E4F39_025956166.236330nicotinate-nucleotide--dimethylbenzimidazole
E4F39_026002153.611697adenosylcobinamide-GDP ribazoletransferase
E4F39_026052153.053553alpha-ribazole phosphatase
E4F39_026101143.679022hypothetical protein
E4F39_026151144.515976hypothetical protein
E4F39_02620-1154.307254cobalamin-binding protein
E4F39_02625-2134.178113threonine-phosphate decarboxylase
E4F39_02630-1135.585433cobalamin biosynthesis protein
E4F39_02635-1135.019919bifunctional adenosylcobinamide
E4F39_02640-1133.392080cobyric acid synthase
E4F39_02645-1121.533144DoxX family protein
E4F39_02650-2120.835484ParA family protein
E4F39_02655011-1.263547aspartate 1-decarboxylase
E4F39_02660211-2.190617pantoate--beta-alanine ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02630FERRIBNDNGPP408e-06 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 39.5 bits (92), Expect = 8e-06
Identities = 39/186 (20%), Positives = 68/186 (36%), Gaps = 9/186 (4%)

Query: 42 AITLAAPARRVVSLAPHVTELIYAAG----GGAKLVGAVSYSDYPPAAKAIARVGSNKAL 97
A A R+V+L EL+ A G G A + + PP ++ VG
Sbjct: 28 AHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEP 87

Query: 98 DLERIAALKPDLIVVWRHGNAEHETERLRALGIPLYFSEPRH-LDDVAASLDKLGLLLGT 156
+LE + +KP +V E A G FS+ + L SL ++ LL
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 157 HEIASAAADAYRRRIAQLRARYADK--PPVTVFFQAWDKPLITLNGDH-IVSDVIALCGG 213
A Y I ++ R+ + P+ + D + + G + + +++ G
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPL-LLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 214 RNVFAR 219
N +
Sbjct: 207 PNAWQG 212


9E4F39_02985E4F39_03190Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_029852152.056074MFS transporter
E4F39_029900131.580792YbhB/YbcL family Raf kinase inhibitor-like
E4F39_029950152.326278hypothetical protein
E4F39_030000151.935414hypothetical protein
E4F39_030052151.387960hypothetical protein
E4F39_030101171.019428hypothetical protein
E4F39_030157250.254219hypothetical protein
E4F39_030207240.414998serine protease inhibitor ecotin
E4F39_030256181.001503D-alanyl-D-alanine carboxypeptidase
E4F39_030306190.698832hypothetical protein
E4F39_030405180.476473hypothetical protein
E4F39_03050320-1.858412hypothetical protein
E4F39_03055929-6.527529hypothetical protein
E4F39_03060930-7.010306hypothetical protein
E4F39_030651030-7.283479filamentous hemagglutinin N-terminal
E4F39_03070929-7.212695hypothetical protein
E4F39_030751030-7.358293hypothetical protein
E4F39_030801029-7.331005ShlB/FhaC/HecB family hemolysin
E4F39_03085734-9.098462hypothetical protein
E4F39_03090843-8.339490hypothetical protein
E4F39_03095742-8.276806integrase
E4F39_03105552-8.869772hypothetical protein
E4F39_03110551-8.430287DUF2778 domain-containing protein
E4F39_03115553-8.310678*hypothetical protein
E4F39_03120229-6.444562hypothetical protein
E4F39_03125128-6.081756hypothetical protein
E4F39_03130129-5.812762hypothetical protein
E4F39_03145235-4.603741hypothetical protein
E4F39_03150435-3.293431translation initiation factor IF-1
E4F39_03155330-3.057445alpha/beta fold hydrolase
E4F39_03160327-2.587976DUF2306 domain-containing protein
E4F39_03165428-2.765574rubredoxin
E4F39_03170227-2.991229DUF4399 domain-containing protein
E4F39_03175015-1.205181ATP-binding cassette domain-containing protein
E4F39_03185211-0.618302DNA topoisomerase IV subunit B
E4F39_031902110.071148DNA topoisomerase IV subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03010TCRTETB1162e-30 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 116 bits (291), Expect = 2e-30
Identities = 76/397 (19%), Positives = 157/397 (39%), Gaps = 14/397 (3%)

Query: 25 LAVLDGAIANVALPTIARDLRASDAASIWIVNAYQLAVTISLLPLASLGDRIGYRRVYIA 84
+VL+ + NV+LP IA D A++ W+ A+ L +I L D++G +R+ +
Sbjct: 25 FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLF 84

Query: 85 GLMLFTAASLGCALSGT-LPALATLRVIQGFGAAGIMSVNTALVRMIYPSSQLGRGVAIN 143
G+++ S+ + + L R IQG GAA ++ +V P G+ +
Sbjct: 85 GIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 144 AMVVALSSAVGPTVASAVLAVAPWPWLFAINVPIGVAAVCGSLRALPANPGRSAPYDFIG 203
+VA+ VGP + + W +L I + I + V ++ L +D G
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM-ITIITVPFLMKLLKKEVRIKGHFDIKG 203

Query: 204 AVMNACVFGLLIVSVDGLGHGGNRASVALTALVAAVIGYF-FVKRQLTQPAPLLPVDLMR 262
++ + V + S ++ L+ +V+ + FVK P + L +
Sbjct: 204 IIL-------MSVGIVFFMLFTTSYS--ISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGK 254

Query: 263 VPIFALSIGTSVASFTSQMLAFVALPFWLQNTLGFSQVQTG-LYMTPWPLVIVVAAPLAG 321
F + + F + +P+ +++ S + G + + P + +++ + G
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 322 VLSDRYSAGALGGIGLALFASGLFALATIGAHPTPVDIVWRMALCGAGFGLFQSPNNRAI 381
+L DR + IG+ + + + T + + G ++ + +
Sbjct: 315 ILVDRRGPLYVLNIGVTFLSVSFLTASFL-LETTSWFMTIIIVFVLGGLSFTKTVISTIV 373

Query: 382 LSSAPRERAGGASGMLGTARLTGQTLGAALVALIFGI 418
SS ++ AG +L + G A+V + I
Sbjct: 374 SSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03040cloacin290.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.5 bits (63), Expect = 0.006
Identities = 19/59 (32%), Positives = 22/59 (37%), Gaps = 1/59 (1%)

Query: 49 GTVNVWGGDGWRDRDHWRGGDDRWHGGWRGSGNWRDGNDWHGGRGNGWQGGRGPAGGRN 107
G + G G D W ++ W GG GSG G HG G G G G N
Sbjct: 23 GPTGLGVGGGASDGSGWSSENNPW-GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 27.8 bits (61), Expect = 0.014
Identities = 21/51 (41%), Positives = 24/51 (47%), Gaps = 3/51 (5%)

Query: 74 GGWRGSGNWRDGNDWHGGRGNGWQGGRGPAGGRNVRGGNDWPDGGGNGRGG 124
G GSG + N W GG G+G G G G GN GGG+G GG
Sbjct: 32 GASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGN---SGGGSGTGG 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03080PF05860653e-14 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 65.2 bits (159), Expect = 3e-14
Identities = 22/138 (15%), Positives = 49/138 (35%), Gaps = 23/138 (16%)

Query: 72 AQVVG-AGANAPSVIQTQNGLQQVNITKPSGAGVSLNTYSQFDVPKQGVIVNNSPTLTNT 130
AQ+ S I T+ + + +G+ + + + +F VP G N+PT
Sbjct: 1 AQITPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNPT---- 55

Query: 131 QQAGYINGNPNLSPNGAARIIINQVNSNNPSQLKGYVEIAGQRAEMIISNSSGLVVDGGG 190
+ II++V + S + G + A + + N +G++
Sbjct: 56 ----------------NIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQNA 98

Query: 191 FINTSRAILTTGTPNLNA 208
++ + + + L
Sbjct: 99 RLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03095IGASERPTASE330.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.7 bits (74), Expect = 0.005
Identities = 14/76 (18%), Positives = 29/76 (38%), Gaps = 2/76 (2%)

Query: 26 PSPADQAAAARANAEQDRQAQQQRDAQQRDAAVRAPSVRSEVPKVEAYPALPAETPCFRI 85
P+PA + AE +Q + + ++DA + ++ EA + A T +
Sbjct: 1028 PAPATPSETTETVAENSKQESKTVEKNEQDAT--ETTAQNREVAKEAKSNVKANTQTNEV 1085

Query: 86 DRFTLDVPNSLPDTTK 101
+ + + TK
Sbjct: 1086 AQSGSETKETQTTETK 1101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03195PF06917270.033 Periplasmic pectate lyase
		>PF06917#Periplasmic pectate lyase

Length = 555

Score = 26.8 bits (59), Expect = 0.033
Identities = 21/77 (27%), Positives = 29/77 (37%), Gaps = 8/77 (10%)

Query: 58 AGDMAPHTGHHHLLIDGKPVARGDVIPANDHS----LHFGKPQTETEVRLSPGRHTLTLQ 113
A A +TG GK + R V+ N + F PQ + P T
Sbjct: 229 AYKYAEYTGDAAAAAWGKHLYRQYVLARNPETGLPVYQFSSPQQRQPI---PADDNQTQS 285

Query: 114 -FGDGAHRSYGPEMSQT 129
+GD A R +GPE +
Sbjct: 286 WYGDRAKRQFGPEFGEI 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03210GPOSANCHOR310.024 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.8 bits (69), Expect = 0.024
Identities = 19/52 (36%), Positives = 28/52 (53%), Gaps = 7/52 (13%)

Query: 460 ARLEKIKIEKELEELRAEKAKLEELLANESAMKRLMIKE-------IEADAK 504
+R K ++EK LEE ++ A LE+L K+L KE +EA+AK
Sbjct: 391 SREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAK 442


10E4F39_03535E4F39_03690Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_035352190.495499hypothetical protein
E4F39_03540-1141.072478FUSC family protein
E4F39_03545-2142.126862hypothetical protein
E4F39_03550-2122.342320MFS transporter
E4F39_035550122.951959hypothetical protein
E4F39_03560-1133.224037carboxymuconolactone decarboxylase family
E4F39_03565-2152.571263RNA polymerase sigma-70 factor
E4F39_03570-1192.517854hypothetical protein
E4F39_03575-2170.578940hypothetical protein
E4F39_035800211.915810hypothetical protein
E4F39_03585-1241.258398glycosyltransferase
E4F39_035901281.297409chemotaxis protein
E4F39_035952292.343599hypothetical protein
E4F39_036002322.112874phospholipase C accessory protein PlcR
E4F39_036052332.578560pectinacetylesterase
E4F39_036100230.411246purine permease
E4F39_036150140.199822ribonuclease G
E4F39_036201160.093988septum formation inhibitor Maf
E4F39_03630110-0.41442523S rRNA
E4F39_03635110-0.413162ribosome silencing factor
E4F39_036403130.548650nicotinate-nucleotide adenylyltransferase
E4F39_036450151.673265oxygen-dependent coproporphyrinogen oxidase
E4F39_03650-1161.547951phosphoribosylamine--glycine ligase
E4F39_03655-3162.537795YebC/PmpR family DNA-binding transcriptional
E4F39_03660-3151.107998uracil phosphoribosyltransferase
E4F39_03665-1161.593356SDR family oxidoreductase
E4F39_03670-2151.198947methylglyoxal synthase
E4F39_036801141.012939hypothetical protein
E4F39_036852121.358684hypothetical protein
E4F39_036903162.419877quinone oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03565TCRTETA613e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 61.0 bits (148), Expect = 3e-12
Identities = 39/144 (27%), Positives = 58/144 (40%), Gaps = 6/144 (4%)

Query: 256 VIAACIIVPQAIVAMLSPWVGRSAQRWGRRPILLLGFAALPLRALLFAGVSSPYLLVPVQ 315
++ A + Q A P +G + R+GRRP+LL+ A + + A ++L +
Sbjct: 47 ILLALYALMQFACA---PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGR 103

Query: 316 MLDGISAAVFGVMLPLIAADVAGGKGRYNLCIGLFGLAAGVGATLSTALAGFAADHFGNA 375
++ GI+ A V I AD+ G R G G G L G F
Sbjct: 104 IVAGITGATGAVAGAYI-ADITDGDERARH-FGFMSACFGFGMVAGPVLGGLMGG-FSPH 160

Query: 376 MSFFGLAAAGALATLLVWFAMPET 399
FF AA L L F +PE+
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPES 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03570PF03544310.004 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 31.5 bits (71), Expect = 0.004
Identities = 15/65 (23%), Positives = 23/65 (35%)

Query: 102 SAPVVEAVAVPVEPASEPIAAPEPASAPEPAETAPPKKPHREAAPPRKPARVAPPAPAPA 161
P E + P + A I P+P P+P ++P R+ P APA
Sbjct: 76 PEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPA 135

Query: 162 PAPAP 166
+
Sbjct: 136 RPTSS 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03670LPSBIOSNTHSS394e-06 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 39.4 bits (92), Expect = 4e-06
Identities = 20/69 (28%), Positives = 29/69 (42%), Gaps = 5/69 (7%)

Query: 31 IGILGGTFDPIHDGHLALARRFAHVLRLTELVLMPAGQPYQKQDVSAAEHRLAMTRAAAA 90
I G+FDPI GHL + R + ++ + P KQ + + + RL A A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRL--FDQVYVAVLRNP-NKQPMFSVQERLEQIAKAIA 58

Query: 91 SLVLPGVAV 99
LP V
Sbjct: 59 H--LPNAQV 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03700DHBDHDRGNASE782e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.2 bits (192), Expect = 2e-19
Identities = 71/258 (27%), Positives = 110/258 (42%), Gaps = 6/258 (2%)

Query: 4 GIAGRTALVCAASKGLGRGCAQALAAEGARLVIVARTRETLEAAAADIRAATGADVTAVA 63
GI G+ A + A++G+G A+ LA++GA + V E LE + ++A A
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKA-EARHAEAFP 63

Query: 64 CDITTPEGRA---AALAACPQP-DILVTNAGGPPPGDFRDFTHDDWIRALEANMLTPIEL 119
D+ A + P DILV AG PG + ++W N
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 120 IRASVDGMIARRFGRVVNITSSAVKAPIDVLALSNGARSGLTGFVAGLARKVAEHNVTIN 179
R+ M+ RR G +V + S+ P +A +++ F L ++AE+N+ N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 180 NLLPGLFDTDRIATTLAAAAQAQGATVDELRARRTKEIPAKRLGTPDEFGAACAFLCSVH 239
+ PG +TD + +L A + IP K+L P + A FL S
Sbjct: 184 IVSPGSTETD-MQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 240 AGYITGQNWLLDGGAYPG 257
AG+IT N +DGGA G
Sbjct: 243 AGHITMHNLCVDGGATLG 260


11E4F39_03755E4F39_03880Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_037550143.840081globin
E4F39_03760-2113.400188globin
E4F39_037650113.237413alanyl-tRNA editing protein
E4F39_03770-1103.374291DUF924 domain-containing protein
E4F39_037750123.793728amidase
E4F39_03780-1114.355490MFS transporter
E4F39_037850113.019566glycosyl transferase
E4F39_03790-383.243506hypothetical protein
E4F39_03795-373.172285TIGR00730 family Rossman fold protein
E4F39_03800-383.544279TetR/AcrR family transcriptional regulator
E4F39_03805-392.948217diacylglycerol kinase
E4F39_03810-490.919511glycosyltransferase family 1 protein
E4F39_03815-4110.604631UDP-2,3-diacylglucosamine diphosphatase
E4F39_03825-111-0.623169RDD family protein
E4F39_03830017-0.894408DUF3106 domain-containing protein
E4F39_038401171.673169DUF3619 family protein
E4F39_038451170.561252RNA polymerase sigma factor
E4F39_038503180.215790acetolactate synthase
E4F39_03855-114-1.112193acetolactate synthase 3 catalytic subunit
E4F39_03860-113-2.757021acetolactate synthase small subunit
E4F39_03865012-3.701095ketol-acid reductoisomerase
E4F39_03870-113-4.284048phosphatidylserine decarboxylase
E4F39_03880014-3.270129CDP-diacylglycerol--serine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03810TCRTETA347e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 7e-04
Identities = 30/101 (29%), Positives = 47/101 (46%), Gaps = 9/101 (8%)

Query: 83 ALVIGAYADRAGRKPAMTLTLAMMAVGTGAIAVLPGYETIGVAAPILLVVTRLIQGLAWG 142
A V+GA +DR GR+P + ++LA AV +A P +L + R++ G+ G
Sbjct: 60 APVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIVAGIT-G 110

Query: 143 GEAGPATTYILEAAPPERRAAYACWQVATQGFAAVAAGLAG 183
A YI + + RA + + A GF VA + G
Sbjct: 111 ATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_03835HTHTETR604e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.0 bits (145), Expect = 4e-13
Identities = 20/73 (27%), Positives = 41/73 (56%)

Query: 7 RRTRERILELSLKLFNEIGEPNVTTTTIAEEMEISPGNLYYHFRNKDDIINSIFAQFEQQ 66
+ TR+ IL+++L+LF++ G + + IA+ ++ G +Y+HF++K D+ + I+ E
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 67 IERRLRFPEDHRP 79
I + P
Sbjct: 70 IGELELEYQAKFP 82


12E4F39_03955E4F39_04030Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_03955011-3.469949NADH-quinone oxidoreductase subunit B
E4F39_03960-113-5.437197NADH-quinone oxidoreductase subunit C
E4F39_03970-112-4.881259NADH-quinone oxidoreductase subunit D
E4F39_03975-114-2.615384NADH-quinone oxidoreductase subunit NuoE
E4F39_03980-116-2.953344NADH oxidoreductase (quinone) subunit F
E4F39_03985-116-3.032230NADH-quinone oxidoreductase subunit G
E4F39_03990016-2.482883NADH-quinone oxidoreductase subunit NuoH
E4F39_04000117-3.259545NADH-quinone oxidoreductase subunit NuoI
E4F39_04005119-5.200924NADH-quinone oxidoreductase subunit J
E4F39_04010118-4.667521NADH-quinone oxidoreductase subunit NuoK
E4F39_04015019-4.595770NADH-quinone oxidoreductase subunit L
E4F39_04020018-4.788068NADH-quinone oxidoreductase subunit M
E4F39_04025-118-4.160983NADH-quinone oxidoreductase subunit NuoN
E4F39_04030-315-3.432314DUF2818 family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04005OUTRMMBRANEA300.013 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 29.9 bits (67), Expect = 0.013
Identities = 16/96 (16%), Positives = 28/96 (29%), Gaps = 10/96 (10%)

Query: 138 YAVILAGWASNSKYAFLGAMR-------AAAQMVSYEISMGFALVLVLMTAGSLNLSEIV 190
Y GW+ F+ A Y+++ + G +
Sbjct: 29 YTGAKLGWSQYHDTGFINNNGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPY---K 85

Query: 191 GSQQHGFFAGHGVNFLSWNWLPLLPVFVIYFISGIA 226
GS ++G + GV + P+ IY G
Sbjct: 86 GSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGGM 121


13E4F39_04135E4F39_04220Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_04135224-3.026905response regulator transcription factor
E4F39_04140429-4.930779DUF2863 family protein
E4F39_04145437-6.914597hypothetical protein
E4F39_04150232-6.646645hypothetical protein
E4F39_04155324-6.124507phenylacetate-CoA oxygenase
E4F39_04160019-5.417697**M24 family metallopeptidase
E4F39_04180114-4.253115fatty acid desaturase
E4F39_04190114-3.961415AraC family transcriptional regulator
E4F39_04200114-3.033311phosphoglycerate dehydrogenase
E4F39_04205113-3.015981hypothetical protein
E4F39_04210115-1.883169transposase
E4F39_04215220-0.829231hypothetical protein
E4F39_04220430-0.095613sulfonate ABC transporter substrate-binding
14E4F39_04315E4F39_04510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_04315215-0.007630hypothetical protein
E4F39_043202150.453123transport system membrane protein
E4F39_043252140.589365MdtB/MuxB family multidrug efflux RND
E4F39_043302130.550299MdtA/MuxA family multidrug efflux RND
E4F39_043351160.135772IclR family transcriptional regulator
E4F39_04340011-0.109662DUF839 domain-containing protein
E4F39_043451121.145070hypothetical protein
E4F39_043501121.242164hypothetical protein
E4F39_043550121.189147NAD-dependent epimerase/dehydratase family
E4F39_043600131.283625hypothetical protein
E4F39_043652193.333867cysteine dioxygenase
E4F39_04370-3133.476182Lrp/AsnC family transcriptional regulator
E4F39_04375-1102.932966ABC transporter ATP-binding protein
E4F39_043800124.320342iron ABC transporter permease
E4F39_04385-1123.855384Fe(3+) ABC transporter substrate-binding
E4F39_04390-1123.573147penicillin-binding protein
E4F39_04395-1142.877626hypothetical protein
E4F39_044000162.748636amino acid permease
E4F39_04405-1162.228486rod shape-determining protein MreB
E4F39_04410-114-0.718441hypothetical protein
E4F39_04415-212-0.397294exodeoxyribonuclease V subunit gamma
E4F39_044200122.460883exodeoxyribonuclease V subunit beta
E4F39_044251113.336562exodeoxyribonuclease V subunit alpha
E4F39_04430094.490559aminoacyl-tRNA hydrolase
E4F39_04435094.482042EAL domain-containing protein
E4F39_04440085.207088hypothetical protein
E4F39_04445094.936493glycine zipper 2TM domain-containing protein
E4F39_04450094.422525phosphomethylpyrimidine synthase ThiC
E4F39_04455183.737907pentapeptide MXKDX repeat protein
E4F39_04460-3120.209038DUF1223 domain-containing protein
E4F39_04465-2120.195469cytochrome b/b6 domain-containing protein
E4F39_04475-111-0.979073molybdopterin-binding protein
E4F39_04485-111-0.554627hypothetical protein
E4F39_044900110.235545hypothetical protein
E4F39_04495-110-0.235699EamA family transporter
E4F39_045003140.903692MFS transporter
E4F39_045052130.726713hypothetical protein
E4F39_045102122.054816hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04335ACRIFLAVINRP7450.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 745 bits (1925), Expect = 0.0
Identities = 279/1104 (25%), Positives = 502/1104 (45%), Gaps = 100/1104 (9%)

Query: 3 LARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLPGASPETVATS 62
+A FI RP+ +LA+ + +AG A ++LPV+ P + P + V A+ PGA +TV +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTSPLERHLGSIADVAEMTSMS-SVGNARIVLQFNLNRDIDGAARDVQAAINAARADLPA 121
VT +E+++ I ++ M+S S S G+ I L F D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 SLKSNPTYRKVNPADSPIMVVSLTS--KTASPAKLYDAASTVLQQSLSQIDGIGQVSLSG 179
++ + S +MV S + + D ++ ++ +LS+++G+G V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 SANPAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGP------HRYQLYTND 233
+ A+R+ L+ L Y + DV L N G + P +
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QATKAAQYKDLVI-AYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLVILYRSPGAN 292
+ ++ + + + + V L DV+ V E+ + +NG+ A + + + GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 IIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVSLVVMVVFLF 352
+DT + +KA L +L P ++V D + ++ S+ + TL A+ LV +V++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDAIVVLENIAR 412
L+N RATLIP++AVP+ ++GTF + G+S+N L++ +++A G +VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 HI-ENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFREFALTLSLAI 471
+ E+ P +A ++ ++ I++ L AVF+P+ GG G ++R+F++T+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 AVSLVVSLTLTPMMCARLLPEAHAPRDE--GRVARWLERGFEWMQRGYERTLSWALRHPF 529
A+S++V+L LTP +CA LL A E G W F+ Y ++ L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 530 TILMTLVATIALNIALYIVVPKGFFPQQDTGLMIGGIQADQTTSFQAMKLRFTEMMRIIR 589
L+ +A + L++ +P F P++D G+ + IQ + + + ++
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 590 ANP-----NVANVAGFT-GGAQTNSGFMFVALKDKPQR---KLSADQVIQQLRPQLAEVA 640
N +V V GF+ G N+G FV+LK +R + SA+ VI + + +L ++
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 641 GARTFLQAAQDIRAGGRQSNAQYQFT-LLGDSTAELYKWGP-ILTEALQKRPELADVNSD 698
I G + ++ G L + +L A Q L V +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 QQQGGLEAMVTIDRATAARLGIKPAQIDNTLYDAFGQRQVSTIYNPLNQYHVVMEVAPQY 758
+ + + +D+ A LG+ + I+ T+ A G V+ + + ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 WQSPEMLKQIYISTSGGSASGVQTTNAAAGTYVATTARASTAGAAAQSAAAIAADSARNQ 818
PE + ++Y+ ++ G V + +V + R R
Sbjct: 779 RMLPEDVDKLYVRSANGEM--VPFSAFTTSHWVYGSPRLE-----------------RYN 819

Query: 819 ALNSIASSG--KSSASSGAAVSTSKSTMVPLSAIASFGPSTTPLAVNHQGLFVATTISFN 876
L S+ G SSG A++ ++
Sbjct: 820 GLPSMEIQGEAAPGTSSGDAMALMENLAS------------------------------K 849

Query: 877 LPPGVSLSKATQVIYQTMAEVGVPPTIQGSFQGTAQAFQESLKDQPILILAALAAVYIVL 936
LP G+ G + P L+ + V++ L
Sbjct: 850 LPAGIGY------------------DWTGMSYQERLSG----NQAPALVAISFVVVFLCL 887

Query: 937 GILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIVKKNAIMMVDF 996
LYES+ PV+++ +P VG LL LF + + ++G++ IG+ KNAI++V+F
Sbjct: 888 AALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEF 947

Query: 997 AIDA-SRQGKSSFDAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAEMRAPLGIAIA 1055
A D ++GK +A A +R RPI+MT++A +LG LPLA G G+ + +GI +
Sbjct: 948 AKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVM 1007

Query: 1056 GGLIVSQMLTLYTTPVVYLYMDRL 1079
GG++ + +L ++ PV ++ + R
Sbjct: 1008 GGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 96.1 bits (239), Expect = 4e-22
Identities = 83/503 (16%), Positives = 167/503 (33%), Gaps = 25/503 (4%)

Query: 2 NLARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLP-GASPETVA 60
N + L+ I + F++LP S LP+ D L LP GA+ E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 TSVTSPLERHLGSIAD----VAEMTSMSSVGNAR----IVLQFNLNRDIDGAARDVQAAI 112
+ + +L + V + S G A+ + + +G +A I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NAARADLPASLKSNPTYRKVNPADSPIMVVSLTSKT-----ASPAKLYDAASTVLQQSLS 167
+ A+ +L + + L A + +L +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 168 QIDGIGQVSLSGSAN-PAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGPHR 226
+ V +G + ++E++ + G+ L D+ +++A +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 227 YQLYT---NDQATKAAQYKDLVIAYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLV 283
+LY L + N V S ++ V L NG ++ +
Sbjct: 768 KKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHW-VYGSPRLERYNGLPSMEI 826

Query: 284 ILYRSPGANIIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVS 343
+PG + D A + L + LPA I S R S + I+
Sbjct: 827 QGEAAPGTSSGD----AMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAISFV 881

Query: 344 LVVMVVFLFLRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDA 403
+V + + +W + + VP+ IVG A L + ++ L+ G +A
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 404 IVVLENI-ARHIENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFRE 462
I+++E + G ++A R +L SL+ + LP+ + G
Sbjct: 942 ILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNA 1001

Query: 463 FALTLSLAIAVSLVVSLTLTPMM 485
+ + + + ++++ P+
Sbjct: 1002 VGIGVMGGMVSATLLAIFFVPVF 1024



Score = 59.9 bits (145), Expect = 5e-11
Identities = 37/225 (16%), Positives = 84/225 (37%), Gaps = 4/225 (1%)

Query: 870 ATTISFNLPPGVSLSKATQVIYQTMAEV--GVPPTIQGS-FQGTAQAFQESLKDQPILIL 926
A + L G + + I +AE+ P ++ T Q S+ + +
Sbjct: 286 AAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF 345

Query: 927 AALAAVYIVLGILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIV 986
A+ V++V+ + ++ + +P +G L F + + + G++L IG++
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLL 405

Query: 987 KKNAIMMVDFAIDASRQGKSSF-DAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAE 1045
+AI++V+ + K +A ++ ++ M +P+AF G
Sbjct: 406 VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 1046 MRAPLGIAIAGGLIVSQMLTLYTTPVVYLYMDRLRVWAEKRRDRR 1090
+ I I + +S ++ L TP + + +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04340ACRIFLAVINRP8020.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 802 bits (2073), Expect = 0.0
Identities = 284/1035 (27%), Positives = 498/1035 (48%), Gaps = 31/1035 (2%)

Query: 4 SRVFILRPVGTALLMAAIMLAGLVALRFLPLAALPEVDYPTIQVQTFYPGASPEVMTSSV 63
+ FI RP+ +L +M+AG +A+ LP+A P + P + V YPGA + + +V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TAPLERQFGQMPSLNQMSSQS-SAGASVITLQFSLDLPLDIAEQEVQAAINAAGNLLPSD 122
T +E+ + +L MSS S SAG+ ITL F DIA+ +VQ + A LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 LPAPPIYAKVNPADAPVITLAVTSKTLPLTQ--VQDLADTRLAMKISQVSGVGLVSLSGG 180
+ I + + ++ S TQ + D + + +S+++GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 NRPAVRIQANPLALASYGLNLDDLRTTISNLNVNTPKGNFDGP------TRAYTINANDQ 234
A+RI + L Y L D+ + N G G +I A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 LTSADQYNDAVV-AYKNGRPVMLTDVAKIVAGSENTKLGAWVDAEPAIILNVQRQPGANV 293
+ +++ + +G V L DVA++ G EN + A ++ +PA L ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 IQTVDNVKAILPKLQESLPAALDVQIVTDRTTMIRAAVRDVQFELGLAVALVVLVMYLFL 353
+ T +KA L +LQ P + V D T ++ ++ +V L A+ LV LVMYLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 ANVYATIIPSLSVPLSLIGTLAVMYLSGFSLNNLSLMALTIATGFVVDDAIVMIENIARY 413
N+ AT+IP+++VP+ L+GT A++ G+S+N L++ + +A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 -VEEGDSALEAALKGSKQIGFTIISLTVSLIAVLIPLLFMGDVVGRLFHEFAITLAVTIV 472
+E+ EA K QI ++ + + L AV IP+ F G G ++ +F+IT+ +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 ISAVVSLTLVPMMCAKLLRHTPPPESHRFEAKVHGLIERV----IERYGVALQWVLDRQR 528
+S +V+L L P +CA LL+ E H + G + Y ++ +L
Sbjct: 480 LSVLVALILTPALCATLLK-PVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 529 ATLVVAVLTLALTALLYAVIPKGFFPTQDTGVIQAITQAPQSVSYGAMAERQQALAAEIL 588
L++ L +A +L+ +P F P +D GV + Q P + + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 589 KH--PDVVSLTSFIGVDGANITLNSGRMLINLKPRDERS---ESAGDVIRSLQRQVANVT 643
K+ +V S+ + G + N+G ++LKP +ER+ SA VI + ++ +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 644 GISLYMQPVQDLTIDSTVSPTQYQFMLTS---PNPDEFATWVPKLVDRLRKEPS-LADVA 699
+ P I + T + F L D +L+ + P+ L V
Sbjct: 659 DGFVI--PFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 700 TDLQNSGKSVYIEIDRTSAARFGITPATVDNALYDAYGQRIVSTIFTQSNQYRVILESEP 759
+ +E+D+ A G++ + ++ + A G V+ + ++ ++++
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 760 QMQHYTDSLNGIYLPSAGGGQVPLSAIATFRERPAPLLVSHLSQFPAATISFNLAPGASL 819
+ + + ++ +Y+ SA G VP SA T + + P+ I APG S
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 820 GEAVKAIDAAERELGLPASFQTRFQGAALAFQASLSNQLFLILAAIVTMYIVLGVLYESY 879
G+A+ ++ +L PA + G + + S + L+ + V +++ L LYES+
Sbjct: 837 GDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 880 IHPITILSTLPSAGVGALLALMITGHDLDIIGIIGIVLLIGIVKKNAIMMIDFALEAERV 939
P++++ +P VG LLA + D+ ++G++ IG+ KNAI++++FA +
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 940 EGKPPREAIYQACLLRFRPILMTTLAALLGAVPLIVGSGAGSELRQPLGIAIAGGLIVSQ 999
EGK EA A +R RPILMT+LA +LG +PL + +GAGS + +GI + GG++ +
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSAT 1014

Query: 1000 VLTLFTTPVIYLGFD 1014
+L +F PV ++
Sbjct: 1015 LLAIFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04345RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 4e-08
Identities = 27/149 (18%), Positives = 57/149 (38%), Gaps = 16/149 (10%)

Query: 84 AARGEMPVVLNALGTVTPLANV-TVRTQLSGYLQAVSFQEGQIVKKGDVLAQIDPRP--- 139
+ G++ +V A G +T ++ + ++ + +EG+ V+KGDVL ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 140 ----YQISLANAQGALARDEALLATARLDLKRYQTLVAQ---DSIAKQTADTQASLVKQY 192
Q SL A+ R + L + L+ L + +++++ SL+K+
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE- 193

Query: 193 EGTVQIDRAAIDSAKLNLAYARITAPVSG 221
Q + L + A
Sbjct: 194 ----QFSTWQNQKYQKELNLDKKRAERLT 218



Score = 38.3 bits (89), Expect = 5e-05
Identities = 33/182 (18%), Positives = 61/182 (33%), Gaps = 26/182 (14%)

Query: 141 QISLANAQGALARDEALLAT--ARLDLKRYQTLVAQDSIAKQTADTQASLVKQY-EGTVQ 197
+ ++ + L ++L+ + L A++ T + ++ + + T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 198 ID--RAAIDSAKLNLAYARITAPVSGRV-GLRQVDPGNYVTPSDT--------NGIVVIT 246
I + + + I APVS +V L+ G VT ++T + + V
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA 370

Query: 247 QLQPMSVIFTTSEDNLPAILKQVGAGGKLSVTAYNRNNTTPLETGV-LDTLDNQIDTATG 305
+Q + F AI+K V A+ L V LD D G
Sbjct: 371 LVQNKDIGFINVG--QNAIIK---------VEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 306 TV 307
V
Sbjct: 420 LV 421


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04380NUCEPIMERASE335e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.2 bits (76), Expect = 5e-04
Identities = 24/123 (19%), Positives = 41/123 (33%), Gaps = 24/123 (19%)

Query: 1 MKIALFGATGMIGSRIAAEAARRGHQVTAL-------------SRNPAASGANVQAKAAD 47
MK + GA G IG ++ GHQV + +R + Q D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 48 LFD---PASIAAALEGQDVVASA------YGPKQEEASKVVAVAKAL--VDGARKAGVKR 96
L D + A+ + V S Y + A + L ++G R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 97 VVV 99
++
Sbjct: 121 LLY 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04495CHLAMIDIAOM6320.007 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 32.4 bits (73), Expect = 0.007
Identities = 15/61 (24%), Positives = 24/61 (39%), Gaps = 3/61 (4%)

Query: 562 FNLG-LDPDKAREFHDETLPKDSAKVAHFC--SMCGPHFCSMKITQDVREFAAQQGVSEN 618
F LG + P + R E P + + S CG H + +T + E Q ++
Sbjct: 265 FTLGDMQPGEHRTITVEFCPLKRGRATNIATVSYCGGHKNTASVTTVINEPCVQVSIAGA 324

Query: 619 D 619
D
Sbjct: 325 D 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04530ACETATEKNASE270.003 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 27.5 bits (61), Expect = 0.003
Identities = 9/38 (23%), Positives = 17/38 (44%)

Query: 19 GKTGMTGNGGVRNDRSERAAANMTEGDERTRRRTRMMT 56
K+G+ G G+ +D + A GD+R + +
Sbjct: 269 KKSGVYGISGISSDFRDLEDAAFKNGDKRAQLALNVFA 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04545TCRTETA453e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.8 bits (106), Expect = 3e-07
Identities = 60/266 (22%), Positives = 95/266 (35%), Gaps = 11/266 (4%)

Query: 66 YATGMLVLAPLG----DRFDRRTLILLQIAGLSAALVVAAAAPTLGVLAAASLAIGILAT 121
YA AP+ DRF RR ++L+ +AG + + A AP L VL + GI
Sbjct: 52 YALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA 111

Query: 122 IAQQAVPFAAEIAPPAARGQAVGTVMSGLLLGILLARTAAGFVAEYFGWRAVFAASVAAL 181
A + A+I R + G + + G++ G + + FAA AAL
Sbjct: 112 TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAA--AAL 169

Query: 182 AALAAVIVA-RLPRSSPTSTLPYGKLLASMWQLVRELRGLR--EASMTGGAIFAAFSAFW 238
L + LP S P + + R RG+ A M I
Sbjct: 170 NGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP 229

Query: 239 PVLTLLLAGAPFHLGPQAAGL-FGIVGAAGALAAPY-AGRFADKRGPRAIISLAIALIAA 296
L ++ FH G+ G +LA G A + G R + L +
Sbjct: 230 AALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 297 SFAIFALSGASLIGLVIGVIVLDVGV 322
+ + A + + I V++ G+
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASGGI 315


15E4F39_04620E4F39_04720Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_046203202.372948FMN-binding negative transcriptional regulator
E4F39_046253241.174080PLP-dependent aminotransferase family protein
E4F39_046303231.078309Lrp/AsnC family transcriptional regulator
E4F39_04635014-0.479711flagellar hook-basal body protein FliE
E4F39_04640-116-0.279218glutamine--fructose-6-phosphate transaminase
E4F39_04650-2120.331919carotenoid oxygenase
E4F39_04655-2110.362692DUF2239 family protein
E4F39_04660091.713041LysR family transcriptional regulator
E4F39_046650112.909497SDR family oxidoreductase
E4F39_04670-2102.689353hypothetical protein
E4F39_046802121.856853DUF934 domain-containing protein
E4F39_046853142.093932nitrite/sulfite reductase
E4F39_046904141.390395hypothetical protein
E4F39_046954102.303049DUF2970 domain-containing protein
E4F39_047004112.493712PLP-dependent aminotransferase family protein
E4F39_047053112.652836hypothetical protein
E4F39_047101143.696615Hsp20/alpha crystallin family protein
E4F39_047151133.998042Hsp20/alpha crystallin family protein
E4F39_047201173.829860hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04730DHBDHDRGNASE681e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 68.2 bits (166), Expect = 1e-15
Identities = 75/266 (28%), Positives = 119/266 (44%), Gaps = 19/266 (7%)

Query: 1 MADHSIKGKTVIIAGGAKNLGGLIARDLAAQGAQAVAIHYNSAASKGAAAETVAAIKAAG 60
M I+GK I G A+ +G +AR LA+QGA A+ YN + + V+++KA
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE----KVVSSLKAEA 56

Query: 61 ARAVALQADLTAAGAVEKLFVDTVAAIGRPDIAINTVGKVLKKPFVEITEAEYDEMAAVN 120
A A AD+ + A++++ +G DI +N G + +++ E++ +VN
Sbjct: 57 RHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116

Query: 121 SKTAFFFLKEAGRHVND--NGKIVTLVTSLLGAFTPFYAAYAGMKAPVEHFTRAAAKEFG 178
S F + +++ D +G IVT+ ++ G AAYA KA FT+ E
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 ARGISVTAVGPGPMDTPFFYPAEGADAVAYHKTAAALSPFSKTGL--------TDIGDVV 230
I V PG +T + + A +L F KTG+ +DI D V
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETF-KTGIPLKKLAKPSDIADAV 235

Query: 231 PFIRHLVSD-GWWITGQTILINGGYT 255
F LVS IT + ++GG T
Sbjct: 236 LF---LVSGQAGHITMHNLCVDGGAT 258


16E4F39_04770E4F39_04940Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_047703180.872897hypothetical protein
E4F39_047756181.875054malonate decarboxylase subunit alpha
E4F39_04785-1100.305130malonate decarboxylase acyl carrier protein
E4F39_04795-3111.318674biotin-independent malonate decarboxylase
E4F39_04805-2141.672993biotin-independent malonate decarboxylase
E4F39_04810-1152.197293malonate decarboxylase
E4F39_04815-1172.944016hypothetical protein
E4F39_048200143.072284hypothetical protein
E4F39_048302153.490194alpha/beta fold hydrolase
E4F39_048353153.956604glutathione S-transferase
E4F39_048401164.289817hypothetical protein
E4F39_048451153.924144hypothetical protein
E4F39_048504195.794690glycine betaine/L-proline transporter ProP
E4F39_048553196.415780proline/betaine transporter
E4F39_048603195.891801YnfA family protein
E4F39_048655205.782584DNA polymerase III subunit epsilon
E4F39_048756225.890858ribonuclease HI
E4F39_048802193.993669class I SAM-dependent methyltransferase
E4F39_048851251.830769hydroxyacylglutathione hydrolase
E4F39_04890-2261.136713LysM peptidoglycan-binding domain-containing
E4F39_048951260.533721MFS transporter
E4F39_049001231.258982hypothetical protein
E4F39_04905-1111.601556flagellar hook capping protein
E4F39_04910-1150.367694AraC family transcriptional regulator
E4F39_04915-1120.754176hypothetical protein
E4F39_049204210.100745carbamoyl-phosphate synthase small subunit
E4F39_04925116-0.080947leucine efflux protein LeuE
E4F39_04930216-0.684638carbamoyl-phosphate synthase large subunit
E4F39_04935519-1.628129transcription elongation factor GreA
E4F39_049403170.331996DUF4149 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04845ADHESNFAMILY300.019 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 30.2 bits (68), Expect = 0.019
Identities = 28/128 (21%), Positives = 42/128 (32%), Gaps = 19/128 (14%)

Query: 390 LKAGEEADARTPAA---LRRGRKLVVQIGE----------TFGEKNAPMFVEQLDALRLA 436
L+ E P A L G I + F EKN + ++LD L
Sbjct: 127 LEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKE 186

Query: 437 DKLALDLAPVMVYGDDVTHVVTEEGIANLLMCRDADEREHAIRGVAGYTEIGRGRDRRLV 496
K + P + +VT EG + I + E + + LV
Sbjct: 187 SKDKFNKIP-----AEKKLIVTSEGAFKYF-SKAYGVPSAYIWEINTEEEGTPEQIKTLV 240

Query: 497 ERLRERGV 504
E+LR+ V
Sbjct: 241 EKLRQTKV 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04930TCRTETA492e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 49.1 bits (117), Expect = 2e-08
Identities = 84/376 (22%), Positives = 143/376 (38%), Gaps = 47/376 (12%)

Query: 72 PSAQLLATFGTFAAAF-LVRPLGGMVFGPLGDRIGRQRVLAMTMIMMAVGTFAIGLIPSY 130
S + A +G A + L++ V G L DR GR+ VL +++ AV + P
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 131 DSIGLLAPVLLLVARLVQGFSTGGEYGGAATFIAEFSTDKRR----GFMGSFLEFGTLIG 186
+L + R+V G TG A +IA+ + R GFM + FG + G
Sbjct: 97 W--------VLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG 147

Query: 187 YVMGAGVVALLTASLSHDALLSWGWRVPFLIAGPLGLIG-LYIRMRLEETPAFKRQAEAR 245
V+G L S PF A L + L L E+ +R+ R
Sbjct: 148 PVLGG-----LMGGFSP--------HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRR 194

Query: 246 EAQDKAVPKAHFRRQLARHWRALLLCVGLVLIFNVTDYMALSYLPSYLSSTLHFDEAH-G 304
EA + P A FR A L+ V ++ AL + + H+D G
Sbjct: 195 EALN---PLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI--FGEDRFHWDATTIG 249

Query: 305 LVLILIVMVLMMPMTLATGRLSDAVGRKPVMLAGCVGLFALAIPALLLIRTGETALVFGG 364
+ L ++ + + TG ++ +G + ++ +G+ A +LL + F
Sbjct: 250 ISLAAFGILHSLAQAMITGPVAARLGERRALM---LGMIADGTGYILLAFATRGWMAFPI 306

Query: 365 LLILGALLSCFTGVMPSALPALFPTEI---RYGALAIGFNVSVSLFGGTT-PLAAAWLVD 420
+++L G+ AL A+ ++ R G L G +++ PL +
Sbjct: 307 MVLLA-----SGGIGMPALQAMLSRQVDEERQGQLQ-GSLAALTSLTSIVGPLLFTAIYA 360

Query: 421 ATGNLMMPAYYLMGAA 436
A+ ++ GAA
Sbjct: 361 ASITTWNGWAWIAGAA 376



Score = 29.4 bits (66), Expect = 0.037
Identities = 25/99 (25%), Positives = 44/99 (44%), Gaps = 7/99 (7%)

Query: 289 LPSYLSSTLHFDEAHGLVLILIVMVLMMPMTLA--TGRLSDAVGRKPVMLAGCVGLFALA 346
LP L +H ++ IL+ + +M A G LSD GR+PV+ V L A
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVL---LVSLAGAA 84

Query: 347 IPALLLIRTGETALVFGGLLILGALLSCFTGVMPSALPA 385
+ ++ +++ G ++ G ++ TG + A A
Sbjct: 85 VDYAIMATAPFLWVLYIGRIVAG--ITGATGAVAGAYIA 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04975TCRTETA372e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.7 bits (85), Expect = 2e-04
Identities = 24/80 (30%), Positives = 36/80 (45%), Gaps = 3/80 (3%)

Query: 52 PLITRELGLS---AADLGLLTSLYFLGFACAQIPAGVLLDRYGPRRVSAALLLVAAAGAW 108
P + R+L S A G+L +LY L G L DR+G R V L AA
Sbjct: 29 PGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88

Query: 109 VFGAAHDIGAMMLGRLLIGV 128
+ A + + +GR++ G+
Sbjct: 89 IMATAPFLWVLYIGRIVAGI 108


17E4F39_05050E4F39_05130Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_05050-112-3.484644hypothetical protein
E4F39_05055-113-3.288581phosphatase PAP2 family protein
E4F39_05060211-2.910729hypothetical protein
E4F39_05065111-2.326234hypothetical protein
E4F39_0507509-1.901835DUF4102 domain-containing protein
E4F39_05080-28-1.080831hypothetical protein
E4F39_05085-37-0.227686DUF2778 domain-containing protein
E4F39_05090-29-0.156494*phosphohistidine phosphatase SixA
E4F39_05095-29-0.626371GNAT family N-acetyltransferase
E4F39_05105322-0.183839hypothetical protein
E4F39_05110226-1.429546lipoprotein
E4F39_05115133-4.981213polyphosphate kinase
E4F39_05120033-4.225479MATE family efflux transporter
E4F39_05125023-3.574794DUF2288 domain-containing protein
E4F39_05130121-3.329231AMP-dependent synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05155AUTOINDCRSYN343e-04 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.4 bits (79), Expect = 3e-04
Identities = 22/140 (15%), Positives = 44/140 (31%), Gaps = 12/140 (8%)

Query: 43 TEDELREAQRLRHSVFAEEMGAHVSGPAGLDVDPFD-PYCDHLLVRDLDTLKVVGTYRVL 101
+E + E LR F + + V G++ D +D +L +T V+ + R +
Sbjct: 13 SETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNT--VICSLRFI 70

Query: 102 PPHQAARVGRLYAEGEFDLSRLTHLRSKMVEVGRSCVHRDY------RSGAVIMALWGGL 155
+ + + +E R V + + L+ +
Sbjct: 71 ETKYPNMITGTFFP---YFKEINIPEGNYLESSRFFVDKSRAKDILGNEYPISSMLFLSM 127

Query: 156 GTYMLQNGYETMLGCASVSM 175
Y GY+ + S M
Sbjct: 128 INYSKDKGYDGIYTIVSHPM 147


18E4F39_05205E4F39_05410Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_05205627-6.412352SAM-dependent methyltransferase
E4F39_05210426-5.553901PLP-dependent aminotransferase family protein
E4F39_05220129-5.037564hypothetical protein
E4F39_05225027-3.623349flavodoxin family protein
E4F39_05230129-3.623349hypothetical protein
E4F39_05235124-1.238455malto-oligosyltrehalose synthase
E4F39_05240122-0.402115malto-oligosyltrehalose trehalohydrolase
E4F39_05245121-0.976995glycogen debranching enzyme GlgX
E4F39_05250222-1.1148021,4-alpha-glucan branching protein GlgB
E4F39_05255324-1.447868maltose alpha-D-glucosyltransferase
E4F39_052601131.421561alpha-1,4-glucan--maltose-1-phosphate
E4F39_052703115.248865hypothetical protein
E4F39_05280085.509293deoxyribodipyrimidine photolyase
E4F39_05285084.807850PHB depolymerase family esterase
E4F39_05295-1124.039289hypothetical protein
E4F39_05305-193.315262hypothetical protein
E4F39_053152112.920340hypothetical protein
E4F39_05320823-1.239162hypothetical protein
E4F39_05330525-3.505882hypothetical protein
E4F39_05335358-1.569343LysR family transcriptional regulator
E4F39_05340-143-1.268320DUF3955 domain-containing protein
E4F39_05345-137-1.550708hypothetical protein
E4F39_05350-235-0.827294hypothetical protein
E4F39_05355-234-0.740981IMP dehydrogenase
E4F39_05360-134-0.638096response regulator transcription factor
E4F39_05370019-3.294551hemagglutinin
E4F39_05375528-4.366672OmpA family protein
E4F39_05380315-0.959753hypothetical protein
E4F39_05385214-0.384129H-NS histone family protein
E4F39_053901120.024437hypothetical protein
E4F39_053951120.051721SCO family protein
E4F39_054001110.646578hypothetical protein
E4F39_054052101.080241DUF1929 domain-containing protein
E4F39_054102121.783938hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05305PRTACTNFAMLY300.031 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.4 bits (68), Expect = 0.031
Identities = 19/63 (30%), Positives = 26/63 (41%), Gaps = 2/63 (3%)

Query: 213 RTEAPPRTASIVADLDALERFGWHDDAWLRARASLDLAHAPVSIYEVHPESWLRVAAEGN 272
+ RT + A L+A RF D +L +A L + A Y + LRV EG
Sbjct: 754 AVKGKYRTHGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRA--ANGLRVRDEGG 811

Query: 273 RSA 275
S
Sbjct: 812 SSV 814


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05330PF07675310.011 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.011
Identities = 18/46 (39%), Positives = 25/46 (54%), Gaps = 3/46 (6%)

Query: 376 SYNVYRNGNKVGSS-TSTAYTDAGLIAGTAYSYTVTEIDPSLGESA 420
+Y +YRN ++ S T T Y D L G Y+Y V ++ GESA
Sbjct: 1260 TYTIYRNNTQIASGVTETTYRDPDLATGF-YTYGV-KVVYPNGESA 1303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05395HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 35/143 (24%), Positives = 61/143 (42%), Gaps = 4/143 (2%)

Query: 1 MSRQKVVLIYLIEDDEVQARCYAAILQHAGYSVRVLPDGERALREIQRAAPDLIVLDRRL 60
M+ +++ +DD L AGY VR+ + R I DL+V D +
Sbjct: 1 MTGATILVA---DDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PDIDGLEIIAWVRERCAPLPILVLTNAVLETDLVEALEAGADDYLIKPPREREFVARV-N 119
PD + +++ +++ LP+LV++ ++A E GA DYL KP E + +
Sbjct: 58 PDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117

Query: 120 ALRRRASISKQFEGTIEIGGYRI 142
AL + E + G +
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05405PF03895394e-06 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 39.4 bits (92), Expect = 4e-06
Identities = 21/77 (27%), Positives = 40/77 (51%)

Query: 1014 VARAAYGGIAAATALTMIPEVDKDKTIAVGIGGGTYRGYQAVALGATARITENIKVRAGV 1073
+++ G+A +AL+M+ + + +V G YR A+A+G +RIT+ +AGV
Sbjct: 1 LSKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGV 60

Query: 1074 GMSSGGTTAGIGASMQW 1090
++ GAS+ +
Sbjct: 61 AFNTYNGGMSYGASVGY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05410OUTRMMBRANEA1272e-37 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 127 bits (321), Expect = 2e-37
Identities = 68/151 (45%), Positives = 95/151 (62%), Gaps = 10/151 (6%)

Query: 87 FQCGEPAQPVAQQPQPAPAAAPAAEPIRLNADAMFAFDRADAASMTEQGRQQLSQLAQRL 146
F GE A VA P PAPA + L +D +F F++A + +G+ L QL +L
Sbjct: 191 FGQGEAAPVVA--PAPAPAPEVQTKHFTLKSDVLFNFNKAT---LKPEGQAALDQLYSQL 245

Query: 147 TDRHAQTVSIV--GYTDRLGSDAYNRQLSQARAKTVGDYLIAAGVPADSVHAEGRGASDP 204
++ + S+V GYTDR+GSDAYN+ LS+ RA++V DYLI+ G+PAD + A G G S+P
Sbjct: 246 SNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP 305

Query: 205 LV--QCDQ-RERAALIACLAPNRRVEVVAAG 232
+ CD ++RAALI CLAP+RRVE+ G
Sbjct: 306 VTGNTCDNVKQRAALIDCLAPDRRVEIEVKG 336


19E4F39_05480E4F39_05665Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_05480217-1.019721lipoprotein
E4F39_05485625-4.179049paar motif family protein
E4F39_05490219-3.114315hypothetical protein
E4F39_05495529-5.979724type VI secretion system tip protein VgrG
E4F39_05500531-6.944393hypothetical protein
E4F39_05505532-6.983254hypothetical protein
E4F39_05510429-5.655405hypothetical protein
E4F39_05515120-3.174192divalent metal cation transporter
E4F39_05520123-0.783679hypothetical protein
E4F39_05525021-1.226834alkaline phosphatase
E4F39_05530536-6.304908hypothetical protein
E4F39_05535535-6.302564lipoprotein
E4F39_05540431-5.697652H-NS histone family protein
E4F39_05545432-5.731087hypothetical protein
E4F39_05550644-9.837335hydrolase
E4F39_05560126-2.212879MFS transporter
E4F39_05565124-1.457057hypothetical protein
E4F39_05570738-2.677057LysR family transcriptional regulator
E4F39_05575325-1.390085hypothetical protein
E4F39_05580015-1.277721oxidoreductase
E4F39_05585-115-1.938553SCPU domain-containing protein
E4F39_05590-113-1.688565SCPU domain-containing protein
E4F39_05595016-2.454957SCPU domain-containing protein
E4F39_05600016-2.120662molecular chaperone
E4F39_05605-118-2.487496fimbrial biogenesis outer membrane usher
E4F39_05615018-0.640069SCPU domain-containing protein
E4F39_056203140.767991hypothetical protein
E4F39_056254102.391760response regulator
E4F39_056304112.173620efflux transporter outer membrane subunit
E4F39_056352111.751632DHA2 family efflux MFS transporter permease
E4F39_056401132.175991thiol:disulfide interchange protein
E4F39_056451131.919622HlyD family secretion protein
E4F39_056502121.699719hypothetical protein
E4F39_056550132.686974enoyl-[acyl-carrier-protein] reductase FabV
E4F39_056601121.950489acid phosphatase
E4F39_056652122.476294cytochrome-c peroxidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05605TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 59/353 (16%), Positives = 120/353 (33%), Gaps = 55/353 (15%)

Query: 27 VDTQMFSLVIPALLTAWGIGKGQAGLIGGATLAAGAIGGLLAGMIADRFGRVRALQITVC 86
++ + ++ +P + + + A + +IG + G ++D+ G R L +
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 87 WFSLFTFLSAFAQNFEQLLVL-KTLQGLGFGGEWTAGAVLLSETIRARHRGKAMGIVQSA 145
+ + +F LL++ + +QG G V+++ I +RGKA G++ S
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 146 WGFGWGGAVLLYTLVFSWLPPEWAWRVLFAIGVLPALLVLYIRRAIPEPPRDDAR----- 200
G G + ++ ++ W L I ++ + V ++ + + + R
Sbjct: 148 VAMGEGVGPAIGGMIAHYI----HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKG 203

Query: 201 ----------------------VAVSTSAAAAQTAPARASAKSIFDPSV------LRMTI 232
+ VS + R DP + + +
Sbjct: 204 IILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVL 263

Query: 233 VGGLIGVGAHGGYHAITTWLPTYLKTERHLSVLGTG------AYLAVIIVAFIIGCMTSA 286
GG+I G + +P +K LS G ++VII +I G
Sbjct: 264 CGGIIFGTVAG----FVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG----- 314

Query: 287 YLQDRIGRRRNLMLFSACCVVTVNLYVMLPLDNVAMLLLGFPLGFFAAGIPAT 339
L DR G +L ++V+ L + + F G+ T
Sbjct: 315 ILVDRRGPL--YVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05650PF00577455e-150 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 455 bits (1171), Expect = e-150
Identities = 166/808 (20%), Positives = 267/808 (33%), Gaps = 89/808 (11%)

Query: 37 GTLYLELVVN-ALSTGRIVPVRYRDGVYYARA----GDLAQASVRTGAQP-------DAL 84
GT +++ +N R V D LA + T + DA
Sbjct: 76 GTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 85 VDL-SRLDGVQVEYESAEQRLKLTVPPDWLPRQTLG--SPRLYDRTPAAVSFGLLFNYDV 141
V L S + + + +QRL LT+P ++ + G P L+D A L NY+
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINA----GLLNYNF 191

Query: 142 YANSPT--LGTSYTSAWTEQRLFDRWGTVTNTGVYRRDYGGGAGGVGSNRYLRYDTFWRY 199
NS +G + A+ + G Y GS ++ W
Sbjct: 192 SGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLE 251

Query: 200 SDQDRLR-TYTAGDVITGALSWSSAVRLGGVSVERDFKVRPDIVTYPLPQFSGQAAVPTA 258
D LR T GD T + + G + D + PD P G A
Sbjct: 252 RDIIPLRSRLTLGDGYTQGDIFDG-INFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQ 310

Query: 259 VDLFINGSKTTTGQVNPGPFTMNNVPFINGAGEATVVTTDALGRQVATTIPFYVANTLLQ 318
V + NG V PGPFT+N++ +G+ V +A G T+P+ L +
Sbjct: 311 VTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQR 370

Query: 319 KGLSDYSLSAGAMRRDYGIRSFSYGKFAASGTARHGLTDYLTLEGHVEGGERFALGGLGF 378
+G + YS++AG R + T HGL T+ G + +R+ G
Sbjct: 371 EGHTRYSITAGEYRSGNAQQE---KPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGI 427

Query: 379 DLGIGMFGVLGVAATQSRLAGASGRQY---------------------AFGYSYASQRF- 416
+G G L V TQ+ Q+ GY Y++ +
Sbjct: 428 GKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYF 487

Query: 417 SVSLQRIQRTNGFRDLS--------VYDLPANVAYRLVRSSTQATGALNLGALG----GT 464
+ + R NG+ + R Q T LG
Sbjct: 488 NFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSG 547

Query: 465 LGAGYFDVRGADGARTRIANLSYTRPLWRRATLYASVNKTVGEHGVAAQLQLIV--PLG- 521
Y+ D N ++ W TL S+ K + G L L V P
Sbjct: 548 SHQTYWGTSNVDEQFQAGLNTAFEDINW---TLSYSLTKNAWQKGRDQMLALNVNIPFSH 604

Query: 522 ----------EPGVVTGALARDANNSFSERVQYSRSVPSDGGLGWNL--AYAGGGSHYQ- 568
+ +++ D N + ++ D L +++ YAGGG
Sbjct: 605 WLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSG 664

Query: 569 ---QADATWRNRYFQAQGGVYGYGAGRGYARWGEVQGSVVVMDGAVLPANRVDDAFVLID 625
A +R Y A G + + V G V+ V ++D VL+
Sbjct: 665 STGYATLNYRGGYGNANIGYSHSDDIKQL--YYGVSGGVLAHANGVTLGQPLNDTVVLVK 722

Query: 626 TQGRGGVPVRYENQLVGKTDGGGHLLVPWAPSYYAGKYEIDPLDLPSNVRVPIVERRVAV 685
G V ENQ +TD G+ ++P+A Y + +D L NV + V
Sbjct: 723 APGAKDAKV--ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 686 RDHGGALVTFPIRRIVCAQIALVDAAGRPVAIGSRVLHEESGETALVGWQGETYLEGLSA 745
F R + + +P+ G+ V E S + +V G+ YL G+
Sbjct: 781 TRGAIVRAEFKARVG-IKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPL 839

Query: 746 LNHLRVR--TPDGRTCRATFAADVDAAQ 771
++V+ + C A + ++ Q
Sbjct: 840 AGKVQVKWGEEENAHCVANYQLPPESQQ 867


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05665HTHFIS631e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.3 bits (154), Expect = 1e-12
Identities = 30/122 (24%), Positives = 50/122 (40%), Gaps = 10/122 (8%)

Query: 440 RVLVVDDQEMNRIVLRYQLDALGHHARLCASGDEALRALGTAAYDVVLTDCRMPGMDGIA 499
+LV DD R VL L G+ R+ ++ R + D+V+TD MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 500 LTAAIRAH-PDARVRATPIVGVTALVSDAEHARCVDAGMTLCIGKP----TTLDALERAL 554
L I+ PD P++ ++A + + + G + KP + + RAL
Sbjct: 65 LLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 555 VE 556
E
Sbjct: 120 AE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05685TCRTETB1022e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (256), Expect = 2e-25
Identities = 71/331 (21%), Positives = 140/331 (42%), Gaps = 20/331 (6%)

Query: 41 AFMEVLDTTIVNVALPHIAGTMSASYDEATWTLTSYLVANGIVLPISGFLGRLLGRKRYF 100
+F VL+ ++NV+LP IA + W T++++ I + G L LG KR
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 101 VLCIVAFTICSFLCGIATDLGQLIVF-RVLQGLFGGGLQPNQQSIILDTF-PPEQRNRAF 158
+ I+ S + + L++ R +QG G P +++ + P E R +AF
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA-GAAAFPALVMVVVARYIPKENRGKAF 141

Query: 159 SISAVAIVVAPVLGPTLGGWITDNFSWRWVFLLNVPIGVLTSLAVIQLVEDPPWKRGRAR 218
+ + + +GP +GG I W +LL +P+ +T + V L++ K R +
Sbjct: 142 GLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPM--ITIITVPFLMKLLK-KEVRIK 196

Query: 219 GLSIDYIGITLIAIGLGCLQVMLDRGEDEDWFASTFIRTFAVLTVAGLVGATFWLLYAKK 278
G D GI L+++G+ + F +++ +F +++V + +
Sbjct: 197 G-HFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 279 PVVDLSCLKDRNFALGCVTIATFAVVLYGSAVLVPQLAQQRLGYTAMLAG-LVLSPGALL 337
P VD K+ F +G + + G +VP + + + G +++ PG +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 338 ITLEIPIVSKLMPYVQTRFLVCFGFLLLAAS 368
+ + I L+ +++ G L+ S
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVS 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05695RTXTOXIND1006e-25 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 99.5 bits (248), Expect = 6e-25
Identities = 63/414 (15%), Positives = 134/414 (32%), Gaps = 91/414 (21%)

Query: 51 KRPGKKPLVVLAIIVVLLLVGAFVW-WFATRNQVSTDDA--YTDGNAITIAPKVSGYVVA 107
+ P + ++A ++ LV AF+ V+T + G + I P + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 108 LAIDDNVYVHRGDLLLVIDQRDYQAQVDAARAQLGLAQAQLDAAQVQLDIA------HVQ 161
+ + + V +GD+LL + +A ++ L A+ + Q+ ++
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 162 FPAQYRQAQA---QIEAAQASFRQALAAYERQHAVDARATSQQAIDVADAQRLTADANVA 218
P + ++ + ++ + ++ Q + + +D A+RLT A +
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQ-----KYQKELNLDKKRAERLTVLARIN 224

Query: 219 TARAQA----------------------------RTASLVPQQIRQAQTAVEQRRQQVLQ 250
+ ++R ++ +EQ ++L
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284

Query: 251 AQA-----------------------------QLEAAQLALSYCEVRAPSDGWITRRNVQ 281
A+ +L + +RAP + + V
Sbjct: 285 AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVH 344

Query: 282 -LGSFLQAGAALFAIVTPQ---LWVTANFKESQLERMRAGDRVSVSVDAYP---NLELHG 334
G + L IV P+ L VTA + + + G + V+A+P L G
Sbjct: 345 TEGGVVTTAETLMVIV-PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVG 403

Query: 335 HVDSIQLGSGSRFSAFPPENATGNFVKIVQRVPVKIAIDGGLPRDPPLGIGLSV 388
V +I L + + G ++ + G ++ PL G++V
Sbjct: 404 KVKNINLDA-------IEDQRLGLVFNVIISIEENCLSTGN--KNIPLSSGMAV 448


20E4F39_05765E4F39_05855Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_05765-1153.327010iron-containing alcohol dehydrogenase
E4F39_057700133.443695hypothetical protein
E4F39_05775-183.359301acyl-CoA thioesterase
E4F39_05780-283.401970branched-chain amino acid ABC transporter
E4F39_057850103.934811ABC transporter ATP-binding protein
E4F39_057901123.1832515-deoxy-glucuronate isomerase
E4F39_058001141.868874myo-inosose-2 dehydratase
E4F39_058052141.5992483D-(3,5/4)-trihydroxycyclohexane-1,2-dione
E4F39_058151142.8297995-dehydro-2-deoxygluconokinase
E4F39_058202123.657643sugar ABC transporter ATP-binding protein
E4F39_058250194.693314ABC transporter permease
E4F39_058301174.822587sugar ABC transporter substrate-binding protein
E4F39_058351174.627896MurR/RpiR family transcriptional regulator
E4F39_058401154.259433inositol 2-dehydrogenase
E4F39_05850194.282234oxidoreductase
E4F39_058551153.075590hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05865PHPHTRNFRASE290.040 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 28.6 bits (64), Expect = 0.040
Identities = 15/90 (16%), Positives = 34/90 (37%), Gaps = 5/90 (5%)

Query: 99 NGATVMVYGEVAGTIQGSPAPLYQRPRFVDDAQW----DAYAERVDAFARYTRAQGV-RL 153
T + + G + P + + A+ ++ +A+ +
Sbjct: 207 KEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKD 266

Query: 154 GYHHHMGAYVESPADVDRLMASTSDAVGLL 183
G H + A + +P DVD ++A+ + +GL
Sbjct: 267 GAHVELAANIGTPKDVDGVLANGGEGIGLY 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05880PF05272300.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.015
Identities = 15/42 (35%), Positives = 19/42 (45%), Gaps = 7/42 (16%)

Query: 41 LLGDNGAGKSTLIKTLAGVHPPSDGQYLVDGKPVLFDSPKDA 82
L G G GKSTLI TL G+ + D + KD+
Sbjct: 601 LEGTGGIGKSTLINTLVGL------DFFSDT-HFDIGTGKDS 635


21E4F39_06410E4F39_06495Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_06410-214-3.624916Flp family type IVb pilin
E4F39_06415-113-3.153113prepilin peptidase
E4F39_06420114-0.479794pilus assembly protein
E4F39_064254152.021518Flp pilus assembly protein CpaB
E4F39_064304151.543523type II and III secretion system protein family
E4F39_064354151.827112fimbrial protein
E4F39_064403131.829382CpaF family protein
E4F39_064454183.024688fimbriae-related outer membrane protein
E4F39_064501152.768584type II secretion system F family protein
E4F39_06455-2120.631559pilus assembly protein
E4F39_06460-2112.202799DUF3613 domain-containing protein
E4F39_06465-1111.816148hypothetical protein
E4F39_064701132.537913sigma-54-dependent Fis family transcriptional
E4F39_064751133.141185DUF2968 domain-containing protein
E4F39_064801133.130559RNA chaperone Hfq
E4F39_064852123.760649hypothetical protein
E4F39_064901133.376733hypothetical protein
E4F39_064950123.202052DUF1571 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06465PREPILNPTASE543e-11 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 53.7 bits (129), Expect = 3e-11
Identities = 31/124 (25%), Positives = 52/124 (41%), Gaps = 10/124 (8%)

Query: 4 LFSIGFFFAWAAAVAIADCRDRRIPNELVLVGLAAVIIFTVCRQNPFGTTLSGALIGGAV 63
+ A+ D +P++L L L ++F + F +L A+IG
Sbjct: 134 TLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL--LGGF-VSLGDAVIGAMA 190

Query: 64 GLVSLFPFFAL-------RVMGAADVKVFAVLGAWCGLPALPRLWVVASVAAGVHALALM 116
G + L+ + MG D K+ A LGAW G ALP + +++S+ + L+
Sbjct: 191 GYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLI 250

Query: 117 LLTR 120
LL
Sbjct: 251 LLRN 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06480BCTERIALGSPD1382e-37 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 138 bits (349), Expect = 2e-37
Identities = 58/249 (23%), Positives = 111/249 (44%), Gaps = 11/249 (4%)

Query: 160 VQVDVRVVEFSRSVLKQAGLNFFKQNNGFTFGSFAPAGLASVTGGG----TSSMSVSANI 215
V V+ + E + G+ + +N G T + + +++ G S+
Sbjct: 347 VLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLA 406

Query: 216 PIASAFN-LVVGSATRGLFADLSILEANNLARVLAQPTLVALSGQSASFLAGGEIPVPVP 274
S+FN + G L+ L ++ +LA P++V L A+F G E+PV
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 275 QSLGT-----ISIDWKPYGVGLTLTPTVLSPRRIALKVAPESSQLDFVHSITINGVTVPA 329
+ +++ K G+ L + P + + L++ E S + S + +
Sbjct: 467 SQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAAS-STSSDLGAT 525

Query: 330 LTTRRADTTVELGDGESFAIGGLIDRETTSNVDKVPFLGDLPIIGTFFKHLSYQQNDKEL 389
TR + V +G GE+ +GGL+D+ + DKVP LGD+P+IG F+ S + + + L
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 390 VIIVTPHLV 398
++ + P ++
Sbjct: 586 MLFIRPTVI 594


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06485HTHFIS385e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.3 bits (89), Expect = 5e-05
Identities = 11/63 (17%), Positives = 26/63 (41%)

Query: 79 AALRVSHPGLPIVALGSLGEPESALAALRAGVRDFIDFSAPAEDALRITRGLLDHVGDQP 138
++ + P LP++ + + +A+ A G D++ + + I L +P
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 139 SRH 141
S+
Sbjct: 127 SKL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06505PYOCINKILLER320.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 31.7 bits (71), Expect = 0.004
Identities = 29/86 (33%), Positives = 37/86 (43%), Gaps = 3/86 (3%)

Query: 214 LMNQLKLAPAVRAEIRNDATRIAAAARARQRA-LARPGAPGAAASAGATLAASAAGSNGG 272
MN L A A + R AAA A+++A AA A T A A GS
Sbjct: 203 RMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQ--AAIRAANTYAMPANGSVVA 260

Query: 273 AAAGKGAVAGAGASAPGAAATATAAA 298
AAG+G + A +A A A + A A
Sbjct: 261 TAAGRGLIQVAQGAASLAQAISDAIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06520HTHFIS2973e-98 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 297 bits (763), Expect = 3e-98
Identities = 130/475 (27%), Positives = 204/475 (42%), Gaps = 53/475 (11%)

Query: 19 ADIVDRVARCMSSFDVEVIRADN-EELSAERTAMRPSLAIISVSMIE-SGAAFLRTWQAE 76
A I + + +S +V N L A L + V M + + L +
Sbjct: 13 AAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKA 72

Query: 77 -IGMPVVWVGA--------------ARDHDPSLYPPEYSHILPLDFTCAELRGMISKLAV 121
+PV+ + A A D+ P P + + ++ + +++
Sbjct: 73 RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK--PFDLTELIGIIGRA------LAEPKR 124

Query: 122 QLRAHAAKALEPSTLVAHSDCMQALLQEVDTFADCDTNVLLHGETGVGKERIAQLLHEKH 181
+ + + LV S MQ + + + D +++ GE+G GKE +A+ LH+ +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-Y 183

Query: 182 SRYGMGEFVPVNCGAIPDGLFESLFFGHAKGSFTGAVGTHKGYFEQAAGGTLFLDEVGDL 241
+ G FV +N AIP L ES FGH KG+FTGA G FEQA GGTLFLDE+GD+
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 242 PLYQQVKLLRVLEDGAVLRIGATAPVKVDFRLVAASNKKLPQLVKDGLFRADLYYRLAVI 301
P+ Q +LLRVL+ G +G P++ D R+VAA+NK L Q + GLFR DLYYRL V+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 302 ELSIPSLEERGPVDKIALFKSFVASIVGEDRLAALPELPYWLAEAVADSYFPGNVRELRN 361
L +P L +R D L + FV E + E + +PGNVREL N
Sbjct: 304 PLRLPPLRDR-AEDIPDLVRHFVQQAEKEGL--DVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 362 LAERVGV------------------------TVRQTGGWDTARLQRLIAHARSAAQPAPA 397
L R+ + + + + + +
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 398 ESAPDVFVDRSKWDMTERNRVIAALDANGWRRQDTAQHLGISRKVLWEKMRKYQI 452
++ P + E ++AAL A + A LG++R L +K+R+ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06525IGASERPTASE280.042 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.042
Identities = 19/108 (17%), Positives = 32/108 (29%), Gaps = 9/108 (8%)

Query: 119 LFQQKAFWRVIRTASEARAEAVYRDFAKQSETLAVNELQAAKLESQKALTDRQIAVA--- 175
++A V + + T K E K T++ V
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVT 1126

Query: 176 ------QERASRLQADLSIAREQRAAVATRQKDKLDETVALREQKSER 217
QE++ +Q ARE V ++ T A EQ ++
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06530PF00577290.014 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.0 bits (65), Expect = 0.014
Identities = 19/123 (15%), Positives = 28/123 (22%), Gaps = 9/123 (7%)

Query: 83 GAHGGGGRPGGREGGGHGPYGSHGGSREPRGDGGGYGAREPRGDGGYGSRESRGDGGYGS 142
SH + G YG + Y + GG G+
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 143 REPRGDGGYGAREPREPREPRESYGAPQEGASTPSGTQERNGNGNGPVIVTRRRRSLGPT 202
G R YG G S ++ +G V+ +LG
Sbjct: 663 SGSTGYATLNYRGG---------YGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 713

Query: 203 DGQ 205

Sbjct: 714 LND 716


22E4F39_06860E4F39_07420Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_068602151.867325PrkA family serine protein kinase
E4F39_068653142.602619YeaH/YhbH family protein
E4F39_068701133.317970SpoVR family protein
E4F39_068751133.750035hypothetical protein
E4F39_06880-290.380356MFS transporter
E4F39_06885-18-0.666999ABC transporter substrate-binding protein
E4F39_06890-19-2.014513hypothetical protein
E4F39_0689509-3.280554ABC transporter ATP-binding protein
E4F39_06900010-4.710525ABC transporter permease subunit
E4F39_06905010-3.940163hypothetical protein
E4F39_06910-110-2.823767Flp family type IVb pilin
E4F39_06915-111-1.478402fimbrial protein
E4F39_06925-1110.698810Flp pilus assembly protein CpaB
E4F39_069300121.174901type II and III secretion system protein family
E4F39_069352112.399331hypothetical protein
E4F39_069402124.665561fimbrial protein
E4F39_06945-2134.124473CpaF family protein
E4F39_06950-2144.504368pilus assembly protein
E4F39_06955-1155.477275type II secretion system F family protein
E4F39_069603145.170851tetratricopeptide repeat protein
E4F39_069652154.904011pilus assembly protein
E4F39_069704174.180791pilus assembly protein TadE
E4F39_069754145.275815transporter substrate-binding domain-containing
E4F39_069851134.394982esterase
E4F39_06990-2103.680800hypothetical protein
E4F39_06995-2113.367570amino acid ABC transporter permease
E4F39_07000-1103.051801M23 family peptidase
E4F39_07005091.625520hypothetical protein
E4F39_07010-1162.075839TetR family transcriptional regulator
E4F39_070200183.818324MexX/AxyX family multidrug efflux RND
E4F39_07030-2124.003969multidrug efflux RND transporter permease
E4F39_07035-2112.636770efflux transporter outer membrane subunit
E4F39_07040-1123.294201hypothetical protein
E4F39_07045-2112.480964hypothetical protein
E4F39_07050-1112.462988type-1 fimbrial protein
E4F39_070550112.583943fimbrial biogenesis outer membrane usher
E4F39_07060-1111.719159molecular chaperone
E4F39_07065292.196097fimbrial protein
E4F39_070703110.842210thiopurine S-methyltransferase
E4F39_070752120.617546lasso peptide isopeptide bond-forming cyclase
E4F39_070801111.284110lasso peptide biosynthesis B2 protein
E4F39_070851121.266075capistruin family lasso peptide
E4F39_070900101.722819hypothetical protein
E4F39_07095-2131.502441helix-turn-helix domain-containing protein
E4F39_07100-3151.124787LacI family transcriptional regulator
E4F39_071051230.936259sugar ABC transporter ATP-binding protein
E4F39_07110213-0.078744ABC transporter permease
E4F39_0711518-0.2398003-hydroxyanthranilate 3,4-dioxygenase
E4F39_07120210-0.747281erythritol/L-threitol dehydrogenase
E4F39_071250120.334640hypothetical protein
E4F39_071300110.919734SDR family oxidoreductase
E4F39_071353141.504622hypothetical protein
E4F39_071403162.413265RNA polymerase factor sigma-70
E4F39_071454163.083033MbtH family NRPS accessory protein
E4F39_071556144.491032TauD/TfdA family dioxygenase
E4F39_071654135.109895Fe(3+)-hydroxamate ABC transporter permease
E4F39_071703155.580396siderophore-iron reductase FhuF
E4F39_071751144.637021iron-siderophore ABC transporter
E4F39_071801134.207822DUF1993 domain-containing protein
E4F39_071851136.228208hypothetical protein
E4F39_071902146.148931cyclic peptide export ABC transporter
E4F39_071953146.335034hypothetical protein
E4F39_072004135.968020amino acid adenylation domain-containing
E4F39_072056136.786799amino acid adenylation domain-containing
E4F39_072101125.706998TonB-dependent siderophore receptor
E4F39_07215-2113.641954N(5)-hydroxyornithine transformylase PvdF
E4F39_072202146.375832cobyrinate a,c-diamide synthase
E4F39_072252156.731125cob(I)yrinic acid a,c-diamide
E4F39_072302156.580913cobalamin biosynthesis protein CbiG
E4F39_072352165.788700HoxN/HupN/NixA family nickel/cobalt transporter
E4F39_072402165.872144cobalamin biosynthesis protein CobW
E4F39_072452165.977515cobaltochelatase subunit CobN
E4F39_072501144.757658hypothetical protein
E4F39_072550132.220282magnesium chelatase
E4F39_07260-1122.715099VWA domain-containing protein
E4F39_072651113.087573hypothetical protein
E4F39_072702113.360356integrase
E4F39_072751123.567101type IV secretion protein Rhs
E4F39_072801113.286375transposase
E4F39_072851104.171374Tn3 family transposase
E4F39_072901114.272048alpha/beta hydrolase
E4F39_072950105.332186hypothetical protein
E4F39_07300-284.062812hypothetical protein
E4F39_07305330-3.056900hypothetical protein
E4F39_07310331-3.878817chitinase
E4F39_07315430-5.707781precorrin-2 C(20)-methyltransferase
E4F39_07320430-6.175099precorrin-8X methylmutase
E4F39_07325430-6.318265precorrin-3B synthase
E4F39_07335019-3.429930bifunctional cobalt-precorrin-7
E4F39_07340115-2.309483cobalt-precorrin-5B (C(1))-methyltransferase
E4F39_073452154.133927cobalt-precorrin-6A reductase
E4F39_073502154.075938precorrin-4 C(11)-methyltransferase
E4F39_073551154.273772glycosyl hydrolase
E4F39_073600135.530807MFS transporter
E4F39_073651126.042766MarR family transcriptional regulator
E4F39_073703116.967071branched-chain amino acid ABC transporter
E4F39_073751166.913736MarR family transcriptional regulator
E4F39_073802167.804598glutathione S-transferase
E4F39_073853167.378015LysR family transcriptional regulator
E4F39_073903145.483693heme-binding protein
E4F39_073951125.195949SDR family oxidoreductase
E4F39_074001125.053803short-chain dehydrogenase
E4F39_074051113.695470carbamate kinase
E4F39_074101123.503911ornithine carbamoyltransferase
E4F39_074150123.585663arginine deiminase
E4F39_074200103.130946arginine-ornithine antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06925TCRTETB310.010 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.010
Identities = 24/111 (21%), Positives = 44/111 (39%), Gaps = 11/111 (9%)

Query: 268 LVNTAGMHAKTASNVMTAALFVYMLMQPVFGALSDKIGRR----MSMILFGTGAVIGTVP 323
+ N + + V TA + + + V+G LSD++G + +I+ G+VIG V
Sbjct: 40 IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG 99

Query: 324 LMHALGGVTSPLVAFGLIVVALAIVSFYTSISGLIKAEMFPPEVRAMGVGL 374
L+ + +F ++ ++ A P E R GL
Sbjct: 100 HSFF------SLLIMARFIQGAGAAAF-PALVMVVVARYIPKENRGKAFGL 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06960PREPILNPTASE329e-04 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 32.1 bits (73), Expect = 9e-04
Identities = 31/148 (20%), Positives = 49/148 (33%), Gaps = 18/148 (12%)

Query: 20 LVASWTLASLALADLRTRRLATFAVALVGALYAALALAGAPGDGGFASHAALGAAA---- 75
L+ +W L +L DL L + L+ L G A +GA A
Sbjct: 138 LLLTWVLVALTFIDLDKMLLP--DQLTLPLLWGGLLFNLLGGFVSLGD-AVIGAMAGYLV 194

Query: 76 ----FALGAAMFRAGWIAGGDVKLAAVVFLWAGPAHAWPVAFAIGVGGLAVGAVCIAAGR 131
+ + + GD KL A + W G V + G +G I
Sbjct: 195 LWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRN 254

Query: 132 VPRVLAWFAPARGVPYGVALAAGGLLAV 159
++ +P+G LA G +A+
Sbjct: 255 H-------HQSKPIPFGPYLAIAGWIAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06970BCTERIALGSPD1434e-39 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 143 bits (361), Expect = 4e-39
Identities = 68/283 (24%), Positives = 116/283 (40%), Gaps = 16/283 (5%)

Query: 127 VVQTLKPYLRQQEALVNRLTLARPIQVHLRVRITEVDRNITQQLGINWSALGA------- 179
+V + E ++ +L + RP QV + I EV LGI W+ A
Sbjct: 322 IVTAAPDVMNDLERVIAQLDIRRP-QVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTN 380

Query: 180 SGNFVGGLFNGRTLFDTASKAFDLSPSGAFSVVGGFHTSRYSIDG--VLDALDQEGLITM 237
SG + G ++ S A S G Y + +L AL +
Sbjct: 381 SGLPISTAIAGANQYNKDGTVSSSLAS-ALSSFNGIAAGFYQGNWAMLLTALSSSTKNDI 439

Query: 238 LAEPNLTAISGQTASFLAGGEFPIPVAQDTTGA----ITIQFKPYGVSLDFTPTVLADNR 293
LA P++ + A+F G E P+ TT T++ K G+ L P + +
Sbjct: 440 LATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDS 499

Query: 294 ISLKVRPEVSEIDPTNSVTTGSIKVPALTVRRVDTTVELSSGQSFAIGGLLQSKSSDVLA 353
+ L++ EVS + S T+ + R V+ V + SG++ +GGLL SD
Sbjct: 500 VLLEIEQEVSSVADAASSTSSDLGA-TFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTAD 558

Query: 354 ELPGLARLPVLGKLFSSRNYLNDKTEVVVIVTPYIVQPANPGE 396
++P L +PV+G LF S + K +++ + P +++ +
Sbjct: 559 KVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06980HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.001
Identities = 29/165 (17%), Positives = 52/165 (31%), Gaps = 20/165 (12%)

Query: 22 GARLVAIVADAASDEVIRNLIADQAMTGAQVARGGIDDAIALMRDLSHGPQHLLVDVSGA 81
GA ++ DAA V+ ++ + + ++ DV
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYD--VRITSNAATLWRWIA--AGDGDLVVTDVV-- 56

Query: 82 AMP----LSDLARLADVCDPSVNVIVIGERNDVGLFRSMLRIGVRDYLVKPL----TVEL 133
MP L R+ P + V+V+ +N G DYL KP + +
Sbjct: 57 -MPDENAFDLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 134 VHRALSAADPNAAARAGKAIGFVGARGGVGVTSIAVALARHLADR 178
+ RAL+ + + + G S A+ + R
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVG----RSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06985PF05272300.034 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.034
Identities = 18/50 (36%), Positives = 26/50 (52%), Gaps = 4/50 (8%)

Query: 303 IVISGGTGSGKTTLLNAL---SHFIDSHERIVTIEDAAELQLQQPHVVSL 349
+V+ G G GK+TL+N L F D+H I T +D+ E Q+ L
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYE-QIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07000SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 31.1 bits (70), Expect = 0.004
Identities = 20/83 (24%), Positives = 32/83 (38%)

Query: 54 SVAESALAAGDAELAATLFERALKADPRSLPAQVGLGDAMYQTGELARAGVLYAQAAAAA 113
S+A + +G E A +F+ D +GLG G+ A Y+ A
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 114 PDDPRAQLGLARVALRERHLDDA 136
+PR A L++ L +A
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEA 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07010PYOCINKILLER320.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.1 bits (72), Expect = 0.004
Identities = 30/132 (22%), Positives = 49/132 (37%), Gaps = 4/132 (3%)

Query: 23 RVAAARNELQNAADAAALAGAASLEAGAGAPAWAAAASAAAAALSLNASDGAALSSGDVQ 82
A A+ + + A A AA+ A PA + + AA + + GAA + +
Sbjct: 226 AAAEAKRKAEEQARQQAAIRAANTYA---MPANGSVVATAAGRGLIQVAQGAASLAQAIS 282

Query: 83 TGYWNVTGVPAGLEPTTLAPGEYDVPAVQATVTRAPNQNGGPLSLLMGGLLGLVGTPAAA 142
V G P+ +A G + T + +Q + +G +G P +
Sbjct: 283 DAI-AVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGMDAAKLGLPPSV 341

Query: 143 TAVAVAGAPATV 154
AVA A TV
Sbjct: 342 NLNAVAKASGTV 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07050HTHTETR1175e-35 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 117 bits (295), Expect = 5e-35
Identities = 53/210 (25%), Positives = 100/210 (47%), Gaps = 4/210 (1%)

Query: 1 MARKTREESLNTKNRILDAAELVLLEKGVGQTAMADIAEAAGMSRGAVYGHFNGKIEVCV 60
MARKT++E+ T+ ILD A + ++GV T++ +IA+AAG++RGA+Y HF K ++
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AVCDRAFSRAVEGFDLSDERPA---LATLRLAASHYLHQCGEPGSMQRVLEILYMKCEQS 117
+ + + S E + L+ LR H L + ++EI++ KCE
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 118 EENAPLMRRRALYELQTLRIAKALLRRAVAAGELDASLDVHLAGVYLLSLLEGIFGSMIW 177
E A + + + L++ + L+ + A L A L A + + + G+ + ++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 178 TTRLRGDRWRDAEAMLDAGVDTLRASPALR 207
+ D ++A + ++ P LR
Sbjct: 181 APQSF-DLKKEARDYVAILLEMYLLCPTLR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07055RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 19/133 (14%), Positives = 41/133 (30%), Gaps = 5/133 (3%)

Query: 67 EVRARVAGIVTARTYEEGQEVKRGAVLFRIDPAPFKAARDAAAGALEKAQAAHLAALDKR 126
E++ IV +EG+ V++G VL ++ +A +L +A+
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 127 RRYDELVRDRAVSERDHTEALADERQAKAAVASARAELA-----RAQLQLDYATVTAPID 181
R + + E + + + + + + Q +L+ A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 182 GRARRALVTEGAL 194
R E
Sbjct: 218 TVLARINRYENLS 230



Score = 34.8 bits (80), Expect = 5e-04
Identities = 18/100 (18%), Positives = 39/100 (39%), Gaps = 10/100 (10%)

Query: 102 KAARDAAAGALEKAQAAHLAALDKRRRYDELVRDRAVSERDHTEALADERQAKAAVASAR 161
LE+ ++ L+A ++ + +L + E L RQ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFK---------NEILDKLRQTTDNIGLLT 315

Query: 162 AELARAQLQLDYATVTAPIDGR-ARRALVTEGALVGQDQA 200
ELA+ + + + + AP+ + + + TEG +V +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07060ACRIFLAVINRP10790.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1079 bits (2791), Expect = 0.0
Identities = 516/1032 (50%), Positives = 701/1032 (67%), Gaps = 6/1032 (0%)

Query: 1 MARFFIDRPVFAWVISLFIMLGGIFAIRALPVAQYPDIAPPVVSLYATYPGASAQVVEES 60
MA FFI RP+FAWV+++ +M+ G AI LPVAQYP IAPP VS+ A YPGA AQ V+++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTAVIEREMNGVPGLLYTSATS-SAGQASLSLTFKQGVSADLAAVDVQNRLKIVEARLPE 119
VT VIE+ MNG+ L+Y S+TS SAG +++LTF+ G D+A V VQN+L++ LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 PVRRDGISIEKAADNAQIIVSLTSEDGRLSGVELGEYASANVLQALRRVEGVGKVQFWGA 179
V++ GIS+EK++ + ++ S++ + ++ +Y ++NV L R+ GVG VQ +GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 EYAMRIWPDPVKMAALGLTASDIASAVRAHNARVTIGDVGRSAVPDSAPIAATVLADAPL 239
+YAMRIW D + LT D+ + ++ N ++ G +G + + A+++A
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 240 TTPDAFGAIALRARADGSTLYLRDVARIEFGGNDYNYPSFVNGKTATGMGIKLAPGSNAV 299
P+ FG + LR +DGS + L+DVAR+E GG +YN + +NGK A G+GIKLA G+NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 300 ATEKRVRATMEELAKFFPPGVKYQIPYETASFVRVSMSKVVTTLVEAGVLVFAVMFLFMQ 359
T K ++A + EL FFP G+K PY+T FV++S+ +VV TL EA +LVF VM+LF+Q
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 360 NFRATLIPTLVVPVALLGTFGAMLAAGFSINVLTMFGMVLAIGILVDDAIVVVENVERLM 419
N RATLIPT+ VPV LLGTF + A G+SIN LTMFGMVLAIG+LVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 420 VEEKLPPYEATVKAMKQISGAIVGITVVLTSVFVPMAFFGGAVGNIYRQFAFALAVSIGF 479
+E+KLPP EAT K+M QI GA+VGI +VL++VF+PMAFFGG+ G IYRQF+ + ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 480 SAFLALSLTPALCATLLKPVADDHHE-KDGFFGWFNRFVARSTHRYTRRVGRVLERPLRW 538
S +AL LTPALCATLLKPV+ +HHE K GFFGWFN S + YT VG++L R+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 539 LVVYGALTAAAALLITKLPAAFLPDEDQGNFMVMVIRPQGTPLAETMQSVRRVEEYVRTH 598
L++Y + A +L +LP++FLP+EDQG F+ M+ P G T + + +V +Y +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 SPSAY--TFALGGYNLYGEGPNGGMIFVTMKDWKERKRARDQVQAIIAEINAHFAGTPNT 656
+ F + G++ G+ N GM FV++K W+ER + +A+I +
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 657 MVFAINMPALPDLGLTGGFDFRLQDRGGLGYGAFVAAREKLLAEGRKDPV-LTDLMFAGT 715
V NMPA+ +LG GFDF L D+ GLG+ A AR +LL + P L + G
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 716 QDAPQLKLDIDRAKASALGVSMEEINATLAVMFGSDYIGDFMHGSQVRRVIVQADGRHRL 775
+D Q KL++D+ KA ALGVS+ +IN T++ G Y+ DF+ +V+++ VQAD + R+
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 776 DAADVTKLRVRNAKGEMVPLAAFATLHWTMGPPQLTRYNGFPSFTINGAASAGHSSGEAM 835
DV KL VR+A GEMVP +AF T HW G P+L RYNG PS I G A+ G SSG+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 836 AAIERIASTLPAGTGYAWSGQSYEERLSGAQAPMLFALSVLVVFLALAALYESWSIPFAV 895
A +E +AS LPAG GY W+G SY+ERLSG QAP L A+S +VVFL LAALYESWSIP +V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 896 MLVVPLGVIGAVAGVTLRGMPNDIYFKVGLIATIGLSAKNAILIVEVAKDLVAQR-MSLA 954
MLVVPLG++G + TL ND+YF VGL+ TIGLSAKNAILIVE AKDL+ + +
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 955 DAALEAARLRLRPIVMTSLAFGVGVLPLAFATGAASGAQIAIGTGVLGGVISATLFAIFL 1014
+A L A R+RLRPI+MTSLAF +GVLPLA + GA SGAQ A+G GV+GG++SATL AIF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1015 VPLFFVCVGRVF 1026
VP+FFV + R F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07065RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.002
Identities = 18/104 (17%), Positives = 34/104 (32%), Gaps = 2/104 (1%)

Query: 379 APRLTLPIFAGGRNRANLDVADARKHIAVAEYEKTIQTAFREV--ADALAARDQIDAQLA 436
P L LP +N + +V I Q +E+ A R + A++
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN 224

Query: 437 AQQAVYGADAERLRLAQRRYDSGVASYLELLDAQRSTFESGQEL 480
+ + + RL + +L+ + E+ EL
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07090PF005776780.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 678 bits (1751), Expect = 0.0
Identities = 227/851 (26%), Positives = 353/851 (41%), Gaps = 60/851 (7%)

Query: 2 RIRHSFLCVFMLAAGSHARATEFNASFLSIDGRNDVDLSQFAQADYTLPGTYLLDVQVND 61
+R C F A + FN FL+ D + DLS+F PGTY +D+ +N+
Sbjct: 27 FVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNN 86

Query: 62 VFFGLQPIEFVAHDDGQGARACVAPELVAQFGLKKSLVENLPRTMGGRCADLASL-DGVT 120
+ + + F D QG C+ +A GL + V + C L S+ T
Sbjct: 87 GYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDAT 146

Query: 121 IRYQKGEGRLKITIAQAALEFADASYLPPERWSDGVDGAMLDYRVFANANHAFGRGAQQN 180
+ G+ RL +TI QA + Y+PPE W G++ +L+Y + N R +
Sbjct: 147 AQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYN--FSGNSVQNRIGGNS 204

Query: 181 NAVQAYGTIGANWGAWRFRGDYQAQ-TRAGGAVYAERAFRFNQLYAYRALPSIRSTLSFG 239
+ G N GAWR R + + + ++ ++ + R + +RS L+ G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 240 EIYVDSDIFSTFSMSGVAMKSDDRMLPPSMRGYAPLVTGVARTNAIVKVMQDSRVLYMTK 299
+ Y DIF + G + SDD MLP S RG+AP++ G+AR A V + Q+ +Y +
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 300 VSPGAFALSNLN-TSVQGTLDVVVEEEDGTVQRFQVATASVPFLAREGQLRYKTAIGQPR 358
V PG F ++++ G L V ++E DG+ Q F V +SVP L REG RY G+ R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 359 TFGGAGITPWFGFAEAAYGLPFDVTVYGGLIAASGYTSVAFGVGRDFGRFGALSADVTHA 418
+ P F + +GLP T+YGG A Y + FG+G++ G GALS D+T A
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 419 RATLWWNGRTKRGNSYRINYSKHVDALDADVRFFGYRFSERDYTNFQQFSGDPTASGL-- 476
+TL + G S R Y+K ++ +++ GYR+S Y NF +
Sbjct: 445 NSTL-PDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 503

Query: 477 -------------------ANGKQRYSAMLSKRFGDTST-YFSYDQTTYW-ARPSDRRIG 515
N + + ++++ G TST Y S TYW D +
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 516 VTLTRAFSLGALKSVNLGFSAFRTQGAGGGGNQVSLTATLPLGER-----------QTLT 564
L AF + L +S + G ++L +P + +
Sbjct: 564 AGLNTAFEDI---NWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 565 SSVSAGEGGTSVNAGYLYDGA---NGRTYQLYGGTTDGRASANASLRQRTPSYQ-----L 616
S+S G N +Y N +Y + G G + S T +Y+
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 617 TAQASTVANAYASASLEVDGSFVATRYGVTAHANGNAGDTRLLVSTDGVPGVPLS-GSYA 675
S ++ V G +A GVT DT +LV G + +
Sbjct: 681 NIGYS-HSDDIKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGAKDAKVENQTGV 737

Query: 676 RTNARGYAVIDGVSPYNVYDATVSVEKLGLDTDVTNPIQRTVLTDGAIGYIRFNAARGRN 735
RT+ RGYAV+ + Y + L + D+ N + V T GAI F A G
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 736 VFVTLTGDGGAPVPFGASVQDAATGKELGIVGEAGAAYLTQVQPRAKLVVRAGAKTICT- 794
+ +TLT + P+PFGA V + + GIV + G YL+ + K+ V+ G +
Sbjct: 798 LLMTLTHNNK-PLPFGAMVTS-ESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHC 855

Query: 795 --PAALPDTLQ 803
LP Q
Sbjct: 856 VANYQLPPESQ 866


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07170DHBDHDRGNASE1232e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 123 bits (310), Expect = 2e-36
Identities = 81/252 (32%), Positives = 118/252 (46%), Gaps = 15/252 (5%)

Query: 9 GRSFLVTGASSGIGRAAAVALRGCGARVVAAARNARELERLAHETGC-----EPLELDVG 63
G+ +TGA+ GIG A A L GA + A N +LE++ E DV
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 64 CDASVRAALSS-ERMRDAFDGLINCAGVTSLAAAIDTTADEFDRVMAVNARGAMLVARHV 122
A++ + ER D L+N AGV + +E++ +VN+ G +R V
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARAMIRAGRGGSIVNVSSQAALVALPSHLAYCASKAALDAMTRVLCVELGPHGIRVNSVN 182
++ M R GSIV V S A V S AY +SKAA T+ L +EL + IR N V+
Sbjct: 128 SKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PTVTLTPMAERAWSDPHASGPMLA--------AIPLGRFARVADVVAPILFLSSDAAAMV 234
P T T M W+D + + ++ IPL + A+ +D+ +LFL S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 235 SGVALPVDGGYT 246
+ L VDGG T
Sbjct: 247 TMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_072152FE2SRDCTASE576e-12 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 57.4 bits (138), Expect = 6e-12
Identities = 51/186 (27%), Positives = 73/186 (39%), Gaps = 24/186 (12%)

Query: 78 RALVSQWSKYYFNLAASAGFAAALLLGRPLDMAPQRMRVALRGGMPVALLFEADALRPAQ 137
+ L+S W+++Y L A L + LD++P+ VA F D
Sbjct: 89 KPLISLWAQWYIGLMVPPLMLALLTQEKALDVSPEHFHAEFHETGRVA-CFWVDVCEDKN 147

Query: 138 AEPAS---RYAALVDH-LRATIDTLAALAKLSPRVLWANAGNLLD-YLFEQCAHAPRAGA 192
A P S R L+ L + L A +++ +++W+N G L++ YL E G
Sbjct: 148 ATPHSPQHRMETLISQALVPVVQALEATGEINGKLIWSNTGYLINWYLTEM---KQLLGE 204

Query: 193 DA------AWLFGPVDSRGEANPLRLPVRRVKPCSARLPDPFRARRVCCLRNEIPGEDQL 246
A F + GE NPL V L D RR CC R +P Q
Sbjct: 205 ATVESLRHALFFEKTLTNGEDNPLWRTV--------VLRDGLLVRRTCCQRYRLPDVQQ- 255

Query: 247 CGSCPL 252
CG C L
Sbjct: 256 CGDCTL 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07220FERRIBNDNGPP1132e-31 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 113 bits (285), Expect = 2e-31
Identities = 77/264 (29%), Positives = 112/264 (42%), Gaps = 15/264 (5%)

Query: 59 PARIVVLEFMFAEDLAALDITPVGMADPAYYPIWIGYDDARFARVSDVGTRQEPSLEAIA 118
P RIV LE++ E L AL I P G+AD Y +W+ + V DVG R EP+LE +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVS-EPPLPDSVIDVGLRTEPNLELLT 93

Query: 119 AAKPDLILGVGLRHAPIFDALSWIAPTVLFKYSPNYIEDGRQVTQYDWARAILRTIGCLT 178
KP ++ + P + L+ IAP F +S DG+Q AR L + L
Sbjct: 94 EMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFS-----DGKQ--PLAMARKSLTEMADLL 145

Query: 179 GRARDARAVQARVDAGLARDARRIAAAGRAGERVAWLQELGLPDRYWAFTGNSASAGIAR 238
A A+ + + R + G R L L P F NS I
Sbjct: 146 NLQSAAETHLAQYEDFIRSMKPRFV---KRGARPLLLTTLIDPRHMLVFGPNSLFQEILD 202

Query: 239 ALGLE-PWPGEPTREGTAYVTSEDLLKQPDLAVLFVSATEPGVPLDAKLDSSIWRFVPAR 297
G+ W GE G+ V+ + L D+ VL +DA + + +W+ +P
Sbjct: 203 EYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKD-MDALMATPLWQAMPFV 261

Query: 298 RAGRVALVERNIWGFGGPMSALRL 321
RAGR V +W +G +SA+
Sbjct: 262 RAGRFQRVP-AVWFYGATLSAMHF 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07310HTHFIS431e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 42.9 bits (101), Expect = 1e-06
Identities = 39/171 (22%), Positives = 63/171 (36%), Gaps = 14/171 (8%)

Query: 3 AAYPFSALIGQ-AALQQALLLVA-VDPGLGGVLVSGPRGTAKSTAARALAELLP--EGRF 58
+ L+G+ AA+Q+ ++A + ++++G GT K ARAL + G F
Sbjct: 132 DSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPF 191

Query: 59 VTLPLSASDEQVTGSLDLASALADNT--VRFSPGLVARAHLGVLYVDEINLLPDALVDAL 116
V + ++A + S T S G +A G L++DEI +P L
Sbjct: 192 VAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRL 251

Query: 117 LDAAASGVNTVERDGVSHSHAARFALVGTMNP------EEGELRPQLLDRF 161
L G G + +V N +G R L R
Sbjct: 252 LRVLQQG--EYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07390OMADHESIN290.027 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 29.5 bits (65), Expect = 0.027
Identities = 25/63 (39%), Positives = 28/63 (44%)

Query: 147 ADGATPAAIAGALVARGFGPSAMSVFEHLGGPLERRLDARADAWRDARAAALNVVAIECR 206
A GAT A GA VA G G A V GPL + L A + A A + VAI R
Sbjct: 74 AIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGAR 133

Query: 207 ACA 209
A
Sbjct: 134 AST 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07410LCRVANTIGEN300.008 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.0 bits (67), Expect = 0.008
Identities = 16/58 (27%), Positives = 24/58 (41%), Gaps = 5/58 (8%)

Query: 46 RAELVVNTAELDLDEIVALLARAHGKGQDVARVHSG-----DPSLYGAIGEQIRRLAA 98
R EL TAEL + ++ H +H D +LYG E+I + +A
Sbjct: 154 REELAELTAELKIYSVIQAEINKHLSSSGTINIHDKSINLMDKNLYGYTDEEIFKASA 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07460DHBDHDRGNASE872e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.0 bits (215), Expect = 2e-22
Identities = 51/187 (27%), Positives = 79/187 (42%), Gaps = 10/187 (5%)

Query: 1 MTGKRILVTGAGSGFGREVALRLAAKGHCVIAGVQITE----LSAEAARRGLALDAVKLD 56
+ GK +TGA G G VA LA++G + A E + + +A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 57 VT-CARERAQAARWD-----VDVLLNNAGAGEAGALVDLPVDIVRELFETNVFGPLELTQ 110
V A AR + +D+L+N AG G + L + F N G ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 111 QVARGMIARGRGRIVFVSSIAGLITGAYTGAYCASKHALEAIAEAMHLELAAHGVQIAVV 170
V++ M+ R G IV V S + AY +SK A + + LELA + ++ +V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 171 NPGPYRT 177
+PG T
Sbjct: 186 SPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07470CARBMTKINASE403e-144 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 403 bits (1037), Expect = e-144
Identities = 145/310 (46%), Positives = 192/310 (61%), Gaps = 13/310 (4%)

Query: 2 RIVIALGGNALLQRNQPMTEVQQRENVKIAVAQIAQ-IAPGNELVIAHGNGPQVGLLALQ 60
R+VIALGGNAL QR Q + + +NV+ QIA+ IA G E+VI HGNGPQVG L L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 61 ---GAAYPAVAPYPLDVLGAQTEGMIGYLIEQEMGNLLPP---DAPFATLLTQVEVDPAD 114
G A + P+DV GA ++G IGY+I+Q + N L + T++TQ VD D
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 115 PAFEHPTKPIGPVYSRDEAERLALEKGWHIAPD-GDKFRRVVPSPRPRRIFEIRPVKWLL 173
PAF++PTKP+GP Y + A+RLA EKGW + D G +RRVVPSP P+ E +K L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 174 EKGTIVICAGGGGIPTRYDANGKLSGVEAVIDKDLCASLLARELSADLLVIATDVDGAYL 233
E+G IVI +GGGG+P + +G++ GVEAVIDKDL LA E++AD+ +I TDV+GA L
Sbjct: 184 ERGVIVIASGGGGVPVILE-DGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 234 DWGKPTQALIEAAHPDELERL----GFAAGSMGPKVQAAIEFARQTGHDAVIGSLADIVA 289
+G + + +EL + F AGSMGPKV AAI F G A+I L V
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 290 IAEGRAGTRV 299
EG+ GT+V
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07480ARGDEIMINASE5150.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 515 bits (1327), Expect = 0.0
Identities = 130/423 (30%), Positives = 227/423 (53%), Gaps = 21/423 (4%)

Query: 1 MSQAIPQVGVHSEVGKLRKVLVCSPGLAHQRLTPSNCDELLFDDVMWVNQAKRDHFDFVS 60
M + + + + SE+G+L+KVL+ PG + LTP LFDD+ ++ A+++H F S
Sbjct: 1 MEEYLNPINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFAS 60

Query: 61 KMRERGVEVLEMHNLLTETVQNPAALK------WILDRKITPDNVGIGLVDEVRAWLEGL 114
++ VE+ + +L++E + + AL+ +IL+ +I D ++ ++ + L
Sbjct: 61 ILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFT----INLLKDYFSSL 116

Query: 115 EPRALAEFLIGGVAASDIAGAERSKVLTLFRDYLGKSSFVLPPLPNMMFTRDTSCWIYGG 174
+ +I GV ++ S + G + F++ P+PN++FTRD I G
Sbjct: 117 TIDNMISKMISGVVTEELKNYTSSLDDLV----NGANLFIIDPMPNVLFTRDPFASIGNG 172

Query: 175 VTLNPMHWPARRQETLLVAAVYKFHPAFTDAKFDVWYGDPDRDHGMATLEGGDVMPIGRG 234
VT+N M R++ET+ ++K+HP + +W + A+LEGGD + + +G
Sbjct: 173 VTINKMFTKVRQRETIFAEYIFKYHPVYK-ENVPIWLNRWE----EASLEGGDELVLNKG 227

Query: 235 VVLVGMGERTSRQAVGQLAQALFA-KGAAERVIVAGLPNSRASMHLDTVFSFCDRDLVTV 293
++++G+ ERT ++V +LA +LF K + + ++ +P +R+ MHLDTVF+ D + T
Sbjct: 228 LLVIGISERTEAKSVEKLAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTS 287

Query: 294 FPEVVNRIVPFTLRPGGDARYGIDIEREDKPFVDVVAQALGLKSLRVVETGGNDFAAERE 353
F + L + I I++E DV++ LG K + GG+ RE
Sbjct: 288 FTSDDMYFSIYVLTYNPSSSK-IHIKKEKARIKDVLSFYLGRKIDIIKCAGGDLIHGARE 346

Query: 354 QWDDGNNMVCIEPGVVVGYDRNTYTNTLLRKAGVEVITIGSSELGRGRGGGHCMTCPVLR 413
QW+DG N++ I PG ++ Y RN TN L + G++V I SSEL RGRGG CM+ P++R
Sbjct: 347 QWNDGANVLAIAPGEIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIR 406

Query: 414 DPV 416
+ +
Sbjct: 407 EDI 409


23E4F39_07490E4F39_08130Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_074903154.865853chemotaxis protein CheW
E4F39_074951144.631762AraC family transcriptional regulator
E4F39_075001144.446142oxidoreductase
E4F39_075050143.580645porin
E4F39_07510-2132.965057hypothetical protein
E4F39_07515-3153.489542hypothetical protein
E4F39_07520-3153.202002alpha/beta fold hydrolase
E4F39_07525-2163.490311MbtH family NRPS accessory protein
E4F39_07530-2153.476755cupin-like domain-containing protein
E4F39_07535-1183.230402histidinol-phosphate aminotransferase family
E4F39_07540-2122.886564hypothetical protein
E4F39_07545-2111.490589formyl transferase
E4F39_07550-3101.186514argininosuccinate synthase
E4F39_07555-291.498355argininosuccinate lyase
E4F39_07560-192.061302kinase
E4F39_07565-1102.681846MFS transporter
E4F39_07570-1112.516407hypothetical protein
E4F39_075750143.140510pyridoxal-phosphate dependent enzyme
E4F39_075802192.293811ATP-grasp domain-containing protein
E4F39_075851172.512299aminotransferase class I/II-fold pyridoxal
E4F39_075902181.513230N-acetyltransferase
E4F39_07595217-0.510933amino acid adenylation domain-containing
E4F39_07600221-1.506175carbamoyltransferase
E4F39_07605016-3.110205hypothetical protein
E4F39_07610-214-3.981405hypothetical protein
E4F39_07620-17-4.352241DUF2282 domain-containing protein
E4F39_07625-18-3.317630DUF692 domain-containing protein
E4F39_07630011-3.412228DUF2063 domain-containing protein
E4F39_07635013-1.967638DoxX family protein
E4F39_07640014-0.635744DUF1109 domain-containing protein
E4F39_07645-1141.426867sigma-70 family RNA polymerase sigma factor
E4F39_07650-1121.922220rhodanese-like domain-containing protein
E4F39_07655-3122.143104Zn-dependent hydrolase
E4F39_07665-1133.738280allantoinase
E4F39_07670-2103.920641LysR family transcriptional regulator
E4F39_07675-2103.604316DUF917 domain-containing protein
E4F39_07680-293.467348urocanate hydratase
E4F39_07685-293.393253MFS transporter
E4F39_07690-124-0.510738sugar transporter
E4F39_07700236-6.130278hypothetical protein
E4F39_07705231-4.698868porin
E4F39_07710328-3.418441hypothetical protein
E4F39_07715325-2.085708DUF3443 family protein
E4F39_07720324-2.527517DUF2844 domain-containing protein
E4F39_07725223-1.846352IS3 family transposase
E4F39_07730426-2.740243response regulator transcription factor
E4F39_07735428-2.806133adenylyl-sulfate kinase
E4F39_07740224-1.823824hypothetical protein
E4F39_07745118-1.459074tetratricopeptide repeat protein
E4F39_07750217-1.214555HlyD family type I secretion periplasmic adaptor
E4F39_07755114-0.729820ATP-binding cassette domain-containing protein
E4F39_07760010-0.196200sulfotransferase
E4F39_07765090.132815hypothetical protein
E4F39_07775-111-0.258754hemolysin
E4F39_07780-211-0.508344TolC family protein
E4F39_07790114-1.401750outer membrane protein assembly factor BamE
E4F39_07795222-4.067788H-NS histone family protein
E4F39_07800633-7.180582peptide synthase
E4F39_07805737-6.673865outer membrane transporter TsaT
E4F39_07815749-9.591258hypothetical protein
E4F39_07820746-9.440598transcriptional regulator
E4F39_07825748-8.647918hypothetical protein
E4F39_07830543-6.750382MEKHLA domain-containing protein
E4F39_07835434-5.652297hypothetical protein
E4F39_07840531-5.212860hypothetical protein
E4F39_07845429-4.173532PHB depolymerase family esterase
E4F39_07850428-4.345257hypothetical protein
E4F39_07855428-4.257052hypothetical protein
E4F39_07860920-3.750465DUF4102 domain-containing protein
E4F39_078701120-4.007529hypothetical protein
E4F39_078751120-4.211459hypothetical protein
E4F39_078801221-4.202671MFS transporter
E4F39_07890637-6.317512NAD(P)/FAD-dependent oxidoreductase
E4F39_07895647-8.070750cation:proton antiporter
E4F39_07900447-7.717089Rieske (2Fe-2S) protein
E4F39_07905345-7.331377oxygenase
E4F39_07910347-7.180406hypothetical protein
E4F39_07915650-11.697570dihydroxyacetone kinase subunit DhaK
E4F39_07920757-13.516397hypothetical protein
E4F39_07925649-11.733654ABC transporter permease
E4F39_07930752-11.935646ABC transporter ATP-binding protein
E4F39_07935754-11.777507FAD:protein FMN transferase
E4F39_07940543-8.019613nitrous-oxide reductase
E4F39_07945436-4.999565nitrous oxide reductase family maturation
E4F39_07950434-3.536728ABC transporter ATP-binding protein
E4F39_07955535-2.869312ABC transporter permease
E4F39_07960541-2.367745nitrous oxide reductase accessory protein NosL
E4F39_07965540-2.497244hypothetical protein
E4F39_07970850-4.397933hypothetical protein
E4F39_07975743-4.453866cytochrome c5 family protein
E4F39_07980647-6.0296754Fe-4S binding protein
E4F39_07985131-4.823141mechanosensitive ion channel family protein
E4F39_07995-114-1.338199hypothetical protein
E4F39_08000-19-0.100878hypothetical protein
E4F39_08005182.138442lipoprotein
E4F39_08010193.628902hypothetical protein
E4F39_08015294.500324hypothetical protein
E4F39_080201214.334624hypothetical protein
E4F39_080250223.930580magnesium/cobalt transporter CorA
E4F39_080350174.652323cytochrome bd-I oxidase subunit CydX
E4F39_080400283.347283ATP-binding protein
E4F39_08045-4171.612629NAD-dependent epimerase/dehydratase family
E4F39_08050-3120.992077UDP-N-acetylenolpyruvoylglucosamine reductase
E4F39_08055-3131.119353hypothetical protein
E4F39_08060-3140.557964hypothetical protein
E4F39_08065-215-0.109735Lrp/AsnC family transcriptional regulator
E4F39_08075-1192.186331hypothetical protein
E4F39_08080-1193.202770LtxA
E4F39_08085-2181.505922SDR family oxidoreductase
E4F39_08090-2191.277108hypothetical protein
E4F39_080951141.228504depolymerase
E4F39_081000121.254384hypothetical protein
E4F39_081050111.013582multidrug transporter
E4F39_081102120.709654DUF3022 domain-containing protein
E4F39_081152151.850019YhfC family intramembrane metalloprotease
E4F39_081202131.856183LacI family DNA-binding transcriptional
E4F39_081253262.568773hypothetical protein
E4F39_081302283.115593D-glycerate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07595ECOLNEIPORIN933e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 92.9 bits (231), Expect = 3e-23
Identities = 90/380 (23%), Positives = 146/380 (38%), Gaps = 64/380 (16%)

Query: 32 ASTAHAQSSVVLYGLIDTSITYANNQRTHGAGSPGSPGWAVTSGALNASRWGLRGREDLG 91
A A + V LYG I + + + +GA + T S+ G +G+EDLG
Sbjct: 12 ALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVE--TGTGIVDLGSKIGFKGQEDLG 69

Query: 92 DGVSAIFALENGFSGASGALSQKGVDMFGRQAWIGLKSKEGGALTLGRQYDLILDF--VT 149
+G+ AI+ +E AS A + G RQ++IGLK G L +GR ++ D +
Sbjct: 70 NGLKAIWQVE---QKASIAGTDSG--WGNRQSFIGLK-GGFGKLRVGRLNSVLKDTGDIN 123

Query: 150 PLGASGPGWGGNLAVHPYDNDDSNRNIRINNAVKYTSPTYRGWTLGAMYGFSNTAGPFGN 209
P + G N P R I +V+Y SP + G + Y ++ AG N
Sbjct: 124 PWDSKSDYLGVNKIAEP-----EARLI----SVRYDSPEFAGLSGSVQYALNDNAG-RHN 173

Query: 210 NAAWSAGLSYANGPLKLGAGYLRINRNPNAANANGALSTTDGSATITGGSQQIWAVAGRY 269
+ ++ AG +Y NG + G + QI + Y
Sbjct: 174 SESYHAGFNYKNGGFFVQYGGAYKRH-------------HQVQENVNIEKYQIHRLVSGY 220

Query: 270 -AFGPHSIGAAWSHSATDRVSGVLQGGSIAKLDGNSLVFDNFTLDGRY-VVTPRLSLAAA 327
++ A A L + + + TL R+ VTPR+S A
Sbjct: 221 DNDALYASVAVQQQDAK------LVEENYSHNSQTEVA---ATLAYRFGNVTPRVSYAHG 271

Query: 328 YTYTMGRFDARSGETRPKWNHMVAQADYAFSIRTDAYLAAVYQRVSGGNGIPAFNATIWT 387
+ + + + ++ +V A+Y FS RT A ++A + + G G F +T
Sbjct: 272 FKGSFDATNYNN-----DYDQVVVGAEYDFSKRTSALVSAGWLQ--EGKGESKFVSTA-- 322

Query: 388 LTPSANGNQVVVALGLRHRF 407
+GLRH+F
Sbjct: 323 -----------GGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07665ARGDEIMINASE290.039 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 28.6 bits (64), Expect = 0.039
Identities = 11/48 (22%), Positives = 18/48 (37%), Gaps = 2/48 (4%)

Query: 263 PTSGAAFMVAEWLRAQRDDGRTIVFIAPDEGHRYADTVYDDAWLRGQG 310
+G + R Q +DG ++ IAP E Y+ + G
Sbjct: 334 KCAGGDLIHGA--REQWNDGANVLAIAPGEIIAYSRNHVTNKLFEENG 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07745PF08280280.020 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 28.3 bits (63), Expect = 0.020
Identities = 23/95 (24%), Positives = 39/95 (41%), Gaps = 17/95 (17%)

Query: 21 RSFLSELTRHLRG--FLRKRIPQFDADIEDLVQEILLAVHNARHTYRADEPLTAWVHAIA 78
SFLS + HL+ +L + +D ILLA+ RH + P T +
Sbjct: 209 HSFLSHSSTHLKTSPWLSESFSFYD---------ILLALSWKRHQFSVTIPQTRIFQQLK 259

Query: 79 RYKLMDFFRTRARREALHDPLDDHTDI-FSEPDDD 112
+ + D + +R D ++ + + FS D D
Sbjct: 260 KLFVYDSLKKSSR-----DIIETYCQLNFSAGDLD 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07805ECOLNEIPORIN924e-23 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 92.2 bits (229), Expect = 4e-23
Identities = 92/377 (24%), Positives = 136/377 (36%), Gaps = 57/377 (15%)

Query: 1 MKKLLIALPLAAAATTHAQSSVTLYGVLEDGVDYVSNVQGKHL----VQLASGV-TAGSR 55
MKK LIAL LAA A + VTLYG ++ GV+ +V V+ +G+ GS+
Sbjct: 1 MKKSLIALTLAALPVA-AMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSK 59

Query: 56 WGVRGTEDLGGGLSAIFRLESGFDINSGRLGSGLAFSRNAYVGVGDAKLGTLTLGRQWDS 115
G +G EDLG GL AI+++E I G G +R +++G+ G L +GR
Sbjct: 60 IGFKGQEDLGNGLKAIWQVEQKASIAGTDSGWG---NRQSFIGLK-GGFGKLRVGRLNSV 115

Query: 116 IVDY--VEPFTLNGNI-GGYYFAHPNDMDNTDNGFPISNAVKYRSPTIAGFTFGGLYAFG 172
+ D + P+ + G A P IS V+Y SP AG + YA
Sbjct: 116 LKDTGDINPWDSKSDYLGVNKIAEPEA-------RLIS--VRYDSPEFAGLSGSVQYALN 166

Query: 173 GQPGRFSDNATFSVGANYAAGPVGFGIGYLRINNPGVSTQGYQNYPGFTNAVYGNYLDAA 232
GR ++ ++ G NY G G Y+ + V
Sbjct: 167 DNAGR-HNSESYHAGFNYKNGGFFVQYGGA-----------YKRHHQVQENVNIEKYQIH 214

Query: 233 RAQKVFGVGASYQVV---QWLKLLADFTNTNFQQGSAGHDATFQNYELSALVKPTPAVTI 289
R + A Y V Q L + ++ Q AT + + + A
Sbjct: 215 RLVSGYDNDALYASVAVQQQDAKLVEENYSHNSQTEVA--ATLAYRFGNVTPRVSYAHGF 272

Query: 290 GAGYTYTTGRDHATNAEPKYHQFNLSVEYALSKRTSVYAMGAFQKAAGDAPVAQIAGFNP 349
+ T + Y Q + EY SKRTS + +
Sbjct: 273 KGSFDATNYNND-------YDQVVVGAEYDFSKRTSALVSAGWLQEG-----------KG 314

Query: 350 SGNQKQAVGRAGIRHVF 366
G G+RH F
Sbjct: 315 ESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07840HTHFIS758e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 8e-18
Identities = 37/163 (22%), Positives = 63/163 (38%), Gaps = 13/163 (7%)

Query: 3 IYLIEDDEIQAQYYQSMLVEHGWQVKLLLDGERAFREIQRMPPDLIILDRRLPDLDGLEV 62
I + +DD L G+ V++ + +R I DL++ D +PD + ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 63 LMWVRKNYSNIPVLILTNAILESEVVAALEAGADDYVIKPPRKQEFVARVKALYRRATET 122
L ++K ++PVL+++ + A E GA DY+ KP E + + RA
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI----GIIGRALAE 121

Query: 123 RTLSELIEIGPYRIQTSEKVVYFHHEAITLSPKEYEIIELLAR 165
R E + S EI +LAR
Sbjct: 122 PKR---------RPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07860SYCDCHAPRONE330.005 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 33.0 bits (75), Expect = 0.005
Identities = 21/126 (16%), Positives = 45/126 (35%), Gaps = 3/126 (2%)

Query: 898 LAPDDADAVLLRAELALDTGDFDEALSQFERLREQRPDAPESYANLIPALAALERRDDAI 957
++ D + + A +G +++A F+ L + L A+ + D AI
Sbjct: 31 ISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAI 90

Query: 958 AALQRALELNSKHPGALNNGVQFYLRTQQYDKA---MELAQRYVGAHGELASAHTMCGLV 1014
+ ++ K P + + L+ + +A + LAQ + E T +
Sbjct: 91 HSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKELSTRVSSM 150

Query: 1015 YHNLKA 1020
+K
Sbjct: 151 LEAIKL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07865RTXTOXIND2745e-89 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 274 bits (701), Expect = 5e-89
Identities = 94/439 (21%), Positives = 204/439 (46%), Gaps = 14/439 (3%)

Query: 43 SALGLEEASIAPARRAAALIPTVMLALLIVLVLWATFFKIDIIAAGQGKVIPSTTVQQLS 102
+ L L E ++ R A ++ L++ + + +++I+A GK+ S +++
Sbjct: 44 AHLELIETPVSRRPRLVAYF---IMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIK 100

Query: 103 TLEGGIVRELLVREGQIVKKGQPLVRLDPVVAQGAVTEQAATREGLMASIARLQAEADGK 162
+E IV+E++V+EG+ V+KG L++L + A+ + ++ R Q +
Sbjct: 101 PIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI 160

Query: 163 ----------ATPLYPAGLKPEIVSEEEHVRAQRAEALNSTIEVLQQQRAAKQAEAADYR 212
Y + E V + ++ + + K+AE
Sbjct: 161 ELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVL 220

Query: 213 GRIPQYVNNQHLLDDQIQRMLPLVGVGSVAPNEITNLQRERGNLAAQIITTREGAAQASA 272
RI +Y N + ++ L+ ++A + + + + ++ + Q +
Sbjct: 221 ARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIES 280

Query: 273 QIAEASHKIEEKISTFRSEAREELARKQVQLQALEGTLSGKQDILDRTLIRSPVNGIVKT 332
+I A + + F++E ++L + + L L+ ++ ++IR+PV+ V+
Sbjct: 281 EILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQ 340

Query: 333 LYITTIGGVASPGKSVIDIVPTNDSLLIEARIQPQDIAYIRVGDDAKVRITAFDSGALGS 392
L + T GGV + ++++ IVP +D+L + A +Q +DI +I VG +A +++ AF G
Sbjct: 341 LKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGY 400

Query: 393 LDAKVELISPDSQADERSGSLYYKVQVRTHSSVVATQVGDLNILPGMVADVDVITGRRTI 452
L KV+ I+ D+ D+R G L + V + + ++T ++ + GM ++ TG R++
Sbjct: 401 LVGKVKNINLDAIEDQRLG-LVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSV 459

Query: 453 MSYILRPIVRGMSRAMSER 471
+SY+L P+ ++ ++ ER
Sbjct: 460 ISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07885INTIMIN471e-06 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 47.4 bits (112), Expect = 1e-06
Identities = 55/247 (22%), Positives = 86/247 (34%), Gaps = 25/247 (10%)

Query: 902 ADGTHSLTASAVDLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNH--DN 959
+ +TA A D GN+S + L +TV ++ + ++D TS +
Sbjct: 521 GSNVYKVTARAYDRNGNSS-NNVLLTITV--LSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 960 ITSATLPTFNGTAAAGSYVQLYDVTGGTTVSMGSAVADSSGGWTTTLTSPLSGSASGVSH 1019
IT NG A A V V+G +S SA + SG T TL S G
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQ------ 631

Query: 1020 TLVAVGVDPAGNTSTVSGPDVVVIDNAAAALPGPTMASASDTGASSSDGIT-SNTAPVFT 1078
V V A TS ++ V+ +D A++ T A T A ++ + T V
Sbjct: 632 --VVVSAKTAEMTSALNANAVIFVDQTKASI---TEIKADKTTAVANGQDAITYTVKVMK 686

Query: 1079 GTGAEAGALVTIYANGTSVG--HATADASGNYTIQ--SNALGADGRYQITAQQVDIAGNT 1134
G + VT + D +G + S G + + +V
Sbjct: 687 GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGK----SLVSARVSDVAVD 742

Query: 1135 SPSSSVT 1141
+ V
Sbjct: 743 VKAPEVE 749



Score = 45.4 bits (107), Expect = 4e-06
Identities = 63/362 (17%), Positives = 110/362 (30%), Gaps = 39/362 (10%)

Query: 359 GHTVSTIADSNGNYSVQAPGTLAEGNNVFTVQ--AVDKAGNTSGTAQQNVTLDTVAATLP 416
G + + S +Y P + G+NV+ V A D+ GN+S +T+ L
Sbjct: 497 GQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITV------LS 550

Query: 417 APQL-------DHGSDTGASNSDGITRATQPVLTGGGAEPNALVTVYADGVSIGQ----- 464
Q+ D +D ++ +DG T A V V + VS
Sbjct: 551 NGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSAN 610

Query: 465 -ATADSLGHYTIHSGVLADGTHQITARQIDIAGNTSALSGAALVTIDTSEPAPANLKLVD 523
A + G T+ L A TSAL+ A++ +D ++ + +K
Sbjct: 611 SANTNGSGKATV---TLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADK 667

Query: 524 DTFGLHTAGTPSDGLTKDSRVTISGTASAGDVVTLMD--GATSVGQVTADASGNWTIQTA 581
T D +T +V + VT G S D +G +
Sbjct: 668 TTA----VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLT 723

Query: 582 SLADGTHSLTASAVDLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNHDN 641
S T ++ S + + + + + G+G G
Sbjct: 724 -------STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI-EIVGTGVKGKLPTV 775

Query: 642 ITSATLPTFNGTAAAGSYVQLYDVTGGTTVSVGSAVADSSGGWTTTLTSPLSGSASGVSH 701
+ G Y +V S TTT++ +S ++
Sbjct: 776 WLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV-ISSDNQTATY 834

Query: 702 TL 703
T+
Sbjct: 835 TI 836



Score = 38.5 bits (89), Expect = 5e-04
Identities = 53/259 (20%), Positives = 91/259 (35%), Gaps = 17/259 (6%)

Query: 798 GADGRYQITAQQVDIAGNTSPSSSVTAMTLDTSEPAPVNLHLVDDTFGQGTAGTS--SDN 855
G Y++TA+ D GN+S + +T L S V+ V D T+ + ++
Sbjct: 520 GGSNVYKVTARAYDRNGNSSNNVLLTITVL--SNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 856 LTKDSRVTISGTASAGD--VVTLMDGATSVGQVTADASGNWTIQTASLADGTHSLTASAV 913
+T + V +G A A ++ G + +A+ +G+ T +L +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKA-TVTLKSDKPGQVVVSA 636

Query: 914 DLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNHDNITSATLPTFNGTAA 973
A TS ++ + VD + T + D IT
Sbjct: 637 KTAEMTSALNANAVIFVDQTKASIT-EIKADKTTAVAN----GQDAITYTVKVMKGDKPV 691

Query: 974 AGSYVQLYDVTGGTTVSMGSAVADSSGGWTTTLTSPLSGSASGVSHTLVAVGVDPAGNTS 1033
+ V T +S + D++G TLTS G + VS + V VD
Sbjct: 692 SNQEVTF--TTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL-VSARVSDVAVDVK--AP 746

Query: 1034 TVSGPDVVVIDNAAAALPG 1052
V + ID+ + G
Sbjct: 747 EVEFFTTLTIDDGNIEIVG 765



Score = 37.4 bits (86), Expect = 0.001
Identities = 28/146 (19%), Positives = 51/146 (34%), Gaps = 8/146 (5%)

Query: 1853 ADGTYTFSAVAVDVAGNTSNPGVPVQVVVDTHAAAPSITLGTPYDTFGTGTSGTNSDELT 1912
Y +A A D GN+SN V + + V ++ T + T ++ +T
Sbjct: 521 GSNVYKVTARAYDRNGNSSNN-VLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAIT 579

Query: 1913 RNTIPYMYGVAEPGARV--TVVENGNTIGTVNA-DSSTGSYSIQIPPATVDGTYTFQAMQ 1969
GVA+ V +V + +A + +G ++ +
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVV----S 635

Query: 1970 VDVAGNTSAYSAPNYVTIDTVAATPT 1995
A TSA +A + +D A+ T
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASIT 661


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07895OMPADOMAIN1134e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (283), Expect = 4e-31
Identities = 59/180 (32%), Positives = 89/180 (49%), Gaps = 12/180 (6%)

Query: 123 QYQVRF--LGGLAYRGYWADSACRDIAARYADAAGLGVIAVAPCNPSDVAAPLPERVELP 180
Q+ + R ++ R+ V+A AP +V L
Sbjct: 163 QWTNNIGDAHTIGTRP-DNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTK---HFTLK 218

Query: 181 TDTLFAFDKGGFEDISADGRRQLGDLVASIKAKILSINHLIVTGYTDRLGSDEHNARLSS 240
+D LF F+K + +G+ L L + + ++V GYTDR+GSD +N LS
Sbjct: 219 SDVLFNFNK---ATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSE 275

Query: 241 ERARTVADYMIAEGIPAAKITAVGRGAADPVV--VCNNGEQ-PELIRCLQKNRRVEIRIK 297
RA++V DY+I++GIPA KI+A G G ++PV C+N +Q LI CL +RRVEI +K
Sbjct: 276 RRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08020PF07675310.008 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.8 bits (69), Expect = 0.008
Identities = 21/59 (35%), Positives = 28/59 (47%), Gaps = 3/59 (5%)

Query: 181 DYVIADPEPRGGRLAME-RGVTWAARRHDHRF--GAHYPWTLRLTPPQDGAPASVEIDT 236
DY I +PEP G++ + G AR D F G Y +T+R DG VE D+
Sbjct: 472 DYCITNPEPASGKMWIAGDGGNQPARYDDFAFEAGKKYTFTMRRAGMGDGTDMEVEDDS 530


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08050ABC2TRNSPORT531e-10 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 53.4 bits (128), Expect = 1e-10
Identities = 62/203 (30%), Positives = 101/203 (49%), Gaps = 12/203 (5%)

Query: 17 ASPLRILFGLTQPLLYLFVLGAALRSGTYAEIGG--YQAYIFPGVVGLSLM----FTAIS 70
A+ +L L +PL+YLF LGA L +GG Y A++ G+V S M F I
Sbjct: 30 AALASLLGHLAEPLIYLFGLGAGL-GVMVGRVGGVSYTAFLAAGMVATSAMTAATFETIY 88

Query: 71 AAVGIVHDRQTGLLNALLVSPVRRVDIALGKIGAGALLAWLQALLLLPFSPAIGIGLTAP 130
AA G + ++T A+L + +R DI LG++ A A L + + A+G
Sbjct: 89 AAFGRMEGQRT--WEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY-TQWL 145

Query: 131 RLALLVAAMAFAALAFSALGLALALALPFRSVIVFPVVSNTLLLPMFFLSGGLYPLDLAP 190
L + +A LAF++LG+ + P +F ++ P+ FLSG ++P+D P
Sbjct: 146 SLLYALPVIALTGLAFASLGMVVTALAPSYDYFIF--YQTLVITPILFLSGAVFPVDQLP 203

Query: 191 DWIRAAAAFDPAAYGVDLMRGVL 213
+ AA F P ++ +DL+R ++
Sbjct: 204 IVFQTAARFLPLSHSIDLIRPIM 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08105PF05272300.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.006
Identities = 11/36 (30%), Positives = 15/36 (41%), Gaps = 2/36 (5%)

Query: 53 ASPSAASAPAASSE--AAPAATTAAAADTPNPGGEK 86
+SP+AA+ A E + A D PGG
Sbjct: 388 SSPTAAAGGAGGGEPPKKRDPSAGAGTDPGGPGGGD 423


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08120RTXTOXIND373e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 3e-04
Identities = 21/139 (15%), Positives = 43/139 (30%), Gaps = 12/139 (8%)

Query: 38 PPAPAISRDEALAELKRVQAALDRIKQQASAATTYKQLDALDESTQALSADVDKLTAALV 97
P +S +E L ++ + Q LD + A +++
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKEL--NLDKKRAERLTVLARINRYENLS- 230

Query: 98 PTRAQLQAQLDVLGPPPAPGAAPETPAVARQR------ADLNARKTQLDAALKQAADEKE 151
+++LD A + + ++ +L K+QL+ + KE
Sbjct: 231 ---RVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 152 SLANLTQQYSKLRRSLLRD 170
+TQ + LR
Sbjct: 288 EYQLVTQLFKNEILDKLRQ 306



Score = 30.2 bits (68), Expect = 0.037
Identities = 31/188 (16%), Positives = 56/188 (29%), Gaps = 28/188 (14%)

Query: 7 FARRVALIGLLHLLCAALPAAAADLASGASVPPAPAISRDEALAELKRVQAA----LDRI 62
R VA + L+ A + + + A+ S K ++ + I
Sbjct: 56 RPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHS-----GRSKEIKPIENSIVKEI 110

Query: 63 K----QQASAATTYKQLDALDESTQALSADVDKLTAALVPTRAQL--------QAQLDVL 110
+ +L AL L L A L TR Q+ + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKL 170

Query: 111 GPPPAPGAAPETPAVAR------QRADLNARKTQLDAALKQAADEKESLANLTQQYSKLR 164
P E + Q + +K Q + L + E+ ++ +Y L
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 165 RSLLRDQL 172
R + + +L
Sbjct: 231 R-VEKSRL 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08180NUCEPIMERASE330.001 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.2 bits (76), Expect = 0.001
Identities = 21/120 (17%), Positives = 38/120 (31%), Gaps = 33/120 (27%)

Query: 1 MKIAIVG-AGLIGHTIAHLLRETGDYEVVAFD---------------------------- 31
MK + G AG IG ++ L E G +VV D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 32 --RDADALAKLANEGIATQRVDSADAAAIREAVKGFDALVNALPYYLAVNVAAAAKAAGV 89
D + + L G + S A+R +++ A ++ +N+ + +
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADS-NLTGFLNILEGCRHNKI 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08195RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.2 bits (94), Expect = 1e-05
Identities = 21/135 (15%), Positives = 45/135 (33%), Gaps = 13/135 (9%)

Query: 50 TLARYADTQADAALRATQRELQREQGALTDKADAFSIARNRAQQEYDQAKESLRNWIATR 109
L + A+A TQ L + + + + R Q + + +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQAR-----------LEQTRYQILSRSIELNKLPELKLP 171

Query: 110 SATGDASRNPDVLARTRRLDELQQAVAGWQRQQDQIADQLDALRKRQDDVGARLAQSRAQ 169
+ + + + R L +++ + WQ Q+ Q LD R + V AR+ +
Sbjct: 172 DEPYFQNVSEEEVLRLTSL--IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229

Query: 170 AEWRYEQASRRYELV 184
+ + L+
Sbjct: 230 SRVEKSRLDDFSSLL 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08255HTHTETR353e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 34.6 bits (79), Expect = 3e-04
Identities = 15/107 (14%), Positives = 40/107 (37%), Gaps = 1/107 (0%)

Query: 12 ATISDVAREAGTGKTSVSRYLNGETNVLSADLRQRIETAIERLNYRPNQMARGL-KRGRN 70
++ ++A+ AG + ++ + ++++ S E + R
Sbjct: 32 TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLRE 91

Query: 71 RLLGLLAADLTNPYTVEVLRGVEAACHALGYMPLICHAANELEMERR 117
L+ +L + +T ++ + C +G M ++ A L +E
Sbjct: 92 ILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESY 138


24E4F39_08205E4F39_08320Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_082052171.908158hypothetical protein
E4F39_082101151.293892hypothetical protein
E4F39_082152161.616290hypothetical protein
E4F39_082250101.504717transcriptional regulator
E4F39_082300101.432674DUF2975 domain-containing protein
E4F39_08240082.665102PAS domain-containing protein
E4F39_082501102.331780MBL fold metallo-hydrolase
E4F39_082550113.378259TIGR01244 family phosphatase
E4F39_082600113.816888sulfite exporter TauE/SafE family protein
E4F39_082651113.816291hypothetical protein
E4F39_082700133.631158ABC transporter permease subunit
E4F39_082750133.927743ABC transporter permease subunit
E4F39_08280-2133.381768polyamine ABC transporter ATP-binding protein
E4F39_08285-1122.930279polyamine ABC transporter substrate-binding
E4F39_082900111.945106Mn(2+) uptake NRAMP transporter MntH
E4F39_082950132.505398DUF1289 domain-containing protein
E4F39_08300-1112.543147OmpW family protein
E4F39_08305-1102.337919nitronate monooxygenase
E4F39_08310-2102.724402aldehyde dehydrogenase family protein
E4F39_08315-382.846929hypothetical protein
E4F39_083200114.288277hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08340PF03544280.027 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 28.4 bits (63), Expect = 0.027
Identities = 19/123 (15%), Positives = 31/123 (25%), Gaps = 2/123 (1%)

Query: 82 VSAPPSASTAVSTARSRLPSPLLLSAPASAPMTARARPSARRGPAPRASMGSVHVAARER 141
+ AP + A + L P + P P P P V + +
Sbjct: 43 LPAPAQPISVTMVAPADLEPPQAVQPPPEPV--VEPEPEPEPIPEPPKEAPVVIEKPKPK 100

Query: 142 EPSSRRAPGIPAVSEPMREPRSDAQASAEAGDAQRRLPRAPGVAADWRADLDSLGAARPL 201
+ + +P AS A R + AA + R L
Sbjct: 101 PKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRAL 160

Query: 202 RQA 204
+
Sbjct: 161 SRN 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08355HTHFIS333e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 333 bits (855), Expect = e-112
Identities = 124/357 (34%), Positives = 178/357 (49%), Gaps = 42/357 (11%)

Query: 145 ERLTTVRSASAKPSGEGLVGGSDAFNAALSALQRVAPSMLPVLLLGESGTGKELFARALH 204
+ + G LVG S A L R+ + L +++ GESGTGKEL ARALH
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH 181

Query: 205 EASARAMGPFVVVDCSGIAETLFESELFGYEKGAFTGASARKPGLVETAQGGTLFLDEIG 264
+ R GPFV ++ + I L ESELFG+EKGAFTGA R G E A+GGTLFLDEIG
Sbjct: 182 DYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIG 241

Query: 265 DVPLSMQVKLLRLIESGTFRRVGGVEALCADFRLVAATHKPLKAMIGDGRFRPDLYYRIS 324
D+P+ Q +LLR+++ G + VGG + +D R+VAAT+K LK I G FR DLYYR++
Sbjct: 242 DMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLN 301

Query: 325 AYPISLPAVRERPGDMPLLVDSILRRIAALGPVAGQHFVVAPDALARLEAYAWPGNIREL 384
P+ LP +R+R D+P LV +++ G +AL ++A+ WPGN+REL
Sbjct: 302 VVPLRLPPLRDRAEDIPDLVRHFVQQAEKEG---LDVKRFDQEALELMKAHPWPGNVREL 358

Query: 385 RNVLDRACLLTDDGVIRVEHLPDEVAGGARIEPGAPAKLSDDELARIARAFDGTRRAL-- 442
N++ R L VI E + +E+ P A L+ + R+
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 443 -------------------------------------AERVGMSERTLYRRLRALGI 462
A+ +G++ TL +++R LG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


25E4F39_08950E4F39_08975Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_08950-113-3.922451LysR family transcriptional regulator
E4F39_08955-113-4.188989glyoxylate carboligase
E4F39_08960115-4.661302hydroxypyruvate isomerase
E4F39_08965217-4.6118102-hydroxy-3-oxopropionate reductase
E4F39_08970315-1.197448hypothetical protein
E4F39_089752120.015182peptidoglycan-binding protein LysM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09040SALSPVBPROT280.023 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 27.8 bits (61), Expect = 0.023
Identities = 26/97 (26%), Positives = 39/97 (40%), Gaps = 5/97 (5%)

Query: 14 LLGKSDEQAATDPAAANQTAADAIKNYISAQGLDTSNLTVAFDGASRTVTLTGSVADLDT 73
LLGK+ +DP AA+ TA ++ ++ G +A +G + V L G+ A D
Sbjct: 170 LLGKTAAARLSDPQAASHTAQWLVEESVTPAGEHIYYSYLAENGDN--VDLNGNEAGRDR 227

Query: 74 KAK---VKVAAGNVQGVAGVNDDDLQPDDPQVQYHDV 107
A KV GN A + Q + V
Sbjct: 228 SAMRYLSKVQYGNATPAADLYLWTSATPAVQWLFTLV 264


26E4F39_09240E4F39_09455Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_09240-1143.465641hypothetical protein
E4F39_09245-1143.026593ATPase
E4F39_09250-1133.677402hypothetical protein
E4F39_09255-2133.685146ribosome-associated translation inhibitor RaiA
E4F39_09265-3110.076134LuxR family transcriptional regulator
E4F39_09275-311-0.241673*hypothetical protein
E4F39_09280-316-0.972756endopeptidase La
E4F39_09295734-0.318104ATP-dependent Clp protease ATP-binding subunit
E4F39_093006380.022570ATP-dependent Clp protease proteolytic subunit
E4F39_09305231-0.334456trigger factor
E4F39_09315131-2.194479glycerate kinase
E4F39_09320112-3.487659MarR family transcriptional regulator
E4F39_09325311-4.620898hypothetical protein
E4F39_09330310-4.967244helix-turn-helix transcriptional regulator
E4F39_09335410-4.748096LuxR family transcriptional regulator
E4F39_09345410-4.556385porin
E4F39_0935039-2.425341MFS transporter
E4F39_0935519-1.450250class II histone deacetylase
E4F39_09360190.847524phospholipase D family protein
E4F39_093652152.645013exported avidin family protein
E4F39_093701174.231745hypothetical protein
E4F39_093752184.006265hypothetical protein
E4F39_09380-1182.357973hypothetical protein
E4F39_09385-1153.326692DUF4123 domain-containing protein
E4F39_09390-1132.717301hypothetical protein
E4F39_09395-191.598305hypothetical protein
E4F39_09400-2121.397159hypothetical protein
E4F39_09405-213-0.227026*presqualene diphosphate synthase HpnD
E4F39_09410-317-1.198104DUF1501 domain-containing protein
E4F39_09415-227-4.326611DUF1800 domain-containing protein
E4F39_09420343-8.395183hypothetical protein
E4F39_09425553-10.675155hypothetical protein
E4F39_09430438-8.566698two-component system response regulator OmpR
E4F39_09435130-6.110004HAMP domain-containing protein
E4F39_09440537-6.903412peroxiredoxin
E4F39_09445436-6.925722carboxymuconolactone decarboxylase family
E4F39_09450232-5.7850612-C-methyl-D-erythritol 2,4-cyclodiphosphate
E4F39_09455024-3.4269002-C-methyl-D-erythritol 4-phosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09350GPOSANCHOR403e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.4 bits (94), Expect = 3e-05
Identities = 35/192 (18%), Positives = 73/192 (38%), Gaps = 15/192 (7%)

Query: 92 KVLVEGLQRAQALSIEEQETQFSCEVMPLEPDHADSAETEALRRAIVSQFDQYVKLNKKI 151
L + L+ A S + + E A A+ E ++ K +
Sbjct: 193 AELEKALEGAMNFSTADSAKIKTLEAE-KAALAARKADLEKALEGAMNFSTADSAKIKTL 251

Query: 152 PPEILTSLSGIDEAGRLADTIAAHLPLKLDQKQHILEMFPVIERLEHLLAQLEAEIDILQ 211
E + E + + + + + LE A LE + +L
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE---KAALEAEKADLEHQSQVLN 308

Query: 212 VEKRIRGRVKRQMEKSQREYYLNEQVKAIQKELGEGEEGAD--LEELEKRINAARMPKEA 269
R ++R ++ S+ +Q++A ++L E + ++ + L + ++A+R EA
Sbjct: 309 AN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEASRQSLRRDLDASR---EA 359

Query: 270 KKKADAELKKLK 281
KK+ +AE +KL+
Sbjct: 360 KKQLEAEHQKLE 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09355HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.009
Identities = 19/112 (16%), Positives = 40/112 (35%), Gaps = 12/112 (10%)

Query: 51 EAAAAGVEASLSKSDLPSPQEIRDILDQYVIGQERAKKILAVAVYNHYKRL-------KH 103
+A+ G L K E+ I+ + + +R L + + +
Sbjct: 92 KASEKGAYDYLPKP--FDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEI 149

Query: 104 LDKKDDVELSKSNILLIGPTGSGKTLLAQTLARL---LNVPFVIADATTLTE 152
+ + +++ G +G+GK L+A+ L N PFV + +
Sbjct: 150 YRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09405ECOLNEIPORIN671e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 67.1 bits (164), Expect = 1e-14
Identities = 72/323 (22%), Positives = 119/323 (36%), Gaps = 37/323 (11%)

Query: 20 AATLAALSGPAHAQSTLTLYGVADAGVQYLSRADGRHAAWRLQN-----YGILPSQLGIK 74
A TLAAL P A + +TLYG AGV SR+ + A L S++G K
Sbjct: 7 ALTLAAL--PVAAMADVTLYGTIKAGV-ETSRSVAHNGAQAASVETGTGIVDLGSKIGFK 63

Query: 75 GEEDLGGGWRARFQLEQGINLNDSTATVPGYAFFRGAYVGMGGPAGTVTLGRQFSTLFDK 134
G+EDLG G +A +Q+EQ ++ + + R +++G+ G G + +GR S L D
Sbjct: 64 GQEDLGNGLKAIWQVEQKASIAGTDSGW----GNRQSFIGLKGGFGKLRVGRLNSVLKDT 119

Query: 135 TLFYDPLWYASYSGQGVLVPLSANFVDHSIKFQSATFAGFDVEALAAMAGIAGNTRAGRV 194
+P S + + S+++ S FAG ++ A N AGR
Sbjct: 120 GDI-NPWDSKSDYLGVNKIAEPEARLI-SVRYDSPEFAGL-SGSVQ----YALNDNAGRH 172

Query: 195 ------LELGGQFTSRGLSASAVLHRSH-GTAQGGADRSAQRRDIGTFAARYAFASLPLT 247
+ + R H ++ R + + +AS+ +
Sbjct: 173 NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQ 232

Query: 248 VHAGVQRLTGELDPARTIV-------WGGARYQASGRFGFAGGIYHTDSPTPQVGHPTLF 300
++T V +G + S GF G T+
Sbjct: 233 QQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNN----DYDQV 288

Query: 301 IASTTCSLSKRTVAYLNLGYAKN 323
+ SKRT A ++ G+ +
Sbjct: 289 VVGAEYDFSKRTSALVSAGWLQE 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09410TCRTETA310.009 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.9 bits (70), Expect = 0.009
Identities = 38/135 (28%), Positives = 57/135 (42%), Gaps = 4/135 (2%)

Query: 254 AQTSGNVLAIASLMGIAGAALASYLGGRAARRAMLLAGYGILAASLVALAAAPNANGYTL 313
G +LA+ +LM A A + L R RR +LL A +A AP +
Sbjct: 42 TAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI 101

Query: 314 A--IFGFKFAWTFVLPFMLASVAAVDATGRLIATLNLVIGSGLAAGPLAAGLMLDGGGTL 371
+ G A V +A + D R ++ G G+ AGP+ GLM GG +
Sbjct: 102 GRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSP 159

Query: 372 RALFSIAAAVSLVSL 386
A F AAA++ ++
Sbjct: 160 HAPFFAAAALNGLNF 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09510HTHFIS1003e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 99.5 bits (248), Expect = 3e-26
Identities = 39/136 (28%), Positives = 72/136 (52%), Gaps = 2/136 (1%)

Query: 7 SKILVVDDDPRLRDLLRRYLGEQGFNVYVAENATAMNKLWVRERFDLLVLDLMLPGEDGL 66
+ ILV DDD +R +L + L G++V + NA + + DL+V D+++P E+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 SICRRLRGSNDRTPIIMLTAKGEDVDRIVGLEMGADDYLPKPFNPRELVARIHAVL--RR 124
+ R++ + P+++++A+ + I E GA DYLPKPF+ EL+ I L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 125 QAPAELPGAPSETTEV 140
+ P++L + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09515PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 21/104 (20%), Positives = 37/104 (35%), Gaps = 18/104 (17%)

Query: 343 LVENARKYGQSKQDGIARILLETRVTHARVELVVADEGPGIPEDQLPLVMRPFYRVDTAR 402
LVEN K+G ++ +ILL+ + V L V + G ++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNT--------------- 307

Query: 403 SKADGTGLGMAIV-LRLVNRYRGALRLRNRTPDAGLEVTLEFPS 445
+ TG G+ V RL Y +++ + + P
Sbjct: 308 --KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


27E4F39_09715E4F39_09855Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_09715427-0.614097hypothetical protein
E4F39_097205190.919970DUF899 domain-containing protein
E4F39_097255220.954516helix-turn-helix domain-containing protein
E4F39_097305230.346658glutamine cyclotransferase
E4F39_09735625-1.177332DMT family transporter
E4F39_09740738-2.528254RnfH family protein
E4F39_09745741-5.516010type II toxin-antitoxin system RatA family
E4F39_09750849-7.622126SsrA-binding protein SmpB
E4F39_097551051-8.248405SPFH/Band 7/PHB domain protein
E4F39_097601151-8.475325NfeD family protein
E4F39_097651149-8.405455phosphoenolpyruvate synthase
E4F39_097701046-9.426587hypothetical protein
E4F39_09775946-9.254589glutathione gamma-glutamylcysteinyltransferase
E4F39_09780949-9.366435hypothetical protein
E4F39_09785425-7.097820peptidase S53
E4F39_09790221-4.690688kinase/pyrophosphorylase
E4F39_09795011-3.055733hypothetical protein
E4F39_09800-18-0.601364RNA methyltransferase
E4F39_09805-17-0.044567ribonuclease HII
E4F39_09810-392.174722lipid-A-disaccharide synthase
E4F39_098150113.555504acyl-ACP--UDP-N-acetylglucosamine
E4F39_09820-2122.7586433-hydroxyacyl-ACP dehydratase FabZ
E4F39_098250133.947079UDP-3-O-(3-hydroxymyristoyl)glucosamine
E4F39_09830-1132.768247OmpH family outer membrane protein
E4F39_09835-1121.839762outer membrane protein assembly factor BamA
E4F39_09840-111-1.636318RIP metalloprotease RseP
E4F39_09845-112-1.8872201-deoxy-D-xylulose-5-phosphate reductoisomerase
E4F39_09850015-3.164968phosphatidate cytidylyltransferase membrane
E4F39_09855-212-3.465938di-trans,poly-cis-decaprenylcistransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09865RTXTOXINA300.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.021
Identities = 18/92 (19%), Positives = 35/92 (38%), Gaps = 8/92 (8%)

Query: 221 INQAQGEAAAILAVAEANSQAIQKIAQAIQSQGGMDAVNLKVAEQYVGAFGNLAKAGNTL 280
INQ A++ + SQ + + + + ++ V K+ NL G L
Sbjct: 188 INQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQN-----LPNLDNIGAGL 242

Query: 281 IVPSNLSDLSTAIASALTIVNRSAPGALAPGA 312
+S + +AI+++ + N A A
Sbjct: 243 DT---VSGILSAISASFILSNADADTRTKAAA 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09875PHPHTRNFRASE2654e-81 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 265 bits (678), Expect = 4e-81
Identities = 99/441 (22%), Positives = 171/441 (38%), Gaps = 73/441 (16%)

Query: 383 IHDPSEMERVQPGDVLVADMTDPNWEPVMK-RASAIVTNRGGRTCHAAIIARELGVPAVV 441
+ S + ++ D+T + + K T+ GGRT H+AI++R L +PAVV
Sbjct: 145 VETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVV 204

Query: 442 GCGDATDVLKDGALVTVSCAEGDEGKIYDGLLETEVSEVQRGE------------LPSVP 489
G + T+ ++ G +V V G EG + E EV + L P
Sbjct: 205 GTKEVTEKIQHGDMVIVD---GIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEP 261

Query: 490 --------VKIMMNVGNPQLAFDFSQLPNAGVGLARLEFIINNNIGVHPKAILEYPNVDA 541
V++ N+G P+ G+GL R EF+ + P
Sbjct: 262 STTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDR-DQLPT---------- 310

Query: 542 DLKKAVESVARGHASPRAFYVDKLTEGIATIAAAFYPKPVIVRLSDFKSNEYKKLIGGSR 601
++ E + KPV++R D ++ +
Sbjct: 311 --------------------EEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYL---- 346

Query: 602 YEPDEENPMLGFRGASRYIAEDFAQAFEMECMALKRVRDEMGLTNVEIMVPFVRTVKQAE 661
P E NP LGFR + + F + AL R N+++M P + T+++
Sbjct: 347 QLPKELNPFLGFRAIRLCL--EKQDIFRTQLRALLRAS---TYGNLKVMFPMIATLEELR 401

Query: 662 RVVGLLGKFGLKRGDNG------LRLIMMCEVPSNAILAEEFLQHFDGFSIGSNDLTQLT 715
+ ++ + K G + + +M E+PS A+ A F + D FSIG+NDL Q T
Sbjct: 402 QAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYT 461

Query: 716 LGLDRDSGMELLAVDFDERDPAVKFMLKRAIDTCRKLDKYVGICGQGPSDHPDFAKWLAD 775
+ DR + E ++ + PA+ ++ I K+VG+CG+ D L
Sbjct: 462 MAADRMN--ERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLG 518

Query: 776 EGIASISLNPDTVIETWQALA 796
G+ S++ +++ L
Sbjct: 519 LGLDEFSMSATSILPARSQLL 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09890INFPOTNTIATR250.019 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 25.3 bits (55), Expect = 0.019
Identities = 11/30 (36%), Positives = 17/30 (56%)

Query: 1 MPRVRPAAIAIAIAIAIATATATATATDTD 30
M V A + +A++ A+A AT+ TD D
Sbjct: 3 MKLVTAAIMGLAMSTAMAATDATSLTTDKD 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09895SUBTILISIN417e-06 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 41.4 bits (97), Expect = 7e-06
Identities = 21/122 (17%), Positives = 36/122 (29%), Gaps = 22/122 (18%)

Query: 342 VLYAAPSMLLSDITSAYNRAVVDNVAKVINVSLGVCEADARASGTQAADDRIFKSAVAQG 401
VL S I A+ + +I++SLG A + AVA
Sbjct: 117 VLNKQGSGQYDWIIQGIYYAI-EQKVDIISMSLG---GPEDVPELHEAVKK----AVASQ 168

Query: 402 QTFVVAAGDAGAYECSVSRVSGGQGVPARSNYSVSEPATSPYVVAVGGTTLSTDRTTLAY 461
+ AAG+ G + + P V++VG + +
Sbjct: 169 ILVMCAAGNEGDGDDRTD--------------ELGYPGCYNEVISVGAINFDRHASEFSN 214

Query: 462 AG 463
+
Sbjct: 215 SN 216


28E4F39_10405E4F39_10435Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_104051113.021616*aspartate kinase
E4F39_104101103.342816hypothetical protein
E4F39_10415193.678613tRNA lysidine(34) synthetase TilS
E4F39_10420183.922637acetyl-CoA carboxylase carboxyltransferase
E4F39_104251123.672369DNA-3-methyladenine glycosylase 2 family
E4F39_104301153.256912cysteine--tRNA ligase
E4F39_10435-1193.199468tetratricopeptide repeat protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10505CARBMTKINASE362e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.0 bits (83), Expect = 2e-04
Identities = 33/119 (27%), Positives = 56/119 (47%), Gaps = 15/119 (12%)

Query: 116 IDDERVRRDLDAGKVVIITGFQGV---DPDGHITTL-GRGGSDTSAVAVAAALEADECLI 171
++ E +++ ++ G +VI +G GV DG I + D + +A + AD +I
Sbjct: 174 VEAETIKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMI 233

Query: 172 YTDVDGVYTTDPRVVEEARRLDSVTFEEMLEMA--------SLGSKVLQ-IRSVEFAGK 221
TDV+G E+ + L V EE+ + S+G KVL IR +E+ G+
Sbjct: 234 LTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGE 290


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10535SYCDCHAPRONE280.017 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.4 bits (63), Expect = 0.017
Identities = 12/65 (18%), Positives = 20/65 (30%)

Query: 72 DDAIKQFVELTQAYPELPEPYNNLAALYAKHGRYDDARSALVTATHANPNYGLAYENLGD 131
+DA K F L + L A G+YD A + + + +
Sbjct: 53 EDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAE 112

Query: 132 LYLRL 136
L+
Sbjct: 113 CLLQK 117


29E4F39_10570E4F39_10670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_10570326-6.107009PAAR domain-containing protein
E4F39_10575026-6.530920DUF4123 domain-containing protein
E4F39_10585640-7.985588hypothetical protein
E4F39_10590740-7.980191hypothetical protein
E4F39_105951054-11.940704DNA adenine methylase
E4F39_106001054-12.018551lysozyme
E4F39_106051053-12.340197glycosyl hydrolase
E4F39_10610957-12.776341hypothetical protein
E4F39_106151175-17.676948late control protein
E4F39_106201176-18.622773phage tail protein
E4F39_10625965-16.162109phage tail protein
E4F39_10635952-11.621497phage tail tape measure protein
E4F39_10640948-10.708869phage tail assembly protein
E4F39_10645329-5.259962phage major tail tube protein
E4F39_10650220-4.281381phage tail sheath family protein
E4F39_10655125-4.027955tail assembly chaperone
E4F39_10660228-4.743279hypothetical protein
E4F39_10670021-3.059944phage tail protein I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10695PRPHPHLPASEC290.034 Prokaryotic zinc-dependent phospholipase C signature.
		>PRPHPHLPASEC#Prokaryotic zinc-dependent phospholipase C signature.

Length = 398

Score = 28.8 bits (64), Expect = 0.034
Identities = 10/50 (20%), Positives = 20/50 (40%), Gaps = 1/50 (2%)

Query: 281 DVSAEKTVTLKGFKRDADGDFLVES-VTHEYAGRSWETEVVLNAGNKGKA 329
D +A +GF + + + ++H + + +V L KG A
Sbjct: 212 DFNAWSKEYARGFAKTGKSIYYSHASMSHSWDDWDYAAKVTLANSQKGTA 261


30E4F39_10795E4F39_10905Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_10795219-3.513248hypothetical protein
E4F39_10800224-5.749833hypothetical protein
E4F39_10805330-6.461991hypothetical protein
E4F39_10810130-6.284503hypothetical protein
E4F39_10815028-5.840110hypothetical protein
E4F39_10820129-6.044513hypothetical protein
E4F39_10825028-5.524605XRE family transcriptional regulator
E4F39_10830026-5.944277hypothetical protein
E4F39_10835022-5.305312hypothetical protein
E4F39_10840-136-7.140959hypothetical protein
E4F39_10845040-7.920713hypothetical protein
E4F39_10850144-9.112203ACP synthase
E4F39_10855243-9.612190pyridoxal phosphate biosynthetic protein PdxJ
E4F39_10860242-7.634044hypothetical protein
E4F39_10865141-7.626071hypothetical protein
E4F39_10870140-6.909876AlpA family phage regulatory protein
E4F39_10875039-6.787805DUF4102 domain-containing protein
E4F39_10880029-5.042929DNA mismatch repair protein MutS
E4F39_10885025-4.118059hypothetical protein
E4F39_10890025-4.423737peptidylprolyl isomerase
E4F39_10895028-4.411102cupin domain-containing protein
E4F39_10900233-5.695606hypothetical protein
E4F39_10905021-3.145448lipoprotein
31E4F39_11410E4F39_11530Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_11410012-4.037458hypothetical protein
E4F39_11420-110-3.065150N-formylglutamate deformylase
E4F39_11425-410-1.707885hypothetical protein
E4F39_11430-49-0.699041formimidoylglutamate deiminase
E4F39_11440015-0.029621imidazolonepropionase
E4F39_114458233.488413HutD family protein
E4F39_114506201.550287urocanate hydratase
E4F39_114555191.710792histidine utilization repressor
E4F39_114603142.120667histidine ammonia-lyase
E4F39_114651132.561007hypothetical protein
E4F39_114701113.4346864'-phosphopantetheinyl transferase superfamily
E4F39_114750122.350870alpha/beta fold hydrolase
E4F39_114800112.117152LuxR family transcriptional regulator
E4F39_114850132.030301hypothetical protein
E4F39_114900122.200097hypothetical protein
E4F39_114950112.638511adenylyl-sulfate kinase
E4F39_11500-1102.475922deoxyribodipyrimidine photo-lyase
E4F39_11505-173.741081alkane 1-monooxygenase
E4F39_11510093.767856hypothetical protein
E4F39_115154153.510147nitric-oxide reductase large subunit
E4F39_115204153.009080metal-sulfur cluster assembly factor
E4F39_115253171.564869hypothetical protein
E4F39_115303142.605299NnrS family protein
32E4F39_11690E4F39_11735Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_116901113.026664ubiquinol oxidase subunit II
E4F39_116952112.904326cytochrome o ubiquinol oxidase subunit I
E4F39_117001141.384358cytochrome o ubiquinol oxidase subunit III
E4F39_11705114-0.973189cytochrome o ubiquinol oxidase subunit IV
E4F39_11710-117-2.065249fatty acid desaturase
E4F39_11715-115-3.602805ABC transporter substrate-binding protein
E4F39_1172009-4.474636glutathione S-transferase
E4F39_11725-19-3.970257succinylglutamate desuccinylase
E4F39_1173019-4.049832N-succinylarginine dihydrolase
E4F39_1173519-3.013155succinylglutamate-semialdehyde dehydrogenase
33E4F39_12420E4F39_12445Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_12420324-2.642640acyl-CoA dehydrogenase
E4F39_12425628-3.557813electron transfer flavoprotein subunit alpha
E4F39_12430426-4.317749electron transfer flavoprotein subunit beta/FixA
E4F39_12435321-3.543062MetQ/NlpA family ABC transporter
E4F39_12440320-3.606027ABC transporter permease
E4F39_12445318-3.299595methionine ABC transporter ATP-binding protein
34E4F39_12820E4F39_12920Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_12820439-1.604187bifunctional (p)ppGpp
E4F39_12825438-1.662564DNA-directed RNA polymerase subunit omega
E4F39_12830232-1.116116guanylate kinase
E4F39_128350200.602121YicC family protein
E4F39_12840-1100.433488ribonuclease PH
E4F39_12845-190.994019RdgB/HAM1 family non-canonical purine NTP
E4F39_128500101.557869oxygen-independent coproporphyrinogen III
E4F39_12855091.583587*hypothetical protein
E4F39_12860081.533329hypothetical protein
E4F39_128653110.409453TonB family protein
E4F39_128703180.097270hypothetical protein
E4F39_12875615-1.634482MFS transporter
E4F39_12885514-3.254076hypothetical protein
E4F39_12895112-4.248918lipopolysaccharide heptosyltransferase I
E4F39_12900012-4.275595chloride channel protein
E4F39_12905-211-3.900953hypothetical protein
E4F39_12920-112-3.148319cell division topological specificity factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12985PF03544290.007 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.2 bits (65), Expect = 0.007
Identities = 10/45 (22%), Positives = 18/45 (40%)

Query: 82 VVVAFTVDRNGQLVKSAVYRSNGDDEAEGIALASLRRAAPLPPPP 126
V V F V +G++ + + + E ++RR P P
Sbjct: 180 VKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKP 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12995TCRTETA347e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 7e-04
Identities = 33/147 (22%), Positives = 52/147 (35%), Gaps = 15/147 (10%)

Query: 290 LLCFAVLFMALATPLAAAAADRFGRKPVLIAGAIAALLSGFTMAPLLGSGSMPLVALFLT 349
LL L P+ A +DRFGR+PVL+ A + MA + V
Sbjct: 48 LLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMA----TAPFLWVLYIGR 103

Query: 350 IELFLMGMTFAPMGALLPELFP--TNVRYTG-AGVAYNLGGILGASIAPYIAQLLAARGG 406
I + G T A GA + ++ R+ G + G + G + +
Sbjct: 104 IVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPH--- 160

Query: 407 LAWVGAYVSAAAAIS--LVGVLLMRET 431
+ +AA L G L+ E+
Sbjct: 161 ---APFFAAAALNGLNFLTGCFLLPES 184


35E4F39_13470E4F39_13530Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_13470116-4.012243dTDP-glucose 4,6-dehydratase
E4F39_13475227-5.469108symmetrical bis(5'-nucleosyl)-tetraphosphatase
E4F39_13480134-6.6088791-acyl-sn-glycerol-3-phosphate acyltransferase
E4F39_13485240-7.786691dihydroorotase
E4F39_13495240-9.123691aspartate carbamoyltransferase
E4F39_13500240-9.036228bifunctional pyr operon transcriptional
E4F39_13505339-9.802099Holliday junction resolvase RuvX
E4F39_13510439-9.370507YqgE/AlgH family protein
E4F39_13515441-9.713693rubredoxin
E4F39_13520436-9.335067hydroxymethylpyrimidine/phosphomethylpyrimidine
E4F39_13525228-7.610520chaperonin GroEL
E4F39_13530122-5.475457co-chaperone GroES
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13550NUCEPIMERASE1744e-54 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 174 bits (444), Expect = 4e-54
Identities = 90/350 (25%), Positives = 136/350 (38%), Gaps = 45/350 (12%)

Query: 2 ILVTGGAGFIGANFVLDWLAQSDEAVLNVDKLT--YAGNLGTLK-SLQGNPKHVFARVDI 58
LVTG AGFIG + L + V+ +D L Y +L + L P F ++D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 59 CDRAAIDALLAQHKPRAIVHFAAESHVDRSIHGPADFVQTNVVGTFTLLEAARQYWSALG 118
DR + L A + V S+ P + +N+ G +LE R
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN----K 117

Query: 119 PDAKAAFRFLHVSTDEVFGSLSPADPQFSETTPYA-PNSPYSATKAGSDHLVRAYHHTYG 177
L+ S+ V+G L+ P FS P S Y+ATK ++ + Y H YG
Sbjct: 118 IQ-----HLLYASSSSVYG-LNRKMP-FSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 178 LPVLTTNCSNNYGPYQFPEKLIPLMIANALGGKPLPVYGDGQNVRDWLYVGDHCSAIREV 237
LP YGP+ P+ + L GK + VY G+ RD+ Y+ D AI +
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRL 230

Query: 238 L------------------ARGVPGETYNVGGWNEKKNLDVVHTLCDLLD-EARPKAAGS 278
A P YN+G + + +D + L D L EA+
Sbjct: 231 QDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKN---- 286

Query: 279 YRDQITYVTDRPGHDRRYAIDARKLERELGWKPAETFETGLAKTVRWYLD 328
+ +PG + D + L +G+ P T + G+ V WY D
Sbjct: 287 ------MLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


36E4F39_13595E4F39_13655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_13595527-1.802034transcriptional regulator
E4F39_13600627-2.328850MBL fold metallo-hydrolase
E4F39_13605830-3.088479XRE family transcriptional regulator
E4F39_13610626-2.613957GNAT family N-acetyltransferase
E4F39_13615624-0.977499LysE family translocator
E4F39_13620524-0.367886hypothetical protein
E4F39_13625223-1.337695glycerophosphodiester phosphodiesterase
E4F39_13630117-0.744142DHA2 family efflux MFS transporter permease
E4F39_13635013-0.871076hypothetical protein
E4F39_13640112-1.354192sulfite exporter TauE/SafE family protein
E4F39_13645213-2.302728peptidase S1
E4F39_13650412-2.916312hypothetical protein
E4F39_13655211-3.148272D-(-)-3-hydroxybutyrate oligomer hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13690AUTOINDCRSYN280.018 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 27.9 bits (62), Expect = 0.018
Identities = 6/48 (12%), Positives = 17/48 (35%), Gaps = 2/48 (4%)

Query: 102 RDARRRGVAQRLLAALDDAARAAGKTVLVLDTVTGGDAERLYARAGWQ 149
++ L ++ + ++ G + T+ + R+GW
Sbjct: 112 ILGNEYPISSMLFLSMINYSKDKGYDGIY--TIVSHPMLTILKRSGWG 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13710TCRTETB1324e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (334), Expect = 4e-36
Identities = 76/413 (18%), Positives = 165/413 (39%), Gaps = 24/413 (5%)

Query: 14 LVVLCLGVLMIVLDSTIVNVALPSIGADLHFTETALVWIVNAYMLTFGGCLLLGGRLGDL 73
L+ LC+ VL+ ++NV+LP I D + + W+ A+MLTF + G+L D
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 74 YGQRRMFLAGLTLFTLASLACGLAPTQF-VLIAARAVQGFGGAVVSAVALSLIMNLFTEP 132
G +R+ L G+ + S+ + + F +LI AR +QG G A A+ + +++ +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL-VMVVVARYIPK 134

Query: 133 GERAKAMGVYSFVCAGGGSLGVLLGGVLTSVLSWHWIFLVNLPIGVAAYALSAALLPKVR 192
R KA G+ + A G +G +GG++ + HW +L+ +P+ ++ L K+
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPM---ITIITVPFLMKLL 189

Query: 193 PQAA--DARLDVAGAIAVTASLMLAVYGIVNGNETGWLSTQTVATLAGAAALLAAFIAIE 250
+ D+ G I ++ ++ + +L ++ F+
Sbjct: 190 KKEVRIKGHFDIKGIILMSVGIVFFMLF-TTSYSISFLIVSVLS--------FLIFVKHI 240

Query: 251 TRAAHPLMPLALATQRNVAVANVIGVLWAAAMFAWFFLSALYMQRVLGYGPLQVGLAFLP 310
+ P + L + + G + + + + M+ V ++G +
Sbjct: 241 RKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIF 300

Query: 311 ANLIMAAFSLGLSARIVMRFGIRRAIGAGLVIAAAGLALFARAPADGGFVAHVLPGMILV 370
+ + +V R G + G+ + + +I+V
Sbjct: 301 PGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLL----ETTSWFMTIIIV 356

Query: 371 GVGAGVAFNPVLLA--AMSDVAPSDSGLASGIVNTSFMMGGALGLAVLASVAS 421
V G++F +++ S + ++G ++N + + G+A++ + S
Sbjct: 357 FVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409



Score = 31.0 bits (70), Expect = 0.012
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 1/99 (1%)

Query: 66 LGGRLGDLYGQRRMFLAGLTLFTLASLACGLAP-TQFVLIAARAVQGFGGAVVSAVALSL 124
+GG L D G + G+T +++ L T + V GG + +S
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIST 371

Query: 125 IMNLFTEPGERAKAMGVYSFVCAGGGSLGVLLGGVLTSV 163
I++ + E M + +F G+ + G L S+
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


37E4F39_14185E4F39_14305Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_14185537-7.137425ABC transporter permease
E4F39_14195535-9.267647capsule biosynthesis protein
E4F39_14200436-9.485908polysaccharide export protein
E4F39_14205438-10.717990glycosyltransferase family 1 protein
E4F39_14210438-11.151967capsule polysaccharide export protein
E4F39_14215440-11.410131mannose-1-phosphate
E4F39_14220442-11.752970class II glutamine amidotransferase
E4F39_14225447-12.520328mechanosensitive ion channel family protein
E4F39_14230549-13.219234hypothetical protein
E4F39_14235649-13.134241DNA mismatch repair endonuclease MutL
E4F39_14240651-13.245861tRNA (adenosine(37)-N6)-dimethylallyltransferase
E4F39_14245650-12.606985SgcJ/EcaC family oxidoreductase
E4F39_14255643-9.012291phosphoribosylformylglycinamidine cyclo-ligase
E4F39_14260437-7.706317hypothetical protein
E4F39_14265333-6.450778DnaA regulatory inactivator Hda
E4F39_14270126-4.982511HAD-IB family hydrolase
E4F39_14275022-4.553648polynucleotide adenylyltransferase PcnB
E4F39_14280-214-1.9413982-amino-4-hydroxy-6-
E4F39_14285-210-0.582115deoxynucleoside kinase
E4F39_14290-180.8275613-methyl-2-oxobutanoate
E4F39_14295-181.261759chloride transporter
E4F39_14300-191.674583molecular chaperone DnaJ
E4F39_14305-193.003262molecular chaperone DnaK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14260ABC2TRNSPORT382e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 38.0 bits (88), Expect = 2e-05
Identities = 32/139 (23%), Positives = 58/139 (41%), Gaps = 7/139 (5%)

Query: 88 MAVTPNLALMYHRNVKVIDIFIARILLEVVGNTASFFVLMITFHALGLVDYPEDILEVMF 147
M M + +++ DI + + + + + ALG + +++
Sbjct: 94 MEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLS----LLY 149

Query: 148 AWVMIIWFG---ASLGFIIGALSEKTELVEKLWHPVTYLMFPLSGAIFMVDWLSPAFQKI 204
A +I G ASLG ++ AL+ + V + LSGA+F VD L FQ
Sbjct: 150 ALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTA 209

Query: 205 VLWLPMVHGVEMLREGYFG 223
+LP+ H ++++R G
Sbjct: 210 ARFLPLSHSIDLIRPIMLG 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14365OMADHESIN300.031 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 30.3 bits (67), Expect = 0.031
Identities = 36/121 (29%), Positives = 50/121 (41%), Gaps = 6/121 (4%)

Query: 314 LIDALETTPRGLYTGAIGW--LDAEARADGAEPAAGAFAPERAASPAAGTASDGACAGET 371
LI AL ++P G L A + A+PA G P R P AG + A +
Sbjct: 13 LISALFSSPYAFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHS 72

Query: 372 ASKQASKQASKRAGATSPAGACGDFCLSVAIRTLTLDAPSAGGERRGTMGVGAGIVLDSV 431
+ A+ +A+K A AG+ SVAI L+ A G+ T G + D V
Sbjct: 73 IAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLS----KALGDSAVTYGAASTAQKDGV 128

Query: 432 A 432
A
Sbjct: 129 A 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14375SHAPEPROTEIN1353e-37 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 135 bits (342), Expect = 3e-37
Identities = 81/382 (21%), Positives = 138/382 (36%), Gaps = 71/382 (18%)

Query: 5 IGIDLGTTNSCVAIMEGNQVKVIENSEGARTTPSIIAYMDDNEVL-VGAPAKRQSVTNPK 63
+ IDLGT N+ + + V + R V VG AK+ P
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQD----RAGSPKSVAAVGHDAKQMLGRTPG 68

Query: 64 NTLFAVKRLIGRRFEEKEVQKDIGLMPYAIIKADNGDAWVEAHGEKLAPPQVSAEVLRK- 122
N + A++ + + V D V+ ++L+
Sbjct: 69 N-IAAIRPM------KDGVIADF---------------------------FVTEKMLQHF 94

Query: 123 MKKTAEDYLGEPVTEAVITVPAYFNDSQRQATKDAGRIAGLEVKRIINEPTAAALAFGLD 182
+K+ + P ++ VP +R+A +++ + AG +I EP AAA+ GL
Sbjct: 95 IKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLP 154

Query: 183 KAEKGDRKIAVYDLGGGTFDVSIIEIADVDGEMQFEVLSTNGDTFLGGEDFDQRIIDYII 242
+E V D+GGGT +V++I + V + +GG+ FD+ II+Y+
Sbjct: 155 VSE--ATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAIINYVR 203

Query: 243 GEFKKEQGVDLSKDVLALQRLKEAAEKAKIELSSS----QQTEINLPYITADASGPKHLN 298
+ G + AE+ K E+ S+ + EI + P+
Sbjct: 204 RNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFT 250

Query: 299 LKVTRAKLEALVEDLVERTIEPCRTAIKDAGVKVSDIDD--VILVGGQTRMPKVQEKVKE 356
L + LEAL E L + SDI + ++L GG + + + E
Sbjct: 251 LN-SNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLME 309

Query: 357 FFGKEPRRDVNPDEAVAVGAAI 378
G +P VA G
Sbjct: 310 ETGIPVVVAEDPLTCVARGGGK 331


38E4F39_14495E4F39_14540Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_144952151.7038824-hydroxybenzoate octaprenyltransferase
E4F39_145002151.465228transcriptional regulator
E4F39_145052152.538538transcriptional regulator
E4F39_145102172.744400non-specific DNA-binding protein DpsA
E4F39_145151162.708030hypothetical protein
E4F39_145201161.694079hypothetical protein
E4F39_145250132.012709catalase/peroxidase HPI
E4F39_145353154.003240hydrogen peroxide-inducible genes activator
E4F39_145402122.102575ATP-dependent DNA helicase RecG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14580HELNAPAPROT1573e-52 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 157 bits (399), Expect = 3e-52
Identities = 50/150 (33%), Positives = 77/150 (51%), Gaps = 1/150 (0%)

Query: 14 ISDKDRKK-IAAGLSRLLADTYTLYLKTHNFHWNVTGPMFNTLHLMFEEQYNELWLAVDL 72
+ K + + L+ L++ + LY K H FHW V GP F TLH FEE Y+ VD
Sbjct: 4 ENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDT 63

Query: 73 VAERIRTLGVVAPGTYREFAKLSSIPEADGVPAAEEMVRQLVEGHEAVVRTARAIFPDAD 132
+AER+ +G T +E+ + +SI + +A EMV+ LV ++ + ++ + A+
Sbjct: 64 IAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLAE 123

Query: 133 AASDEPTADLLTQRLQTHEKTAWMLRSLLA 162
D TADL ++ EK WML S L
Sbjct: 124 ENQDNATADLFVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14605IGASERPTASE368e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 8e-04
Identities = 26/87 (29%), Positives = 34/87 (39%), Gaps = 6/87 (6%)

Query: 140 RSAPRATPRGRTADGRAVPA-AAQAARAPRAPRAAPPDALPADTPPASAAKPAKRAAKAA 198
+ TP AD +VP+ + AR AP P A P++T A +K
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA-----ENSKQE 1047

Query: 199 SKTVSKAAADAAGAQAAPKAVKTADKL 225
SKTV K DA A + V K
Sbjct: 1048 SKTVEKNEQDATETTAQNREVAKEAKS 1074


39E4F39_14690E4F39_14720Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_146900133.910891tRNA 2-thiouridine(34) synthase MnmA
E4F39_146951143.589471FMN-binding glutamate synthase family protein
E4F39_147001142.638511glutathione S-transferase
E4F39_147053183.204965hypothetical protein
E4F39_147106252.807133M24 family metallopeptidase
E4F39_147154222.012317UbiH/UbiF/VisC/COQ6 family ubiquinone
E4F39_147202241.963656tRNA dihydrouridine synthase DusB
40E4F39_14875E4F39_14985Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_14875012-3.506827hypothetical protein
E4F39_14880014-4.723789hypothetical protein
E4F39_14885014-3.905410Hsp20/alpha crystallin family protein
E4F39_14890-111-1.815189Hsp20/alpha crystallin family protein
E4F39_14895114-1.235214hypothetical protein
E4F39_14900-112-0.670501co-chaperone GroES
E4F39_14905-2130.664321hypothetical protein
E4F39_14915-2121.019428hypothetical protein
E4F39_14920-3110.468812amino acid ABC transporter ATP-binding protein
E4F39_14930-116-0.983922glutamate/aspartate ABC transporter permease
E4F39_14935330-0.675722amino acid ABC transporter permease
E4F39_14940229-1.510589glutamate/aspartate ABC transporter
E4F39_14945330-1.090541Glu/Leu/Phe/Val dehydrogenase
E4F39_14950131-3.002329hypothetical protein
E4F39_14955320-5.299765LysR family transcriptional regulator
E4F39_14960217-4.941933MFS transporter
E4F39_14965211-5.126213adenylosuccinate lyase
E4F39_14970311-6.118434gluconokinase
E4F39_14975214-5.264995gluconate permease
E4F39_14980114-4.936570bifunctional 4-hydroxy-2-oxoglutarate
E4F39_14985-113-3.170102phosphogluconate dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14975ACRIFLAVINRP250.039 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 25.2 bits (55), Expect = 0.039
Identities = 9/38 (23%), Positives = 21/38 (55%), Gaps = 1/38 (2%)

Query: 14 IEIDDVIVGLLAI-RLNLPENADPRDAISRHLSEAGGP 50
+ +DD IV + + R+ + + P++A + +S+ G
Sbjct: 404 LLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGA 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14980PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.002
Identities = 17/53 (32%), Positives = 24/53 (45%), Gaps = 5/53 (9%)

Query: 29 VVVVCGPSGSGKSTLIKTVNGLEPFQQGEILVNGQSVGDKKTNLSKLRSKVGM 81
VV+ G G GKSTLI T+ GL+ F +G K + ++ V
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHF-----DIGTGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15015TCRTETB355e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 5e-04
Identities = 60/314 (19%), Positives = 116/314 (36%), Gaps = 9/314 (2%)

Query: 37 APFISKDIAMTAAQQGFLVAVPVLAAAILRVTLGNLYQSADGRRIALMGVLLSALPAAVL 96
P I+ D A ++ +L +I G L +R+ L G++++ +V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCF-GSVI 95

Query: 97 PFVPGTPSYTLLLVLGVFLGIGGASFAVALPMAGSSY-PPKVQGLVLGL-AAAGNIGAVL 154
FV G ++LL++ G G A+F + + + Y P + +G GL + +G +
Sbjct: 96 GFV-GHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 155 DGFMFPHIAAALGWKFSTTAALPLLAIAGFALFAWADDRAEKSGSAARAFGAFVITLGGL 214
+ IA + W + +P++ I F + E ++ G+
Sbjct: 155 GPAIGGMIAHYIHWSY--LLLIPMITIIT-VPFLMKLLKKEVRIKGHFDIKGIILMSVGI 211

Query: 215 VALVLAVHAGLFGAGKAGVLLLPVMGALLAIAVLPGRYRAVLAERDTWVIMLIYSITFGG 274
V +L + VL + + P + + +L I FG
Sbjct: 212 VFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGT 271

Query: 275 FVGMSSYVTLLLTTLYQMPKIEAGLFMSLLAFLGAIVR-PFGGHLADRITGVRALMAILA 333
G S V ++ ++Q+ E G + + I+ GG L DR G ++ I
Sbjct: 272 VAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRR-GPLYVLNIGV 330

Query: 334 AIAAADFAFAAVMP 347
+ F A+ +
Sbjct: 331 TFLSVSFLTASFLL 344


41E4F39_15180E4F39_15250Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_151804150.799557CinA family protein
E4F39_1518509-0.890449orotidine-5'-phosphate decarboxylase
E4F39_15190011-0.928350aldose 1-epimerase
E4F39_1519509-1.583388SDR family oxidoreductase
E4F39_15200-210-1.207502L-arabinose ABC transporter permease AraH
E4F39_15205-113-0.615255L-arabinose ABC transporter ATP-binding protein
E4F39_152100130.457177arabinose ABC transporter substrate-binding
E4F39_152152132.292963SDR family oxidoreductase
E4F39_152204140.7947012-dehydro-3-deoxy-6-phosphogalactonate aldolase
E4F39_152253141.2709092-dehydro-3-deoxygalactonokinase
E4F39_152302130.844668IclR family transcriptional regulator
E4F39_152352141.351620EF-hand domain-containing protein
E4F39_152403151.219380lipoprotein
E4F39_152452152.057898hypothetical protein
E4F39_152502162.674170monofunctional biosynthetic peptidoglycan
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15240DHBDHDRGNASE1233e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 123 bits (310), Expect = 3e-36
Identities = 76/249 (30%), Positives = 113/249 (45%), Gaps = 8/249 (3%)

Query: 26 GRAVLITGGATGIGASFVEHFARQGARVAFVDLDEKAGRALVARLADAAHEPVFVVCDLT 85
G+ ITG A GIG + A QGA +A VD + + +V+ L A D+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 86 DIGALRGAIDAIRVRIGPIAVLVNNAANDVRHAVADVTPESFDASIAVNLRHQFFAAQAV 145
D A+ I +GPI +LVN A + ++ E ++A+ +VN F A+++V
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 146 IDDMKRLGGGAIVNLGSIGWMLKNAGYPVYATAKAAVQGLTRALARELGPFGIRVNTLVP 205
M G+IV +GS + YA++KAA T+ L EL + IR N + P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 206 GWVMTDKQRRLWLDDAGRAAIKAGQCIDAEL--------LPGDLARMALFLAADDSRLIT 257
G TD Q LW D+ G + G + P D+A LFL + + IT
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 258 AQDVVVDGG 266
++ VDGG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15250PF05272290.040 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.040
Identities = 14/38 (36%), Positives = 17/38 (44%), Gaps = 5/38 (13%)

Query: 18 RALD-GISFDVHAGQVHGLMGENGAGKSTLLKILGGEY 54
R ++ G FD L G G GKSTL+ L G
Sbjct: 587 RVMEPGCKFDY----SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15260DHBDHDRGNASE1357e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (342), Expect = 7e-41
Identities = 81/260 (31%), Positives = 130/260 (50%), Gaps = 14/260 (5%)

Query: 14 LAGKVAIVTGAGRGIGAAIARAFVREGAAVAIAELDAA---LAEESADAIARDTAGARVL 70
+ GK+A +TGA +GIG A+AR +GA +A + + S A AR
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE----- 60

Query: 71 AVPTDVARAESVAAALARTERAFGPLDVLVNNAGVNVFGDPLALTDEDWRRCFAIDLDGV 130
A P DV + ++ AR ER GP+D+LVN AGV G +L+DE+W F+++ GV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 131 WNGCRAALPGMVERGRGSIVNIASTHAFKIIPGCFPYPVAKHGVLGLTRALGIEYAPRNV 190
+N R+ M++R GSIV + S A Y +K + T+ LG+E A N+
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 191 RVNAIAPGYIETQLTHDWWSAQPDPQAARRETLALQ-----PMKRIGRPDEVAMTAVFLA 245
R N ++PG ET + W+ + + + P+K++ +P ++A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADE-NGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 246 SDEAPFINASCITIDGGRSV 265
S +A I + +DGG ++
Sbjct: 240 SGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15295FLGFLIJ280.025 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 27.9 bits (61), Expect = 0.025
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 8/54 (14%)

Query: 84 HRWVPYDQIARTLKRAVIA--------SEDADFANNSGYEVDAILQAWEKNRAR 129
+RW+ Y Q +TL++A+ ++ D A NS E LQAW+ + R
Sbjct: 64 NRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQER 117


42E4F39_15320E4F39_15370Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_15320-116-3.021260glycine zipper 2TM domain-containing protein
E4F39_15325016-0.974815histone H1
E4F39_15330-113-0.557736hypothetical protein
E4F39_15335-111-0.724601ribonucleotide-diphosphate reductase subunit
E4F39_15340-211-0.065651response regulator SirA
E4F39_153452150.732687ribonucleoside-diphosphate reductase subunit
E4F39_153505190.304674hypothetical protein
E4F39_15355315-3.0246251,6-anhydro-N-acetylmuramyl-L-alanine amidase
E4F39_15360316-3.586684inner membrane protein YpjD
E4F39_15365115-2.603495signal recognition particle protein
E4F39_15370116-3.267035hypoxanthine-guanine phosphoribosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15370IGASERPTASE352e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.7 bits (79), Expect = 2e-04
Identities = 26/165 (15%), Positives = 49/165 (29%), Gaps = 6/165 (3%)

Query: 19 VAKKAAPAKKAAVKKVAAKKVAVKKVAAKKAAPAKKVAAKKVAAKKAPAKKAAAKKVAVK 78
V K+ + + V V + +A A PA ++
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEE----IARVDEAPVPPPAPATPSETTETV 1040

Query: 79 KVAAKKVAAKKAPAKKAAAKKVAVKKVAAKKVAAKKAAPAKKAAAKKAAPAKKAAAKKAA 138
+K+ + ++ A + A + AK AK A + A + +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAK--EAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 139 PAKKAAAKKAAPKKAVVKKAAPATTASTASVAPASGVKTALNPAA 183
K+ A + K V + T+ V+P + P A
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15395FLAGELLIN260.021 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 26.2 bits (57), Expect = 0.021
Identities = 12/18 (66%), Positives = 13/18 (72%), Gaps = 1/18 (5%)

Query: 68 LRISRAGDDA-GTAIGNR 84
LRI+ A DDA G AI NR
Sbjct: 35 LRINSAKDDAAGQAIANR 52


43E4F39_15420E4F39_15460Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_15420-111-3.901715octaprenyl diphosphate synthase
E4F39_15425-211-4.075016*HlyC/CorC family transporter
E4F39_15430114-4.363165GspE family protein
E4F39_15435217-4.160327type II secretion system F family protein
E4F39_15440012-3.284366prepilin peptidase
E4F39_15445011-3.095869dephospho-CoA kinase
E4F39_15450112-3.278680cell division protein ZapD
E4F39_15455-112-3.395645DNA gyrase inhibitor YacG
E4F39_15460-310-3.543811NUDIX domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15500BCTERIALGSPF2573e-84 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 257 bits (657), Expect = 3e-84
Identities = 115/404 (28%), Positives = 194/404 (48%), Gaps = 13/404 (3%)

Query: 14 YRWAGVTIDGARRRGMLIAVDPSAARAALKRTGVTVLRL-EARGRAPQPAAR-------- 64
Y + + G + RG A AR L+ G+ L + E RG + +
Sbjct: 4 YHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRKI 63

Query: 65 ---ASEVTRFARQLAGLLHAGLALAPSLELLAQATQRSEMPRVAAGLAREIVAGRQFSAA 121
S++ RQLA L+ A + L +L+ +A+ +++ + ++ A + +++ G + A
Sbjct: 64 RLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLADA 123

Query: 122 LRRYPRQFDPFFCQLVAVGETAGALAAVLTRLADDRERAAAQYARVKGALAYPATVLLFA 181
++ +P F+ +C +VA GET+G L AVL RLAD E+ +R++ A+ YP + + A
Sbjct: 124 MKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVVA 183

Query: 182 LAITAALLVWVVPTFEQIFAGFGAPLPAPTRFVLALSAGVTRWSAPAAAAALFACVAIRH 241
+A+ + LL VVP + F LP TR ++ +S V + A L +A R
Sbjct: 184 IAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFRV 243

Query: 242 AARRSAAARVALARTLLRAPFVGPLVQTLAAARWSRALGTLLAAGTPLTDAFAALSNATG 301
R+ RV+ R LL P +G + + L AR++R L L A+ PL A +
Sbjct: 244 MLRQE-KRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMS 302

Query: 302 NPRFDLATAGIAARVLHGERLARAMRAEGCFPDDLVQPIAVAEESGTLDTMLIDAATLCD 361
N + V G L +A+ FP + IA E SG LD+ML AA D
Sbjct: 303 NDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQD 362

Query: 362 RRVDERISALANLCEPVVIVVLGALIGALVVAMYLPIVQLGNVV 405
R +++ L EP+++V + A++ +V+A+ PI+QL ++
Sbjct: 363 REFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15505PREPILNPTASE2802e-96 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 280 bits (718), Expect = 2e-96
Identities = 136/276 (49%), Positives = 172/276 (62%), Gaps = 1/276 (0%)

Query: 20 LAAFAALPTGMQLAFAIVLGLVVGSFLNVVVHRLPIMMKRAWLAEIAEATGAPCADDSLP 79
L A + + + L++GSFLNVV+HRLPIM++R W AE P
Sbjct: 4 LLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEP 63

Query: 80 ARYNLCVPRSACPHCGHALRAWENVPVLSYIALRGRCRHCRTPIGARYPLIELASGALAA 139
YNL VPRS CPHC H + A EN+P+LS++ LRGRCR C+ PI ARYPL+EL + L+
Sbjct: 64 -PYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSV 122

Query: 140 GALALFGPSGAALAAFGLCAALLAMSAIDMQTGFLPDSLTLPLLWAGLCVNLWGTFASLR 199
P LAA L L+A++ ID+ LPD LTLPLLW GL NL G F SL
Sbjct: 123 AVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLG 182

Query: 200 AAVIGAIAGYLFLWCILWLFKLLRGIEGIGYGDLKLLAALGAWLGWEALPQVVLIAAVAG 259
AVIGA+AGYL LW + W FKLL G EG+GYGD KLLAALGAWLGW+ALP V+L++++ G
Sbjct: 183 DAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVG 242

Query: 260 AAVGLVATWRGRMRFEEPLPFGPFLAAGGAATLFFG 295
A +G+ +P+PFGP+LA G L +G
Sbjct: 243 AFMGIGLILLRNHHQSKPIPFGPYLAIAGWIALLWG 278


44E4F39_15625E4F39_15670Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_15625-110-3.1436293-oxoadipyl-CoA thiolase
E4F39_15630013-3.5606862-(1,2-epoxy-1,2-dihydrophenyl)acetyl-CoA
E4F39_15635016-4.137597hydroxyphenylacetyl-CoA thioesterase PaaI
E4F39_15640016-4.657356phenylacetate--CoA ligase
E4F39_15645017-5.059411murein transglycosylase
E4F39_15650119-5.866489hypothetical protein
E4F39_15655125-6.373453Co2+/Mg2+ efflux protein ApaG
E4F39_15660234-7.297712hypothetical protein
E4F39_15670123-5.368641ribulose-phosphate 3-epimerase
45E4F39_15855E4F39_15920Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_158552130.219156hypothetical protein
E4F39_158601120.799400hypothetical protein
E4F39_158651151.560272glutamyl-tRNA reductase
E4F39_158703171.856749peptide chain release factor 1
E4F39_158756213.615213peptide chain release factor N(5)-glutamine
E4F39_158807214.640466Grx4 family monothiol glutaredoxin
E4F39_15885-1101.372376UbiX family flavin prenyltransferase
E4F39_158900121.507798dioxygenase
E4F39_158950101.007262APC family permease
E4F39_159000120.386148cold-shock protein
E4F39_15905-2101.093214branched-chain amino acid ABC transporter
E4F39_15915-2123.231127Hsp70 family protein
E4F39_15920-2133.163385hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_15995SHAPEPROTEIN377e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 37.4 bits (87), Expect = 7e-05
Identities = 52/224 (23%), Positives = 83/224 (37%), Gaps = 45/224 (20%)

Query: 5 AIDFGTSNSAVALPRGGGTTGMRLAPVEGEHLTLPTAIFFNTDESTREYGRAALASYIDG 64
+ID GT+N+ + + +G G L P+ + D + AA+ G
Sbjct: 14 SIDLGTANTLIYV-KGQGIV-----------LNEPSVVAIRQDRAGSPKSVAAV-----G 56

Query: 65 FDGRLM--RSMKSILGSPLAETTTDLGDGSAIAYTDVIALFLMH-LKQKAEACAGGAIGC 121
D + M R+ +I + DG IA V L H +KQ
Sbjct: 57 HDAKQMLGRTPGNI------AAIRPMKDG-VIADFFVTEKMLQHFIKQVHSNSFMRPSPR 109

Query: 122 AVLGRPVFFVDDDPRADRLAQQQLEAAAHAVGLADVQFQYEPIAAAFDYESRQDAERLVL 181
++ PV + RA R E+A A G +V EP+AAA +
Sbjct: 110 VLVCVPVGATQVERRAIR------ESAQGA-GAREVFLIEEPMAAAIGAGLPVSEATGSM 162

Query: 182 VADIGGGTSDFSLVRVGPERMRRLERKDDVLAHHGVHVAGTDYD 225
V DIGGGT++ +++ + V+ V + G +D
Sbjct: 163 VVDIGGGTTEVAVISLN-----------GVVYSSSVRIGGDRFD 195


46E4F39_16045E4F39_16085Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_160452200.450673hypothetical protein
E4F39_16050320-0.615663HlyD family efflux transporter periplasmic
E4F39_16055-113-2.475022peptidase domain-containing ABC transporter
E4F39_16060215-2.362256TolC family protein
E4F39_16065118-3.667146flagellar protein FliO
E4F39_16070120-4.437828M15 family peptidase
E4F39_16075121-4.642407hypothetical protein
E4F39_16080221-4.141197type VI secretion system-associated protein
E4F39_16085439-3.896544OmpA family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16115RTXTOXIND1322e-36 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 132 bits (334), Expect = 2e-36
Identities = 81/429 (18%), Positives = 159/429 (37%), Gaps = 53/429 (12%)

Query: 28 RPVSFAVLASAAASMALGVI--LLFTFGTYTRRTTVDGVLTPDTGLVKVYAQQTGVVLKK 85
PVS A M VI +L G T +G LT ++ + +V +
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 86 NVVEGQHVTRGQVLYTVSTDLQSAAAGQTQAAL----IEQAQQRKTSLQQELDKTRRLQ- 140
V EG+ V +G VL ++ A +TQ++L +EQ + + S EL+K L+
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKL 170

Query: 141 ----------------------------QDERDTLQSKIASLRTELAGIDDQIAAQRTRA 172
Q+++ + + R E + +I +
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS 230

Query: 173 SIAADAASRYAGLLAQDYISKDQAQQRQADLLDQRSKLNSLMRDRASTAQSLKEALNDLS 232
+ ++ LL + I+K +++ ++ ++L + A +
Sbjct: 231 RVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ 290

Query: 233 GLSLKQQNQLSQIDRSVIDVDRTLIESEAKREF-----VVTAPETGT-ATAVIAEPGQTA 286
++ +N++ R D L AK E V+ AP + + G
Sbjct: 291 LVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVV 350

Query: 287 DTSHPLASIVPTGAHWQAYLFVPSAAVGFVHVGDRVLVRYQAYPYQKFGQYEASVVSIAR 346
T+ L IVP + V + +GF++VG +++ +A+PY ++G V +I
Sbjct: 351 TTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINL 410

Query: 347 TALSAAELATSGGPAAQTASGTYYRITVALNSQNVMAYGRAQPLQAGMALQADVLQERRR 406
A+ G + + +++ + + PL +GMA+ A++ R
Sbjct: 411 DAIE------------DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRS 458

Query: 407 LYEWVLEPL 415
+ ++L PL
Sbjct: 459 VISYLLSPL 467


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16160OMPADOMAIN923e-23 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 91.5 bits (227), Expect = 3e-23
Identities = 34/103 (33%), Positives = 58/103 (56%), Gaps = 2/103 (1%)

Query: 203 FETGSATLTPQGRAILDQMAGALAKM--SNRTVEIIGHTDNSGNRTSNIALSQARADAVK 260
F ATL P+G+A LDQ+ L+ + + +V ++G+TD G+ N LS+ RA +V
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 261 GYLVTKGIASQQLTTTGVGPDQPIAPNDSADGRARNRRIEFRA 303
YL++KGI + +++ G+G P+ N + + R I+ A
Sbjct: 283 DYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLA 325


47E4F39_16175E4F39_16295Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_16175210-2.905218hypothetical protein
E4F39_16180310-3.092170S8 family peptidase
E4F39_16185314-3.957418AAA family ATPase
E4F39_16190213-3.449666*ClpXP protease specificity-enhancing factor
E4F39_16195312-3.283190glutathione S-transferase
E4F39_16200312-2.578179cytochrome c1
E4F39_16205212-1.894046cytochrome bc complex cytochrome b subunit
E4F39_16210318-5.077327ubiquinol-cytochrome c reductase iron-sulfur
E4F39_16215316-4.449361Nif3-like dinuclear metal center hexameric
E4F39_16220623-4.702738Do family serine endopeptidase
E4F39_16225730-5.694500hypothetical protein
E4F39_16230634-7.006860twin-arginine translocase subunit TatC
E4F39_16235851-10.712245Sec-independent protein translocase subunit
E4F39_16240751-10.475480Sec-independent protein translocase subunit
E4F39_16245856-11.667211histidine triad nucleotide-binding protein
E4F39_16250956-11.854061hypothetical protein
E4F39_16255752-11.881636phosphoribosyl-ATP diphosphatase
E4F39_16260748-11.280531phosphoribosyl-AMP cyclohydrolase
E4F39_16265323-8.421600imidazole glycerol phosphate synthase subunit
E4F39_16270315-6.3787151-(5-phosphoribosyl)-5-((5-
E4F39_16280112-5.877618imidazole glycerol phosphate synthase subunit
E4F39_1628509-5.132833YchE family NAAT transporter
E4F39_1629019-4.150428imidazoleglycerol-phosphate dehydratase HisB
E4F39_16295110-3.554674histidinol-phosphate transaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16260SUBTILISIN743e-16 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 73.7 bits (181), Expect = 3e-16
Identities = 66/303 (21%), Positives = 90/303 (29%), Gaps = 94/303 (31%)

Query: 250 VCVLDTGIAAAHPLLAPAIGETGNFIEPGE----PAADEDGHGTRVAGLA--------LY 297
V VLDTG A HP L I NF + E D +GHGT VAG +
Sbjct: 45 VAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTIAATENENGVV 104

Query: 298 GSMTDLLPDRFAPQLRLFAGKVFSNDGQDQTHFVEKAVEDAVRYFREEYGCKIFCLAYGD 357
G AP+ L KV + G Q ++ + + A+ E I ++ G
Sbjct: 105 G---------VAPEADLLIIKVLNKQGSGQYDWIIQGIYYAI-----EQKVDIISMSLG- 149

Query: 358 ANKVYDGRHVRGLAYTLDRLSRELNILFVVPTGNLLSDEIPDDALQTYPQYLSGSEFRLL 417
V L + + IL + GN E D Y
Sbjct: 150 -----GPEDVPELHEAVKKAVAS-QILVMCAAGN----EGDGDDRTDELGY--------- 190

Query: 418 DPATSINSLTVGGLAEFELDHQAQRFPERIETTPIARRNQPSPFTRSGPSVAGAVKPDFV 477
P ++VG I S F+ S V D V
Sbjct: 191 -PGCYNEVISVGA---------------------INFDRHASEFSNSNNEV------DLV 222

Query: 478 AFGGNVARHHLRNGFAGQRLGVVSTSRNFADGRLFDDAPGTSFAAPQVAHLAATVLRFLP 537
A G ++ ST G + GTS A P VA A + +
Sbjct: 223 APGEDIL----------------STV----PGGKYATFSGTSMATPHVAGALALIKQLAN 262

Query: 538 DAS 540
+
Sbjct: 263 ASF 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16310V8PROTEASE689e-15 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 68.1 bits (166), Expect = 9e-15
Identities = 34/183 (18%), Positives = 64/183 (34%), Gaps = 38/183 (20%)

Query: 116 NLGSGVIVSSEGYILTNQHVVDGADQIEVALA------------DGRTATAKVIGSDPET 163
+ SGV+V +LTN+HVVD AL +G ++ E
Sbjct: 102 FIASGVVVGK-DTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEG 160

Query: 164 DLAVLKIN--------MTNLPTITLGRSDQSRVGDVVLAIGNPFGVGQTVTMGIISALGR 215
DLA++K + + T+ + +++V + G P ++ +
Sbjct: 161 DLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-------KPVATMWE 213

Query: 216 NHLGINTFEN-FIQTDAPINPGNSGGALVDVNGNLLGINTAIYSRSGGSLGIGFAIPVST 274
+ I + +Q D GNSG + + ++GI+ G+ +
Sbjct: 214 SKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGAV 264

Query: 275 ARN 277
N
Sbjct: 265 FIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16325TATBPROTEIN772e-20 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 77.4 bits (190), Expect = 2e-20
Identities = 23/84 (27%), Positives = 45/84 (53%), Gaps = 1/84 (1%)

Query: 1 MLDLGLSKMALIGVVALVVLGPERLPRVARTAGALFGRAQRYINDVKAEVSREIELDALR 60
M D+G S++ L+ ++ LVVLGP+RLP +T + V+ E+++E++L +
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60

Query: 61 TMKTDFEQAA-RNVENTIHDNLRE 83
E+A+ N+ + ++ E
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDE 84


48E4F39_17025E4F39_17170Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_17025212-1.264686amino acid permease
E4F39_17035211-0.618648*c-type cytochrome
E4F39_170400101.013079BON domain-containing protein
E4F39_17050-3111.955405SIS domain-containing protein
E4F39_17055-3112.597287YraN family protein
E4F39_17060-2113.07769516S rRNA
E4F39_17065-1102.404972septal ring lytic transglycosylase RlpA family
E4F39_17070317-1.545090hypothetical protein
E4F39_17075318-1.597076MBL fold metallo-hydrolase
E4F39_17080529-3.221317exonuclease
E4F39_17085528-2.972088Lrp/AsnC family transcriptional regulator
E4F39_17090738-3.599364cation transporter
E4F39_17095323-3.540947H-NS histone family protein
E4F39_17100415-1.662564hypothetical protein
E4F39_17105216-1.349281pyridoxamine 5'-phosphate oxidase
E4F39_17110314-0.793755MFS transporter
E4F39_17120-3120.228846branched-chain amino acid ABC transporter
E4F39_17130-3121.372622phosphoheptose isomerase
E4F39_17135-1140.619769nitronate monooxygenase
E4F39_171400130.459452hypothetical protein
E4F39_171452140.094854dienelactone hydrolase family protein
E4F39_17150013-1.343446amidase
E4F39_17155-213-2.435898methylenetetrahydrofolate reductase [NAD(P)H]
E4F39_17160014-2.924301phage holin family protein
E4F39_17165017-4.380733adenosylhomocysteinase
E4F39_17170018-4.098135RNA polymerase sigma factor FliA
49E4F39_17315E4F39_17395Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_173152131.221518aquaporin Z
E4F39_173202140.976032Cof-type HAD-IIB family hydrolase
E4F39_17325313-0.563858DNA-3-methyladenine glycosylase I
E4F39_17330112-0.984067hypothetical protein
E4F39_17335013-3.70450730S ribosomal protein S21
E4F39_17340013-3.612425flagellin
E4F39_17345217-4.370549flagellar hook protein FliD
E4F39_17350626-5.795078flagellar protein FliT
E4F39_17355524-4.583694glycosyltransferase
E4F39_17360220-3.680913DegT/DnrJ/EryC1/StrS family aminotransferase
E4F39_17365120-0.901377ketoacyl-ACP synthase III
E4F39_17370017-0.661336acyl carrier protein
E4F39_17375118-0.044255ketoacyl-ACP synthase III
E4F39_173800150.698587SDR family oxidoreductase
E4F39_17385-116-0.312011acetyltransferase
E4F39_173902150.001234Rieske (2Fe-2S) protein
E4F39_17395315-1.858932methyltransferase domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17410FLAGELLIN1323e-36 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 132 bits (332), Expect = 3e-36
Identities = 104/336 (30%), Positives = 149/336 (44%), Gaps = 9/336 (2%)

Query: 4 INSNINSLVAQQNLNGSQGALSQAITRLSSGKRINSAADDAAGLAIATRMQTQINGLNQG 63
IN+N SL+ Q NLN SQ +LS AI RLSSG RINSA DDAAG AIA R + I GL Q
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 VSNANDGVSILQTASSGLTSLTNSLQRIRQLAVQASNGPLSASDASALQQEVAQQISEVN 123
NANDG+SI QT L + N+LQR+R+L+VQA+NG S SD ++Q E+ Q++ E++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 RIASQTNYNGKNILDGSAGTLSFQVGANVGQTVSVDLTQSMSAAKIGGGMVQTGQTLGTI 183
R+++QT +NG +L + QVGAN G+T+++DL + + G G T+
Sbjct: 124 RVSNQTQFNGVKVLSQD-NQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATV 182

Query: 184 KVAIDSSGAAWSSGSTGQETTQINVVSDGKGGFTFTDQNNQALSSTAVTAVFGSSTAGTG 243
S + + V + T T A +T
Sbjct: 183 GDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAE 242

Query: 244 TAASPSFQTLALSTSATSALSATDQANATAMVAQINAVNKPQTVSNLDISTQTGAYQAMV 303
+ ST+ T A A A+ I + T ++
Sbjct: 243 NNTAVDLFKTTKSTAGT--------AEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGND 294

Query: 304 SIDNALATVNNLQATLGAAQNRFTAIATTQQAGSNN 339
T+N + TL A A ++
Sbjct: 295 GNGKVSTTINGEKVTLTVADITAGAANVDAATLQSS 330



Score = 84.3 bits (208), Expect = 7e-20
Identities = 61/263 (23%), Positives = 100/263 (38%), Gaps = 3/263 (1%)

Query: 129 TNYNGKNILDGSAGTLSFQVGANVGQTVSVDLTQSMSAAKIGGGMVQTGQTLGTIKVAID 188
T + +AGT + A + T G + I+
Sbjct: 245 TAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTIN 304

Query: 189 SSGAAWSSGSTGQETTQIN---VVSDGKGGFTFTDQNNQALSSTAVTAVFGSSTAGTGTA 245
+ ++ + S + + T + S
Sbjct: 305 GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAV 364

Query: 246 ASPSFQTLALSTSATSALSATDQANATAMVAQINAVNKPQTVSNLDISTQTGAYQAMVSI 305
S T+ + +A M A ++ + + + SI
Sbjct: 365 KGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASI 424

Query: 306 DNALATVNNLQATLGAAQNRFTAIATTQQAGSNNLAQAQSQIQSADFAQETANLSRAQVL 365
D+AL+ V+ ++++LGA QNRF + T NL A+S+I+ AD+A E +N+S+AQ+L
Sbjct: 425 DSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQIL 484

Query: 366 QQAGISVLAQANSLPQQVLKLLQ 388
QQAG SVLAQAN +PQ VL LL+
Sbjct: 485 QQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17425SYCDCHAPRONE371e-04 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 1e-04
Identities = 22/125 (17%), Positives = 39/125 (31%), Gaps = 15/125 (12%)

Query: 262 AQVRRVLDGTPD--YAEAHRVLNMTLSARGRYQEAIEAGRRSVELAPNSVNAHGSLAVTL 319
A + + T + Y+ A G+Y++A + + L L
Sbjct: 26 AMLNEISSDTLEQLYSLA-----FNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACR 80

Query: 320 SDYGQTDEAEIHFRRAHELNPKDPMAYSNLLFCQSHKIDV--------SIRELFDAHRAF 371
GQ D A + ++ K+P + C K ++ +EL F
Sbjct: 81 QAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140

Query: 372 GELYE 376
EL
Sbjct: 141 KELST 145



Score = 33.7 bits (77), Expect = 0.001
Identities = 20/80 (25%), Positives = 29/80 (36%)

Query: 182 TLYNKRRMAEAIKFARALAQRYPGSGVAWKSLGFALHRDGQYGPACEALTKGAAMLPDDA 241
Y + +A K +AL + LG GQY A + + GA M +
Sbjct: 45 NQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEP 104

Query: 242 ECNTLYADTLRVLNRLADAE 261
A+ L LA+AE
Sbjct: 105 RFPFHAAECLLQKGELAEAE 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17450DHBDHDRGNASE973e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 96.7 bits (240), Expect = 3e-26
Identities = 66/251 (26%), Positives = 111/251 (44%), Gaps = 9/251 (3%)

Query: 9 LAGGTYLVTGASSGIGRAAAIAIAQLGGRLVLGGRDPARLADTLAALPGDGHASHAAALD 68
+ G +TGA+ GIG A A +A G + +P +L +++L + + A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 69 DADAAA--DWVGALAETHGPLAGVFHAAGVELIRPARMTAQAQLEQVFGASLYAAFGIAR 126
D+AA + + GP+ + + AGV + + E F + F +R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 127 AAAKKNVIADGGSVVYMSSVAGSTGQVGMTAYSAAKAGIEGLVRSLACELAPRRIRANAI 186
+ +K + GS+V + S + M AY+++KA + L ELA IR N +
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 187 AAGAVKTEMHARL--TRGTPEDALAAYEASHLLG-----FGEPGDVAAAAIFLLSGASRW 239
+ G+ +T+M L E + + G +P D+A A +FL+SG +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 240 ITGTSLVVDGG 250
IT +L VDGG
Sbjct: 246 ITMHNLCVDGG 256


50E4F39_17450E4F39_17625Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_17450-116-3.744311hypothetical protein
E4F39_17455-116-4.716803alpha/beta fold hydrolase
E4F39_17460-118-4.875061methyl-accepting chemotaxis protein
E4F39_17465-113-2.844248histidinol dehydrogenase
E4F39_174701160.141421FAD-binding oxidoreductase
E4F39_17475-1111.163142hypothetical protein
E4F39_17480-1100.721794chitin-binding protein
E4F39_17485080.032351hypothetical protein
E4F39_17490070.005311DUF3396 domain-containing protein
E4F39_174952130.026129VRR-NUC domain-containing protein
E4F39_17500-210-0.156343PAAR domain-containing protein
E4F39_175102120.085058hypothetical protein
E4F39_17515310-0.370257XRE family transcriptional regulator
E4F39_17520211-0.773136site-specific integrase
E4F39_17530112-2.230388hypothetical protein
E4F39_17535120-4.124501*cytochrome c5 family protein
E4F39_17540023-4.985977ATP-dependent DNA helicase Rep
E4F39_17545336-6.869791hypothetical protein
E4F39_17550641-7.485668SGNH/GDSL hydrolase family protein
E4F39_17555652-11.594778oxidoreductase
E4F39_17560655-12.465568hypothetical protein
E4F39_17565758-12.595099glycine cleavage system aminomethyltransferase
E4F39_17570862-12.976907glycine cleavage system protein GcvH
E4F39_17575863-13.561106glycine dehydrogenase
E4F39_175801060-13.257264ubiquinol-cytochrome C reductase
E4F39_175851045-6.260265alginate lyase
E4F39_17590421-4.410601L-serine ammonia-lyase
E4F39_17595319-2.167027thiamine pyrophosphate-binding protein
E4F39_176001160.025199ADP-heptose--LPS heptosyltransferase
E4F39_17615-1141.206197helix-turn-helix domain-containing protein
E4F39_17620-2101.291953CarR
E4F39_17625-1123.163273aldehyde dehydrogenase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17625OMADHESIN280.045 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 27.9 bits (61), Expect = 0.045
Identities = 32/107 (29%), Positives = 45/107 (42%), Gaps = 2/107 (1%)

Query: 76 LSLSAIAASEAFSFAYAWTCRRHRWPLALAAGLAAWAAAASALARLPATPPAATAVAFAA 135
+S+SA S FS YA+ P A ++ A A L P PP A A
Sbjct: 7 ISVSAALISALFSSPYAFADDYDGIPNLTAVQISPNADPALGLE-YPVRPPVPGAGGLNA 65

Query: 136 TCFGQSCLPRGATLAPRAPLSHADLAGRLAAGAALALAVTSLAGALG 182
+ G + GAT + A AG +A G ++A+ L+ ALG
Sbjct: 66 SAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVN-SVAIGPLSKALG 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17645PF05272270.030 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 26.6 bits (58), Expect = 0.030
Identities = 13/42 (30%), Positives = 21/42 (50%)

Query: 5 IRAAACCVLVSSLAACVVTPPRPAPAPRPSPQVVGYERMQQI 46
+ + A V+ + A PPRP P PRP + +E +Q +
Sbjct: 99 LESVAGIVMGAPAGAPAPKPPRPEPPPRPVVEKECWETIQPV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17675PHPHTRNFRASE310.006 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 31.3 bits (71), Expect = 0.006
Identities = 23/111 (20%), Positives = 38/111 (34%), Gaps = 8/111 (7%)

Query: 68 DAHLNDEPKALPRVHTEGTLPHEGIYDQSAEALNDMELMRNAALAWRVTNQSRYLALVDR 127
D L D K + E + + S ++ E M N + R + + R
Sbjct: 83 DPELVDGIKGK--IENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAAD---IRDVSKR 137

Query: 128 FLSTWVNTYRPSFNPIDETRFESLILAYDMTASALPVKTRNAAAAFIAALG 178
L + S I E E++I+A D+T S + F +G
Sbjct: 138 VLGHLIGVETGSLATIAE---ETVIIAEDLTPSDTAQLNKQFVKGFATDIG 185


51E4F39_17860E4F39_17990Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_17860118-3.020861ParA family protein
E4F39_17865-115-3.71613616S rRNA (guanine(527)-N(7))-methyltransferase
E4F39_17870-117-3.929231tRNA uridine-5-carboxymethylaminomethyl(34)
E4F39_17875015-5.020407ABC transporter ATP-binding protein
E4F39_17880-114-4.690982branched-chain amino acid ABC transporter
E4F39_17885116-4.128096branched-chain amino acid ABC transporter
E4F39_17890117-3.782380ABC transporter substrate-binding protein
E4F39_17895223-4.710183hypothetical protein
E4F39_17900225-4.940931branched-chain amino acid ABC transporter
E4F39_17905120-5.002490branched-chain amino acid ABC transporter
E4F39_17910221-4.563182ABC transporter substrate-binding protein
E4F39_17915219-3.654179ABC transporter ATP-binding protein
E4F39_17920016-3.607809ABC transporter ATP-binding protein
E4F39_17925017-3.992313methyltransferase
E4F39_17930-118-3.808487hypothetical protein
E4F39_17935123-4.438957choline dehydrogenase
E4F39_17940023-4.835641hypothetical protein
E4F39_17945225-5.227238CoA-acylating methylmalonate-semialdehyde
E4F39_17950125-5.113779hypothetical protein
E4F39_17960122-3.567326CYTH domain-containing protein
E4F39_17965018-2.374877Lrp/AsnC family transcriptional regulator
E4F39_17970-114-2.147724phenylalanine 4-monooxygenase
E4F39_17975-312-2.0831254a-hydroxytetrahydrobiopterin dehydratase
E4F39_17980-312-1.874983DUF3717 domain-containing protein
E4F39_17985-111-2.115368response regulator transcription factor
E4F39_17990-111-3.425789sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18075HTHFIS817e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.0 bits (200), Expect = 7e-20
Identities = 31/135 (22%), Positives = 62/135 (45%), Gaps = 1/135 (0%)

Query: 2 RLLLIEDDRPIARGIQSSLEQAGFTVDMVHDGIFAEQALAQNRHELVILDLGLPGIDGMT 61
+L+ +DD I + +L +AG+ V + + + +A +LV+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLSRFRQTNRHTPVIVLTARDELNDRVQGLNSGADDYMLKPFEPAE-LEARIRAVMRRSG 120
LL R ++ PV+V++A++ ++ GA DY+ KPF+ E + RA+
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 PHSDMPRPEVSLGGV 135
S + +
Sbjct: 125 RPSKLEDDSQDGMPL 139


52E4F39_18195E4F39_18280Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_181950113.591327flagellar biosynthetic protein FliO
E4F39_182000123.275957flagellar type III secretion system pore protein
E4F39_18205-3100.864230flagellar biosynthesis protein FliQ
E4F39_18210-3111.619598flagellar biosynthetic protein FliR
E4F39_18215-2131.752736MerR family transcriptional regulator
E4F39_18220-1110.878732class I SAM-dependent methyltransferase
E4F39_18225-190.068138transporter substrate-binding domain-containing
E4F39_18230-190.114085ABC transporter ATP-binding protein
E4F39_182350122.473273ABC transporter permease
E4F39_182402141.120917sensor histidine kinase
E4F39_182452161.507726response regulator
E4F39_182501131.023802porin
E4F39_182550120.438384hypothetical protein
E4F39_182650100.072672site-specific DNA-methyltransferase
E4F39_18275212-1.466486restriction endonuclease subunit R
E4F39_18280212-1.086497hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18295FLGBIOSNFLIP291e-102 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 291 bits (747), Expect = e-102
Identities = 155/242 (64%), Positives = 192/242 (79%), Gaps = 1/242 (0%)

Query: 11 RWLPAILIGLAPALACAQAAGLPAFNSAPGPNGGTTYSLSVQTMLLLTMLSFLPAMLLMM 70
R L + L + A LP S P P GG ++SL VQT++ +T L+F+PA+LLMM
Sbjct: 3 RLLSVAPVLLW-LITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMM 61

Query: 71 TSFTRIIIVLSLLRQAIGTASTPPNQVLVGLALFLTLFVMSPVIDRAYNDAYKPFSEGTL 130
TSFTRIIIV LLR A+GT S PPNQVL+GLALFLT F+MSPVID+ Y DAY+PFSE +
Sbjct: 62 TSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKI 121

Query: 131 QMDQAVQRGTAPFKAFMLKQTRETDLALFAKISKAAPMQGPEDVPLSLLVPAFVTSELKT 190
M +A+++G P + FML+QTRE DL LFA+++ P+QGPE VP+ +L+PA+VTSELKT
Sbjct: 122 SMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKT 181

Query: 191 GFQIGFTIFIPFLIIDMVVASVLMSMGMMMVSPATVSLPFKLMLFVLVDGWQLLIGSLAQ 250
FQIGFTIFIPFLIID+V+ASVLM++GMMMV PAT++LPFKLMLFVLVDGWQLL+GSLAQ
Sbjct: 182 AFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQ 241

Query: 251 SF 252
SF
Sbjct: 242 SF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18300TYPE3IMQPROT694e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 68.6 bits (168), Expect = 4e-19
Identities = 26/85 (30%), Positives = 46/85 (54%)

Query: 4 ENVMTLAHQAMYIGLLLAAPLLLVALAVGLVVSLFQAATQINEATLSFIPKLLAVAATMV 63
++++ ++A+Y+ L+L+ +VA +GL+V LFQ TQ+ E TL F KLL V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLSTMIDYLRETLLRVATLG 88
+ W ++ Y R+ + G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18305TYPE3IMRPROT1617e-51 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 161 bits (409), Expect = 7e-51
Identities = 116/253 (45%), Positives = 158/253 (62%), Gaps = 4/253 (1%)

Query: 1 MFSVTYAQLNGWLTAFLWPFVRMLALVAIAPVTGHRSTPVRVKIGLAGFMALVVAPTLPP 60
M VT Q WL + WP +R+LAL++ AP+ RS P RVK+GLA + +AP+LP
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 MPPMPVATVFSAQGVWIIVNQFLIGAALGFTMQIVFAAIEAAGDIIGLSMGLGFATFFDP 120
VFS +W+ V Q LIG ALGFTMQ FAA+ AG+IIGL MGL FATF DP
Sbjct: 61 NDVP----VFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP 116

Query: 121 HSSGATPVMGRFLNAVAILAFLAFDGHLQVFAALVDSFRLVPVSADLLRAAGWQTLVAFG 180
S PV+ R ++ +A+L FL F+GHL + + LVD+F +P+ + L + + L G
Sbjct: 117 ASHLNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAG 176

Query: 181 AAIFEMGLLLALPVVAALLIANLALGILNRAAPQIGIFQVGFPVTMLVGLLLVQLMAPNL 240
+ IF GL+LALP++ LL NLALG+LNR APQ+ IF +GFP+T+ VG+ L+ + P +
Sbjct: 177 SLIFLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLI 236

Query: 241 IPFVGRLFDTGVD 253
PF LF +
Sbjct: 237 APFCEHLFSEIFN 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18340PF06580543e-10 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 54.5 bits (131), Expect = 3e-10
Identities = 24/128 (18%), Positives = 45/128 (35%), Gaps = 22/128 (17%)

Query: 334 RIDLGAELDDDLQVAGSESLLSALLMNLVDNAVRYAHE----GGRVTVSARRDGDAVVLE 389
R+ +++ + + L+ LV+N +++ GG++ + +D V LE
Sbjct: 239 RLQFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 390 VVDDGPGIPAEARPHVFKRFYRVARDEEGTGLGLAIVEE-IAQSHGGAVSLATGPGNRGV 448
V + G +E TG GL V E + +G + V
Sbjct: 296 VENTGSLALKN--------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV 341

Query: 449 RMTVRLPA 456
V +P
Sbjct: 342 NAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18345HTHFIS963e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.1 bits (239), Expect = 3e-25
Identities = 30/119 (25%), Positives = 60/119 (50%), Gaps = 1/119 (0%)

Query: 2 KLLLVEDNAELAHWIVDLLRGEGFGVDSAPDGESADTVLKAQRYDALLLDMRLPGMSGKE 61
+L+ +D+A + + L G+ V + + + A D ++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLARLRRRGDNVPVLMLTAHGSVDDKVDCFSAGADDYVVKPFESRELVARI-RALIRRQ 119
LL R+++ ++PVL+++A + + GA DY+ KPF+ EL+ I RAL +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18350ECOLNEIPORIN641e-13 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 64.1 bits (156), Expect = 1e-13
Identities = 56/228 (24%), Positives = 94/228 (41%), Gaps = 28/228 (12%)

Query: 1 MKRQYLALSIATAACAAPQAHAQSSVQLYGLIDLSVPTYRSHANAKGDHVIGMGLGGEPW 60
MK+ +AL++A AA + V LYG I V T RS A+ G + G
Sbjct: 1 MKKSLIALTLAALPVAA-----MADVTLYGTIKAGVETSRSVAH-NGAQAASVETGTGIV 54

Query: 61 FSGSRWGLKGAEDIGGGTKVIFRLESEYTVADGNMEDPGQIFDRDAWVGVENDTFGKLTA 120
GS+ G KG ED+G G K I+++E + ++A + +R +++G++ FGKL
Sbjct: 55 DLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTD----SGWGNRQSFIGLKGG-FGKLRV 109

Query: 121 GFQNTIARDAAAIYGDPYGSAKLTTEEGGWTNANNFKQMIFYAAGATGTRYNNGLAWKKL 180
G N++ +D I +P+ S RY++ +
Sbjct: 110 GRLNSVLKDTGDI--NPWDSKSDYLGVNKIAEPEARL---------ISVRYDS----PEF 154

Query: 181 FGNGIFASAGYAFSNSTSFGQNSTYQVALGYNGGPFNVSGFFSHVNHA 228
G+ S YA +++ + +Y Y G F V ++ H
Sbjct: 155 A--GLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKRHH 200


53E4F39_18515E4F39_18705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_185152170.10087750S ribosomal protein L34
E4F39_185202172.270690ribonuclease P protein component
E4F39_18525-1152.387280membrane protein insertion efficiency factor
E4F39_185300162.943723membrane protein insertase YidC
E4F39_185352232.192714transcriptional regulator
E4F39_185402231.946652tRNA uridine-5-carboxymethylaminomethyl(34)
E4F39_18545-214-0.931281DUF4102 domain-containing protein
E4F39_18550-116-2.226524DUF927 domain-containing protein
E4F39_18555-113-3.004926hypothetical protein
E4F39_18560011-3.705763hypothetical protein
E4F39_18565013-3.689285hypothetical protein
E4F39_18570012-3.319388hypothetical protein
E4F39_18575215-2.894281AlpA family phage regulatory protein
E4F39_18580013-3.306809hypothetical protein
E4F39_18585113-3.361078hypothetical protein
E4F39_18590-115-2.534324recombinase family protein
E4F39_18595-118-3.881120hypothetical protein
E4F39_18600-120-4.483940DUF4382 domain-containing protein
E4F39_18605017-3.461306lipoprotein
E4F39_18610021-3.340569hypothetical protein
E4F39_18615020-3.159991spermidine N1-acetyltransferase
E4F39_18620122-4.054540hypothetical protein
E4F39_18625118-2.384260YaeQ family protein
E4F39_18630117-1.988449VOC family protein
E4F39_18635329-5.560000hypothetical protein
E4F39_18640334-6.254805bifunctional DNA-binding transcriptional
E4F39_18645341-7.595286DNA-3-methyladenine glycosylase 2 family
E4F39_18650443-8.203856glutamate--cysteine ligase
E4F39_18655661-13.456536RNA polymerase sigma factor RpoD/SigA
E4F39_18660759-13.007682hypothetical protein
E4F39_18665548-10.922446histone deacetylase
E4F39_18670549-11.491789periplasmic heavy metal sensor
E4F39_18675549-11.354553LysR family transcriptional regulator
E4F39_18680445-11.632106polyamine ABC transporter substrate-binding
E4F39_18685124-6.886230class II histone deacetylase
E4F39_18690119-5.194096N-carbamoylputrescine amidase
E4F39_18695024-5.437887hypothetical protein
E4F39_18700-124-4.580772phage portal protein
E4F39_18705-124-5.098990hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_1860560KDINNERMP490e-171 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 490 bits (1263), Expect = e-171
Identities = 204/576 (35%), Positives = 320/576 (55%), Gaps = 46/576 (7%)

Query: 1 MDIKRTVLWVIFFMSAVMLFDNWQRSHGRPSMFFPNVTQTNTASNATNGNGASGASAAAA 60
MD +R +L + + M++ W++ Q T + T
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQ-----PQAQQTTQTTTT------------- 42

Query: 61 ANALPAAATGAAPATTAPAAQAQLVRFSTDVYNGEIDTRGGTLAKLTLTK---AGDGKQP 117
AA AA + Q +L+ TDV + I+TRGG + + L + QP
Sbjct: 43 ------AAGSAADQGVPASGQGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQP 96

Query: 118 DLSVTLFDHTANHTYLARTGLLGGDFPN-----HNDVYAQVAGPTSLAADQNTLKLSFES 172
L + + Y A++GL G D P+ +Y LA QN L++
Sbjct: 97 ---FQLLETSPQFIYQAQSGLTGRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTY 153

Query: 173 PVKGGVKVVKTYTFTRGSYVIGVDTKIENVGAAPVTPSVYMELVRD-----NSSVETPMF 227
G KT+ RG Y + V+ ++N G P+ S + +L + + + F
Sbjct: 154 TDAAGNTFTKTFVLKRGDYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNF 213

Query: 228 S-HTFLGPAVYTDQKHFQKITFGDIDKNKADYVTSADNGWIAMVQHYFASAWIPQSGAKR 286
+ HTF G A T + ++K F I N+ ++S GW+AM+Q YFA+AWIP +
Sbjct: 214 ALHTFRGAAYSTPDEKYEKYKFDTIADNENLNISS-KGGWVAMLQQYFATAWIPHNDGTN 272

Query: 287 DIYVEKIDPTLYRVGVKQPVAAIAPGQSADVSARLFAGPEEERMLEGIAPGLELVKDYGW 346
+ Y + + +G K + PGQ+ +++ L+ GPE + + +AP L+L DYGW
Sbjct: 273 NFYTANLGNGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGW 332

Query: 347 VTIIAKPLFWLLEKIHGFVGNWGWAIVLLTLLIKAVFFPLSAASYKSMARMKEITPRMQA 406
+ I++PLF LL+ IH FVGNWG++I+++T +++ + +PL+ A Y SMA+M+ + P++QA
Sbjct: 333 LWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQA 392

Query: 407 LRERFKSDPQKMNAALMELYKTEKVNPFGGCLPVVIQIPVFISLYWVLLASVEMRGAPWV 466
+RER D Q+++ +M LYK EKVNP GGC P++IQ+P+F++LY++L+ SVE+R AP+
Sbjct: 393 MRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA 452

Query: 467 LWIHDLSQRDPYFILPVLMAVSMFVQTKLNPTP-PDPVQAKMMMFMPIAFSVMFFFFPAG 525
LWIHDLS +DPY+ILP+LM V+MF K++PT DP+Q K+M FMP+ F+V F +FP+G
Sbjct: 453 LWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSG 512

Query: 526 LVLYYVVNNVLSIAQQYYITRTL---GGAAAKKKAS 558
LVLYY+V+N+++I QQ I R L G + +KK S
Sbjct: 513 LVLYYIVSNLVTIIQQQLIYRGLEKRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18615PF05272372e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.6 bits (84), Expect = 2e-04
Identities = 25/123 (20%), Positives = 40/123 (32%), Gaps = 9/123 (7%)

Query: 191 IDFLEAADARGKLAHIR--ERLAHVLGDARQGALLREGLSV----VLAGQPNVGKSSLLN 244
+ L K +R + + + ++ G VL G +GKS+L+N
Sbjct: 555 VHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLIN 614

Query: 245 ALAGAELAIVTPI-AGTTRDKVAQTIQIEGIPLHIIDTAGLRETEDEVEKIGIARTWGEI 303
L G + T GT +D Q I L + R + E K +
Sbjct: 615 TLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMT--AFRRADAEAVKAFFSSRKDRY 672

Query: 304 ERA 306
A
Sbjct: 673 RGA 675


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18705SACTRNSFRASE418e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 40.7 bits (95), Expect = 8e-07
Identities = 23/113 (20%), Positives = 45/113 (39%), Gaps = 11/113 (9%)

Query: 37 FEEPYETFTELSQLYDQHVHDQRERRFVAFDSDGELVGLVELI----ELDYIHRRGEFQI 92
F +PY E + +V ++ + F+ + + +G +++ I I
Sbjct: 42 FSKPYFKQYEDDDMDVSYVEEEGKAAFLYY-LENNCIGRIKIRSNWNGYALIE-----DI 95

Query: 93 IIAPNRQGRGFATRATRLAVEYAFKVLNLRKLYLIVDKSNVAAIRVYEKCGFK 145
+A + + +G T A+E+A K + L L N++A Y K F
Sbjct: 96 AVAKDYRKKGVGTALLHKAIEWA-KENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18785MALTOSEBP300.018 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 29.7 bits (66), Expect = 0.018
Identities = 17/49 (34%), Positives = 27/49 (55%)

Query: 44 FEKETGIKVRLDVYDSNEALQTKLTTGNSGYDLVFPSNDFLARQIQAGL 92
FEK+TGIKV ++ D E ++ G D++F ++D Q+GL
Sbjct: 53 FEKDTGIKVTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGL 101


54E4F39_18800E4F39_18895Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_18800318-4.355343DUF4390 domain-containing protein
E4F39_18805318-5.710799PAS domain-containing sensor histidine kinase
E4F39_18810324-6.524486response regulator
E4F39_18815742-9.529433*hypothetical protein
E4F39_18820643-9.3182947-cyano-7-deazaguanine synthase QueC
E4F39_18825228-6.1203967-carboxy-7-deazaguanine synthase
E4F39_18830225-4.4615476-carboxytetrahydropterin synthase QueD
E4F39_18835117-3.215076hypothetical protein
E4F39_18840012-1.867850sel1 repeat family protein
E4F39_18850-111-0.377689rod shape-determining protein RodA
E4F39_18855111-1.411294penicillin-binding protein 2
E4F39_18860214-1.968489rod shape-determining protein MreD
E4F39_18865113-0.746899rod shape-determining protein
E4F39_18870013-1.108328Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase
E4F39_18875114-0.173958Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase
E4F39_18880013-0.471521Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase
E4F39_188850140.779680polyphosphate kinase 2 family protein
E4F39_188902141.301032exodeoxyribonuclease III
E4F39_188952120.882692hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18935HTHFIS1081e-29 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 108 bits (272), Expect = 1e-29
Identities = 46/144 (31%), Positives = 69/144 (47%), Gaps = 1/144 (0%)

Query: 2 ATILVVDDEMGIRELLSEILSDEGHVVDVAENAQAARDYRLRQAPDLVLLDIWMPDTDGV 61
ATILV DD+ IR +L++ LS G+ V + NA + DLV+ D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 TLLKEWASQGQLTMPVIMMSGHATIDTAVEATKIGALDFLEKPITLQKLLKSVEHGLARG 121
LL +PV++MS T TA++A++ GA D+L KP L +L+ + LA
Sbjct: 64 DLLPRIKKARP-DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 122 AAPLPASAAAKPAAGAAVASAAAL 145
V +AA+
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAM 146



Score = 47.1 bits (112), Expect = 2e-08
Identities = 19/106 (17%), Positives = 37/106 (34%), Gaps = 3/106 (2%)

Query: 113 SVEHGLARGAAPLPASAAAKPAAGAAVASAAALPTLGDDPAVALAGQTTAAIPFDIPLRE 172
+ E + +P S AA + + ++ ++ A+P
Sbjct: 375 TREIIENELRSEIPDSP---IEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDR 431

Query: 173 ARDAFERAYFEYHLARENGSMTRVAEKTGLERTHLYRKLKQLGVEL 218
E L G+ + A+ GL R L +K+++LGV +
Sbjct: 432 VLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18990cloacin310.024 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.8 bits (69), Expect = 0.024
Identities = 21/65 (32%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 681 SGADGASGASGAGGEPTEHANAGGNPAGGGIAGGAAGTANNGSGAAAPGGM-PGANGAAM 739
+G GAS G +E+ GG G GG +G N G + GG G N +A+
Sbjct: 25 TGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAV 84

Query: 740 GTPPA 744
P A
Sbjct: 85 AAPVA 89



Score = 30.8 bits (69), Expect = 0.026
Identities = 23/83 (27%), Positives = 27/83 (32%), Gaps = 14/83 (16%)

Query: 677 PASASGADGASGASGAGGE-----------PTEHANAGGNPAG-GGIAGGAAGTANNGSG 724
P GAS SG E +G G G +GG +GT N S
Sbjct: 24 PTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83

Query: 725 AAAPG--GMPGANGAAMGTPPAS 745
AAP G P + G S
Sbjct: 84 VAAPVAFGFPALSTPGAGGLAVS 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19005SHAPEPROTEIN5040.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 504 bits (1300), Expect = 0.0
Identities = 247/348 (70%), Positives = 294/348 (84%), Gaps = 2/348 (0%)

Query: 1 MFGFLRSYFSNDLAIDLGTANTLIYMRGKGIVLDEPSVVSIRQEGGPNGKKTIQAVGKEA 60
M R FSNDL+IDLGTANTLIY++G+GIVL+EPSVV+IRQ+ K++ AVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRA-GSPKSVAAVGHDA 59

Query: 61 KQMLGKVPGNIEAIRPMKDGVIADFTVTEQMIKQFIKTAHESRMFSPSPRIIICVPCGST 120
KQMLG+ PGNI AIRPMKDGVIADF VTE+M++ FIK H + PSPR+++CVP G+T
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIKEAAHGAGASQVYLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVGVISLG 180
QVERRAI+E+A GAGA +V+LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEV VISL
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GIVYKGSVRVGGDKFDEAIVNYIRRNYGMLIGEQTAEAIKKEIGSAFPGSEVKEMEVKGR 240
G+VY SVR+GGD+FDEAI+NY+RRNYG LIGE TAE IK EIGSA+PG EV+E+EV+GR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLSEGIPRSFTISSNEILEALTDPLNQIVSSVKIALEQTPPELGADIAERGMMLTGGGAL 300
NL+EG+PR FT++SNEILEAL +PL IVS+V +ALEQ PPEL +DI+ERGM+LTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 LRDLDRLLAEETGLPVLVAEDPLTCVVRGSGMALERMDKL-GSIFSYE 347
LR+LDRLL EETG+PV+VAEDPLTCV RG G ALE +D G +FS E
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19020TYPE4SSCAGA310.013 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 30.8 bits (69), Expect = 0.013
Identities = 29/89 (32%), Positives = 43/89 (48%), Gaps = 5/89 (5%)

Query: 395 SNKIAKEIFVTIWDEKAADEGAADRIIEAKGLK-QISDTGALEAIIDEVLAANAKSVEEF 453
+N EIF I E D A KG+K ++SD LE + ++ L KS +EF
Sbjct: 648 ANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDK--LENV-NKNLKDFDKSFDEF 704

Query: 454 RAGKDKAFNALVGQAMKATKGKANPQQVN 482
+ GK+K F+ + +KA KG +N
Sbjct: 705 KNGKNKDFSK-AEETLKALKGSVKDLGIN 732


55E4F39_19230E4F39_19315Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_19230-1113.545348hypothetical protein
E4F39_192401133.178015hypothetical protein
E4F39_192452151.658785NAD(P)/FAD-dependent oxidoreductase
E4F39_192503131.291240iron-sulfur protein
E4F39_192551130.281516MFS transporter
E4F39_192601102.296920hypothetical protein
E4F39_19265-1103.682554ABC transporter substrate-binding protein
E4F39_192702104.334194ABC transporter permease subunit
E4F39_192750104.675785ABC transporter permease subunit
E4F39_19280-293.827632ABC transporter ATP-binding protein
E4F39_19285-1103.157137TraB/GumN family protein
E4F39_19290-1122.123050DUF979 domain-containing protein
E4F39_192951131.005564DUF969 domain-containing protein
E4F39_193000141.3305575-oxoprolinase subunit PxpA
E4F39_193050151.318465biotin-dependent carboxyltransferase family
E4F39_19310-2142.2440475-oxoprolinase subunit PxpB
E4F39_19315-2133.206641MarR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19360TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.9 bits (75), Expect = 0.002
Identities = 22/107 (20%), Positives = 43/107 (40%), Gaps = 9/107 (8%)

Query: 70 FMRPIGGIVLGLYADRAGRKAALSLVILLMTFGIFLIAVAPPYAAIGIGGPLLIVLGRLL 129
M+ VLG +DR GR+ L + + ++A AP ++ +GR++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIV 105

Query: 130 QGFSAGGEFGSATALLIEAAPLSRRGYYGSWQMASQAAALLFGSLVG 176
G + G A A + + R + + A ++ G ++G
Sbjct: 106 AGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151


56E4F39_19375E4F39_19565Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_193753130.587859flagellar basal body rod protein FlgB
E4F39_193802121.292142flagellar basal body rod protein FlgC
E4F39_193851111.813598flagellar hook assembly protein FlgD
E4F39_19390-1102.467410flagellar hook protein FlgE
E4F39_19395-1103.390180flagellar basal-body rod protein FlgF
E4F39_194000113.451113flagellar basal-body rod protein FlgG
E4F39_19405-2113.250986flagellar basal body L-ring protein FlgH
E4F39_19410-1103.568488flagellar basal body P-ring protein FlgI
E4F39_19415-1111.965309flagellar assembly peptidoglycan hydrolase FlgJ
E4F39_194200112.519285flagellar brake protein
E4F39_194250101.932941flagellar hook-associated protein FlgK
E4F39_194301112.134904flagellar hook-associated protein 3
E4F39_194352122.816379hypothetical protein
E4F39_194400102.392845pyrimidine utilization transport protein G
E4F39_194451123.099340hypothetical protein
E4F39_194501142.142501transcriptional regulator GcvA
E4F39_194551161.955814hypothetical protein
E4F39_194604290.540961chromate transporter
E4F39_194655214.401076chromate transporter
E4F39_194702164.223492hypothetical protein
E4F39_19480-1122.881774hypothetical protein
E4F39_194852111.334152alkylphosphonate utilization protein
E4F39_194902111.078975DUF1275 domain-containing protein
E4F39_19495517-2.013904hypothetical protein
E4F39_19500419-1.955145glyoxalase
E4F39_19505421-1.004309porin
E4F39_19510021-1.015315hypothetical protein
E4F39_19515-119-0.225274hypothetical protein
E4F39_19520118-0.207098hypothetical protein
E4F39_195252180.022580HpnL family protein
E4F39_195302160.214240glycosyltransferase
E4F39_19535113-0.014554carbohydrate porin
E4F39_195402120.395483hypothetical protein
E4F39_195452120.816775hypothetical protein
E4F39_195500131.128014divalent metal cation transporter
E4F39_195551142.2482214-hydroxy-3-methylbut-2-en-1-yl diphosphate
E4F39_195602141.908864hypothetical protein
E4F39_195653152.944807glutathione-disulfide reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19500FLGHOOKAP1270.029 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 26.8 bits (59), Expect = 0.029
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 102 NVDPVQEMVNMISASRSYQANVETLNTAKQLMLKTLTI 139
V+ +E N+ + Y AN + L TA + + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19510FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 17/58 (29%), Positives = 24/58 (41%)

Query: 356 ISAPGSTNHGTLQGSALENSNVDLTSQLVKLITAQRNYQANAQTIKTQQTVDQTLINL 413
SA L S V+L + L Q+ Y ANAQ ++T + LIN+
Sbjct: 488 SSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 30.3 bits (68), Expect = 0.017
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 6 GLSGLAGASSDLDVIGNNIANANTVGFKGST 36
+SGL A + L+ NNI++ N G+ T
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19515FLGHOOKAP1290.018 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.018
Identities = 9/34 (26%), Positives = 18/34 (52%)

Query: 4 LIYTAMTGATQSLEQQSVVANNLANASTTGFRAQ 37
LI AM+G + + +NN+++ + G+ Q
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQ 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19520FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 10/48 (20%), Positives = 23/48 (47%)

Query: 213 TLKQGYVESSNVNVVQELVNMIQTQRAYEINSKAVTTSDQMLQTVTQM 260
L S VN+ +E N+ + Q+ Y N++ + T++ + + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 40.3 bits (94), Expect = 5e-06
Identities = 19/80 (23%), Positives = 34/80 (42%), Gaps = 14/80 (17%)

Query: 4 SLYIAATGMNAQQAQMDVISNNLANVSTNGFKGSRAVFEDLLYQTVRQPGANSTQQTELP 63
+ A +G+NA QA ++ SNN+++ + G+ RQ + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYT--------------RQTTIMAQANSTLG 48

Query: 64 SGLQLGTGVQQVATERLYTQ 83
+G +G GV +R Y
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19525FLGLRINGFLGH2063e-69 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 206 bits (526), Expect = 3e-69
Identities = 128/222 (57%), Positives = 156/222 (70%), Gaps = 7/222 (3%)

Query: 25 AALAAAALALAGCAQIPREPITQQPMSAMPPMPPAMQAPGSIY---NPGYAG-RPLFEDQ 80
A + L+L GCA IP P+ Q SA P P A GSI+ P G +PLFED+
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNVGDILTIVIAENINATKSSGANTNRQGNTSFDVPTAG-FLGGLF--NKANLSAQGA 137
RPRN+GD LTIV+ EN++A+KSS AN +R G T+F T +L GLF +A++ A G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 138 NKFAATGGASAANTFNGTITVTVTNVLPNGNLVVSGEKQMLINQGNEFVRFSGIVNPNTI 197
N F GGA+A+NTF+GT+TVTV VL NGNL V GEKQ+ INQG EF+RFSG+VNP TI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 198 SGQNSVYSTQVADARIEYSAKGYINEAETMGWLQRFFLNIAP 239
SG N+V STQVADARIEY GYINEA+ MGWLQRFFLN++P
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19530FLGPRINGFLGI363e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 363 bits (934), Expect = e-127
Identities = 158/367 (43%), Positives = 216/367 (58%), Gaps = 19/367 (5%)

Query: 4 LAFAPAAARAERLKDLAQIQGVRDNPLIGYGLVVGLDGTGDQTMQTPFTTQTLANMLANL 63
L+ PA A R+KD+A +Q RDN LIGYGLVVGL GTGD +PFT Q++ ML NL
Sbjct: 19 LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNL 78

Query: 64 GISINNGSANGGGSSAMTNMQLKNVAAVMVTATLPPFARPGEAIDVTVSSLGNAKSLRGG 123
GI+ G +N KN+AAVMVTA LPPFA PG +DVTVSSLG+A SLRGG
Sbjct: 79 GITTQGGQSN-----------AKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGG 127

Query: 124 TLLLTPLKGADGQVYALAQGNMAVGGAGASANGSRVQVNQLAAGRIAGGAIVERSVPNAV 183
L++T L GADGQ+YA+AQG + V G A + + + + R+ GAI+ER +P+
Sbjct: 128 NLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKF 187

Query: 184 AQMNGVLQLQLNDMDYGTAQRIVSAVNS----SFGAGTATALDGRTIQLTAPADSAQQVA 239
L LQL + D+ TA R+ VN+ +G A D + I + P +
Sbjct: 188 KDSV-NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVA-DLTR 245

Query: 240 FMARLQNLEVSPERAAAKVILNARTGSIVMNQMVTLQNCAVAHGNLSVVVNTQPVVSQPG 299
MA ++NL V + AKV++N RTG+IV+ V + AV++G L+V V P V QP
Sbjct: 246 LMAEIENLTVETD-TPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPA 304

Query: 300 PFSNGQTVVAQQSQIQLKQDNGSLRMVTAGANLAEVVKALNSLGATPADLMSILQAMKAA 359
PFS GQT V Q+ I Q+ + + G +L +V LNS+G +++ILQ +K+A
Sbjct: 305 PFSRGQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSA 363

Query: 360 GALRADL 366
GAL+A+L
Sbjct: 364 GALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19535FLGFLGJ2274e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 227 bits (579), Expect = 4e-75
Identities = 124/297 (41%), Positives = 172/297 (57%), Gaps = 15/297 (5%)

Query: 15 ALDVQGFDALRSKATAAAPREGVKMVAGQFDAMFTQMMLKSMRDATPSDGLLDSSSSKMY 74
A D Q + L++KA P ++ VA Q + MF QMMLKSMRDA P DGL S +++Y
Sbjct: 12 AWDAQSLNELKAKA-GEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEHTRLY 70

Query: 75 TSMLDQQLAQQMSS-KGIGVADALTKQLLRNANVAPDAQGEGGLAAMNALAKAYANSNGA 133
TSM DQQ+AQQM++ KG+G+A+ + KQ+ + ++ + Y N +
Sbjct: 71 TSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETVVRYQNQALS 130

Query: 134 PGNGALAGTRGYSAASALTPPLKGNGNSAQADAFVEKMALAAQAASATTGIPARFIVGQA 193
P + + AF+ +++L AQ AS +G+P I+ QA
Sbjct: 131 ------------QLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 194 ALESGWGKREIRGANGESSYNVFGIKATKGWTGRTVSAVTTEYVNGRPHRVVAQFRAYDS 253
ALESGWG+R+IR NGE SYN+FG+KA+ W G TTEY NG +V A+FR Y S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 254 YEHAMTDYANLLKNNPRYASVLNAGHNAEGFAHGMQKAGYATDPHYAKKLISIMQQI 310
Y A++DY LL NPRYA+V A +AE A +Q AGYATDPHYA+KL +++QQ+
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAA-SAEQGAQALQDAGYATDPHYARKLTNMIQQM 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19545FLGHOOKAP12314e-70 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 231 bits (591), Expect = 4e-70
Identities = 162/444 (36%), Positives = 253/444 (56%), Gaps = 12/444 (2%)

Query: 3 NTLMNLGVSGLNAALWGLTTTGQNISNAATPGYSVERPVYAEASGQYTSSGYLPQGVSTV 62
++L+N +SGLNAA L T NIS+ GY+ + + A+A+ + G++ GV
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 63 TVERQYNQYLSNQLNAAQTQGSSLSTYYTLVAQLNNYVGSPTAGIATAITNYFTGLQTVA 122
V+R+Y+ +++NQL AAQTQ S L+ Y +++++N + + T+ +AT + ++FT LQT+
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 123 NNAADPSARQTAMSNAQTLASQLVAAGQQYSQLRQSVNSQLTDTVTQINSYTSQIAQLNE 182
+NA DP+ARQ + ++ L +Q Q + VN + +V QIN+Y QIA LN+
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 183 QIA--SASSQGQPPNQLLDQRDLAVSKLSQLAGVQV-VQSNGNYSVFLSGGQPLVVGNAS 239
QI+ + G PN LLDQRD VS+L+Q+ GV+V VQ G Y++ ++ G LV G+ +
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 240 YQLATVASPSDPSELTI-VSKGVAGSAQPGPTQYLPDVSLTGGALGGLLAFRSQTLDPAQ 298
QLA V S +DPS T+ G AG+ + +P+ L G+LGG+L FRSQ LD +
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIE------IPEKLLNTGSLGGILTFRSQDLDQTR 294

Query: 299 AQLGALAVSFASQVNAQNALGVDMSGNPGGSLFAVGAPAVYANQNNTGSATLSVSFVDGT 358
LG LA++FA N Q+ G D +G+ G FA+G PAV N N G + + D +
Sbjct: 295 NTLGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDAS 354

Query: 359 QPTTSDYALSYDGAKYTLTDRATGSVVGTATPSSTPPTMTIGGLKLSLSSTPNAGDSFTV 418
+DY +S+D ++ +T R + T TP + + GL+L+ + TP DSFT+
Sbjct: 355 AVLATDYKISFDNNQWQVT-RLASNTTFTVTPDAN-GKVAFDGLELTFTGTPAVNDSFTL 412

Query: 419 LPTRGALDGFSLATANGSAIAAAS 442
P A+ + + + IA AS
Sbjct: 413 KPVSDAIVNMDVLITDEAKIAMAS 436



Score = 83.1 bits (205), Expect = 9e-19
Identities = 46/105 (43%), Positives = 66/105 (62%)

Query: 561 GTNDGRNALALSQLVNSKTMNNGTTTLTGAYAGYVNAIGNAASQLKASSAAQTALVGQIT 620
G +D RN AL L ++ G + AYA V+ IGN + LK SSA Q +V Q++
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 621 QAQQSVSGVNQNEEAANLMQYQQLYQANAKVIQTANSVFQTVLGL 665
QQS+SGVN +EE NL ++QQ Y ANA+V+QTAN++F ++ +
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19550FLAGELLIN416e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 6e-06
Identities = 55/369 (14%), Positives = 113/369 (30%), Gaps = 10/369 (2%)

Query: 16 MNDQQAQIAQLYQQVSSGISLTTPADNPLAAAQAVQLSATSATLAQYTQNQTIVQTALQT 75
+N Q+ ++ +++SSG+ + + D+ A A + ++ L Q ++N + QT
Sbjct: 17 LNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQT 76

Query: 76 EDTTLTSVNDVLNAAYQALMHAGDGGLSDSDRAALAAQIQGSRDHLLTLANTADGAGNYL 135
+ L +N+ L + + A +G SDSD ++ +IQ + + ++N G +
Sbjct: 77 TEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKV 136

Query: 136 FAGFQPTTQPFSNKPGGGVTY------AGDYGARAVQIADTRTVSQGDNGANVFMSVPFL 189
+ G +T G + + + GD ++ +
Sbjct: 137 LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYD 196

Query: 190 GSLPVPAAGASNTGTGTIGAVSITNPSDPTNTHQFTITFGGTAAAPTYTVTDNSVTPPTT 249
+ +G + + T A T D T +T
Sbjct: 197 TYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKST 256

Query: 250 TAAQAYSSGQGINLGGQTVAVSGKPAVGDTFTVTPAPQAGTDVFATLD----TVIAALKS 305
+ G GG+ V T V T++ T+ A +
Sbjct: 257 AGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADIT 316

Query: 306 PVGNSQTASTALTNTMATASTKLMNTMTNVLTVQASVGGRLQEVKAMQAVTTTNTLQTTN 365
+ A+T ++ S + T S E + T+
Sbjct: 317 AGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAE 376

Query: 366 SLSNLTDTN 374
+N
Sbjct: 377 YTANAAGDK 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19585ACRIFLAVINRP280.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.021
Identities = 17/63 (26%), Positives = 30/63 (47%), Gaps = 2/63 (3%)

Query: 110 YVQQGMMPVTAGLVVASAVLISEASNRSALQWGITAAVAAL-AYRTRVHPLWLLAGGALA 168
Y G++ T GL +A+LI E + + G A L A R R+ P+ + + +
Sbjct: 925 YFMVGLL-TTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFIL 983

Query: 169 GLV 171
G++
Sbjct: 984 GVL 986


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19620ECOLNEIPORIN686e-15 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 68.3 bits (167), Expect = 6e-15
Identities = 55/252 (21%), Positives = 85/252 (33%), Gaps = 55/252 (21%)

Query: 16 LAIAGQQAMAQSSVTLWGVADVSLRYLSNSNAQNDGRFFM----TNGAITNSRIGLHGSE 71
L +A A + VTL+G + S S A N + T S+IG G E
Sbjct: 8 LTLAALPVAAMADVTLYGTIK-AGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQE 66

Query: 72 DLGGGLKAIFNLESGVNLQDGAFSDSKRIFNRAAYVGLTSQYGTVTLGRQKTVLFDLLSD 131
DLG GLKAI+ +E ++ NR +++GL +G + +GR +VL D
Sbjct: 67 DLGNGLKAIWQVEQKASIAGT----DSGWGNRQSFIGLKGGFGKLRVGRLNSVLKD---- 118

Query: 132 TYDPLTVGNYLENAW---LPVALGAGL-YADN---SVKYRGT-FGGLTIGAMYSFGTDST 183
N W + + SV+Y F GL+ Y+ ++
Sbjct: 119 --------TGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALNDNAG 170

Query: 184 ATGAGGFSGQLPGHMGAG-NMYGFSLSYVAGPVSVAAGVQQSSDNSNRKQTI-------- 234
+ + H G GF + Y G + I
Sbjct: 171 RHNSESY------HAGFNYKNGGFFVQY--------GGAYKRHHQVQENVNIEKYQIHRL 216

Query: 235 ---YHANVVYAF 243
Y + +YA
Sbjct: 217 VSGYDNDALYAS 228


57E4F39_19745E4F39_19830Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_19745282.058013bile acid:sodium symporter
E4F39_19750281.812757hypothetical protein
E4F39_19755182.113422porin
E4F39_19760191.842503LysR family transcriptional regulator
E4F39_19765070.695000hypothetical protein
E4F39_19775-1111.244670nitronate monooxygenase
E4F39_19780-1101.830586alpha/beta hydrolase
E4F39_19785-290.740358Lrp/AsnC family transcriptional regulator
E4F39_19790-1100.922630DMT family transporter
E4F39_19795-2123.896951YbaK/EbsC family protein
E4F39_19800-1104.305737hydroxymethylglutaryl-CoA lyase
E4F39_19805-2103.287931glyoxylate/hydroxypyruvate reductase A
E4F39_19810-2132.364791glyoxalase/bleomycin resistance
E4F39_198150154.061341alpha/beta hydrolase
E4F39_198200164.058677NUDIX domain-containing protein
E4F39_198250162.526544hypothetical protein
E4F39_198300173.186934Lrp/AsnC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19880ECOLNEIPORIN715e-16 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 71.4 bits (175), Expect = 5e-16
Identities = 80/358 (22%), Positives = 130/358 (36%), Gaps = 48/358 (13%)

Query: 18 VAALAAGASGACAQSSVQLYGQVDEWIGAQKFPGGQRAWGVQGGGMST-----SYWGLRG 72
+ AL A A + V LYG + + + A + S G +G
Sbjct: 5 LIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKG 64

Query: 73 TEDLGGGYQAIFTLEDFFRAQNGHYGRFDGDTFFGRNAYVGLATPYGTVRAGRLTTQLFI 132
EDLG G +AI+ +E Q D R +++GL +G +R GRL + L
Sbjct: 65 QEDLGNGLKAIWQVE-----QKASIAGTDSGWG-NRQSFIGLKGGFGKLRVGRLNSVLK- 117

Query: 133 STILFNPFIDSYVFSPMVYHVFLGLGTFPTYTTDQGVVGDSGWNNAIDYTSPSFGGLNAA 192
T NP+ + LG + ++ + Y SP F GL+ +
Sbjct: 118 DTGDINPWDSKSDY----------LGVNKIAEPEARLIS-------VRYDSPEFAGLSGS 160

Query: 193 AMYALGNTAGDNRSKKWSGQLNYSSGSFAATAVYQYVNFNGGPGDLGALVSGMKSQGVAQ 252
YAL + AG + S+ + NY +G F Y + V+ K Q + +
Sbjct: 161 VQYALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKRH----HQVQENVNIEKYQ-IHR 215

Query: 253 LGLSYDFKLAKIYA-QYMYTKNELNTGSWHVNTAQGGVSVPL----GPGSALASYAYS-- 305
L YD +YA + ++ + + +Q V+ L G + SYA+
Sbjct: 216 LVSGYD--NDALYASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFK 273

Query: 306 --RDSGGLDQTRRTWALGYDYPLSKRTDVYAAYM---NDRYSGMSGGDTFGAGIRAKF 358
D+ + +G +Y SKRT + + G G+R KF
Sbjct: 274 GSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19905HTHFIS270.042 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 26.7 bits (59), Expect = 0.042
Identities = 8/41 (19%), Positives = 17/41 (41%)

Query: 15 LDAIDRELLRTLADDARQPVSELARRVGLSAPSTADRLRRL 55
L ++ L+ R + A +GL+ + ++R L
Sbjct: 433 LAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


58E4F39_19955E4F39_20020Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_199552141.722051EAL domain-containing protein
E4F39_19960-115-0.545246hypothetical protein
E4F39_19965018-2.463584hypothetical protein
E4F39_19970120-3.207818hypothetical protein
E4F39_19975222-3.124196alkaline phosphatase
E4F39_19980222-3.224126alkaline phosphatase
E4F39_19985223-3.831943hypothetical protein
E4F39_19990223-3.269486biotin synthase BioB
E4F39_19995434-3.387892dethiobiotin synthase
E4F39_20000331-4.5047028-amino-7-oxononanoate synthase
E4F39_20005431-4.765571adenosylmethionine--8-amino-7-oxononanoate
E4F39_20010536-6.385437hypothetical protein
E4F39_20015534-5.675942SDR family oxidoreductase
E4F39_20020333-5.351273acetyl-CoA C-acyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_20105adhesinb290.036 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.036
Identities = 10/32 (31%), Positives = 13/32 (40%)

Query: 1 MKRVRLAALLGASVFAAAGCGSDEPKTPGASD 32
MK+ R LL + A C S + T S
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSS 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_20160DHBDHDRGNASE395e-06 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 39.3 bits (91), Expect = 5e-06
Identities = 45/191 (23%), Positives = 65/191 (34%), Gaps = 16/191 (8%)

Query: 2 KTVLIVGASRGIGREFVGQYLHDGWRVIATARDAAAL------AALDALGARALALDVAQ 55
K I GA++GIG G + A + L +A A A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 56 PDDIAAFDARLGG--AALDAAVVVSGVYGPRTAGVEPIGVEDFDAVMHTNVLGPMLLMPI 113
I AR+ +D V V+GV R + + E+++A N G
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVL--RPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 114 LLPRVEASRGVLAVLSSKMGSIGEATGTTGW-LYRASKAAVNDALRIASLQARHAA--CI 170
+ + R V +GS T Y +SKAA + L+ C
Sbjct: 127 VSKYMMDRRSGSIVT---VGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 171 ALHPGWVRTDM 181
+ PG TDM
Sbjct: 184 IVSPGSTETDM 194


59E4F39_20220E4F39_20360Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_20220211-1.217664ATP-binding cassette domain-containing protein
E4F39_20225412-1.934991MCE family protein
E4F39_20230-113-1.842670ABC transporter
E4F39_20235-113-2.025175teicoplanin resistance protein VanZ
E4F39_20245-112-3.149425(2Fe-2S) ferredoxin domain-containing protein
E4F39_20250012-1.219302alpha/beta hydrolase
E4F39_20255-112-0.464960hypothetical protein
E4F39_20265-1121.592644D-alanyl-D-alanine carboxypeptidase
E4F39_20270-1112.283014D-amino acid aminotransferase
E4F39_202751103.344418DUF493 family protein
E4F39_202801113.157201transcriptional regulator GcvA
E4F39_202851123.259136DUF2917 domain-containing protein
E4F39_202901122.173870lipoyl(octanoyl) transferase LipB
E4F39_202951101.566669lipoyl synthase
E4F39_20305-1141.973799hypothetical protein
E4F39_20310-1133.174266gamma-glutamyltransferase
E4F39_20315-1144.132238hypothetical protein
E4F39_20320-2143.602161LysR family transcriptional regulator
E4F39_20325-2133.624160enoyl-CoA hydratase
E4F39_203350142.525710pimeloyl-CoA dehydrogenase large subunit
E4F39_203401162.070520pimeloyl-CoA dehydrogenase small subunit
E4F39_203451161.960301CoA transferase
E4F39_203502162.753902DUF485 domain-containing protein
E4F39_203550142.659636cation acetate symporter
E4F39_20360-3113.206605hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_20380PF06057290.013 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.0 bits (65), Expect = 0.013
Identities = 20/129 (15%), Positives = 42/129 (32%), Gaps = 28/129 (21%)

Query: 93 DDLLAVLAHMRALPGHADLPLVLAGFSFGTFVLSHVGKRLRDAGQAIERMVF---VGTAA 149
D LA++ +A G ++L G+SFG V+ V + + ++
Sbjct: 101 QDTLAIIDKYQAEFGT--QKVILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSS 158

Query: 150 ------SRW--------------QVAAVPEDTIV-IHGENDDTVPIASVYDWARPQELPV 188
S +V ++ ++G+ DD + + + V
Sbjct: 159 DFEIHVSEMVTSDNQSARYLTLPEVNKQTTVPMLCLYGKEDDAP--LHLCPEVKQPNVTV 216

Query: 189 IVIPGAEHF 197
+ + G F
Sbjct: 217 MELSGGHSF 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_20410FbpA_PF05833280.007 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 27.9 bits (62), Expect = 0.007
Identities = 14/67 (20%), Positives = 25/67 (37%), Gaps = 5/67 (7%)

Query: 32 GAPVWATRSNDVHDYFLSPGATLKLRRGERLWLSADGATSACVSFSAIAPPQQAALRGVA 91
G ++ ++N +DY TLK +W + V I ++ L A
Sbjct: 467 GIDIYVGKNNIQNDYL-----TLKFANKHDIWFHTKNIPGSHVIVKNIMDIPESTLLEAA 521

Query: 92 RFASWLS 98
A++ S
Sbjct: 522 NLAAYYS 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_20440HTHFIS310.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.006
Identities = 13/47 (27%), Positives = 24/47 (51%), Gaps = 6/47 (12%)

Query: 7 TLLVDILDA--GNLSKAAQRLKMSRANVSYRLNQLEKSIGLQLVRRT 51
L++ L A GN KAA L ++R + ++ +L G+ + R +
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRKKIREL----GVSVYRSS 481


60E4F39_01655E4F39_01690N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_01655-3101.082243DegQ family serine endoprotease
E4F39_01660-3100.396026carboxypeptidase regulatory-like
E4F39_01665-2130.946863DUF427 domain-containing protein
E4F39_01670113-1.094920carbon monoxide dehydrogenase
E4F39_01675115-0.106920TetR family transcriptional regulator
E4F39_01680012-0.567995chemotaxis protein
E4F39_01685011-1.016489efflux RND transporter periplasmic adaptor
E4F39_01690-2110.511739efflux RND transporter permease subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01660V8PROTEASE794e-18 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 78.5 bits (193), Expect = 4e-18
Identities = 38/207 (18%), Positives = 71/207 (34%), Gaps = 40/207 (19%)

Query: 81 QRRAAPQLPIDPDDP-----FYQFFRHFYGQIPGMGGGRQPQPDDQPSTSLGSGFIISAD 135
++R + + +D I Q + T + SG ++
Sbjct: 62 EQREHANVILPNNDRHQITDTTNGHYAPVTYI---------QVEAPTGTFIASGVVV-GK 111

Query: 136 GYILTNAHVIDGANVVTVKLTDKR-----------EYKA-KVVGADKQSDVAVLKIDA-- 181
+LTN HV+D + L + A ++ + D+A++K
Sbjct: 112 DTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNE 171

Query: 182 ------SGLPIVKIGDPAQSKVGQWVVAIGSPYGFDNTVTSGIISAKSRALPDENYTPFI 235
+ + + A+++V Q + G P +K + + +
Sbjct: 172 QNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMW---ESKGKITYLKGE--AM 226

Query: 236 QTDVPVNPGNSGGPLFNLNGEVIGINS 262
Q D+ GNSG P+FN EVIGI+
Sbjct: 227 QYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01680HTHTETR1262e-38 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 126 bits (317), Expect = 2e-38
Identities = 81/209 (38%), Positives = 115/209 (55%), Gaps = 1/209 (0%)

Query: 1 MARRTKEEALATRDRILDAAEHVFFEKGVSHTSLADIAQHAGVTRGAIYWHFASKSELFD 60
MAR+TK+EA TR ILD A +F ++GVS TSL +IA+ AGVTRGAIYWHF KS+LF
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AMFDRVLLPIDELKAGT-GEPHADPLGRIREILIWCLLGAARDPQLRRVFSILFMKCEYV 119
+++ I EL+ + DPL +REILI L + + R + I+F KCE+V
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 120 ADMGPLLQRNREGMRDALRNIEADLAQGVANGQLPADLDTWRATLMLHTLVSGFVRDMLM 179
+M + Q R ++ IE L + LPADL T RA +++ +SG + + L
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 180 LPGEIDAERHAEKLVDGCFDMLRTSPAMR 208
P D ++ A V +M P +R
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01690RTXTOXIND424e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.7 bits (98), Expect = 4e-06
Identities = 42/266 (15%), Positives = 80/266 (30%), Gaps = 75/266 (28%)

Query: 92 KIDPAPYIAQLNSAKATLAKAQANLATQNALVARYKVLVAANAVSKQQYDDAVAAQGQAA 151
+++ A+ + A + + + + + + + L+ A++K + +A
Sbjct: 206 ELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 152 ADVGAGKAAV-------------------------------------------ETAQINL 168
++ K+ + +
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325

Query: 169 GYTDVVSPITGRV-GISQVTPGAYVQASQATLMSTVQQLDPVYVDLTQSSLDGLKLRQDI 227
+ + +P++ +V + T G V ++ TLM V + D + V + D +
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALVQNKDIGFINVG- 383

Query: 228 QSGRIK-------TEGPGAAKVTLILEDGKPYPERGKLQFSDVTVDQTTGSVT--IRAI- 277
Q+ IK G KV I D DQ G V I +I
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNI--------------NLDAIEDQRLGLVFNVIISIE 429

Query: 278 -----FPNKQRVLLPGMFVRARIEEG 298
NK L GM V A I+ G
Sbjct: 430 ENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 31.7 bits (72), Expect = 0.006
Identities = 21/122 (17%), Positives = 36/122 (29%), Gaps = 20/122 (16%)

Query: 1 MRVERVPYRLITVATAAVFLAACGKKESAPPPQTPEVGVVTVQPQSVPVVSELPGRTSAY 60
R V Y ++ A L+ G+ E G +T +S +
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATAN----GKLTHSGRSKEIKPIENSIVKEI 110

Query: 61 LVAQVRARVDGIVLRREFTEGSDVKAGQRLYKIDPAPYIAQLNSAKATLAKAQANLATQN 120
+V EG V+ G L K+ A +++L +A+
Sbjct: 111 IVK----------------EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQ 154

Query: 121 AL 122
L
Sbjct: 155 IL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01695ACRIFLAVINRP12700.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1270 bits (3288), Expect = 0.0
Identities = 673/1035 (65%), Positives = 822/1035 (79%), Gaps = 2/1035 (0%)

Query: 1 MAKFFIDRPIFAWVIAIILMLAGVAAIFTLPIAQYPTIAPPSIQITANYPGASAKTVEDT 60
MA FFI RPIFAWV+AIILM+AG AI LP+AQYPTIAPP++ ++ANYPGA A+TV+DT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQQMSGLDNFLYMSSTSDDSGNATITITFAPGTNPDIAQVQVQNKLSLATPILPQ 120
VTQVIEQ M+G+DN +YMSSTSD +G+ TIT+TF GT+PDIAQVQVQNKL LATP+LPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 VVQQLGLSVTKSSSSFLLVLAFNSEDGSMNKYDLANYVASHVKDPISRINGVGTVTLFGS 180
VQQ G+SV KSSSS+L+V F S++ + D+++YVAS+VKD +SR+NGVG V LFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDPTKLTNYGLTPVDVTSAISAQNVQIAGGQLGGTPAVPGTVLQATITEATLL 240
QYAMRIWLD L Y LTPVDV + + QN QIA GQLGGTPA+PG L A+I T
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 QTPEQFGNILLKVNQDGSQVRLKDVAQIGLGGETYNFDTKYNGQPTAALGIQLATNANAL 300
+ PE+FG + L+VN DGS VRLKDVA++ LGGE YN + NG+P A LGI+LAT ANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 ATAKAVRAKIDEMSAYFPHGLVVKYPYDTTPFVRLSIEEVVKTLLEGIVLVFLVMYLFLQ 360
TAKA++AK+ E+ +FP G+ V YPYDTTPFV+LSI EVVKTL E I+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NLRATIIPTIAVPVVLLGTFAIMSMVGFSINVLSMFGLVLAIGLLVDDAIVVVENVERVM 420
N+RAT+IPTIAVPVVLLGTFAI++ G+SIN L+MFG+VLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKAMGQITGALVGVALVLSAVFVPVAFSGGSVGAIYRQFSLTIVSAMVL 480
E+ LPPKEAT K+M QI GALVG+A+VLSAVF+P+AF GGS GAIYRQFS+TIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATILKPIPQGHHEEKKGFFGWFNRTFNSSRDKYHVGVHHVIKRSGRW 540
SVLVALILTPALCAT+LKP+ HHE K GFFGWFN TF+ S + Y V ++ +GR+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LIIYLAVIVAVGLLFVRLPKSFLPDEDQGLMFVIVQTPSGSTQETTARTLANISDYLLTQ 600
L+IY ++ + +LF+RLP SFLP+EDQG+ ++Q P+G+TQE T + L ++DY L
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKDIVESAFTVNGFSFAGRGQNSGLVFVKLKDYSQRQSSDQKVQALIGRMFGRYAGYKDA 660
EK VES FTVNGFSF+G+ QN+G+ FV LK + +R + +A+I R +D
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 LVIPFNPPSIPELGTAAGFDFELTDNAGLGHDALMAARNQLLGMAAKD-STLRGVRPNGL 719
VIPFN P+I ELGTA GFDFEL D AGLGHDAL ARNQLLGMAA+ ++L VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 720 NDTPQYKVDIDREKANALGVTADAIDQTFSIAWASKYVNNFLDTDGRIKKVYVQSDAPFR 779
DT Q+K+++D+EKA ALGV+ I+QT S A YVN+F+D GR+KK+YVQ+DA FR
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID-RGRVKKLYVQADAKFR 779

Query: 780 MTPEDMNIWYVRNGSGGMVPFSAFATGHWTYGSPKLERYNGISAMEIQGQAAPGKSTGQA 839
M PED++ YVR+ +G MVPFSAF T HW YGSP+LERYNG+ +MEIQG+AAPG S+G A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 840 MTAMETLAKKLPTGIGYSWTGLSFQEIQSGSQAPILYAISILVVFLCLAALYESWSIPFS 899
M ME LA KLP GIGY WTG+S+QE SG+QAP L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 900 VIMVVPLGVIGALLAATLRGLENDVFFQVGLLTTVGLSAKNAILIVEFARELQQTEKMGP 959
V++VVPLG++G LLAATL +NDV+F VGLLTT+GLSAKNAILIVEFA++L + E G
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 960 IEAALEAARLRLRPILMTSLAFILGVMPLAISNGAGSASQHAIGTGVIGGMITATFLAIF 1019
+EA L A R+RLRPILMTSLAFILGV+PLAISNGAGS +Q+A+G GV+GGM++AT LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1020 MIPMFFVKVRAVFSG 1034
+P+FFV +R F G
Sbjct: 1020 FVPVFFVVIRRCFKG 1034


61E4F39_01785E4F39_01815N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_017850113.180584sn-glycerol-3-phosphate ABC transporter
E4F39_017900123.803566hypothetical protein
E4F39_017950124.510914haloacid dehalogenase-like hydrolase
E4F39_01800-1144.457561LysR family transcriptional regulator
E4F39_018050145.341761serine hydrolase
E4F39_018150155.384680MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01790PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 14/35 (40%), Positives = 17/35 (48%)

Query: 32 VVFVGPSGCGKSTLMRMIAGLEEISGGELLIDGAK 66
VV G G GKSTL+ + GL+ S I K
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01800PF06776300.020 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 29.5 bits (66), Expect = 0.020
Identities = 11/49 (22%), Positives = 15/49 (30%), Gaps = 2/49 (4%)

Query: 1 MKTGRRHFVRSVASASAALAAAAWSPARAAIDAPASPATALSLTPGRWS 49
+ + RR R+ A A A A A A+ G W
Sbjct: 38 LASCRRLARRNGARLMLAGAMAI--ALSFGWSDRADAQGAVRSVHGDWQ 84



Score = 28.7 bits (64), Expect = 0.040
Identities = 7/37 (18%), Positives = 13/37 (35%)

Query: 10 RSVASASAALAAAAWSPARAAIDAPASPATALSLTPG 46
+++ A L+ S R A A A ++
Sbjct: 25 KAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01810BLACTAMASEA300.019 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.019
Identities = 11/35 (31%), Positives = 15/35 (42%)

Query: 57 REDALFRFASVSKPIVSAAAMRAVAAGKLDLDASI 91
R D F S K ++ A + V AG L+ I
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKI 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_01815TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 31/155 (20%), Positives = 59/155 (38%), Gaps = 5/155 (3%)

Query: 26 LLALATAGFITIVTEALPAGLLPLMGRDLRVSDALVGQLVTVYAAGSIVAAIPLVAATRG 85
L+ L F +++ E + LP + D A + T + + +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 86 MRRRPLLLAALAGFVVANTATAASPYYAPVLV-ARCVAGVSAGLLWALLAGYASRMVDAR 144
+ + LLL + + + +L+ AR + G A AL+ +R +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 145 QRGRAIAIAMLGAPVAMSVGI-PL-GTALGAALGW 177
RG+A ++G+ VAM G+ P G + + W
Sbjct: 136 NRGKAFG--LIGSIVAMGEGVGPAIGGMIAHYIHW 168


62E4F39_02810E4F39_02840N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_028100112.028698fused uroporphyrinogen-III synthase
E4F39_02815-192.067149heme biosynthesis protein HemY
E4F39_02820-292.523664MFS transporter
E4F39_02825-2111.654116SDR family oxidoreductase
E4F39_02830-2111.685396aldehyde dehydrogenase family protein
E4F39_02835-1121.743762inorganic diphosphatase
E4F39_028401122.637081GIY-YIG nuclease family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02825RTXTOXIND310.021 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.021
Identities = 16/112 (14%), Positives = 37/112 (33%), Gaps = 8/112 (7%)

Query: 359 ATERQKALDAQTAELRTKTEQALASVRQADSQLSQLEG--KLAD----AQTAQTALQQQY 412
+++ LD + AE T + + + S+L+ L A+ A + +Y
Sbjct: 202 KYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKY 261

Query: 413 QDLSRNRDAWM--IEEVGQMLSSASQQLLLTGNTQLALIALQNADARLASSQ 462
+ + +E++ + SA ++ L I +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02835TCRTETB290.041 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.041
Identities = 30/140 (21%), Positives = 55/140 (39%), Gaps = 12/140 (8%)

Query: 73 VGYFLFEVPSNVILHKVGARVWIARIMVTWGIIS---ALTMFVSTPAMFYTM--RFLLGV 127
+ L + K+ ++ I R+++ II+ ++ FV + RF+ G
Sbjct: 56 TAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA 115

Query: 128 AEAGFFPGVILYLTYWYPAHRRGRMTTLFMTAVALSGVVGGPISGYILKTFDGMNGWRGW 187
A F V++ + + P RG+ L + VA+ VG I G I W
Sbjct: 116 GAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI-------HW 168

Query: 188 QWLFLLEGVPSVLVGILLLF 207
+L L+ + + V L+
Sbjct: 169 SYLLLIPMITIITVPFLMKL 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02840DHBDHDRGNASE1153e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (288), Expect = 3e-33
Identities = 73/252 (28%), Positives = 115/252 (45%), Gaps = 6/252 (2%)

Query: 3 LSGKTAVVTGAGSGFGEGIAKTFAREGACVVVNDLHAAAAERVASEIALAGGRALAIAGD 62
+ GK A +TGA G GE +A+T A +GA + D + E+V S + A A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 63 VSRGEDWQALRDAALAAFGSVQVVVNNAGTTHRNKPVLDITEAEYDRVYAINMKSLFWSV 122
V + G + ++VN AG + +++ E++ +++N +F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 123 QTFVPYFRGAGGGAFVNIASTAAVRPRPGLVWYNSTKGAMLTASKTLAAELGADRIRVNC 182
++ Y G+ V + S A PR + Y S+K A + +K L EL IR N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 183 VNPVLGETGLTSEFMGVPDTSENR-----ARFVATIPLGRLSTPQDIANAALYLASDEAA 237
V+P ET + + +E F IPL +L+ P DIA+A L+L S +A
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 238 FVTGACLEVDGG 249
+T L VDGG
Sbjct: 245 HITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_02855IGASERPTASE320.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.3 bits (73), Expect = 0.002
Identities = 29/211 (13%), Positives = 57/211 (27%), Gaps = 18/211 (8%)

Query: 106 AQVAAAAAAQREGAPVTGRVKRVDAERASGTPTPRSPRRATRATAEALDAGTPADEGEAT 165
A P E P P +P T AE + + T
Sbjct: 997 ITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN------SKQESKT 1050

Query: 166 QAPKERKTTEAAKTSKTVRLSKAPKPSKPSKPSKPSKPSKPSKPSKPSKPSKPSKPSKPS 225
E+ TE ++ V ++ ++ ++ +K ++ ++ + + K
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 226 K-----------PSKPSKAS-KASKPSKPSKPSTPPKPPKPSTQIAATAASRRTKTVRAP 273
K P S+ S K + + P + P+ I + T
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 274 AANTTARANASPNAGTAVSAAPPRGARAKQN 304
A T+ P + +N
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201


63E4F39_04305E4F39_04355N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_04305-191.041800SDR family oxidoreductase
E4F39_043100120.581454hypothetical protein
E4F39_04315215-0.007630hypothetical protein
E4F39_043202150.453123transport system membrane protein
E4F39_043252140.589365MdtB/MuxB family multidrug efflux RND
E4F39_043302130.550299MdtA/MuxA family multidrug efflux RND
E4F39_043351160.135772IclR family transcriptional regulator
E4F39_04340011-0.109662DUF839 domain-containing protein
E4F39_043451121.145070hypothetical protein
E4F39_043501121.242164hypothetical protein
E4F39_043550121.189147NAD-dependent epimerase/dehydratase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04320PF01540290.015 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 28.9 bits (64), Expect = 0.015
Identities = 26/84 (30%), Positives = 38/84 (45%), Gaps = 3/84 (3%)

Query: 11 RTGRALADLLLKQQDFEVTALVRRPDFA--LPGAKVVVADLTGDFSSAFN-GITHAIYAA 67
+ G+ AD LKQ + L + PD++ L +A+ T F A + G AI +
Sbjct: 35 KNGKEKADAALKQANALAEELKKNPDYSKILETLNKEIAEATKSFKEAGSYGDYPAIISK 94

Query: 68 GSAESEGATEEEQIDRDAVARAAD 91
SA E A E+Q A + AD
Sbjct: 95 LSAAVENAKSEQQKVDQANKKIAD 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04335ACRIFLAVINRP7450.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 745 bits (1925), Expect = 0.0
Identities = 279/1104 (25%), Positives = 502/1104 (45%), Gaps = 100/1104 (9%)

Query: 3 LARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLPGASPETVATS 62
+A FI RP+ +LA+ + +AG A ++LPV+ P + P + V A+ PGA +TV +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTSPLERHLGSIADVAEMTSMS-SVGNARIVLQFNLNRDIDGAARDVQAAINAARADLPA 121
VT +E+++ I ++ M+S S S G+ I L F D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 SLKSNPTYRKVNPADSPIMVVSLTS--KTASPAKLYDAASTVLQQSLSQIDGIGQVSLSG 179
++ + S +MV S + + D ++ ++ +LS+++G+G V L G
Sbjct: 121 EVQ-QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 180 SANPAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGP------HRYQLYTND 233
+ A+R+ L+ L Y + DV L N G + P +
Sbjct: 180 AQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 234 QATKAAQYKDLVI-AYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLVILYRSPGAN 292
+ ++ + + + + V L DV+ V E+ + +NG+ A + + + GAN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 293 IIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVSLVVMVVFLF 352
+DT + +KA L +L P ++V D + ++ S+ + TL A+ LV +V++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDAIVVLENIAR 412
L+N RATLIP++AVP+ ++GTF + G+S+N L++ +++A G +VDDAIVV+EN+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 413 HI-ENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFREFALTLSLAI 471
+ E+ P +A ++ ++ I++ L AVF+P+ GG G ++R+F++T+ A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 472 AVSLVVSLTLTPMMCARLLPEAHAPRDE--GRVARWLERGFEWMQRGYERTLSWALRHPF 529
A+S++V+L LTP +CA LL A E G W F+ Y ++ L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 530 TILMTLVATIALNIALYIVVPKGFFPQQDTGLMIGGIQADQTTSFQAMKLRFTEMMRIIR 589
L+ +A + L++ +P F P++D G+ + IQ + + + ++
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 590 ANP-----NVANVAGFT-GGAQTNSGFMFVALKDKPQR---KLSADQVIQQLRPQLAEVA 640
N +V V GF+ G N+G FV+LK +R + SA+ VI + + +L ++
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 641 GARTFLQAAQDIRAGGRQSNAQYQFT-LLGDSTAELYKWGP-ILTEALQKRPELADVNSD 698
I G + ++ G L + +L A Q L V +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 QQQGGLEAMVTIDRATAARLGIKPAQIDNTLYDAFGQRQVSTIYNPLNQYHVVMEVAPQY 758
+ + + +D+ A LG+ + I+ T+ A G V+ + + ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 WQSPEMLKQIYISTSGGSASGVQTTNAAAGTYVATTARASTAGAAAQSAAAIAADSARNQ 818
PE + ++Y+ ++ G V + +V + R R
Sbjct: 779 RMLPEDVDKLYVRSANGEM--VPFSAFTTSHWVYGSPRLE-----------------RYN 819

Query: 819 ALNSIASSG--KSSASSGAAVSTSKSTMVPLSAIASFGPSTTPLAVNHQGLFVATTISFN 876
L S+ G SSG A++ ++
Sbjct: 820 GLPSMEIQGEAAPGTSSGDAMALMENLAS------------------------------K 849

Query: 877 LPPGVSLSKATQVIYQTMAEVGVPPTIQGSFQGTAQAFQESLKDQPILILAALAAVYIVL 936
LP G+ G + P L+ + V++ L
Sbjct: 850 LPAGIGY------------------DWTGMSYQERLSG----NQAPALVAISFVVVFLCL 887

Query: 937 GILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIVKKNAIMMVDF 996
LYES+ PV+++ +P VG LL LF + + ++G++ IG+ KNAI++V+F
Sbjct: 888 AALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEF 947

Query: 997 AIDA-SRQGKSSFDAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAEMRAPLGIAIA 1055
A D ++GK +A A +R RPI+MT++A +LG LPLA G G+ + +GI +
Sbjct: 948 AKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVM 1007

Query: 1056 GGLIVSQMLTLYTTPVVYLYMDRL 1079
GG++ + +L ++ PV ++ + R
Sbjct: 1008 GGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 96.1 bits (239), Expect = 4e-22
Identities = 83/503 (16%), Positives = 167/503 (33%), Gaps = 25/503 (4%)

Query: 2 NLARPFITRPVATTLLALGIALAGLFAFVKLPVSPLPQVDFPTILVQASLP-GASPETVA 60
N + L+ I + F++LP S LP+ D L LP GA+ E
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 TSVTSPLERHLGSIAD----VAEMTSMSSVGNAR----IVLQFNLNRDIDGAARDVQAAI 112
+ + +L + V + S G A+ + + +G +A I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NAARADLPASLKSNPTYRKVNPADSPIMVVSLTSKT-----ASPAKLYDAASTVLQQSLS 167
+ A+ +L + + L A + +L +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 168 QIDGIGQVSLSGSAN-PAVRVELEPQALFHYGIGLEDVRAALASANANSPKGAIEAGPHR 226
+ V +G + ++E++ + G+ L D+ +++A +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 227 YQLYT---NDQATKAAQYKDLVIAYRNHAAVSLSDVSSVVDSVEDLRNLGLMNGERAVLV 283
+LY L + N V S ++ V L NG ++ +
Sbjct: 768 KKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHW-VYGSPRLERYNGLPSMEI 826

Query: 284 ILYRSPGANIIDTIERVKAALPQLTAALPADIQVTPVLDRSRTIRASLADTEHTLIIAVS 343
+PG + D A + L + LPA I S R S + I+
Sbjct: 827 QGEAAPGTSSGD----AMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAISFV 881

Query: 344 LVVMVVFLFLRNWRATLIPSVAVPISIVGTFGAMYLLGFSLNNLSLMALIVATGFVVDDA 403
+V + + +W + + VP+ IVG A L + ++ L+ G +A
Sbjct: 882 VVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNA 941

Query: 404 IVVLENI-ARHIENGTPRLQAAFDGAREVGFTVLSISLSLVAVFLPILLMGGIVGRLFRE 462
I+++E + G ++A R +L SL+ + LP+ + G
Sbjct: 942 ILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNA 1001

Query: 463 FALTLSLAIAVSLVVSLTLTPMM 485
+ + + + ++++ P+
Sbjct: 1002 VGIGVMGGMVSATLLAIFFVPVF 1024



Score = 59.9 bits (145), Expect = 5e-11
Identities = 37/225 (16%), Positives = 84/225 (37%), Gaps = 4/225 (1%)

Query: 870 ATTISFNLPPGVSLSKATQVIYQTMAEV--GVPPTIQGS-FQGTAQAFQESLKDQPILIL 926
A + L G + + I +AE+ P ++ T Q S+ + +
Sbjct: 286 AAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF 345

Query: 927 AALAAVYIVLGILYESYIHPVTILSTLPSAGVGALLGLLLFKTEFSIIALIGVILLIGIV 986
A+ V++V+ + ++ + +P +G L F + + + G++L IG++
Sbjct: 346 EAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLL 405

Query: 987 KKNAIMMVDFAIDASRQGKSSF-DAIHEACLLRFRPIMMTTMAALLGALPLAFGRGDGAE 1045
+AI++V+ + K +A ++ ++ M +P+AF G
Sbjct: 406 VDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 1046 MRAPLGIAIAGGLIVSQMLTLYTTPVVYLYMDRLRVWAEKRRDRR 1090
+ I I + +S ++ L TP + + +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04340ACRIFLAVINRP8020.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 802 bits (2073), Expect = 0.0
Identities = 284/1035 (27%), Positives = 498/1035 (48%), Gaps = 31/1035 (2%)

Query: 4 SRVFILRPVGTALLMAAIMLAGLVALRFLPLAALPEVDYPTIQVQTFYPGASPEVMTSSV 63
+ FI RP+ +L +M+AG +A+ LP+A P + P + V YPGA + + +V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TAPLERQFGQMPSLNQMSSQS-SAGASVITLQFSLDLPLDIAEQEVQAAINAAGNLLPSD 122
T +E+ + +L MSS S SAG+ ITL F DIA+ +VQ + A LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 LPAPPIYAKVNPADAPVITLAVTSKTLPLTQ--VQDLADTRLAMKISQVSGVGLVSLSGG 180
+ I + + ++ S TQ + D + + +S+++GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 NRPAVRIQANPLALASYGLNLDDLRTTISNLNVNTPKGNFDGP------TRAYTINANDQ 234
A+RI + L Y L D+ + N G G +I A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 LTSADQYNDAVV-AYKNGRPVMLTDVAKIVAGSENTKLGAWVDAEPAIILNVQRQPGANV 293
+ +++ + +G V L DVA++ G EN + A ++ +PA L ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 IQTVDNVKAILPKLQESLPAALDVQIVTDRTTMIRAAVRDVQFELGLAVALVVLVMYLFL 353
+ T +KA L +LQ P + V D T ++ ++ +V L A+ LV LVMYLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 ANVYATIIPSLSVPLSLIGTLAVMYLSGFSLNNLSLMALTIATGFVVDDAIVMIENIARY 413
N+ AT+IP+++VP+ L+GT A++ G+S+N L++ + +A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 -VEEGDSALEAALKGSKQIGFTIISLTVSLIAVLIPLLFMGDVVGRLFHEFAITLAVTIV 472
+E+ EA K QI ++ + + L AV IP+ F G G ++ +F+IT+ +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 ISAVVSLTLVPMMCAKLLRHTPPPESHRFEAKVHGLIERV----IERYGVALQWVLDRQR 528
+S +V+L L P +CA LL+ E H + G + Y ++ +L
Sbjct: 480 LSVLVALILTPALCATLLK-PVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 529 ATLVVAVLTLALTALLYAVIPKGFFPTQDTGVIQAITQAPQSVSYGAMAERQQALAAEIL 588
L++ L +A +L+ +P F P +D GV + Q P + + + L
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 589 KH--PDVVSLTSFIGVDGANITLNSGRMLINLKPRDERS---ESAGDVIRSLQRQVANVT 643
K+ +V S+ + G + N+G ++LKP +ER+ SA VI + ++ +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 644 GISLYMQPVQDLTIDSTVSPTQYQFMLTS---PNPDEFATWVPKLVDRLRKEPS-LADVA 699
+ P I + T + F L D +L+ + P+ L V
Sbjct: 659 DGFVI--PFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 700 TDLQNSGKSVYIEIDRTSAARFGITPATVDNALYDAYGQRIVSTIFTQSNQYRVILESEP 759
+ +E+D+ A G++ + ++ + A G V+ + ++ ++++
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 760 QMQHYTDSLNGIYLPSAGGGQVPLSAIATFRERPAPLLVSHLSQFPAATISFNLAPGASL 819
+ + + ++ +Y+ SA G VP SA T + + P+ I APG S
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 820 GEAVKAIDAAERELGLPASFQTRFQGAALAFQASLSNQLFLILAAIVTMYIVLGVLYESY 879
G+A+ ++ +L PA + G + + S + L+ + V +++ L LYES+
Sbjct: 837 GDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 880 IHPITILSTLPSAGVGALLALMITGHDLDIIGIIGIVLLIGIVKKNAIMMIDFALEAERV 939
P++++ +P VG LLA + D+ ++G++ IG+ KNAI++++FA +
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 940 EGKPPREAIYQACLLRFRPILMTTLAALLGAVPLIVGSGAGSELRQPLGIAIAGGLIVSQ 999
EGK EA A +R RPILMT+LA +LG +PL + +GAGS + +GI + GG++ +
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSAT 1014

Query: 1000 VLTLFTTPVIYLGFD 1014
+L +F PV ++
Sbjct: 1015 LLAIFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04345RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 4e-08
Identities = 27/149 (18%), Positives = 57/149 (38%), Gaps = 16/149 (10%)

Query: 84 AARGEMPVVLNALGTVTPLANV-TVRTQLSGYLQAVSFQEGQIVKKGDVLAQIDPRP--- 139
+ G++ +V A G +T ++ + ++ + +EG+ V+KGDVL ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 140 ----YQISLANAQGALARDEALLATARLDLKRYQTLVAQ---DSIAKQTADTQASLVKQY 192
Q SL A+ R + L + L+ L + +++++ SL+K+
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE- 193

Query: 193 EGTVQIDRAAIDSAKLNLAYARITAPVSG 221
Q + L + A
Sbjct: 194 ----QFSTWQNQKYQKELNLDKKRAERLT 218



Score = 38.3 bits (89), Expect = 5e-05
Identities = 33/182 (18%), Positives = 61/182 (33%), Gaps = 26/182 (14%)

Query: 141 QISLANAQGALARDEALLAT--ARLDLKRYQTLVAQDSIAKQTADTQASLVKQY-EGTVQ 197
+ ++ + L ++L+ + L A++ T + ++ + + T
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 198 ID--RAAIDSAKLNLAYARITAPVSGRV-GLRQVDPGNYVTPSDT--------NGIVVIT 246
I + + + I APVS +V L+ G VT ++T + + V
Sbjct: 311 IGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTA 370

Query: 247 QLQPMSVIFTTSEDNLPAILKQVGAGGKLSVTAYNRNNTTPLETGV-LDTLDNQIDTATG 305
+Q + F AI+K V A+ L V LD D G
Sbjct: 371 LVQNKDIGFINVG--QNAIIK---------VEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 306 TV 307
V
Sbjct: 420 LV 421


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_04380NUCEPIMERASE335e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 33.2 bits (76), Expect = 5e-04
Identities = 24/123 (19%), Positives = 41/123 (33%), Gaps = 24/123 (19%)

Query: 1 MKIALFGATGMIGSRIAAEAARRGHQVTAL-------------SRNPAASGANVQAKAAD 47
MK + GA G IG ++ GHQV + +R + Q D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 48 LFD---PASIAAALEGQDVVASA------YGPKQEEASKVVAVAKAL--VDGARKAGVKR 96
L D + A+ + V S Y + A + L ++G R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 97 VVV 99
++
Sbjct: 121 LLY 123


64E4F39_05605E4F39_05645N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_05605-118-2.487496fimbrial biogenesis outer membrane usher
E4F39_05615018-0.640069SCPU domain-containing protein
E4F39_056203140.767991hypothetical protein
E4F39_056254102.391760response regulator
E4F39_056304112.173620efflux transporter outer membrane subunit
E4F39_056352111.751632DHA2 family efflux MFS transporter permease
E4F39_056401132.175991thiol:disulfide interchange protein
E4F39_056451131.919622HlyD family secretion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05650PF00577455e-150 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 455 bits (1171), Expect = e-150
Identities = 166/808 (20%), Positives = 267/808 (33%), Gaps = 89/808 (11%)

Query: 37 GTLYLELVVN-ALSTGRIVPVRYRDGVYYARA----GDLAQASVRTGAQP-------DAL 84
GT +++ +N R V D LA + T + DA
Sbjct: 76 GTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 85 VDL-SRLDGVQVEYESAEQRLKLTVPPDWLPRQTLG--SPRLYDRTPAAVSFGLLFNYDV 141
V L S + + + +QRL LT+P ++ + G P L+D A L NY+
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINA----GLLNYNF 191

Query: 142 YANSPT--LGTSYTSAWTEQRLFDRWGTVTNTGVYRRDYGGGAGGVGSNRYLRYDTFWRY 199
NS +G + A+ + G Y GS ++ W
Sbjct: 192 SGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLE 251

Query: 200 SDQDRLR-TYTAGDVITGALSWSSAVRLGGVSVERDFKVRPDIVTYPLPQFSGQAAVPTA 258
D LR T GD T + + G + D + PD P G A
Sbjct: 252 RDIIPLRSRLTLGDGYTQGDIFDG-INFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQ 310

Query: 259 VDLFINGSKTTTGQVNPGPFTMNNVPFINGAGEATVVTTDALGRQVATTIPFYVANTLLQ 318
V + NG V PGPFT+N++ +G+ V +A G T+P+ L +
Sbjct: 311 VTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQR 370

Query: 319 KGLSDYSLSAGAMRRDYGIRSFSYGKFAASGTARHGLTDYLTLEGHVEGGERFALGGLGF 378
+G + YS++AG R + T HGL T+ G + +R+ G
Sbjct: 371 EGHTRYSITAGEYRSGNAQQE---KPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGI 427

Query: 379 DLGIGMFGVLGVAATQSRLAGASGRQY---------------------AFGYSYASQRF- 416
+G G L V TQ+ Q+ GY Y++ +
Sbjct: 428 GKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYF 487

Query: 417 SVSLQRIQRTNGFRDLS--------VYDLPANVAYRLVRSSTQATGALNLGALG----GT 464
+ + R NG+ + R Q T LG
Sbjct: 488 NFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSG 547

Query: 465 LGAGYFDVRGADGARTRIANLSYTRPLWRRATLYASVNKTVGEHGVAAQLQLIV--PLG- 521
Y+ D N ++ W TL S+ K + G L L V P
Sbjct: 548 SHQTYWGTSNVDEQFQAGLNTAFEDINW---TLSYSLTKNAWQKGRDQMLALNVNIPFSH 604

Query: 522 ----------EPGVVTGALARDANNSFSERVQYSRSVPSDGGLGWNL--AYAGGGSHYQ- 568
+ +++ D N + ++ D L +++ YAGGG
Sbjct: 605 WLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSG 664

Query: 569 ---QADATWRNRYFQAQGGVYGYGAGRGYARWGEVQGSVVVMDGAVLPANRVDDAFVLID 625
A +R Y A G + + V G V+ V ++D VL+
Sbjct: 665 STGYATLNYRGGYGNANIGYSHSDDIKQL--YYGVSGGVLAHANGVTLGQPLNDTVVLVK 722

Query: 626 TQGRGGVPVRYENQLVGKTDGGGHLLVPWAPSYYAGKYEIDPLDLPSNVRVPIVERRVAV 685
G V ENQ +TD G+ ++P+A Y + +D L NV + V
Sbjct: 723 APGAKDAKV--ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 686 RDHGGALVTFPIRRIVCAQIALVDAAGRPVAIGSRVLHEESGETALVGWQGETYLEGLSA 745
F R + + +P+ G+ V E S + +V G+ YL G+
Sbjct: 781 TRGAIVRAEFKARVG-IKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPL 839

Query: 746 LNHLRVR--TPDGRTCRATFAADVDAAQ 771
++V+ + C A + ++ Q
Sbjct: 840 AGKVQVKWGEEENAHCVANYQLPPESQQ 867


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05665HTHFIS631e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.3 bits (154), Expect = 1e-12
Identities = 30/122 (24%), Positives = 50/122 (40%), Gaps = 10/122 (8%)

Query: 440 RVLVVDDQEMNRIVLRYQLDALGHHARLCASGDEALRALGTAAYDVVLTDCRMPGMDGIA 499
+LV DD R VL L G+ R+ ++ R + D+V+TD MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 500 LTAAIRAH-PDARVRATPIVGVTALVSDAEHARCVDAGMTLCIGKP----TTLDALERAL 554
L I+ PD P++ ++A + + + G + KP + + RAL
Sbjct: 65 LLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 555 VE 556
E
Sbjct: 120 AE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05685TCRTETB1022e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (256), Expect = 2e-25
Identities = 71/331 (21%), Positives = 140/331 (42%), Gaps = 20/331 (6%)

Query: 41 AFMEVLDTTIVNVALPHIAGTMSASYDEATWTLTSYLVANGIVLPISGFLGRLLGRKRYF 100
+F VL+ ++NV+LP IA + W T++++ I + G L LG KR
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 101 VLCIVAFTICSFLCGIATDLGQLIVF-RVLQGLFGGGLQPNQQSIILDTF-PPEQRNRAF 158
+ I+ S + + L++ R +QG G P +++ + P E R +AF
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGA-GAAAFPALVMVVVARYIPKENRGKAF 141

Query: 159 SISAVAIVVAPVLGPTLGGWITDNFSWRWVFLLNVPIGVLTSLAVIQLVEDPPWKRGRAR 218
+ + + +GP +GG I W +LL +P+ +T + V L++ K R +
Sbjct: 142 GLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPM--ITIITVPFLMKLLK-KEVRIK 196

Query: 219 GLSIDYIGITLIAIGLGCLQVMLDRGEDEDWFASTFIRTFAVLTVAGLVGATFWLLYAKK 278
G D GI L+++G+ + F +++ +F +++V + +
Sbjct: 197 G-HFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 279 PVVDLSCLKDRNFALGCVTIATFAVVLYGSAVLVPQLAQQRLGYTAMLAG-LVLSPGALL 337
P VD K+ F +G + + G +VP + + + G +++ PG +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 338 ITLEIPIVSKLMPYVQTRFLVCFGFLLLAAS 368
+ + I L+ +++ G L+ S
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVS 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_05695RTXTOXIND1006e-25 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 99.5 bits (248), Expect = 6e-25
Identities = 63/414 (15%), Positives = 134/414 (32%), Gaps = 91/414 (21%)

Query: 51 KRPGKKPLVVLAIIVVLLLVGAFVW-WFATRNQVSTDDA--YTDGNAITIAPKVSGYVVA 107
+ P + ++A ++ LV AF+ V+T + G + I P + V
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKE 109

Query: 108 LAIDDNVYVHRGDLLLVIDQRDYQAQVDAARAQLGLAQAQLDAAQVQLDIA------HVQ 161
+ + + V +GD+LL + +A ++ L A+ + Q+ ++
Sbjct: 110 IIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELK 169

Query: 162 FPAQYRQAQA---QIEAAQASFRQALAAYERQHAVDARATSQQAIDVADAQRLTADANVA 218
P + ++ + ++ + ++ Q + + +D A+RLT A +
Sbjct: 170 LPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQ-----KYQKELNLDKKRAERLTVLARIN 224

Query: 219 TARAQA----------------------------RTASLVPQQIRQAQTAVEQRRQQVLQ 250
+ ++R ++ +EQ ++L
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284

Query: 251 AQA-----------------------------QLEAAQLALSYCEVRAPSDGWITRRNVQ 281
A+ +L + +RAP + + V
Sbjct: 285 AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVH 344

Query: 282 -LGSFLQAGAALFAIVTPQ---LWVTANFKESQLERMRAGDRVSVSVDAYP---NLELHG 334
G + L IV P+ L VTA + + + G + V+A+P L G
Sbjct: 345 TEGGVVTTAETLMVIV-PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVG 403

Query: 335 HVDSIQLGSGSRFSAFPPENATGNFVKIVQRVPVKIAIDGGLPRDPPLGIGLSV 388
V +I L + + G ++ + G ++ PL G++V
Sbjct: 404 KVKNINLDA-------IEDQRLGLVFNVIISIEENCLSTGN--KNIPLSSGMAV 448


65E4F39_06325E4F39_06370N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_063251100.538479DHA2 family efflux MFS transporter permease
E4F39_06335111-0.105648EmrA/EmrK family multidrug efflux transporter
E4F39_06340211-1.545959efflux transporter outer membrane subunit
E4F39_06345012-1.116560MarR family transcriptional regulator
E4F39_06350-113-0.510566hypothetical protein
E4F39_06355-113-0.099700translational GTPase TypA
E4F39_06365-212-0.4516402-oxoglutarate dehydrogenase E1 component
E4F39_06370014-0.6882512-oxoglutarate dehydrogenase complex
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06370TCRTETB1356e-37 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 135 bits (342), Expect = 6e-37
Identities = 84/396 (21%), Positives = 159/396 (40%), Gaps = 16/396 (4%)

Query: 27 VFMNVLDTSIANVAIPTISGDLGVSSDQGTWVITSFAVANAISVPLTGWLTDRIGQVRLF 86
F +VL+ + NV++P I+ D WV T+F + +I + G L+D++G RL
Sbjct: 23 SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLL 82

Query: 87 LASIILFVISSWMCGLAPT-LPFLLASRVLQGAVAGPMIPLSQALLLSSYPRAKAPMALA 145
L II+ S + + + L+ +R +QGA A L ++ P+ A
Sbjct: 83 LFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFG 142

Query: 146 LWSMTTLIAPVAGPILGGWISDNYSWPWIFYVNIPVGIAAAAVTWMIYRSRESAVRRAPI 205
L + GP +GG I+ W ++ IP+ I V +++ ++ +
Sbjct: 143 LIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPM-ITIITVPFLMKLLKKEVRIKGHF 199

Query: 206 DGVGLALLVIWVGSLQIMLDKGKDLDWFASTTIVVLALTALIAFAFFVVWELTAEHPVVD 265
D G+ L+ + G + ML F ++ + + ++++F FV P VD
Sbjct: 200 DIKGIILMSV--GIVFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVD 249

Query: 266 LSLFRMRNFSGGTIALSVGYGLYFGNLVLLPLWLQTQIGYTATDAG-LVMAPVGFFAILL 324
L + F G + + +G G + ++P ++ + + G +++ P I+
Sbjct: 250 PGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIF 309

Query: 325 SPLTGKFLSRTDPRYIATAAFLTFALCFWMRSRYTTGVDEWSLMAPTFVQGIAMAGFFIP 384
+ G + R P Y+ ++ F S + + FV G ++
Sbjct: 310 GYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFTKTV 368

Query: 385 LVSITLSGLPGHRIPAASGLSNFVRIMCGGIGTSIF 420
+ +I S L A L NF + G G +I
Sbjct: 369 ISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06375RTXTOXIND711e-15 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 71.4 bits (175), Expect = 1e-15
Identities = 36/270 (13%), Positives = 85/270 (31%), Gaps = 28/270 (10%)

Query: 94 ADSQVALQQAEANLAQTVRQVRGLYVNDDQYRAQVALRQSDLS--------------KAQ 139
+ Q Q E NL + + + ++Y + +S L
Sbjct: 196 STWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVL 255

Query: 140 DDLRRRLAVAQTGAVSQEEISHARDAVKAAQASLDAAGQQLASNRALTANTTVADHPNVL 199
+ + + V + ++ + +A+ Q + L N+
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN-EILDKLRQT--TDNIG 312

Query: 200 AAAAKVRDAYLNNARNTLPAPVTGYVAKRSVQ-VGQRVSPGTPLMSVVPLNAV-WVDANF 257
++ + + APV+ V + V G V+ LM +VP + V A
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALV 372

Query: 258 KEVQLKHMRIGQPVELTADIYGSSVKYHGKVIGFSAGTGAAFSLLPAQNATGNWIKVVQR 317
+ + + +GQ + + + + +G ++G + + +V
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTR--YGYLVG-------KVKNINLDAIEDQRLGLVFN 423

Query: 318 LPVRVELDPKELKEHPLRIGLSMQVDVDIK 347
+ + +E + + + M V +IK
Sbjct: 424 VIISIEENCLSTGNKNIPLSSGMAVTAEIK 453



Score = 47.1 bits (112), Expect = 8e-08
Identities = 32/207 (15%), Positives = 72/207 (34%), Gaps = 28/207 (13%)

Query: 29 VIAIAAIAYGLYYLLVARFHETTDDAYVNGNVV------QITPQVTGTVIAVKADDTQTV 82
++A + + + +++ + A NG + +I P V + + ++V
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESV 118

Query: 83 KSGDPLVVLDPADSQVALQQAEANLAQT---------------VRQVRGLYVNDDQYRAQ 127
+ GD L+ L ++ + +++L Q + ++ L + D+ Y
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 128 VA----LRQSDLSKAQ-DDLRRRLAVAQ-TGAVSQEEISHARDAVKAAQASLDAAGQQLA 181
V+ LR + L K Q + + + + E + + +L
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 182 SNRALTANTTVADHPNVLAAAAKVRDA 208
+L +A H VL K +A
Sbjct: 239 DFSSLLHKQAIAKH-AVLEQENKYVEA 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06395TCRTETOQM1717e-48 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 171 bits (434), Expect = 7e-48
Identities = 102/435 (23%), Positives = 172/435 (39%), Gaps = 62/435 (14%)

Query: 5 LRNIAIIAHVDHGKTTLVDQLLRQSGTFRENQQVAE--RVMDSNDIEKERGITILAKNCA 62
+ NI ++AHVD GKTTL + LL SG E V + D+ +E++RGITI +
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 63 VEYEGTHINIVDTPGHADFGGEVERVLSMVDSVLLLVDAVEGPMPQTRFVTKKALALGLK 122
++E T +NI+DTPGH DF EV R LS++D +LL+ A +G QTR + +G+
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 123 PIVVINKIDRPGARIDWV-------------INQTFDLFDKLGATE----EQLDFPIV-- 163
I INKID+ G + V I Q +L+ + T EQ D I
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 164 -----------------YASGLNGY---ASLDP-----AARDGDMRPLFEAILQHVPVRP 198
+ SL P A + + L E I
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSST 242

Query: 199 ADPDAPLQLQITSLDYSTYVGRIGVGRITRGRIKPGQPVVMRFGLEGDVLNRKINQVLSF 258
+ L ++ ++YS R+ R+ G + V R + + KI ++ +
Sbjct: 243 HRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSV--RISEKEKI---KITEMYTS 297

Query: 259 QGLERVQVDSAEAGDIVLINGIEDVGIGATICAVEAPEALPMITVDEPTLTMNFLVNSSP 318
E ++D A +G+IV++ E + + + + + I P L +
Sbjct: 298 INGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQ 356

Query: 319 LAGREGKFVTSRQIRDRLMKELNHNVALRVKDTGDETVFEVSGRGELHLTILVENMRRE- 377
+ D L++ V E + +S G++ + + ++ +
Sbjct: 357 QREMLLDALLEISDSDPLLR-------YYVDSATHEII--LSFLGKVQMEVTCALLQEKY 407

Query: 378 GYELAVSRPRVVMQE 392
E+ + P V+ E
Sbjct: 408 HVEIEIKEPTVIYME 422



Score = 33.7 bits (77), Expect = 0.002
Identities = 17/100 (17%), Positives = 32/100 (32%), Gaps = 1/100 (1%)

Query: 387 RVVMQEIDGVKHEPYELLTVDLEDEHQGGVMEELGRRKGEMLDMVSDGRGRTRLEYRIPA 446
V+++ EPY + E+ + + ++D L IPA
Sbjct: 525 EQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQL-KNNEVILSGEIPA 583

Query: 447 RGLIGFQSEFLTLTRGTGLMSHIFDSYAPVKEGSVGERRN 486
R + ++S+ T G + Y V + R
Sbjct: 584 RCIQEYRSDLTFFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06405RTXTOXIND290.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.028
Identities = 8/83 (9%), Positives = 26/83 (31%), Gaps = 3/83 (3%)

Query: 48 EVPAPAAGVLAQVLQNDGDTVVADQVIATID---TEAKAGAAAAAAGAADVQPAAAPVAA 104
E+ ++ +++ +G++V V+ + EA ++ A ++ + +
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 105 PAPAAQPAAAAASSTAAASPAAS 127
+ S
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVS 180


66E4F39_06455E4F39_06515N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_06455-2120.631559pilus assembly protein
E4F39_06460-2112.202799DUF3613 domain-containing protein
E4F39_06465-1111.816148hypothetical protein
E4F39_064701132.537913sigma-54-dependent Fis family transcriptional
E4F39_064751133.141185DUF2968 domain-containing protein
E4F39_064801133.130559RNA chaperone Hfq
E4F39_064852123.760649hypothetical protein
E4F39_064901133.376733hypothetical protein
E4F39_064950123.202052DUF1571 domain-containing protein
E4F39_065000122.569986long-chain-fatty-acid--CoA ligase
E4F39_06505-1102.120667TetR/AcrR family transcriptional regulator
E4F39_06515-2110.898460MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06505PYOCINKILLER320.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 31.7 bits (71), Expect = 0.004
Identities = 29/86 (33%), Positives = 37/86 (43%), Gaps = 3/86 (3%)

Query: 214 LMNQLKLAPAVRAEIRNDATRIAAAARARQRA-LARPGAPGAAASAGATLAASAAGSNGG 272
MN L A A + R AAA A+++A AA A T A A GS
Sbjct: 203 RMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQ--AAIRAANTYAMPANGSVVA 260

Query: 273 AAAGKGAVAGAGASAPGAAATATAAA 298
AAG+G + A +A A A + A A
Sbjct: 261 TAAGRGLIQVAQGAASLAQAISDAIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06520HTHFIS2973e-98 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 297 bits (763), Expect = 3e-98
Identities = 130/475 (27%), Positives = 204/475 (42%), Gaps = 53/475 (11%)

Query: 19 ADIVDRVARCMSSFDVEVIRADN-EELSAERTAMRPSLAIISVSMIE-SGAAFLRTWQAE 76
A I + + +S +V N L A L + V M + + L +
Sbjct: 13 AAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKA 72

Query: 77 -IGMPVVWVGA--------------ARDHDPSLYPPEYSHILPLDFTCAELRGMISKLAV 121
+PV+ + A A D+ P P + + ++ + +++
Sbjct: 73 RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK--PFDLTELIGIIGRA------LAEPKR 124

Query: 122 QLRAHAAKALEPSTLVAHSDCMQALLQEVDTFADCDTNVLLHGETGVGKERIAQLLHEKH 181
+ + + LV S MQ + + + D +++ GE+G GKE +A+ LH+ +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-Y 183

Query: 182 SRYGMGEFVPVNCGAIPDGLFESLFFGHAKGSFTGAVGTHKGYFEQAAGGTLFLDEVGDL 241
+ G FV +N AIP L ES FGH KG+FTGA G FEQA GGTLFLDE+GD+
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 242 PLYQQVKLLRVLEDGAVLRIGATAPVKVDFRLVAASNKKLPQLVKDGLFRADLYYRLAVI 301
P+ Q +LLRVL+ G +G P++ D R+VAA+NK L Q + GLFR DLYYRL V+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 302 ELSIPSLEERGPVDKIALFKSFVASIVGEDRLAALPELPYWLAEAVADSYFPGNVRELRN 361
L +P L +R D L + FV E + E + +PGNVREL N
Sbjct: 304 PLRLPPLRDR-AEDIPDLVRHFVQQAEKEGL--DVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 362 LAERVGV------------------------TVRQTGGWDTARLQRLIAHARSAAQPAPA 397
L R+ + + + + + +
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 398 ESAPDVFVDRSKWDMTERNRVIAALDANGWRRQDTAQHLGISRKVLWEKMRKYQI 452
++ P + E ++AAL A + A LG++R L +K+R+ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06525IGASERPTASE280.042 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.042
Identities = 19/108 (17%), Positives = 32/108 (29%), Gaps = 9/108 (8%)

Query: 119 LFQQKAFWRVIRTASEARAEAVYRDFAKQSETLAVNELQAAKLESQKALTDRQIAVA--- 175
++A V + + T K E K T++ V
Sbjct: 1067 EVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVT 1126

Query: 176 ------QERASRLQADLSIAREQRAAVATRQKDKLDETVALREQKSER 217
QE++ +Q ARE V ++ T A EQ ++
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06530PF00577290.014 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.0 bits (65), Expect = 0.014
Identities = 19/123 (15%), Positives = 28/123 (22%), Gaps = 9/123 (7%)

Query: 83 GAHGGGGRPGGREGGGHGPYGSHGGSREPRGDGGGYGAREPRGDGGYGSRESRGDGGYGS 142
SH + G YG + Y + GG G+
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 143 REPRGDGGYGAREPREPREPRESYGAPQEGASTPSGTQERNGNGNGPVIVTRRRRSLGPT 202
G R YG G S ++ +G V+ +LG
Sbjct: 663 SGSTGYATLNYRGG---------YGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 713

Query: 203 DGQ 205

Sbjct: 714 LND 716


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06555HTHTETR673e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.0 bits (163), Expect = 3e-15
Identities = 21/83 (25%), Positives = 35/83 (42%)

Query: 4 RQASRQSGGTKARILDAAEDLFIEHGFEAMSMRQITSRAAVNLAAVNYHFGSKEALIHAM 63
R+ +++ T+ ILD A LF + G + S+ +I A V A+ +HF K L +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 64 LSRRLDQLNEERLRILDRFDAQL 86
+ E L +F
Sbjct: 63 WELSESNIGELELEYQAKFPGDP 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06560TCRTETA604e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 60.2 bits (146), Expect = 4e-12
Identities = 58/261 (22%), Positives = 103/261 (39%), Gaps = 12/261 (4%)

Query: 59 IGALIFGRLADHFGRRPTLMINIACYSLLELASGFAPSLAALLVLRTLFGVAMGGEWGVG 118
A + G L+D FGRRP L++++A ++ AP L L + R + G+ G V
Sbjct: 58 ACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVA 116

Query: 119 SALTMETVPPRARGAVSGLLQAGYPSGYLLASVVFGLLYPYIGWRGMFMIGVLPALLVLY 178
A + R G + A + G + V+ GL+ + F L L L
Sbjct: 117 GAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLT 176

Query: 179 VRAKVPES-PAWKQMEKRARPGLVATLKQNWKLSIYAVVLMTAF--NFFSHGTQDLYPTF 235
+PES ++ +R +A+ + +++ A ++ F L+ F
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIF 236

Query: 236 LREQHHFDPHTVSWITIVLNI-GAIVGGLTFGWLSERIGRRRAI---FIAAMIALPVLPL 291
++ H+D T+ I ++ + G ++ R+G RRA+ IA +L
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAF 296

Query: 292 ----WAFSTGALALAAGAFLM 308
W + LA+G M
Sbjct: 297 ATRGWMAFPIMVLLASGGIGM 317


67E4F39_06915E4F39_06970N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_06915-111-1.478402fimbrial protein
E4F39_06925-1110.698810Flp pilus assembly protein CpaB
E4F39_069300121.174901type II and III secretion system protein family
E4F39_069352112.399331hypothetical protein
E4F39_069402124.665561fimbrial protein
E4F39_06945-2134.124473CpaF family protein
E4F39_06950-2144.504368pilus assembly protein
E4F39_06955-1155.477275type II secretion system F family protein
E4F39_069603145.170851tetratricopeptide repeat protein
E4F39_069652154.904011pilus assembly protein
E4F39_069704174.180791pilus assembly protein TadE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06960PREPILNPTASE329e-04 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 32.1 bits (73), Expect = 9e-04
Identities = 31/148 (20%), Positives = 49/148 (33%), Gaps = 18/148 (12%)

Query: 20 LVASWTLASLALADLRTRRLATFAVALVGALYAALALAGAPGDGGFASHAALGAAA---- 75
L+ +W L +L DL L + L+ L G A +GA A
Sbjct: 138 LLLTWVLVALTFIDLDKMLLP--DQLTLPLLWGGLLFNLLGGFVSLGD-AVIGAMAGYLV 194

Query: 76 ----FALGAAMFRAGWIAGGDVKLAAVVFLWAGPAHAWPVAFAIGVGGLAVGAVCIAAGR 131
+ + + GD KL A + W G V + G +G I
Sbjct: 195 LWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRN 254

Query: 132 VPRVLAWFAPARGVPYGVALAAGGLLAV 159
++ +P+G LA G +A+
Sbjct: 255 H-------HQSKPIPFGPYLAIAGWIAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06970BCTERIALGSPD1434e-39 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 143 bits (361), Expect = 4e-39
Identities = 68/283 (24%), Positives = 116/283 (40%), Gaps = 16/283 (5%)

Query: 127 VVQTLKPYLRQQEALVNRLTLARPIQVHLRVRITEVDRNITQQLGINWSALGA------- 179
+V + E ++ +L + RP QV + I EV LGI W+ A
Sbjct: 322 IVTAAPDVMNDLERVIAQLDIRRP-QVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTN 380

Query: 180 SGNFVGGLFNGRTLFDTASKAFDLSPSGAFSVVGGFHTSRYSIDG--VLDALDQEGLITM 237
SG + G ++ S A S G Y + +L AL +
Sbjct: 381 SGLPISTAIAGANQYNKDGTVSSSLAS-ALSSFNGIAAGFYQGNWAMLLTALSSSTKNDI 439

Query: 238 LAEPNLTAISGQTASFLAGGEFPIPVAQDTTGA----ITIQFKPYGVSLDFTPTVLADNR 293
LA P++ + A+F G E P+ TT T++ K G+ L P + +
Sbjct: 440 LATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDS 499

Query: 294 ISLKVRPEVSEIDPTNSVTTGSIKVPALTVRRVDTTVELSSGQSFAIGGLLQSKSSDVLA 353
+ L++ EVS + S T+ + R V+ V + SG++ +GGLL SD
Sbjct: 500 VLLEIEQEVSSVADAASSTSSDLGA-TFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTAD 558

Query: 354 ELPGLARLPVLGKLFSSRNYLNDKTEVVVIVTPYIVQPANPGE 396
++P L +PV+G LF S + K +++ + P +++ +
Sbjct: 559 KVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06980HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.001
Identities = 29/165 (17%), Positives = 52/165 (31%), Gaps = 20/165 (12%)

Query: 22 GARLVAIVADAASDEVIRNLIADQAMTGAQVARGGIDDAIALMRDLSHGPQHLLVDVSGA 81
GA ++ DAA V+ ++ + + ++ DV
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYD--VRITSNAATLWRWIA--AGDGDLVVTDVV-- 56

Query: 82 AMP----LSDLARLADVCDPSVNVIVIGERNDVGLFRSMLRIGVRDYLVKPL----TVEL 133
MP L R+ P + V+V+ +N G DYL KP + +
Sbjct: 57 -MPDENAFDLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114

Query: 134 VHRALSAADPNAAARAGKAIGFVGARGGVGVTSIAVALARHLADR 178
+ RAL+ + + + G S A+ + R
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVG----RSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_06985PF05272300.034 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.034
Identities = 18/50 (36%), Positives = 26/50 (52%), Gaps = 4/50 (8%)

Query: 303 IVISGGTGSGKTTLLNAL---SHFIDSHERIVTIEDAAELQLQQPHVVSL 349
+V+ G G GK+TL+N L F D+H I T +D+ E Q+ L
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYE-QIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07000SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 31.1 bits (70), Expect = 0.004
Identities = 20/83 (24%), Positives = 32/83 (38%)

Query: 54 SVAESALAAGDAELAATLFERALKADPRSLPAQVGLGDAMYQTGELARAGVLYAQAAAAA 113
S+A + +G E A +F+ D +GLG G+ A Y+ A
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 114 PDDPRAQLGLARVALRERHLDDA 136
+PR A L++ L +A
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEA 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07010PYOCINKILLER320.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 32.1 bits (72), Expect = 0.004
Identities = 30/132 (22%), Positives = 49/132 (37%), Gaps = 4/132 (3%)

Query: 23 RVAAARNELQNAADAAALAGAASLEAGAGAPAWAAAASAAAAALSLNASDGAALSSGDVQ 82
A A+ + + A A AA+ A PA + + AA + + GAA + +
Sbjct: 226 AAAEAKRKAEEQARQQAAIRAANTYA---MPANGSVVATAAGRGLIQVAQGAASLAQAIS 282

Query: 83 TGYWNVTGVPAGLEPTTLAPGEYDVPAVQATVTRAPNQNGGPLSLLMGGLLGLVGTPAAA 142
V G P+ +A G + T + +Q + +G +G P +
Sbjct: 283 DAI-AVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGMDAAKLGLPPSV 341

Query: 143 TAVAVAGAPATV 154
AVA A TV
Sbjct: 342 NLNAVAKASGTV 353


68E4F39_07010E4F39_07055N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_07010-1162.075839TetR family transcriptional regulator
E4F39_070200183.818324MexX/AxyX family multidrug efflux RND
E4F39_07030-2124.003969multidrug efflux RND transporter permease
E4F39_07035-2112.636770efflux transporter outer membrane subunit
E4F39_07040-1123.294201hypothetical protein
E4F39_07045-2112.480964hypothetical protein
E4F39_07050-1112.462988type-1 fimbrial protein
E4F39_070550112.583943fimbrial biogenesis outer membrane usher
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07050HTHTETR1175e-35 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 117 bits (295), Expect = 5e-35
Identities = 53/210 (25%), Positives = 100/210 (47%), Gaps = 4/210 (1%)

Query: 1 MARKTREESLNTKNRILDAAELVLLEKGVGQTAMADIAEAAGMSRGAVYGHFNGKIEVCV 60
MARKT++E+ T+ ILD A + ++GV T++ +IA+AAG++RGA+Y HF K ++
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AVCDRAFSRAVEGFDLSDERPA---LATLRLAASHYLHQCGEPGSMQRVLEILYMKCEQS 117
+ + + S E + L+ LR H L + ++EI++ KCE
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 118 EENAPLMRRRALYELQTLRIAKALLRRAVAAGELDASLDVHLAGVYLLSLLEGIFGSMIW 177
E A + + + L++ + L+ + A L A L A + + + G+ + ++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 178 TTRLRGDRWRDAEAMLDAGVDTLRASPALR 207
+ D ++A + ++ P LR
Sbjct: 181 APQSF-DLKKEARDYVAILLEMYLLCPTLR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07055RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 19/133 (14%), Positives = 41/133 (30%), Gaps = 5/133 (3%)

Query: 67 EVRARVAGIVTARTYEEGQEVKRGAVLFRIDPAPFKAARDAAAGALEKAQAAHLAALDKR 126
E++ IV +EG+ V++G VL ++ +A +L +A+
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 127 RRYDELVRDRAVSERDHTEALADERQAKAAVASARAELA-----RAQLQLDYATVTAPID 181
R + + E + + + + + + Q +L+ A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 182 GRARRALVTEGAL 194
R E
Sbjct: 218 TVLARINRYENLS 230



Score = 34.8 bits (80), Expect = 5e-04
Identities = 18/100 (18%), Positives = 39/100 (39%), Gaps = 10/100 (10%)

Query: 102 KAARDAAAGALEKAQAAHLAALDKRRRYDELVRDRAVSERDHTEALADERQAKAAVASAR 161
LE+ ++ L+A ++ + +L + E L RQ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFK---------NEILDKLRQTTDNIGLLT 315

Query: 162 AELARAQLQLDYATVTAPIDGR-ARRALVTEGALVGQDQA 200
ELA+ + + + + AP+ + + + TEG +V +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07060ACRIFLAVINRP10790.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1079 bits (2791), Expect = 0.0
Identities = 516/1032 (50%), Positives = 701/1032 (67%), Gaps = 6/1032 (0%)

Query: 1 MARFFIDRPVFAWVISLFIMLGGIFAIRALPVAQYPDIAPPVVSLYATYPGASAQVVEES 60
MA FFI RP+FAWV+++ +M+ G AI LPVAQYP IAPP VS+ A YPGA AQ V+++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTAVIEREMNGVPGLLYTSATS-SAGQASLSLTFKQGVSADLAAVDVQNRLKIVEARLPE 119
VT VIE+ MNG+ L+Y S+TS SAG +++LTF+ G D+A V VQN+L++ LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 120 PVRRDGISIEKAADNAQIIVSLTSEDGRLSGVELGEYASANVLQALRRVEGVGKVQFWGA 179
V++ GIS+EK++ + ++ S++ + ++ +Y ++NV L R+ GVG VQ +GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 EYAMRIWPDPVKMAALGLTASDIASAVRAHNARVTIGDVGRSAVPDSAPIAATVLADAPL 239
+YAMRIW D + LT D+ + ++ N ++ G +G + + A+++A
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 240 TTPDAFGAIALRARADGSTLYLRDVARIEFGGNDYNYPSFVNGKTATGMGIKLAPGSNAV 299
P+ FG + LR +DGS + L+DVAR+E GG +YN + +NGK A G+GIKLA G+NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 300 ATEKRVRATMEELAKFFPPGVKYQIPYETASFVRVSMSKVVTTLVEAGVLVFAVMFLFMQ 359
T K ++A + EL FFP G+K PY+T FV++S+ +VV TL EA +LVF VM+LF+Q
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 360 NFRATLIPTLVVPVALLGTFGAMLAAGFSINVLTMFGMVLAIGILVDDAIVVVENVERLM 419
N RATLIPT+ VPV LLGTF + A G+SIN LTMFGMVLAIG+LVDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 420 VEEKLPPYEATVKAMKQISGAIVGITVVLTSVFVPMAFFGGAVGNIYRQFAFALAVSIGF 479
+E+KLPP EAT K+M QI GA+VGI +VL++VF+PMAFFGG+ G IYRQF+ + ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 480 SAFLALSLTPALCATLLKPVADDHHE-KDGFFGWFNRFVARSTHRYTRRVGRVLERPLRW 538
S +AL LTPALCATLLKPV+ +HHE K GFFGWFN S + YT VG++L R+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 539 LVVYGALTAAAALLITKLPAAFLPDEDQGNFMVMVIRPQGTPLAETMQSVRRVEEYVRTH 598
L++Y + A +L +LP++FLP+EDQG F+ M+ P G T + + +V +Y +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 SPSAY--TFALGGYNLYGEGPNGGMIFVTMKDWKERKRARDQVQAIIAEINAHFAGTPNT 656
+ F + G++ G+ N GM FV++K W+ER + +A+I +
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 657 MVFAINMPALPDLGLTGGFDFRLQDRGGLGYGAFVAAREKLLAEGRKDPV-LTDLMFAGT 715
V NMPA+ +LG GFDF L D+ GLG+ A AR +LL + P L + G
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 716 QDAPQLKLDIDRAKASALGVSMEEINATLAVMFGSDYIGDFMHGSQVRRVIVQADGRHRL 775
+D Q KL++D+ KA ALGVS+ +IN T++ G Y+ DF+ +V+++ VQAD + R+
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 776 DAADVTKLRVRNAKGEMVPLAAFATLHWTMGPPQLTRYNGFPSFTINGAASAGHSSGEAM 835
DV KL VR+A GEMVP +AF T HW G P+L RYNG PS I G A+ G SSG+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 836 AAIERIASTLPAGTGYAWSGQSYEERLSGAQAPMLFALSVLVVFLALAALYESWSIPFAV 895
A +E +AS LPAG GY W+G SY+ERLSG QAP L A+S +VVFL LAALYESWSIP +V
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 896 MLVVPLGVIGAVAGVTLRGMPNDIYFKVGLIATIGLSAKNAILIVEVAKDLVAQR-MSLA 954
MLVVPLG++G + TL ND+YF VGL+ TIGLSAKNAILIVE AKDL+ + +
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 955 DAALEAARLRLRPIVMTSLAFGVGVLPLAFATGAASGAQIAIGTGVLGGVISATLFAIFL 1014
+A L A R+RLRPI+MTSLAF +GVLPLA + GA SGAQ A+G GV+GG++SATL AIF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1015 VPLFFVCVGRVF 1026
VP+FFV + R F
Sbjct: 1021 VPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07065RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.002
Identities = 18/104 (17%), Positives = 34/104 (32%), Gaps = 2/104 (1%)

Query: 379 APRLTLPIFAGGRNRANLDVADARKHIAVAEYEKTIQTAFREV--ADALAARDQIDAQLA 436
P L LP +N + +V I Q +E+ A R + A++
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN 224

Query: 437 AQQAVYGADAERLRLAQRRYDSGVASYLELLDAQRSTFESGQEL 480
+ + + RL + +L+ + E+ EL
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07090PF005776780.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 678 bits (1751), Expect = 0.0
Identities = 227/851 (26%), Positives = 353/851 (41%), Gaps = 60/851 (7%)

Query: 2 RIRHSFLCVFMLAAGSHARATEFNASFLSIDGRNDVDLSQFAQADYTLPGTYLLDVQVND 61
+R C F A + FN FL+ D + DLS+F PGTY +D+ +N+
Sbjct: 27 FVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNN 86

Query: 62 VFFGLQPIEFVAHDDGQGARACVAPELVAQFGLKKSLVENLPRTMGGRCADLASL-DGVT 120
+ + + F D QG C+ +A GL + V + C L S+ T
Sbjct: 87 GYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDAT 146

Query: 121 IRYQKGEGRLKITIAQAALEFADASYLPPERWSDGVDGAMLDYRVFANANHAFGRGAQQN 180
+ G+ RL +TI QA + Y+PPE W G++ +L+Y + N R +
Sbjct: 147 AQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYN--FSGNSVQNRIGGNS 204

Query: 181 NAVQAYGTIGANWGAWRFRGDYQAQ-TRAGGAVYAERAFRFNQLYAYRALPSIRSTLSFG 239
+ G N GAWR R + + + ++ ++ + R + +RS L+ G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 240 EIYVDSDIFSTFSMSGVAMKSDDRMLPPSMRGYAPLVTGVARTNAIVKVMQDSRVLYMTK 299
+ Y DIF + G + SDD MLP S RG+AP++ G+AR A V + Q+ +Y +
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 300 VSPGAFALSNLN-TSVQGTLDVVVEEEDGTVQRFQVATASVPFLAREGQLRYKTAIGQPR 358
V PG F ++++ G L V ++E DG+ Q F V +SVP L REG RY G+ R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 359 TFGGAGITPWFGFAEAAYGLPFDVTVYGGLIAASGYTSVAFGVGRDFGRFGALSADVTHA 418
+ P F + +GLP T+YGG A Y + FG+G++ G GALS D+T A
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 419 RATLWWNGRTKRGNSYRINYSKHVDALDADVRFFGYRFSERDYTNFQQFSGDPTASGL-- 476
+TL + G S R Y+K ++ +++ GYR+S Y NF +
Sbjct: 445 NSTL-PDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 503

Query: 477 -------------------ANGKQRYSAMLSKRFGDTST-YFSYDQTTYW-ARPSDRRIG 515
N + + ++++ G TST Y S TYW D +
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 516 VTLTRAFSLGALKSVNLGFSAFRTQGAGGGGNQVSLTATLPLGER-----------QTLT 564
L AF + L +S + G ++L +P + +
Sbjct: 564 AGLNTAFEDI---NWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 565 SSVSAGEGGTSVNAGYLYDGA---NGRTYQLYGGTTDGRASANASLRQRTPSYQ-----L 616
S+S G N +Y N +Y + G G + S T +Y+
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 617 TAQASTVANAYASASLEVDGSFVATRYGVTAHANGNAGDTRLLVSTDGVPGVPLS-GSYA 675
S ++ V G +A GVT DT +LV G + +
Sbjct: 681 NIGYS-HSDDIKQLYYGVSGGVLAHANGVTLGQPL--NDTVVLVKAPGAKDAKVENQTGV 737

Query: 676 RTNARGYAVIDGVSPYNVYDATVSVEKLGLDTDVTNPIQRTVLTDGAIGYIRFNAARGRN 735
RT+ RGYAV+ + Y + L + D+ N + V T GAI F A G
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 736 VFVTLTGDGGAPVPFGASVQDAATGKELGIVGEAGAAYLTQVQPRAKLVVRAGAKTICT- 794
+ +TLT + P+PFGA V + + GIV + G YL+ + K+ V+ G +
Sbjct: 798 LLMTLTHNNK-PLPFGAMVTS-ESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHC 855

Query: 795 --PAALPDTLQ 803
LP Q
Sbjct: 856 VANYQLPPESQ 866


69E4F39_07395E4F39_07430N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_073951125.195949SDR family oxidoreductase
E4F39_074001125.053803short-chain dehydrogenase
E4F39_074051113.695470carbamate kinase
E4F39_074101123.503911ornithine carbamoyltransferase
E4F39_074150123.585663arginine deiminase
E4F39_074200103.130946arginine-ornithine antiporter
E4F39_07425-1131.570874RNA-binding protein
E4F39_074300111.769199hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07460DHBDHDRGNASE872e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.0 bits (215), Expect = 2e-22
Identities = 51/187 (27%), Positives = 79/187 (42%), Gaps = 10/187 (5%)

Query: 1 MTGKRILVTGAGSGFGREVALRLAAKGHCVIAGVQITE----LSAEAARRGLALDAVKLD 56
+ GK +TGA G G VA LA++G + A E + + +A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 57 VT-CARERAQAARWD-----VDVLLNNAGAGEAGALVDLPVDIVRELFETNVFGPLELTQ 110
V A AR + +D+L+N AG G + L + F N G ++
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 111 QVARGMIARGRGRIVFVSSIAGLITGAYTGAYCASKHALEAIAEAMHLELAAHGVQIAVV 170
V++ M+ R G IV V S + AY +SK A + + LELA + ++ +V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 171 NPGPYRT 177
+PG T
Sbjct: 186 SPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07470CARBMTKINASE403e-144 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 403 bits (1037), Expect = e-144
Identities = 145/310 (46%), Positives = 192/310 (61%), Gaps = 13/310 (4%)

Query: 2 RIVIALGGNALLQRNQPMTEVQQRENVKIAVAQIAQ-IAPGNELVIAHGNGPQVGLLALQ 60
R+VIALGGNAL QR Q + + +NV+ QIA+ IA G E+VI HGNGPQVG L L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 61 ---GAAYPAVAPYPLDVLGAQTEGMIGYLIEQEMGNLLPP---DAPFATLLTQVEVDPAD 114
G A + P+DV GA ++G IGY+I+Q + N L + T++TQ VD D
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 115 PAFEHPTKPIGPVYSRDEAERLALEKGWHIAPD-GDKFRRVVPSPRPRRIFEIRPVKWLL 173
PAF++PTKP+GP Y + A+RLA EKGW + D G +RRVVPSP P+ E +K L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 174 EKGTIVICAGGGGIPTRYDANGKLSGVEAVIDKDLCASLLARELSADLLVIATDVDGAYL 233
E+G IVI +GGGG+P + +G++ GVEAVIDKDL LA E++AD+ +I TDV+GA L
Sbjct: 184 ERGVIVIASGGGGVPVILE-DGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 234 DWGKPTQALIEAAHPDELERL----GFAAGSMGPKVQAAIEFARQTGHDAVIGSLADIVA 289
+G + + +EL + F AGSMGPKV AAI F G A+I L V
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 290 IAEGRAGTRV 299
EG+ GT+V
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07480ARGDEIMINASE5150.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 515 bits (1327), Expect = 0.0
Identities = 130/423 (30%), Positives = 227/423 (53%), Gaps = 21/423 (4%)

Query: 1 MSQAIPQVGVHSEVGKLRKVLVCSPGLAHQRLTPSNCDELLFDDVMWVNQAKRDHFDFVS 60
M + + + + SE+G+L+KVL+ PG + LTP LFDD+ ++ A+++H F S
Sbjct: 1 MEEYLNPINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFAS 60

Query: 61 KMRERGVEVLEMHNLLTETVQNPAALK------WILDRKITPDNVGIGLVDEVRAWLEGL 114
++ VE+ + +L++E + + AL+ +IL+ +I D ++ ++ + L
Sbjct: 61 ILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFT----INLLKDYFSSL 116

Query: 115 EPRALAEFLIGGVAASDIAGAERSKVLTLFRDYLGKSSFVLPPLPNMMFTRDTSCWIYGG 174
+ +I GV ++ S + G + F++ P+PN++FTRD I G
Sbjct: 117 TIDNMISKMISGVVTEELKNYTSSLDDLV----NGANLFIIDPMPNVLFTRDPFASIGNG 172

Query: 175 VTLNPMHWPARRQETLLVAAVYKFHPAFTDAKFDVWYGDPDRDHGMATLEGGDVMPIGRG 234
VT+N M R++ET+ ++K+HP + +W + A+LEGGD + + +G
Sbjct: 173 VTINKMFTKVRQRETIFAEYIFKYHPVYK-ENVPIWLNRWE----EASLEGGDELVLNKG 227

Query: 235 VVLVGMGERTSRQAVGQLAQALFA-KGAAERVIVAGLPNSRASMHLDTVFSFCDRDLVTV 293
++++G+ ERT ++V +LA +LF K + + ++ +P +R+ MHLDTVF+ D + T
Sbjct: 228 LLVIGISERTEAKSVEKLAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTS 287

Query: 294 FPEVVNRIVPFTLRPGGDARYGIDIEREDKPFVDVVAQALGLKSLRVVETGGNDFAAERE 353
F + L + I I++E DV++ LG K + GG+ RE
Sbjct: 288 FTSDDMYFSIYVLTYNPSSSK-IHIKKEKARIKDVLSFYLGRKIDIIKCAGGDLIHGARE 346

Query: 354 QWDDGNNMVCIEPGVVVGYDRNTYTNTLLRKAGVEVITIGSSELGRGRGGGHCMTCPVLR 413
QW+DG N++ I PG ++ Y RN TN L + G++V I SSEL RGRGG CM+ P++R
Sbjct: 347 QWNDGANVLAIAPGEIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIR 406

Query: 414 DPV 416
+ +
Sbjct: 407 EDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07490PF06917260.041 Periplasmic pectate lyase
		>PF06917#Periplasmic pectate lyase

Length = 555

Score = 26.0 bits (57), Expect = 0.041
Identities = 12/37 (32%), Positives = 17/37 (45%), Gaps = 3/37 (8%)

Query: 52 NWSALAEIRRRLHGMYWKRRRIGVWLFSFWDRSDAAE 88
+W L R HG Y K+R V+ +D + AE
Sbjct: 173 DWKTLDLGR---HGNYSKQRDPQVFTHPRYDVVNPAE 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07495FLGLRINGFLGH260.042 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 26.5 bits (58), Expect = 0.042
Identities = 13/37 (35%), Positives = 18/37 (48%)

Query: 73 ALSARPIPLPCAARHGTPATLAAFGRMGSRPLVHSRR 109
A SA+P+P P +G+ A G +PL RR
Sbjct: 34 ATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDRR 70


70E4F39_07745E4F39_07790N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_07745118-1.459074tetratricopeptide repeat protein
E4F39_07750217-1.214555HlyD family type I secretion periplasmic adaptor
E4F39_07755114-0.729820ATP-binding cassette domain-containing protein
E4F39_07760010-0.196200sulfotransferase
E4F39_07765090.132815hypothetical protein
E4F39_07775-111-0.258754hemolysin
E4F39_07780-211-0.508344TolC family protein
E4F39_07790114-1.401750outer membrane protein assembly factor BamE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07860SYCDCHAPRONE330.005 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 33.0 bits (75), Expect = 0.005
Identities = 21/126 (16%), Positives = 45/126 (35%), Gaps = 3/126 (2%)

Query: 898 LAPDDADAVLLRAELALDTGDFDEALSQFERLREQRPDAPESYANLIPALAALERRDDAI 957
++ D + + A +G +++A F+ L + L A+ + D AI
Sbjct: 31 ISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAI 90

Query: 958 AALQRALELNSKHPGALNNGVQFYLRTQQYDKA---MELAQRYVGAHGELASAHTMCGLV 1014
+ ++ K P + + L+ + +A + LAQ + E T +
Sbjct: 91 HSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKELSTRVSSM 150

Query: 1015 YHNLKA 1020
+K
Sbjct: 151 LEAIKL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07865RTXTOXIND2745e-89 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 274 bits (701), Expect = 5e-89
Identities = 94/439 (21%), Positives = 204/439 (46%), Gaps = 14/439 (3%)

Query: 43 SALGLEEASIAPARRAAALIPTVMLALLIVLVLWATFFKIDIIAAGQGKVIPSTTVQQLS 102
+ L L E ++ R A ++ L++ + + +++I+A GK+ S +++
Sbjct: 44 AHLELIETPVSRRPRLVAYF---IMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIK 100

Query: 103 TLEGGIVRELLVREGQIVKKGQPLVRLDPVVAQGAVTEQAATREGLMASIARLQAEADGK 162
+E IV+E++V+EG+ V+KG L++L + A+ + ++ R Q +
Sbjct: 101 PIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI 160

Query: 163 ----------ATPLYPAGLKPEIVSEEEHVRAQRAEALNSTIEVLQQQRAAKQAEAADYR 212
Y + E V + ++ + + K+AE
Sbjct: 161 ELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVL 220

Query: 213 GRIPQYVNNQHLLDDQIQRMLPLVGVGSVAPNEITNLQRERGNLAAQIITTREGAAQASA 272
RI +Y N + ++ L+ ++A + + + + ++ + Q +
Sbjct: 221 ARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIES 280

Query: 273 QIAEASHKIEEKISTFRSEAREELARKQVQLQALEGTLSGKQDILDRTLIRSPVNGIVKT 332
+I A + + F++E ++L + + L L+ ++ ++IR+PV+ V+
Sbjct: 281 EILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQ 340

Query: 333 LYITTIGGVASPGKSVIDIVPTNDSLLIEARIQPQDIAYIRVGDDAKVRITAFDSGALGS 392
L + T GGV + ++++ IVP +D+L + A +Q +DI +I VG +A +++ AF G
Sbjct: 341 LKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGY 400

Query: 393 LDAKVELISPDSQADERSGSLYYKVQVRTHSSVVATQVGDLNILPGMVADVDVITGRRTI 452
L KV+ I+ D+ D+R G L + V + + ++T ++ + GM ++ TG R++
Sbjct: 401 LVGKVKNINLDAIEDQRLG-LVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSV 459

Query: 453 MSYILRPIVRGMSRAMSER 471
+SY+L P+ ++ ++ ER
Sbjct: 460 ISYLLSPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07885INTIMIN471e-06 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 47.4 bits (112), Expect = 1e-06
Identities = 55/247 (22%), Positives = 86/247 (34%), Gaps = 25/247 (10%)

Query: 902 ADGTHSLTASAVDLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNH--DN 959
+ +TA A D GN+S + L +TV ++ + ++D TS +
Sbjct: 521 GSNVYKVTARAYDRNGNSS-NNVLLTITV--LSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 960 ITSATLPTFNGTAAAGSYVQLYDVTGGTTVSMGSAVADSSGGWTTTLTSPLSGSASGVSH 1019
IT NG A A V V+G +S SA + SG T TL S G
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQ------ 631

Query: 1020 TLVAVGVDPAGNTSTVSGPDVVVIDNAAAALPGPTMASASDTGASSSDGIT-SNTAPVFT 1078
V V A TS ++ V+ +D A++ T A T A ++ + T V
Sbjct: 632 --VVVSAKTAEMTSALNANAVIFVDQTKASI---TEIKADKTTAVANGQDAITYTVKVMK 686

Query: 1079 GTGAEAGALVTIYANGTSVG--HATADASGNYTIQ--SNALGADGRYQITAQQVDIAGNT 1134
G + VT + D +G + S G + + +V
Sbjct: 687 GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGK----SLVSARVSDVAVD 742

Query: 1135 SPSSSVT 1141
+ V
Sbjct: 743 VKAPEVE 749



Score = 45.4 bits (107), Expect = 4e-06
Identities = 63/362 (17%), Positives = 110/362 (30%), Gaps = 39/362 (10%)

Query: 359 GHTVSTIADSNGNYSVQAPGTLAEGNNVFTVQ--AVDKAGNTSGTAQQNVTLDTVAATLP 416
G + + S +Y P + G+NV+ V A D+ GN+S +T+ L
Sbjct: 497 GQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITV------LS 550

Query: 417 APQL-------DHGSDTGASNSDGITRATQPVLTGGGAEPNALVTVYADGVSIGQ----- 464
Q+ D +D ++ +DG T A V V + VS
Sbjct: 551 NGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSAN 610

Query: 465 -ATADSLGHYTIHSGVLADGTHQITARQIDIAGNTSALSGAALVTIDTSEPAPANLKLVD 523
A + G T+ L A TSAL+ A++ +D ++ + +K
Sbjct: 611 SANTNGSGKATV---TLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADK 667

Query: 524 DTFGLHTAGTPSDGLTKDSRVTISGTASAGDVVTLMD--GATSVGQVTADASGNWTIQTA 581
T D +T +V + VT G S D +G +
Sbjct: 668 TTA----VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLT 723

Query: 582 SLADGTHSLTASAVDLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNHDN 641
S T ++ S + + + + + G+G G
Sbjct: 724 -------STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI-EIVGTGVKGKLPTV 775

Query: 642 ITSATLPTFNGTAAAGSYVQLYDVTGGTTVSVGSAVADSSGGWTTTLTSPLSGSASGVSH 701
+ G Y +V S TTT++ +S ++
Sbjct: 776 WLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV-ISSDNQTATY 834

Query: 702 TL 703
T+
Sbjct: 835 TI 836



Score = 38.5 bits (89), Expect = 5e-04
Identities = 53/259 (20%), Positives = 91/259 (35%), Gaps = 17/259 (6%)

Query: 798 GADGRYQITAQQVDIAGNTSPSSSVTAMTLDTSEPAPVNLHLVDDTFGQGTAGTS--SDN 855
G Y++TA+ D GN+S + +T L S V+ V D T+ + ++
Sbjct: 520 GGSNVYKVTARAYDRNGNSSNNVLLTITVL--SNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 856 LTKDSRVTISGTASAGD--VVTLMDGATSVGQVTADASGNWTIQTASLADGTHSLTASAV 913
+T + V +G A A ++ G + +A+ +G+ T +L +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKA-TVTLKSDKPGQVVVSA 636

Query: 914 DLAGNTSPASSTLPVTVDTINPPPALTLSPLSDTFGSGTSGTNHDNITSATLPTFNGTAA 973
A TS ++ + VD + T + D IT
Sbjct: 637 KTAEMTSALNANAVIFVDQTKASIT-EIKADKTTAVAN----GQDAITYTVKVMKGDKPV 691

Query: 974 AGSYVQLYDVTGGTTVSMGSAVADSSGGWTTTLTSPLSGSASGVSHTLVAVGVDPAGNTS 1033
+ V T +S + D++G TLTS G + VS + V VD
Sbjct: 692 SNQEVTF--TTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL-VSARVSDVAVDVK--AP 746

Query: 1034 TVSGPDVVVIDNAAAALPG 1052
V + ID+ + G
Sbjct: 747 EVEFFTTLTIDDGNIEIVG 765



Score = 37.4 bits (86), Expect = 0.001
Identities = 28/146 (19%), Positives = 51/146 (34%), Gaps = 8/146 (5%)

Query: 1853 ADGTYTFSAVAVDVAGNTSNPGVPVQVVVDTHAAAPSITLGTPYDTFGTGTSGTNSDELT 1912
Y +A A D GN+SN V + + V ++ T + T ++ +T
Sbjct: 521 GSNVYKVTARAYDRNGNSSNN-VLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAIT 579

Query: 1913 RNTIPYMYGVAEPGARV--TVVENGNTIGTVNA-DSSTGSYSIQIPPATVDGTYTFQAMQ 1969
GVA+ V +V + +A + +G ++ +
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVV----S 635

Query: 1970 VDVAGNTSAYSAPNYVTIDTVAATPT 1995
A TSA +A + +D A+ T
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASIT 661


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_07895OMPADOMAIN1134e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (283), Expect = 4e-31
Identities = 59/180 (32%), Positives = 89/180 (49%), Gaps = 12/180 (6%)

Query: 123 QYQVRF--LGGLAYRGYWADSACRDIAARYADAAGLGVIAVAPCNPSDVAAPLPERVELP 180
Q+ + R ++ R+ V+A AP +V L
Sbjct: 163 QWTNNIGDAHTIGTRP-DNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTK---HFTLK 218

Query: 181 TDTLFAFDKGGFEDISADGRRQLGDLVASIKAKILSINHLIVTGYTDRLGSDEHNARLSS 240
+D LF F+K + +G+ L L + + ++V GYTDR+GSD +N LS
Sbjct: 219 SDVLFNFNK---ATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSE 275

Query: 241 ERARTVADYMIAEGIPAAKITAVGRGAADPVV--VCNNGEQ-PELIRCLQKNRRVEIRIK 297
RA++V DY+I++GIPA KI+A G G ++PV C+N +Q LI CL +RRVEI +K
Sbjct: 276 RRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVK 335


71E4F39_08170E4F39_08215N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_08170-311-0.936664TetR/AcrR family transcriptional regulator
E4F39_08180-2130.043820efflux RND transporter periplasmic adaptor
E4F39_08190-1160.977351efflux RND transporter permease subunit
E4F39_082001171.692106efflux transporter outer membrane subunit
E4F39_082052171.908158hypothetical protein
E4F39_082101151.293892hypothetical protein
E4F39_082152161.616290hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08305HTHTETR626e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.6 bits (149), Expect = 6e-14
Identities = 41/203 (20%), Positives = 76/203 (37%), Gaps = 10/203 (4%)

Query: 11 RLTREQSKDLTRERLLSAAHAIFTKKGYVAASVEDIASAAGYTRGAFYSNFRSKAELLIE 70
R T++++++ TR+ +L A +F+++G + S+ +IA AAG TRGA Y +F+ K++L E
Sbjct: 3 RKTKQEAQE-TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 71 LLKRDHEEAEADLQKIFE--SGGTREQMEA---HALEYYSQFFRNNPAFLLWGEAKLQAT 125
+ + + G + H LE R +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 126 RDAKFRARFNEFVKEKRDRFTHYILTFAERVGTPLLLPADVLALGLMSLCDGVQSYHAAD 185
A + E DR + E P L A+ + G+
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 186 PRHVTGDAAQQVLAGFFARVVLA 208
P+ D ++ A + ++L
Sbjct: 182 PQSF--DLKKE--ARDYVAILLE 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08310RTXTOXIND371e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 1e-04
Identities = 30/200 (15%), Positives = 59/200 (29%), Gaps = 32/200 (16%)

Query: 1 MNRSGSRAALLIGVALIAAACHRKEAAPSAPRPVVAVPAQADGAAAAVSLPGEIQPRYAT 60
+ SR L+ ++ + +VA A+G EI+P
Sbjct: 49 IETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVAT---ANGKLTHSGRSKEIKP---- 101

Query: 61 PLSFRIAGKLVER-KVRLGDIVKKGQVVALLDTSDVARSAASAQAQLDAATHALTFAQQQ 119
I +V+ V+ G+ V+KG V+ L A+A +L A+ +
Sbjct: 102 -----IENSIVKEIIVKEGESVRKGDVLLKLTALG-------AEADTLKTQSSLLQARLE 149

Query: 120 RERDRA--QARENLIAPAQLEQTENAYASARAQRDQAAQQLA----------LAKNQLQY 167
+ R + ++ E P E + + + L + +L
Sbjct: 150 QTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNL 209

Query: 168 ATLVADHAGYITAEQADTGQ 187
A+ +
Sbjct: 210 DKKRAERLTVLARINRYENL 229



Score = 34.8 bits (80), Expect = 5e-04
Identities = 10/71 (14%), Positives = 27/71 (38%)

Query: 100 ASAQAQLDAATHALTFAQQQRERDRAQARENLIAPAQLEQTENAYASARAQRDQAAQQLA 159
+ A+++ + + + + + + IA + + EN Y A + QL
Sbjct: 217 LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276

Query: 160 LAKNQLQYATL 170
++++ A
Sbjct: 277 QIESEILSAKE 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08315ACRIFLAVINRP432e-136 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 432 bits (1113), Expect = e-136
Identities = 223/1062 (20%), Positives = 423/1062 (39%), Gaps = 75/1062 (7%)

Query: 13 LSAWALRHQALVVYLIALATIAGILAYSRLAQSEDPPFTFRVMVIRTFWPGATARQVQEQ 72
++ + +R L + +AG LA +L ++ P + + +PGA A+ VQ+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 73 VTDRIGRKLQEMPAIDYLRSYS-RPGESMLFFAMKDSAPVKDVPQTWYQVRKKVGDISMT 131
VT I + + + + Y+ S S G + + QV+ K+ +
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQV---QVQNKLQLATPL 117

Query: 132 LPPGVQGP-FFNDEFGDVYTNIYTLEGDG--FSPAQLHDYAD-QLRVVLLRVPGVAKVDY 187
LP VQ ++ Y + D + + DY ++ L R+ GV V
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 188 FGDPDQRIFVEIDNTRLARLGISPQQIAQAINAQNDVASPGVLTAAHD------RVFIRP 241
FG + + +D L + ++P + + QND + G L I
Sbjct: 178 FG-AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIA 236

Query: 242 SGQYESVAAIADTLIRVN--GRTFRLGELATIKRGYDDPPVTQMRTIGRDTNGRAVLGIG 299
++++ +RVN G RL ++A ++ G + + NG+ G+G
Sbjct: 237 QTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELG------GENYNVIARINGKPAAGLG 290

Query: 300 VTMQPGGDVIRLGKALDASAKALQAQLPAGLALTEVSSMPHAVARSVDDFLEAVAEAVAI 359
+ + G + + KA+ A LQ P G+ + V S+ + ++ + EA+ +
Sbjct: 291 IKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIML 350

Query: 360 VLIVSLVSLG-LRTGMVVVISIPVVLAVTALFMYLFDIGLHKVSLGTLVLALGLLVDDAI 418
V +V + L +R ++ I++PVVL T + F ++ +++ +VLA+GLLVDDAI
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAI 410

Query: 419 IAVEMMA-VKLEQGFSRARAAAFAYTSTAFPMLTGTLVTVSGFLPIALAKSSTGEYTRSI 477
+ VE + V +E A + + ++ +V + F+P+A STG R
Sbjct: 411 VVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQF 470

Query: 478 FEVSAIALIASWFAAVVLIPLLGYHMLPERKHPRQDAAGAPHAP-DAAHDHAHGHDIYDT 536
A+ S A++L P L +L + G + DH+
Sbjct: 471 SITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHS-------V 523

Query: 537 RFYTRLRVWIKWCIERRFIVLAITIALFVVALAGFSLVPQQFFPSSDRPELLVDLRLPEG 596
YT + + L I + + F +P F P D+ L ++LP G
Sbjct: 524 NHYTNS---VGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAG 580

Query: 597 ASFNATLKEAERLEKLIAK--RPEIDHAVNFVGSGAPRFYLPLDQQLQLPNFAQFVITAK 654
A+ T K +++ K + ++ G Q N ++ K
Sbjct: 581 ATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSG---------QAQNAGMAFVSLK 631

Query: 655 SVDAR---EKLSAWLAPVLREQFPAARTRISRLENGPPV-------GYPVQ-FRVSGDSI 703
+ R E + + + + R N P + G+ + +G
Sbjct: 632 PWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGH 691

Query: 704 ATVRAIAEKVAATMR---ADARATNVQFDWDEPAERSVRFELDQHKARELNVSSQDVASF 760
+ ++ A + D + E+DQ KA+ L VS D+
Sbjct: 692 DALTQARNQLLGMAAQHPASLVSVRPNGLEDTA---QFKLEVDQEKAQALGVSLSDINQT 748

Query: 761 LAMTLSGTTLTQYRERDKLIAVDLRAPRAQRVDPASLAGLAMPTPNG-PVPLGSLGRFHD 819
++ L GT + + +R ++ + ++A R+ P + L + + NG VP + H
Sbjct: 749 ISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHW 808

Query: 820 TLEYGVVWERDRQPTITVQSDVTAGAQGIDVTHAIDAKLDALRAQLPVGYRIEIGGSVEE 879
+ + P++ +Q + G + A ++ L ++LP G + G +
Sbjct: 809 VYGSPRLERYNGLPSMEIQGEAAPG----TSSGDAMALMENLASKLPAGIGYDWTGMSYQ 864

Query: 880 STKGQTSINAQMPLMVIAVLTLLMIQLQSFSRVLMVVLTAPLGMIGVVGTLLLFGKPFGF 939
A + + + V L +S+S + V+L PLG++GV+ LF +
Sbjct: 865 ERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDV 924

Query: 940 VAMLGVIAMFGIIMRNSVILVDQIEQDIAA-GHGRFDAIVGATVRRFRPITLTAAAAVLA 998
M+G++ G+ +N++++V+ + + G G +A + A R RPI +T+ A +L
Sbjct: 925 YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG 984

Query: 999 LIPLLRSNFFG-----PMATALMGGITSATVLTLFFLPALYA 1035
++PL SN G + +MGG+ SAT+L +FF+P +
Sbjct: 985 VLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFV 1026



Score = 83.7 bits (207), Expect = 2e-18
Identities = 91/535 (17%), Positives = 182/535 (34%), Gaps = 67/535 (12%)

Query: 550 IERRFIVLAITIALFVVALAGFSLVPQQFFPSSDRPELLVDLRLPEGASFNATLKEAER- 608
I R + I L + +P +P+ P + V P A + +
Sbjct: 6 IRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-----GADAQTVQDT 60

Query: 609 ----LEKLIAKRPEIDH---AVNFVGSG--APRFYLPLDQQLQLPNFAQFVITAKSVDAR 659
+E+ + + + + GS F D P+ AQ V +
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTD-----PDIAQ-------VQVQ 108

Query: 660 EKLSAWLAPVLREQFP-AARTRISRLENGPPVGYPVQFRVSGDSIATVRAIAEKVAATMR 718
KL P + + +E V VS + T I++ VA+ ++
Sbjct: 109 NKLQL-----ATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVK 163

Query: 719 ADAR----ATNVQFDWDEPAERSVRFELDQHKARELNVSSQDVASFL--------AMTLS 766
+VQ A+ ++R LD + ++ DV + L A L
Sbjct: 164 DTLSRLNGVGDVQLF---GAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLG 220

Query: 767 GTTLTQYRERDKLIAVDLRAPRAQRVDPASLAGLAMPTPNG-PVPLGSLGRFHDTLE-YG 824
GT ++ + I R + +L +G V L + R E Y
Sbjct: 221 GTPALPGQQLNASIIAQTRFKNPEEFGKVTLRV----NSDGSVVRLKDVARVELGGENYN 276

Query: 825 VVWERDRQPTITVQSDVTAGAQGIDVTHAIDAKLDALRAQLPVGYRIEI----GGSVEES 880
V+ + +P + + GA +D AI AKL L+ P G ++ V+ S
Sbjct: 277 VIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLS 336

Query: 881 TKGQTSINAQMPLMVIAVLTLLMIQLQSFSRVLMVVLTAPLGMIGVVGTLLLFGKPFGFV 940
+ +++ L + + LQ+ L+ + P+ ++G L FG +
Sbjct: 337 I--HEVVKTLFEAIMLVFLVMYLF-LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTL 393

Query: 941 AMLGVIAMFGIIMRNSVILVDQIEQDIAAGHGRF-DAIVGATVRRFRPITLTAAAAVLAL 999
M G++ G+++ +++++V+ +E+ + +A + + + A
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 1000 IPLL-----RSNFFGPMATALMGGITSATVLTLFFLPALYAAWFRVKPDERDPEP 1049
IP+ + + ++ + + ++ L PAL A + E
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_08340PF03544280.027 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 28.4 bits (63), Expect = 0.027
Identities = 19/123 (15%), Positives = 31/123 (25%), Gaps = 2/123 (1%)

Query: 82 VSAPPSASTAVSTARSRLPSPLLLSAPASAPMTARARPSARRGPAPRASMGSVHVAARER 141
+ AP + A + L P + P P P P V + +
Sbjct: 43 LPAPAQPISVTMVAPADLEPPQAVQPPPEPV--VEPEPEPEPIPEPPKEAPVVIEKPKPK 100

Query: 142 EPSSRRAPGIPAVSEPMREPRSDAQASAEAGDAQRRLPRAPGVAADWRADLDSLGAARPL 201
+ + +P AS A R + AA + R L
Sbjct: 101 PKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRAL 160

Query: 202 RQA 204
+
Sbjct: 161 SRN 163


72E4F39_09755E4F39_09785N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_097551051-8.248405SPFH/Band 7/PHB domain protein
E4F39_097601151-8.475325NfeD family protein
E4F39_097651149-8.405455phosphoenolpyruvate synthase
E4F39_097701046-9.426587hypothetical protein
E4F39_09775946-9.254589glutathione gamma-glutamylcysteinyltransferase
E4F39_09780949-9.366435hypothetical protein
E4F39_09785425-7.097820peptidase S53
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09865RTXTOXINA300.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.021
Identities = 18/92 (19%), Positives = 35/92 (38%), Gaps = 8/92 (8%)

Query: 221 INQAQGEAAAILAVAEANSQAIQKIAQAIQSQGGMDAVNLKVAEQYVGAFGNLAKAGNTL 280
INQ A++ + SQ + + + + ++ V K+ NL G L
Sbjct: 188 INQLVDTVASLNNNVNSFSQQLNTLGSVLSNTKHLNGVGNKLQN-----LPNLDNIGAGL 242

Query: 281 IVPSNLSDLSTAIASALTIVNRSAPGALAPGA 312
+S + +AI+++ + N A A
Sbjct: 243 DT---VSGILSAISASFILSNADADTRTKAAA 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09875PHPHTRNFRASE2654e-81 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 265 bits (678), Expect = 4e-81
Identities = 99/441 (22%), Positives = 171/441 (38%), Gaps = 73/441 (16%)

Query: 383 IHDPSEMERVQPGDVLVADMTDPNWEPVMK-RASAIVTNRGGRTCHAAIIARELGVPAVV 441
+ S + ++ D+T + + K T+ GGRT H+AI++R L +PAVV
Sbjct: 145 VETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVV 204

Query: 442 GCGDATDVLKDGALVTVSCAEGDEGKIYDGLLETEVSEVQRGE------------LPSVP 489
G + T+ ++ G +V V G EG + E EV + L P
Sbjct: 205 GTKEVTEKIQHGDMVIVD---GIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEP 261

Query: 490 --------VKIMMNVGNPQLAFDFSQLPNAGVGLARLEFIINNNIGVHPKAILEYPNVDA 541
V++ N+G P+ G+GL R EF+ + P
Sbjct: 262 STTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDR-DQLPT---------- 310

Query: 542 DLKKAVESVARGHASPRAFYVDKLTEGIATIAAAFYPKPVIVRLSDFKSNEYKKLIGGSR 601
++ E + KPV++R D ++ +
Sbjct: 311 --------------------EEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYL---- 346

Query: 602 YEPDEENPMLGFRGASRYIAEDFAQAFEMECMALKRVRDEMGLTNVEIMVPFVRTVKQAE 661
P E NP LGFR + + F + AL R N+++M P + T+++
Sbjct: 347 QLPKELNPFLGFRAIRLCL--EKQDIFRTQLRALLRAS---TYGNLKVMFPMIATLEELR 401

Query: 662 RVVGLLGKFGLKRGDNG------LRLIMMCEVPSNAILAEEFLQHFDGFSIGSNDLTQLT 715
+ ++ + K G + + +M E+PS A+ A F + D FSIG+NDL Q T
Sbjct: 402 QAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYT 461

Query: 716 LGLDRDSGMELLAVDFDERDPAVKFMLKRAIDTCRKLDKYVGICGQGPSDHPDFAKWLAD 775
+ DR + E ++ + PA+ ++ I K+VG+CG+ D L
Sbjct: 462 MAADRMN--ERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLG 518

Query: 776 EGIASISLNPDTVIETWQALA 796
G+ S++ +++ L
Sbjct: 519 LGLDEFSMSATSILPARSQLL 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09890INFPOTNTIATR250.019 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 25.3 bits (55), Expect = 0.019
Identities = 11/30 (36%), Positives = 17/30 (56%)

Query: 1 MPRVRPAAIAIAIAIAIATATATATATDTD 30
M V A + +A++ A+A AT+ TD D
Sbjct: 3 MKLVTAAIMGLAMSTAMAATDATSLTTDKD 32


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_09895SUBTILISIN417e-06 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 41.4 bits (97), Expect = 7e-06
Identities = 21/122 (17%), Positives = 36/122 (29%), Gaps = 22/122 (18%)

Query: 342 VLYAAPSMLLSDITSAYNRAVVDNVAKVINVSLGVCEADARASGTQAADDRIFKSAVAQG 401
VL S I A+ + +I++SLG A + AVA
Sbjct: 117 VLNKQGSGQYDWIIQGIYYAI-EQKVDIISMSLG---GPEDVPELHEAVKK----AVASQ 168

Query: 402 QTFVVAAGDAGAYECSVSRVSGGQGVPARSNYSVSEPATSPYVVAVGGTTLSTDRTTLAY 461
+ AAG+ G + + P V++VG + +
Sbjct: 169 ILVMCAAGNEGDGDDRTD--------------ELGYPGCYNEVISVGAINFDRHASEFSN 214

Query: 462 AG 463
+
Sbjct: 215 SN 216


73E4F39_09985E4F39_10020N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_09985013-1.177912DNA repair protein RadA
E4F39_09990-111-0.229028alanine racemase
E4F39_09995-110-0.221434lysophospholipid transporter LplT
E4F39_10000-190.821432bifunctional hydroxymethylpyrimidine
E4F39_100052121.382654serine/arginine repetitive matrix 1
E4F39_100101101.710638DUF1853 family protein
E4F39_100151102.094333uracil-DNA glycosylase
E4F39_100202112.618823ribosomal-protein-alanine N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10095TCRTETOQM310.011 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.0 bits (70), Expect = 0.011
Identities = 16/79 (20%), Positives = 27/79 (34%), Gaps = 17/79 (21%)

Query: 104 LLQSLAQIASERPALYISGEESGAQIALRAQRLALLEGGASAADLKLLAEIQLEKIQATI 163
LL +L +I+ P L + + +I L L ++Q+E A +
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILS-----------------FLGKVQMEVTCALL 403

Query: 164 DAERPDVAVIDSIQTIYSE 182
+ I IY E
Sbjct: 404 QEKYHVEIEIKEPTVIYME 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10100ALARACEMASE438e-156 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 438 bits (1127), Expect = e-156
Identities = 207/353 (58%), Positives = 270/353 (76%)

Query: 1 MPRPISATIHTAALANNLSVVRRHAAQSKVWAIVKANAYGHGLARVFPGLRGTDGFGLLD 60
M RPI A++ AL NLS+VR+ A ++VW++VKANAYGHG+ R++ + TDGF LL+
Sbjct: 1 MTRPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLN 60

Query: 61 LDEAVKLRELGWAGPILLLEGFFRSTDIDVIDRYSLTTAVHNDEQMRMLETARLSKPVNV 120
L+EA+ LRE GW GPIL+LEGFF + D+++ D++ LTT VH++ Q++ L+ ARL P+++
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI 120

Query: 121 QLKMNSGMNRLGYTPEKYRAAWERARACPGIGQITLMTHFSDADGERGVAEQMATFERGA 180
LK+NSGMNRLG+ P++ W++ RA +G++TLM+HF++A+ G++ MA E+ A
Sbjct: 121 YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQAA 180

Query: 181 QGIAGARSFANSAAVLWHPSAHFDWVRPGIMLYGASPSGRAADIADRGLKPTMTLASELI 240
+G+ RS +NSAA LWHP AHFDWVRPGI+LYGASPSG+ DIA+ GL+P MTL+SE+I
Sbjct: 181 EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII 240

Query: 241 AVQTLAKGQAVGYGSMFVAEDTMRIGVVACGYADGYPRIAPEGTPVVVDGVRTRIVGRVS 300
VQTL G+ VGYG + A D RIG+VA GYADGYPR AP GTPV+VDGVRT VG VS
Sbjct: 241 GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTVS 300

Query: 301 MDMLTVDLTPVPQAGVGARVELWGETLPIDDVAARCMTVGYELMCAVAPRVPV 353
MDML VDLTP PQAG+G VELWG+ + IDDVAA TVGYELMCA+A RVPV
Sbjct: 301 MDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPV 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10105TCRTETB290.049 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.049
Identities = 31/139 (22%), Positives = 54/139 (38%), Gaps = 4/139 (2%)

Query: 13 FFSSLADSALLIAAIALLKDLHAPNWMIPLLKLFFVLSYVVLAAFVGAFADSRPKGHVMF 72
FFS L + L ++ + D + P + F+L++ + A G +D ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 73 ITNSIKVVGCLIMLFGAHP----LIAYGIVGFGAAAYSPAKYGILTELLPPERLVAANGW 128
I G +I G ++A I G GAAA+ ++ +P E A G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 129 IEGTTVGSIILGTVLGGAL 147
I +G +GG +
Sbjct: 144 IGSIVAMGEGVGPAIGGMI 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10115ECOLNEIPORIN392e-06 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 38.6 bits (90), Expect = 2e-06
Identities = 13/59 (22%), Positives = 18/59 (30%), Gaps = 7/59 (11%)

Query: 7 RPRHARYRAVTSFGFGFGFGF------GFGFGFGFGFGFGFGFGFGFGVQYAASRATRA 59
R RY + G + G + GF + G GF VQY +
Sbjct: 143 RLISVRYDSPEFAGLSGSVQYALNDNAGRHNSESYHAGFNYKNG-GFFVQYGGAYKRHH 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10135SACTRNSFRASE443e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.2 bits (104), Expect = 3e-08
Identities = 21/71 (29%), Positives = 32/71 (45%)

Query: 79 VAPVAQRSGVGLALLREAVRIARAERLDGVLLEVRPSNPRAIRLYERFGFVSVGRRRNYY 138
VA ++ GVG ALL +A+ A+ G++LE + N A Y + F+ Y
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLY 156

Query: 139 PAKHRSREDAI 149
+ E AI
Sbjct: 157 SNFPTANEIAI 167


74E4F39_10335E4F39_10405N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_10335-1111.076526MFS transporter
E4F39_10340-2120.888456amino acid adenylation domain-containing
E4F39_10345-1120.638478hypothetical protein
E4F39_10350-214-0.722475hypothetical protein
E4F39_10360-2160.584811hypothetical protein
E4F39_10365-2130.937749efflux RND transporter periplasmic adaptor
E4F39_103750132.110160ACR family transporter
E4F39_10380-2142.624604hypothetical protein
E4F39_103850131.953341autotransporter domain-containing protein
E4F39_103900141.316049acetyltransferase
E4F39_10395-1142.284389hypothetical protein
E4F39_104000112.589511hypothetical protein
E4F39_104051113.021616*aspartate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10435TCRTETA385e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.3 bits (89), Expect = 5e-05
Identities = 57/271 (21%), Positives = 97/271 (35%), Gaps = 13/271 (4%)

Query: 74 AFTLPIALFALLSGVAADAWDRRTVMLLSQALMFSVALCLVALAAAGAMTPARLLVCMFV 133
+ L A + G +D + RR V+L+S +V ++A A + L + V
Sbjct: 51 LYALMQFACAPVLGALSDRFGRRPVLLVS-LAGAAVDYAIMATAPFLWV----LYIGRIV 105

Query: 134 GGCAGAMFQPAWQSAVTEQVPARELSAAIALDSFSMNFARTAGPALGGFIVASVSPNAAF 193
G GA + + + E + S F AGP LGG + SP+A F
Sbjct: 106 AGITGATG-AVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPF 163

Query: 194 V---LSGLSYAGLIYALSRSIRGAAARPPVRERLATMLVQGVRYCGRARGIRGTLIRSSL 250
L RP R A + R+ + + +
Sbjct: 164 FAAAALNGLNFLTGCFLLPESHKGERRP--LRREALNPLASFRWARGMTVVAALMAVFFI 221

Query: 251 FGFLGSPVWALLPLFAKTQFGGEARTYGVLLASFGA-GAASGALGGAAGRARLGREALVR 309
+G AL +F + +F +A T G+ LA+FG + + A+ ARLG +
Sbjct: 222 MQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM 281

Query: 310 LCTLTFAAGMLATAWSPCQAVAMLGLAVAGG 340
L + G + A++ +A + +
Sbjct: 282 LGMIADGTGYILLAFATRGWMAFPIMVLLAS 312



Score = 35.2 bits (81), Expect = 5e-04
Identities = 31/167 (18%), Positives = 58/167 (34%), Gaps = 8/167 (4%)

Query: 21 LAALRGPFAYRTFAAIWVAS-LVGNIGGSIQTVAASWLMTSMAPSPTMVSLVQTAFTLPI 79
LA+ R AA+ ++ +G + + T + + AF +
Sbjct: 200 LASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILH 259

Query: 80 ALF-ALLSGVAADAWDRRTVMLLSQALMFSVALCLVALAAAGAMTPARLLVCMFVGGCAG 138
+L A+++G A R ++L M + + LA A A ++ + G
Sbjct: 260 SLAQAMITGPVAARLGERRALMLG---MIADGTGYILLAFATRGWMAFPIMVLLASG--- 313

Query: 139 AMFQPAWQSAVTEQVPARELSAAIALDSFSMNFARTAGPALGGFIVA 185
+ PA Q+ ++ QV + + GP L I A
Sbjct: 314 GIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10460FLGHOOKFLIK260.024 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 25.6 bits (55), Expect = 0.024
Identities = 13/45 (28%), Positives = 17/45 (37%)

Query: 36 PGDPVGPKAPPARGPGARRTRAAPRAPATPGARPLSAGATTPPRP 80
PG P P P ++ + +P T A PL T P P
Sbjct: 180 PGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLP 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10465RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 17/126 (13%), Positives = 42/126 (33%), Gaps = 21/126 (16%)

Query: 87 TVRSQVDGQITHVRFREGQQVRAGDVLVEIDRRALQAAADQATAKLEQDKATLANARLEL 146
++ + + + +EG+ VR GDVL+++ +A + + L Q + ++
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 147 ----------------ARHQRLAEMNAAPVQML-----DTWKARVNELHAQIRGDQAAVQ 185
Q ++E + L TW+ + + + +A
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 186 NARVAV 191
+
Sbjct: 218 TVLARI 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10470ACRIFLAVINRP7550.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 755 bits (1951), Expect = 0.0
Identities = 273/1033 (26%), Positives = 495/1033 (47%), Gaps = 26/1033 (2%)

Query: 9 FIRYPVATCLMTAGILFAGVAAYFHLPVAPLPQVEFPTIQVSAVLPGADPVSVASTLAQP 68
FIR P+ ++ ++ AG A LPVA P + P + VSA PGAD +V T+ Q
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 69 LETQFSKIPYVTQMTSQSTLS-STSIVLQFSLERSIDAAANDVQSAIDAAAAQLPADLPS 127
+E + I + M+S S + S +I L F D A VQ+ + A LP ++
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 128 PPTFQKVNPADSPIMLLSAISSTLPLTTID--DYVETRLTKSLSQIDGVGSVSIGGQQKP 185
+ S +M+ +S T D DYV + + +LS+++GVG V + G Q
Sbjct: 125 QGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY- 182

Query: 186 SIRIQLDPVKLASRGLSSEDVRRALSGLSGVNPKGVFNGT------TRSYTIYTNGQLTE 239
++RI LD L L+ DV L + G GT + +I +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 240 PAQWNDAIV-AYRDGTPVRIRDIGQAVLGPEDNTLAAWIDGRRAISVGIYKKPGANTVST 298
P ++ + DG+ VR++D+ + LG E+ + A I+G+ A +GI GAN + T
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 299 VDKILARLPELEASLPPSLKIAVLADRTQTIRASLLDIELTLLLNVVLVVVVIYAFLGSV 358
I A+L EL+ P +K+ D T ++ S+ ++ TL ++LV +V+Y FL ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 359 RTTIIPAVTVPVSLFGACALMWVCGYSLDNISLMAMTIAVGFVVDDAIVMVENIARH-VE 417
R T+IP + VPV L G A++ GYS++ +++ M +A+G +VDDAIV+VEN+ R +E
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 418 AGERPLQAALKGLSETSFTIASISLSLVAVLLPLLLMSGIIGRMFREFAVTLSMTIIVSA 477
P +A K +S+ + I++ L AV +P+ G G ++R+F++T+ + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 478 FVSLTLTPMMASYLLRAHRHDAGRPPRP--GLFERAFARTAAAYERALDVALRHRFVTLC 535
V+L LTP + + LL+ + G F F + Y ++ L L
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLL 542

Query: 536 AFFASVAASVFLYVGIPKGFFPQQDTGVITGISEAAQTISVEDMARHSMALAAIIRADPA 595
+ VA V L++ +P F P++D GV + + + E + + +
Sbjct: 543 IYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEK 602

Query: 596 --VEHCQMAVGGSAYAGTTVNNGRWYITLKPRDQRDA---TADEVIRRLRPQFAKVPGVR 650
VE V G +++G N G +++LKP ++R+ +A+ VI R + + K+
Sbjct: 603 ANVES-VFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 651 MYLQAAQDVIIGARLARTQYQLTLQSA-DVGALTTWAPRLLARLSGLP-QLRDVASDQQV 708
+ ++ ++L Q+ ALT +LL + P L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 709 NGSALSVAIDRDQAARYGLTPEAIDGTLYDAFGSRQVAQYFTQLSTYKVIMETLPSLQRD 768
+ + + +D+++A G++ I+ T+ A G V + + K+ ++ +
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 769 PGTLDRIYMKAPSGALVPLSSVARWTTDTVQPLSVNHQSHFPSVTISFNLAPGVSLGEAT 828
P +D++Y+++ +G +VP S+ + + + PS+ I APG S G+A
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTT-SHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 829 AAIEAARASLRMPPAVVGSFQGTAQAFQSTLATMPMLILSALIVAYLVLGALYGSFIHPW 888
A +E + ++P + + G + + + P L+ + +V +L L ALY S+ P
Sbjct: 841 ALME--NLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPV 898

Query: 889 TILSTLPSAGVGAIATLWLFKYDFNLIALIGVILLIGIVKKNGIMMVDFAIAATRERNMT 948
+++ +P VG + LF ++ ++G++ IG+ KN I++V+FA +
Sbjct: 899 SVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG 958

Query: 949 SLDAIRSACLLRLRPIMMTTMTALFGALPLMLTPGMGSELRQPLGYAMVGGLLVSQVLTL 1008
++A A +RLRPI+MT++ + G LPL ++ G GS + +G ++GG++ + +L +
Sbjct: 959 VVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAI 1018

Query: 1009 FTTPVIYLYLDTL 1021
F PV ++ +
Sbjct: 1019 FFVPVFFVVIRRC 1031



Score = 89.9 bits (223), Expect = 3e-20
Identities = 78/509 (15%), Positives = 164/509 (32%), Gaps = 37/509 (7%)

Query: 4 NLFAVFIRYPVATCLMTAGILFAGVAAYFHLPVAPLPQVEFPTIQVSAVLP-GADPVSVA 62
N + L+ A I+ V + LP + LP+ + LP GA
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 63 STLAQPLETQF---SKIPYVTQMTSQSTLSSTS-------IVLQFSLERSIDAAANDVQS 112
L Q + + + S + + L+ ER + N ++
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEER--NGDENSAEA 645

Query: 113 AIDAAAAQLPADLPSPPTFQKVNPADSPIMLLSAIS---------STLPLTTIDDYVETR 163
I A + L + I+ L + + L +
Sbjct: 646 VIHRAKME----LGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 164 LTKSLSQIDGVGSVSIGGQQ-KPSIRIQLDPVKLASRGLSSEDVRRALSGLSGVNPKGVF 222
L + + SV G + ++++D K + G+S D+ + +S G F
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 223 NGTTRSYTIYTNGQ---LTEPAQWNDAIVAYRDGTPVRIRDIGQAVLGPEDNTLAAWIDG 279
R +Y P + V +G V + L +G
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLER-YNG 820

Query: 280 RRAISVGIYKKPGANTVSTVDKILARLPELEASLPPSLKIAVLADRTQTIRASLLDIELT 339
++ + PG ++ +A + L + LP + + R S
Sbjct: 821 LPSMEIQGEAAPG----TSSGDAMALMENLASKLPAGIGYDW-TGMSYQERLSGNQAPAL 875

Query: 340 LLLNVVLVVVVIYAFLGSVRTTIIPAVTVPVSLFGACALMWVCGYSLDNISLMAMTIAVG 399
+ ++ V+V + + A S + + VP+ + G + D ++ + +G
Sbjct: 876 VAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIG 935

Query: 400 FVVDDAIVMVENI-ARHVEAGERPLQAALKGLSETSFTIASISLSLVAVLLPLLLMSGII 458
+AI++VE + G+ ++A L + I SL+ + +LPL + +G
Sbjct: 936 LSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAG 995

Query: 459 GRMFREFAVTLSMTIIVSAFVSLTLTPMM 487
+ + ++ + +++ P+
Sbjct: 996 SGAQNAVGIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10480IGASERPTASE300.026 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.026
Identities = 34/191 (17%), Positives = 52/191 (27%), Gaps = 22/191 (11%)

Query: 335 SDRLSLFADVGYTRNFHG--AAGGMNAFDSDVEMFSIGADYKLSEASRAGALLSSGNANG 392
S+ + L Y RN + A N + + Y G L G
Sbjct: 1331 SNNVQLGGVFTYVRNSNNFDKATSKNTL----AQVNFYSKYYADNHWYLGIDLGYGKFQS 1386

Query: 393 SLAGGQGR-IGLHAYRLGVY--HAFERAGLFVRAYAGAGWSR-----YRL--DRAAVLPG 442
L H + G+ AF + G +S + L R V P
Sbjct: 1387 KLQTNHNAKFARHTAQFGLTAGKAFNLGNFGITPIVGVRYSYLSNADFALDQARIKVNPI 1446

Query: 443 AVRASTSGFDFGALVKAGYLFALGGVRLGPVADVGYTQLVARGYTEDGDPILAQNVGVQR 502
+V+ + + D Y + LG + P+ Y G A NV Q+
Sbjct: 1447 SVKTAFAQVDLS------YTYHLGEFSVTPILSARYDANQGSGKINVNGYDFAYNVENQQ 1500

Query: 503 LKGVSAGAGVR 513

Sbjct: 1501 QYNAGLKLKYH 1511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_10505CARBMTKINASE362e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.0 bits (83), Expect = 2e-04
Identities = 33/119 (27%), Positives = 56/119 (47%), Gaps = 15/119 (12%)

Query: 116 IDDERVRRDLDAGKVVIITGFQGV---DPDGHITTL-GRGGSDTSAVAVAAALEADECLI 171
++ E +++ ++ G +VI +G GV DG I + D + +A + AD +I
Sbjct: 174 VEAETIKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMI 233

Query: 172 YTDVDGVYTTDPRVVEEARRLDSVTFEEMLEMA--------SLGSKVLQ-IRSVEFAGK 221
TDV+G E+ + L V EE+ + S+G KVL IR +E+ G+
Sbjct: 234 LTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGE 290


75E4F39_11130E4F39_11170N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_11130014-1.151630D-alanyl-D-alanine endopeptidase
E4F39_11135-111-2.490190hypothetical protein
E4F39_11140-212-2.527563phasin family protein
E4F39_11145-124-1.326481dihydrolipoyl dehydrogenase
E4F39_11150-224-0.964409dihydrolipoyllysine-residue acetyltransferase
E4F39_11155-338-0.717453pyruvate dehydrogenase (acetyl-transferring),
E4F39_11160-434-0.326186PAS domain S-box protein
E4F39_11170-428-1.448984response regulator transcription factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_11190SSBTLNINHBTR280.026 Streptomyces subtilisin inhibitor signature.
		>SSBTLNINHBTR#Streptomyces subtilisin inhibitor signature.

Length = 144

Score = 28.3 bits (62), Expect = 0.026
Identities = 15/50 (30%), Positives = 23/50 (46%)

Query: 29 VATAAVAPADAFAATAKTAQSAKGKKSAAKKSLRAASSSAEPRAKGARKR 78
+A+ A APA +A +A G+ +A LRA + + P A G
Sbjct: 27 LASPATAPASLYAPSALVLTVGHGESAATAAPLRAVTLTCAPTASGTHPA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_11215RTXTOXIND365e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.6 bits (82), Expect = 5e-04
Identities = 12/58 (20%), Positives = 22/58 (37%)

Query: 49 VPSPSAGTVKEVKVKVGDAVSQGSLIVLLDGAQAAAQPAQANGAATSAAQPAAAPAAA 106
+ VKE+ VK G++V +G +++ L A A + + A
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQIL 156



Score = 31.0 bits (70), Expect = 0.012
Identities = 18/91 (19%), Positives = 30/91 (32%), Gaps = 5/91 (5%)

Query: 162 VPSPAAGVVKDIKVKVGDAVSEGSLIVVLEASGGAAA--SAPQAAAPAPAPAAPAPAPAP 219
+ +VK+I VK G++V +G +++ L A G A + A +
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 220 QAAPAAAPA---PAQAPAPAASGEYRASHAS 247
P P + S E S
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTS 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_11225PF06580320.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.012
Identities = 18/85 (21%), Positives = 31/85 (36%), Gaps = 18/85 (21%)

Query: 744 PVLIEQVLV-NLMKNAAEAMQEARPQAENGVIRVVADLEAGFVDIRVIDQGPGVDEATAE 802
P ++ Q LV N +K+ + G I + + G V + V + G + T E
Sbjct: 256 PPMLVQTLVENGIKHGIA------QLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE 309

Query: 803 RLFEPFYSTKSDGMGMGLNICRSII 827
S G G+ N+ +
Sbjct: 310 ----------STGTGL-QNVRERLQ 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_11230HTHFIS1132e-31 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 113 bits (283), Expect = 2e-31
Identities = 39/153 (25%), Positives = 67/153 (43%), Gaps = 4/153 (2%)

Query: 11 TVFVVDDDEAVRDSLRWLLEANGYRVQCFSSAEQFLDAYQPAQQAGQIACLILDVRMSGM 70
T+ V DDD A+R L L GY V+ S+A AG ++ DV M
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA----AGDGDLVVTDVVMPDE 60

Query: 71 SGLELQERLIAENAALPIIFVTGHGDVPMAVSTMKKGAMDFIEKPFDEAELRKLVERMLE 130
+ +L R+ LP++ ++ A+ +KGA D++ KPFD EL ++ R L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 131 KARNESKSVQEQRAASERLSKLTAREQQVLERI 163
+ + +++ L +A Q++ +
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


76E4F39_12240E4F39_12290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_12240217-1.638082transcriptional regulator
E4F39_122600111.093795FecR protein
E4F39_12265-1131.474140thymidylate synthase
E4F39_12270-3131.801377sigma-54-dependent Fis family transcriptional
E4F39_12275-2111.451536hypothetical protein
E4F39_12280-2110.714813hypothetical protein
E4F39_12285-2110.546777hypothetical protein
E4F39_12290116-1.142864sigma-54-dependent Fis family transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12325adhesinmafb320.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 32.3 bits (73), Expect = 0.002
Identities = 17/67 (25%), Positives = 28/67 (41%), Gaps = 3/67 (4%)

Query: 36 SARPAGELTMIAGLSPSAASAHLARLTDGGLLAL---DVRGRHRYYRIATPDIAAAIEAL 92
R A + + ++P A A + G +A + R + P+ A +EA+
Sbjct: 254 GTRYAIDKAAMRNIAPLPAEGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAV 313

Query: 93 ANVAQAA 99
NVA AA
Sbjct: 314 FNVAAAA 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12330IGASERPTASE435e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.1 bits (101), Expect = 5e-06
Identities = 42/263 (15%), Positives = 63/263 (23%), Gaps = 12/263 (4%)

Query: 582 NPAARAGERPQPNMPQPNAAQPNAAQPNIARPGQPQPGVAQPTAPHAPGTPPNAMRPDAA 641
NP + + N PN Q P P AP PP P
Sbjct: 982 NPEVEKRNQT---VDTTNITTPNNIQ--ADVPSVPSNNEEIARVDEAPVPPPAPATPSET 1036

Query: 642 RPNEARPAPAPSARNGVPRPPAAVENPGMRDEARAPGEAPRPQPSWTQPHPPIQQQRANE 701
A + S A R+ A+ EA + TQ + Q +
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAK---EAKSNVKANTQTNEVAQSGSETK 1093

Query: 702 GGPRASGEPNAPLNYRSPTQNALPPIRSTPTPTHSAPPAPPPAERAQPQPQPGPAPRNAM 761
+ A + + + P T P +E QPQ +P +
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 762 RAPEAPRQEVAPPAPRNEYRAPAPAPRPQIE--APRMEAPRMP-APRAEAP-RMEPRPAP 817
E Q + + + + P P +P
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 818 PPPAVPHNPPPAPRQEPPHQARP 840
P N + PH P
Sbjct: 1214 ESSNKPKNRHRRSVRSVPHNVEP 1236


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12340HTHFIS376e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 376 bits (968), Expect = e-129
Identities = 133/388 (34%), Positives = 202/388 (52%), Gaps = 40/388 (10%)

Query: 101 FDYVTVPYECDRIVESVGHAYGMVTLSEGLAPAAATVRNEGEMVGTCEAMLALFKMIRKV 160
+DY+ P++ ++ +G A + + ++ +VG AM +++++ ++
Sbjct: 99 YDYLPKPFDLTELIGIIGRA--LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156

Query: 161 ASTDAPVFISGESGTGKELTAVAIHERSSRAGAPFVAINCGAIPPTLLQAELFGYERGAF 220
TD + I+GESGTGKEL A A+H+ R PFVAIN AIP L+++ELFG+E+GAF
Sbjct: 157 MQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAF 216

Query: 221 TGANQRKIGRIEAANGGTLFLDEIGDLPFESQASLLRFLQEHKVERVGGHQSIPVDVRII 280
TGA R GR E A GGTLFLDEIGD+P ++Q LLR LQ+ + VGG I DVRI+
Sbjct: 217 TGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIV 276

Query: 281 SATHVDMQIALRNGRFREDLYHRLCVLKLEEPPLRERGKDIEILARHMLERFKGDAHRRL 340
+AT+ D++ ++ G FREDLY+RL V+ L PPLR+R +DI L RH +++ + + +
Sbjct: 277 AATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEG-LDV 335

Query: 341 RGFTPDAIAALHNYAWPGNVRELINRVRRAIVMSEGRMISAADLELSGYAEVA------- 393
+ F +A+ + + WPGNVREL N VRR + +I+ +E +E+
Sbjct: 336 KRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKA 395

Query: 394 ------------------------------PMSLEEARESAERHAIEVALLRHRGRLADA 423
+ E I AL RG A
Sbjct: 396 AARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKA 455

Query: 424 ARELGVSRVTLYRLLCAYGMRDDDGARA 451
A LG++R TL + + G+ +R+
Sbjct: 456 ADLLGLNRNTLRKKIRELGVSVYRSSRS 483


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_12360HTHFIS357e-121 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 357 bits (917), Expect = e-121
Identities = 145/453 (32%), Positives = 216/453 (47%), Gaps = 46/453 (10%)

Query: 53 VHVARSANEAARRVKPNQPQAGIADL---DGFAPRELPTLEAVLRQQQVGWIALAGDTRI 109
V + +A R + + D+ D A LP ++ V + ++
Sbjct: 30 VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPV--LVMSAQNTF 87

Query: 110 NDPDVRRLIRQYCFDYMQGLPPHETIDYLVGHAYGMVALCDLDVTAGAAATGDEMVGACD 169
+ + +DY+ + ++G A + G +VG
Sbjct: 88 MTA--IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR-RPSKLEDDSQDGMPLVGRSA 144

Query: 170 AMQQLFRTIRKVAATDATVFISGESGTGKELTALAIHERSERRKAPFVAINCGAIPNHLL 229
AMQ+++R + ++ TD T+ I+GESGTGKEL A A+H+ +RR PFVAIN AIP L+
Sbjct: 145 AMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLI 204

Query: 230 QSELFGYERGAFTGASQRKVGRVEAADGGTLFLDEIGDMPLESQASMLRFLQEGKIERLG 289
+SELFG+E+GAFTGA R GR E A+GGTLFLDEIGDMP+++Q +LR LQ+G+ +G
Sbjct: 205 ESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVG 264

Query: 290 GHESIPVDVRIISATHVDLDAAMREGRFRDDLYHRLCVLKLDEPPLRARGKDIEILAHHI 349
G I DVRI++AT+ DL ++ +G FR+DLY+RL V+ L PPLR R +DI L H
Sbjct: 265 GRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHF 324

Query: 350 LHQFRSDGARRIHGFTSCAIEAMYNYHWPGNVRELINRIRRAIVMSDSRQLSAADLDL-- 407
+ Q +G + F A+E M + WPGNVREL N +RR + ++ ++
Sbjct: 325 VQQAEKEG-LDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENEL 383

Query: 408 -----------------------------------APFAARQATTLAEARERAERRTIEA 432
A + E I A
Sbjct: 384 RSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILA 443

Query: 433 SLLRHRNRLTEAAAELGVSRATLYRLMVSHGLR 465
+L R +AA LG++R TL + + G+
Sbjct: 444 ALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


77E4F39_13425E4F39_13470N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_13425-2111.500232NAD(P)-dependent oxidoreductase
E4F39_13435-1130.658565acyltransferase
E4F39_13445-1130.352649ABC transporter ATP-binding protein
E4F39_13450-112-0.073835ABC transporter permease
E4F39_13455-18-0.593925dTDP-4-dehydrorhamnose reductase
E4F39_13460-18-1.988805dTDP-4-dehydrorhamnose 3,5-epimerase
E4F39_13465011-2.359040glucose-1-phosphate thymidylyltransferase
E4F39_13470116-4.012243dTDP-glucose 4,6-dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13515NUCEPIMERASE1673e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 167 bits (425), Expect = 3e-51
Identities = 83/363 (22%), Positives = 136/363 (37%), Gaps = 58/363 (15%)

Query: 13 KILVTGGAGFIGCAISERLAARASRYVVMDNLHPQIHASAVRPGALHEKAE----LVVAD 68
K LVTG AGFIG +S+RL + V +DNL+ + +++ L A+ D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDY-YDVSLKQARLELLAQPGFQFHKID 60

Query: 69 VTDAGAWDALLSDFQPEIIIHLAAETGTGQSLTEASRHALVNVVGTTRLTDALVKHGIVV 128
+ D L + E + SL +A N+ G + + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK--I 118

Query: 129 EHILLTSSRAVYGEGAWQKDDGTIVYPGQRGRAQLEAAQWDFPGMTMLPSRADRTEPRPT 188
+H+L SS +VYG +P D + P
Sbjct: 119 QHLLYASSSSVYGLN------------------------------RKMPFSTDDSVDHPV 148

Query: 189 SVYGATKLAQEHVLRAWSLATKTPLSILRLQNVYGPGQSLTNSYTGIVALFSRLAREKKV 248
S+Y ATK A E + +S P + LR VYGP + F++ E K
Sbjct: 149 SLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF----KFTKAMLEGKS 204

Query: 249 IPLYEDGNVTRDFVSIDDVADAIVATLVRTPEA-----------------LSLFDIGSGQ 291
I +Y G + RDF IDD+A+AI+ P A +++IG+
Sbjct: 205 IDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSS 264

Query: 292 ATSILDMARIIAAHYGAPEPQINGAFRDGDVRHAACDLSESLANLGWKPQWSLKRGIGEL 351
++D + + G + + GDV + D +G+ P+ ++K G+
Sbjct: 265 PVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 352 QTW 354
W
Sbjct: 325 VNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13530ABC2TRNSPORT300.007 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 30.3 bits (68), Expect = 0.007
Identities = 16/59 (27%), Positives = 24/59 (40%)

Query: 195 LFTMVLMFLSPVFYPASALPEKYRFWLELNPLTLFIEQSRGILLEGRVPDFHPLGLAFL 253
L ++FLS +P LP ++ PL+ I+ R I+L V D A
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALC 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13535NUCEPIMERASE588e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 57.9 bits (140), Expect = 8e-12
Identities = 32/160 (20%), Positives = 56/160 (35%), Gaps = 27/160 (16%)

Query: 1 MKILVTGANGQVGWELARSLAVLGQVV-----------PLTRE--------------QAD 35
MK LVTGA G +G+ +++ L G V ++ + D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 36 LGRPETLARIVEDAKPDVVVNAAAYTAVDAAETDGAAANVINGEA-VGVLAAATKRVGGL 94
L E + + + V + AV + + A N + +L
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 95 FVHYSTDYVFDGTKPSPYIETDPT-CPVNAYGASKLLGEL 133
++ S+ V+ + P+ D PV+ Y A+K EL
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANEL 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13550NUCEPIMERASE1744e-54 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 174 bits (444), Expect = 4e-54
Identities = 90/350 (25%), Positives = 136/350 (38%), Gaps = 45/350 (12%)

Query: 2 ILVTGGAGFIGANFVLDWLAQSDEAVLNVDKLT--YAGNLGTLK-SLQGNPKHVFARVDI 58
LVTG AGFIG + L + V+ +D L Y +L + L P F ++D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 59 CDRAAIDALLAQHKPRAIVHFAAESHVDRSIHGPADFVQTNVVGTFTLLEAARQYWSALG 118
DR + L A + V S+ P + +N+ G +LE R
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN----K 117

Query: 119 PDAKAAFRFLHVSTDEVFGSLSPADPQFSETTPYA-PNSPYSATKAGSDHLVRAYHHTYG 177
L+ S+ V+G L+ P FS P S Y+ATK ++ + Y H YG
Sbjct: 118 IQ-----HLLYASSSSVYG-LNRKMP-FSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 178 LPVLTTNCSNNYGPYQFPEKLIPLMIANALGGKPLPVYGDGQNVRDWLYVGDHCSAIREV 237
LP YGP+ P+ + L GK + VY G+ RD+ Y+ D AI +
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRL 230

Query: 238 L------------------ARGVPGETYNVGGWNEKKNLDVVHTLCDLLD-EARPKAAGS 278
A P YN+G + + +D + L D L EA+
Sbjct: 231 QDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKN---- 286

Query: 279 YRDQITYVTDRPGHDRRYAIDARKLERELGWKPAETFETGLAKTVRWYLD 328
+ +PG + D + L +G+ P T + G+ V WY D
Sbjct: 287 ------MLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330


78E4F39_13770E4F39_13810N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_137701113.9753642-hydroxyacid dehydrogenase
E4F39_13775-1103.3808775,6-dimethylbenzimidazole synthase
E4F39_13780092.617467MFS transporter
E4F39_13785-2102.043148hypothetical protein
E4F39_13795-2102.110432MFS transporter
E4F39_13800-191.679338fumarylacetoacetase
E4F39_13805-2101.823274homogentisate 1,2-dioxygenase
E4F39_13810-192.032358MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13855SECA340.001 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.7 bits (77), Expect = 0.001
Identities = 27/87 (31%), Positives = 42/87 (48%), Gaps = 9/87 (10%)

Query: 116 LNRRLPRAVARTREGDFSLNGLLGFDLFGKTVGVIGTGLI--GSVFARIMTGFGMRVLAH 173
L +P A A RE + G+ FD V ++G G++ A + TG G + L
Sbjct: 60 LENLIPEAFAVVREASKRVFGMRHFD-----VQLLG-GMVLNERCIAEMRTGEG-KTLTA 112

Query: 174 SLPPHDDALIALGVRYVPLDALLAEAD 200
+LP + +AL GV V ++ LA+ D
Sbjct: 113 TLPAYLNALTGKGVHVVTVNDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13870TCRTETA501e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 49.8 bits (119), Expect = 1e-08
Identities = 95/399 (23%), Positives = 149/399 (37%), Gaps = 37/399 (9%)

Query: 5 LFALAVAAFGIGTTEFVIMGLLPNVARDLGVSIPAA---GMLVSGYALGVTIGAPILAVV 61
L +A+ A GIG +IM +LP + RDL S G+L++ YAL AP+L +
Sbjct: 11 LSTVALDAVGIG----LIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 TAKMPRKAALLALIGVFIVGNLFCAIAPGYATLMVARVVTAFCHGAFFGIGSVVASNLVA 121
+ + R+ LL + V A AP L + R+V G+ +A ++
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA-DITD 125

Query: 122 PNKRAQAIALMFTGLTLANVLGVPLGTALGQAFGWRATFWAVTGIGALAAAALAFCVPKR 181
++RA+ M V G LG +G F A F+A + L F +P+
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 LEMPAAGIAREFGVLRNPQVLMVLGISVLASASLFTVFTYIAPI-----------LEDVT 230
+ + RE NP + A+L VF + + ED
Sbjct: 185 HKGERRPLRRE---ALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRF 241

Query: 231 GFTPHDVTLVLLLFG-LGLTVGGTVGGKLADW---RRMPSLVATLASIGVVLAAFAGTMR 286
+ + + L FG L + G +A RR L G +L AFA
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 287 TPLPALVTIFVWGVLAFAIVPPLQILIVDRAS-HAPNLASTLNQGAFNLGNALGAWLGGT 345
P +V + G+ +P LQ ++ + +L + +G L
Sbjct: 302 MAFPIMVLLASGGI----GMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 346 AIHAGVPLAK-LPW-AGAAL---AMAALALTLWSASLER 379
A + W AGAAL + AL LWS + +R
Sbjct: 358 IYAASITTWNGWAWIAGAALYLLCLPALRRGLWSGAGQR 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13880TCRTETA479e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.7 bits (111), Expect = 9e-08
Identities = 95/395 (24%), Positives = 152/395 (38%), Gaps = 25/395 (6%)

Query: 12 LILSVAVVGLGTGATLPLTALALTEAGHGTRIV---GILTAAQAGGGLAVVPFVTAITKR 68
++ +VA+ +G G +P+ L + H + GIL A A A P + A++ R
Sbjct: 10 ILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDR 69

Query: 69 LGARQVIVASVVVLAAATALMQFTSNLVVWGVLRVVCGAALMLLFTIGEAWVNQLADDAT 128
G R V++ S+ A A+M L V + R+V G G A++ + D
Sbjct: 70 FGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIADITDGDE 128

Query: 129 RGRVVAIYATNFTLFQMAGPVLVSQIAGMTH-----VRFALSGTLFLLAL----PSLASI 179
R R + F +AGPVL + G + AL+G FL S
Sbjct: 129 RARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGE 188

Query: 180 RKTPIADEPHHDAHDRWTRVIPKMPALVVGTAFFALFDTLALSLLPIFAMAR--GVASEA 237
R+ + + A RW R + + AL+ L + +L IF R A+
Sbjct: 189 RRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTI 248

Query: 238 AVLFAAILLFGDTAMQFPIGWLADKLGRERVHLGAGCVVLALLPLLPAVVTTPWLCWPLL 297
+ AA + A G +A +LG ER L G + +L A T W+ +P++
Sbjct: 249 GISLAAFGILHSLAQAMITGPVAARLG-ERRALMLGMIADGTGYILLAFATRGWMAFPIM 307

Query: 298 FVLGAAAGSVYTL----SLVACGERFRGSALVTASSLVSASWSAASFGGPLVAGALMEQF 353
+L + + L S ER +G + ++L S S GPL+ A+
Sbjct: 308 VLLASGGIGMPALQAMLSRQVDEER-QGQLQGSLAALT----SLTSIVGPLLFTAIYAAS 362

Query: 354 GGDALIGVLIVSAIAFVGAALWERRALPMQAARRG 388
I A ++ RR L A +R
Sbjct: 363 ITTWNGWAWIAGAALYLLCLPALRRGLWSGAGQRA 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_13900TCRTETA449e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.7 bits (103), Expect = 9e-07
Identities = 77/398 (19%), Positives = 124/398 (31%), Gaps = 59/398 (14%)

Query: 50 VAPSVIAEWGVKKQA---LGPVFSASLFGMLLGALGLSVLADRIGRRPVLIGATLFFALA 106
V P ++ + G + + A L L+DR GRRPVL+ + A+
Sbjct: 27 VLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVD 86

Query: 107 MLATPFATSIPILIALRFVTGLGLGCIMPNAMALVGECSPSAHRVKRM----MIVSCGFT 162
A + +L R V G+ G A A + + + R + G
Sbjct: 87 YAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMV 145

Query: 163 LGAALGGFVSAALIPAFGWRAVFFVGGAVPLALAAAMAASLPESPQLLVLRGRHDAARAW 222
G LGG + F A FF A+ LPES H R
Sbjct: 146 AGPVLGGLMGG-----FSPHAPFFAAAALNGLNFLTGCFLLPES---------HKGERRP 191

Query: 223 LAKFAPRLAVPPDTRLVVREAGPRGAPVAELFRSGRARVTLLLWAINF-MNLIDLYFLSN 281
L + A P+A + V L A+ F M L+ +
Sbjct: 192 LRREALN-------------------PLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 282 WLPTVMRDAGYASGTAVIVGTVLQTGGVIGTLS----LGWFIERHGFARVLFACFACATI 337
W+ + A +G L G++ +L+ G R G R L
Sbjct: 233 WVIFGEDRFHW---DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT 289

Query: 338 AIGLIGSVAHAFVWLLAAVFVGGFCVVGGQPAVNALAGHYYPTSLRSTGIGWSLGVGRVG 397
L+ ++ V + + G PA+ A+ + G + +
Sbjct: 290 GYILLAFATRGWMAFPIMVLLASGGI--GMPALQAMLSRQVDEERQGQLQGSLAALTSLT 347

Query: 398 SVLGPLVGGQLIA--------LGWSNDALFHAAAVPVL 427
S++GPL+ + A W A + +P L
Sbjct: 348 SIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPAL 385


79E4F39_14660E4F39_14680N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_14660-111-1.175099hypothetical protein
E4F39_14665-2110.750008YafY family transcriptional regulator
E4F39_14670-2111.872894NAD(P)(+) transhydrogenase (Re/Si-specific)
E4F39_14675-3101.906028NAD(P) transhydrogenase subunit alpha
E4F39_14680-3112.359017Re/Si-specific NAD(P)(+) transhydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14730PYOCINKILLER280.017 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 27.8 bits (61), Expect = 0.017
Identities = 20/87 (22%), Positives = 32/87 (36%), Gaps = 5/87 (5%)

Query: 17 EAEAEAEAEAEAEAEAEAEAEAEAEAEAEADTDTDTDRKFDTERSAANDVSARAAAPPVP 76
+ A+A EA A +A +A AEA + + R A +A A P
Sbjct: 201 QIRMNTLTAAKASIEAAAANKAREQAAAEA-----KRKAEEQARQQAAIRAANTYAMPAN 255

Query: 77 STAGVVIAVHPLDPAAAAALAIAATVA 103
+ A L A A ++A ++
Sbjct: 256 GSVVATAAGRGLIQVAQGAASLAQAIS 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14735ARGREPRESSOR363e-05 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 36.0 bits (83), Expect = 3e-05
Identities = 20/75 (26%), Positives = 33/75 (44%), Gaps = 6/75 (8%)

Query: 3 RRADRLFQIAELLRGRRLTTAQQLADWL-----SVSPRTVYRDVRDLQLSGVPIEGEAGI 57
+ R +I E++ + T +L D L +V+ TV RD+++L L VP +
Sbjct: 2 NKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYK 61

Query: 58 GYRLNRAASLPPLTF 72
Y L PL+
Sbjct: 62 -YSLPADQRFNPLSK 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14750TCRTETA300.019 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.019
Identities = 44/244 (18%), Positives = 74/244 (30%), Gaps = 45/244 (18%)

Query: 33 GNLFGMVGMAIAILTTVALIAKQAAWLGANLPLGLALVFGALVIGGAVGAVIAARVEMTK 92
G + G IA +T A+ ++ A G+ V+GG +G
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVA---GPVLGGLMGGFSPH------ 160

Query: 93 MPELVAAMHSLIGLAAVCIAIAVVAEPEAFGL---VPQDASAPNFIPYGNRIELFIGTFV 149
P AA + + C + + E L ++ + + + F
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF 220

Query: 150 GAITFSGSVIAFGKLSGKYRFR------------------LFQ----GAPVVYPGQ-HLI 186
A + G+ RF L Q G G+ +
Sbjct: 221 IMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL 280

Query: 187 NLMLALAMLGFGILFFITQSWLPFGIMTAIAFALGVLIIIPIGGADMPVVVSMLNSYSGW 246
L + G+ +L F T+ W+ F IM +A GG MP + +ML+
Sbjct: 281 MLGMIADGTGYILLAFATRGWMAFPIMVLLAS----------GGIGMPALQAMLSRQVDE 330

Query: 247 AAAG 250
G
Sbjct: 331 ERQG 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_14760ACRIFLAVINRP300.018 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.018
Identities = 13/47 (27%), Positives = 21/47 (44%), Gaps = 4/47 (8%)

Query: 139 KAVLVAAALYPRFFPMLMTAAGTVKAARVLVL--GAGVAGLQAIATA 183
+A L+A + R P+LMT+ + L + GAG A+
Sbjct: 961 EATLMAVRM--RLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIG 1005


80E4F39_16320E4F39_16355N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_16320-211-2.823613ABC transporter permease
E4F39_16325-313-2.575211ABC transporter ATP-binding protein
E4F39_16330-112-2.369278hypothetical protein
E4F39_16335010-2.509746STAS domain-containing protein
E4F39_16345013-2.109992ABC transporter substrate-binding protein
E4F39_16355-112-1.348099VacJ family lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16405ABC2TRNSPORT741e-17 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 73.8 bits (181), Expect = 1e-17
Identities = 60/243 (24%), Positives = 100/243 (41%), Gaps = 6/243 (2%)

Query: 7 LFYKEILRFWKVSFQTVLAPVVTALLYLTIFGHALTGRVNVYPGVEYVSFLVPGLVMMSV 66
++ + + + K + ++L + L+YL G L V GV Y +FL G+V S
Sbjct: 19 VWRRNYIAWKKAALASLLGHLAEPLIYLFGLGAGLGVMVGRVGGVSYTAFLAAGMVATSA 78

Query: 67 LQNA-FANSSSSLIQSKITGNLVFMLLPPLSSADIFGAYVLASVVRGLAVGAGVFVVTVW 125
+ A F ++ + + ML L DI + + + GAG+ VV
Sbjct: 79 MTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAA 138

Query: 126 FIPMSFAAPLYIVAFALFGSAILGTLGLIAGIWAEKFDQLAAFQNFLIMPLTFLSGVFYS 185
+ + LY + +LG++ A +D +Q +I P+ FLSG +
Sbjct: 139 LGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFP 198

Query: 186 THSLPPVWREVSRLNPFFYMIDGFRYGFFG--IADVNPLASLS---VVAGFFVLLALIAM 240
LP V++ +R P + ID R G + DV +V FF+ AL+
Sbjct: 199 VDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRR 258

Query: 241 RLL 243
RLL
Sbjct: 259 RLL 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16410PF05272280.037 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.037
Identities = 11/19 (57%), Positives = 13/19 (68%)

Query: 34 LLGPNGAGKTTLISILAGL 52
L G G GK+TLI+ L GL
Sbjct: 601 LEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16425FLGMOTORFLIG280.026 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.2 bits (63), Expect = 0.026
Identities = 12/73 (16%), Positives = 22/73 (30%)

Query: 74 RTTQLAMGRNWRTATPAQQQQVIEQFKQLLIRTYSGALAQLKPDQQIQYPPFRADADATD 133
R + A ++ +Q +T + L+ L P + T+
Sbjct: 107 NLGSALQSRPFEFVRRADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTN 166

Query: 134 VVVRTVAMNNGQP 146
V R M+ P
Sbjct: 167 VARRIALMDRTSP 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_16430VACJLIPOPROT2233e-74 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 223 bits (569), Expect = 3e-74
Identities = 85/220 (38%), Positives = 114/220 (51%), Gaps = 8/220 (3%)

Query: 15 AAAALSGCATVQTPTKG--DPFEGFNRTMYTFNDKV-DQYALKPVARGYQWAVPQPMRDS 71
L GCA+ T +G DP EGFNRTMY FN V D Y ++PVA ++ VPQP R+
Sbjct: 11 GTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWRDYVPQPARNG 70

Query: 72 VTNFFSNIGDVYIAANNLVQLKIADGVGDIMRVVINTVFGVGGLFDVATLAKLPKHAND- 130
++NF N+ + + N +Q G+ R +NT+ G+GG DVA +A +
Sbjct: 71 LSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGMANPKLQRTEP 130

Query: 131 --FGVTLGHYGVPSGPYLVLPLLGPSTVRDTAGLAVDYAGNPLTYVRPDGVSWGLFGLNL 188
FG TLGHYGV GPY+ LP G T+RD G D A P+ +S G + L
Sbjct: 131 HRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD-ALYPVLSWLTWPMSVGKWTLEG 189

Query: 189 VNTRANLLGAGDVLEAAAIDKYSFVRNAYLQRRQALIGGA 228
+ TRA LL + +L + D Y VR AY QR + G
Sbjct: 190 IETRAQLLDSDGLLR-QSSDPYIMVREAYFQRHDFIANGG 228


81E4F39_17195E4F39_17270N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_17195-1111.099993flagellar type III secretion system protein
E4F39_17200-2121.024139VOC family protein
E4F39_17205-2111.118779DUF3443 domain-containing protein
E4F39_17210-1121.936702DUF2844 domain-containing protein
E4F39_17215-3121.433141protein phosphatase CheZ
E4F39_17220-313-0.888318chemotaxis protein CheY
E4F39_17225-113-0.830136hypothetical protein
E4F39_17230013-0.121650chemotaxis response regulator protein-glutamate
E4F39_17235-113-0.189735chemoreceptor glutamine deamidase CheD
E4F39_17240114-0.381049chemotaxis protein methyltransferase
E4F39_172450130.165112methyl-accepting chemotaxis protein
E4F39_172550111.210551chemotaxis protein CheW
E4F39_17260-1100.900501chemotaxis protein CheA
E4F39_17265-110-0.035021response regulator
E4F39_17270-290.003035flagellar motor protein MotB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17270TYPE3IMSPROT358e-124 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 358 bits (921), Expect = e-124
Identities = 108/344 (31%), Positives = 181/344 (52%), Gaps = 2/344 (0%)

Query: 12 DRTEAATPKRREKAREEGQVARSRELASFALLSAGFYGAWMLSGPIGEHLRTMLHAAFSF 71
++TE TPK+ AR++GQVA+S+E+ S AL+ A LS EH ++
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLM--LIPA 61

Query: 72 DRAAAFDTNRMLSHAGTLSLEGLYALAPVLALTGVAALAAPMAMGGWLVSTKTFELKFER 131
+++ + + + LE Y P+L + + A+A+ + G+L+S + + ++
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 132 LNPITGLGRIFSIQGPIQLGMSIAKTLVVGGIGGIAIWRSKDELLGLATQPLHAALADAL 191
+NPI G RIFSI+ ++ SI K +++ + I I + LL L T +
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 192 HLVAVCCGMTVAGMLVVAGLDVPYQLWQYNKKLRMTKEEVKREHRENEGDPHVKGRIRQQ 251
++ + G +V++ D ++ +QY K+L+M+K+E+KRE++E EG P +K + RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 252 QRAMARRRMMANVPTADVVVTNPTHFAVALKYTDGEMRAPKVVAKGVNLVAARIRELAAE 311
+ + R M NV + VVV NPTH A+ + Y GE P V K + +R++A E
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 312 HHVPLLEAPPLARALYHNVELEREIPGTLYSAVAEVLAWVYQLK 355
VP+L+ PLARALY + ++ IP A AEVL W+ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17280cloacin320.004 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.0 bits (72), Expect = 0.004
Identities = 14/43 (32%), Positives = 20/43 (46%)

Query: 28 GGGGDGGSNASVNTGTGGGDTSAGGGSNGGTGGTGGSGSTPLA 70
GGG G + +G G G + G GTGG + + P+A
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVA 89



Score = 30.1 bits (67), Expect = 0.022
Identities = 17/56 (30%), Positives = 22/56 (39%)

Query: 17 AAATAALVAACGGGGDGGSNASVNTGTGGGDTSAGGGSNGGTGGTGGSGSTPLASN 72
A +T+ + G G AS +G + GGGS G GGSG N
Sbjct: 13 AHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGN 68



Score = 29.7 bits (66), Expect = 0.025
Identities = 21/61 (34%), Positives = 27/61 (44%), Gaps = 7/61 (11%)

Query: 28 GGGGDGGSNASVNTGTGGGDTS-------AGGGSNGGTGGTGGSGSTPLASNQAAITVST 80
GG DG +S N GGG S +G G+ GG G +GG T + A V+
Sbjct: 31 GGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAF 90

Query: 81 G 81
G
Sbjct: 91 G 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17295HTHFIS865e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.4 bits (214), Expect = 5e-23
Identities = 32/110 (29%), Positives = 52/110 (47%), Gaps = 4/110 (3%)

Query: 1 MDKSMKILVVDDFPTMRRIVRNLLKELGYSNVDEAEDGLAGLARLRGGGYDFVISDWNMP 60
M + ILV DD +R ++ L GY V + + G D V++D MP
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 NLDGLAMLKEIRADASLTHLPVLMVTAESKKENIIAAAQAGASGYVVKPF 110
+ + +L I+ LPVL+++A++ I A++ GA Y+ KPF
Sbjct: 59 DENAFDLLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17305HTHFIS664e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.4 bits (162), Expect = 4e-14
Identities = 31/143 (21%), Positives = 61/143 (42%), Gaps = 13/143 (9%)

Query: 7 KIKVLCVDDSALIRSLMTEIINSQPDMEVCATAPDPLVARELIKQHNPDVLTLDVEMPRM 66
+L DD A IR+++ + ++ + I + D++ DV MP
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 67 DGLDFLEKLMRLRP-MPVVMVSSLTERGSEITLRALELGAVDFVTKPRVGIRDGMLDYSE 125
+ D L ++ + RP +PV+++S+ ++A E GA D++ KP D +E
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNT--FMTAIKASEKGAYDYLPKP--------FDLTE 110

Query: 126 KLADKVRAASRARVRQNPQPHAA 148
+ RA + + R + +
Sbjct: 111 LIGIIGRALAEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17330PF06580463e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 46.4 bits (110), Expect = 3e-07
Identities = 21/151 (13%), Positives = 50/151 (33%), Gaps = 52/151 (34%)

Query: 467 ELDKSLIERIIDPLT--HLVRNSLDHGIETVEARRAAGKDAVGQLVLSAAHHGGNIVIEV 524
+++ ++++ + P+ LV N + HGI G+++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 525 SDDGAGLNRERILAKAAKQGMQISENISDDEVWNLIFAPGFSTAEVVTDVSGRGVGMDVV 584
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 585 KRNIQSMGG---HVEISSQAGRGTTTRIVLP 612
+ +Q + G +++S + G +++P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17335HTHFIS719e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 9e-18
Identities = 38/114 (33%), Positives = 58/114 (50%), Gaps = 2/114 (1%)

Query: 4 TILAIDDSATMRTLLSATLGEAGYDVTVASDGEVGLDVALATRFDLVLTDHHMPRKNGLE 63
TIL DD A +RT+L+ L AGYDV + S+ A DLV+TD MP +N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LIVALRRQLGYEATPILVLTTENGDAFKDAARVAGATGWIEKPIDPDALIELVA 117
L+ +++ P+LV++ +N A GA ++ KP D LI ++
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_17340OMPADOMAIN401e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.5 bits (92), Expect = 1e-05
Identities = 25/117 (21%), Positives = 51/117 (43%), Gaps = 9/117 (7%)

Query: 182 FAMSSDAVEPYMRDILREIGKTLNDV---PNRIIVQGHTDAVPYAGGEKGYSNWELSADR 238
F + ++P + L ++ L+++ ++V G+TD + G Y N LS R
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI----GSDAY-NQGLSERR 277

Query: 239 ANASRRELIAGGMDEAKVLRV-LGLASTQNLNKADPLDPENRRISIIVLNRKSELAL 294
A + LI+ G+ K+ +G ++ N D + I + +R+ E+ +
Sbjct: 278 AQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


82E4F39_18080E4F39_18120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_18080014-0.368751type II secretion system protein GspD
E4F39_18085021-1.134395type II secretion system protein GspE
E4F39_18090022-2.011604type II secretion system protein GspF
E4F39_18095-116-3.107850general secretion pathway protein GspC
E4F39_18100-212-2.413404hypothetical protein
E4F39_18105-112-3.260240type II secretion system protein GspG
E4F39_18110015-4.804204type II secretion system protein GspH
E4F39_18115-211-2.015713type II secretion system protein GspI
E4F39_18120-210-2.034239prepilin-type N-terminal cleavage/methylation
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18165BCTERIALGSPD403e-133 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 403 bits (1038), Expect = e-133
Identities = 215/691 (31%), Positives = 324/691 (46%), Gaps = 88/691 (12%)

Query: 13 TALVVAGIVAAQAAHAQVTLNFVNADIDQVAKAIGAATGKTIIVDPRVKGQLNLVAERPV 72
T L+ A ++ AA + + +F DI + + KT+I+DP V+G + + + +
Sbjct: 13 TLLIFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDML 72

Query: 73 PEDQALKTLQSALRMQGFALV-QDHGVLKVVPEADAKLQGVPTYIGNAPQARGDQVVTQV 131
E+Q + S L + GFA++ ++GVLKVV DAK VP AP GD+VVT+V
Sbjct: 73 NEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGI-GDEVVTRV 131

Query: 132 FELRNESANNLLPVLRPLI--SPNNTITAYPANNTIVVTDYADNVRRIAQIIAGVDSAAG 189
L N +A +L P+LR L + ++ Y +N +++T A ++R+ I+ VD+A
Sbjct: 132 VPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGD 191

Query: 190 SQVAVVPLKNANAIDIAAQLTKLLDPGAIGNTDATLKVTVQADPRTNALLLRASNAQRLA 249
V VPL A+A D+ +T+L + ++ V AD RTNA+L+ R
Sbjct: 192 RSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSR-Q 250

Query: 250 AAKKIAQQLDAPSGVPGNMHVVPLRNAEAVKLAKTLRGMLGKGGGESGSSASSNDANAFN 309
+ +QLD GN V+ L+ A+A L + L G+
Sbjct: 251 RIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGIS-------------------- 290

Query: 310 QGGSQSGSNFSTGASGTPPLPSGLSSNSSGGAGGTTGGGGLGNAGLLGGDKDKGDDNQPG 369
S + S +
Sbjct: 291 ---------------------STMQSEKQAAKPVAALDKNI------------------- 310

Query: 370 GMIQADAASNSLIITASDPVYRNLRAVIDQLDARRAQVYIEALVVELQATTSANLGIQWQ 429
+I+A +N+LI+TA+ V +L VI QLD RR QV +EA++ E+Q NLGIQW
Sbjct: 311 -IIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWA 369

Query: 430 VANNALYAGTNLVTGQTGLGNSIVNLTAGAVT--NPGGTLGSLG---SITNGLNIGWLHN 484
N +T T G I AGA G SL S NG+ G
Sbjct: 370 NKNAG-------MTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAG---- 418

Query: 485 MFGVQGLGALLQFFAGSSDANVLSTPNLVTLDNEEAKIVVGQNVPIPTGSYSNLTSGTTA 544
F LL + S+ ++L+TP++VTLDN EA VGQ VP+ TGS + +
Sbjct: 419 -FYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGS----QTTSGD 473

Query: 545 NAFNTYDRRDVGLTLHVKPQITEGGILKLQLYTEDSAVVPGTNTTSANSPGPTFTKRSIQ 604
N FNT +R+ VG+ L VKPQI EG + L++ E S+V +++++ G TF R++
Sbjct: 474 NIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSV-ADAASSTSSDLGATFNTRTVN 532

Query: 605 STVLADNGEIIVLGGLMQDNYQVSNTKVPLLGDIPWIGQLFRSEGKTRQKTNLMVFLRPV 664
+ VL +GE +V+GGL+ + + KVPLLGDIP IG LFRS K K NLM+F+RP
Sbjct: 533 NAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPT 592

Query: 665 IINDRETAQAVTSNRYDYIQGVTGAYKSDNN 695
+I DR+ + +S +Y + N
Sbjct: 593 VIRDRDEYRQASSGQYTAFNDAQSKQRGKEN 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18175BCTERIALGSPF382e-133 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 382 bits (982), Expect = e-133
Identities = 174/406 (42%), Positives = 266/406 (65%), Gaps = 2/406 (0%)

Query: 1 MPAFRFEAIDASGRAQKGVIEADSARNARGQLRTQGLTPLVVEPAASAQRGARSQRLALG 60
M + ++A+DA G+ +G EADSAR AR LR +GL PL V+ Q+ + S L+L
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 61 R--KLSQREQAILTRQLASLLVAGLPLDEALAVLTEQAERDYIRELMAAIRAEVLGGHSL 118
R +LS + A+LTRQLA+L+ A +PL+EAL + +Q+E+ ++ +LMAA+R++V+ GHSL
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 119 ANALTQHPRDFPEIYRALVAAGEHTGKLGIVLSRLADYIEERNALKQKILLAFTYPAIVT 178
A+A+ P F +Y A+VAAGE +G L VL+RLADY E+R ++ +I A YP ++T
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 179 VIAFGIVTFLLSYVVPQVVNVFASTKQQLPVLTIVMMALSDFVRHWWWAILIGIAAVVYL 238
V+A +V+ LLS VVP+VV F KQ LP+ T V+M +SD VR + +L+ + A
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 239 VKATLSRDGPRLAFDRWLLTAPLAGKLVRGYNTVRFASTLGILTAAGVPILRALQAAGET 298
+ L ++ R++F R LL PL G++ RG NT R+A TL IL A+ VP+L+A++ +G+
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 299 LSNRAMRGNIDDAIVRVREGSALSRALNNVKTFPPVLVHLIRSGEATGDVTTMLDRAAEG 358
+SN R + A VREG +L +AL FPP++ H+I SGE +G++ +ML+RAA+
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 359 ESRELERRTMFLTSLLEPLLILAMGGIVLVIVLAVMLPIIELNNMV 404
+ RE + L EPLL+++M +VL IVLA++ PI++LN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18190BCTERIALGSPG1886e-65 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 188 bits (480), Expect = 6e-65
Identities = 67/140 (47%), Positives = 94/140 (67%), Gaps = 3/140 (2%)

Query: 10 QAARRQRGFTLIEIMVVVAILGILAALIVPKIMSRPDEARRIAAKQDIGTIMQALKLYRL 69
+A +QRGFTL+EIMVV+ I+G+LA+L+VP +M ++A + A DI + AL +Y+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 70 DNGRYPTQDQGLNALIQKPTTDPIPNNWKDGGYLERLPNDPWGNSYKYLNPGVHGEIDVF 129
DN YPT +QGL +L++ PT P+ N+ GY++RLP DPWGN Y +NPG HG D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 130 SYGADGKEGGESNDSDIGSW 149
S G DG+ G E DI +W
Sbjct: 122 SAGPDGEMGTE---DDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18195BCTERIALGSPH521e-10 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 51.9 bits (124), Expect = 1e-10
Identities = 20/101 (19%), Positives = 33/101 (32%), Gaps = 15/101 (14%)

Query: 48 RARGFTLLEMLVVLVIAGILVSVASLTLRRNPRTDLREEAQRIALLFETAGDEAQVRARP 107
R RGFTLLEM+++L++ G+ + L + + R +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQF 61

Query: 108 IAWRATEHGFRF---------------DIRTGDGWRPLRDD 133
++F D +G W PLR
Sbjct: 62 FGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAG 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18200BCTERIALGSPG300.001 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.001
Identities = 10/26 (38%), Positives = 18/26 (69%)

Query: 8 RSPARSRGFTMIEVLVALAIIAVALA 33
R+ + RGFT++E++V + II V +
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLAS 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18205BCTERIALGSPG333e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.3 bits (76), Expect = 3e-04
Identities = 17/72 (23%), Positives = 34/72 (47%), Gaps = 3/72 (4%)

Query: 33 RGFTLIEMMIAITILAVIA-ILSWRGLDQIIRGREKVAAAMEDERVFAQMFDQMRIDARR 91
RGFTL+E+M+ I I+ V+A ++ + + ++ A+ D D ++D
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQK--AVSDIVALENALDMYKLDNHH 65

Query: 92 AATDDEAGQPAV 103
T ++ + V
Sbjct: 66 YPTTNQGLESLV 77


83E4F39_18185E4F39_18210N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_181851141.653163flagellar motor switch protein FliM
E4F39_181900102.797078flagellar motor switch protein FliN
E4F39_181950113.591327flagellar biosynthetic protein FliO
E4F39_182000123.275957flagellar type III secretion system pore protein
E4F39_18205-3100.864230flagellar biosynthesis protein FliQ
E4F39_18210-3111.619598flagellar biosynthetic protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18280FLGMOTORFLIM2762e-93 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 276 bits (706), Expect = 2e-93
Identities = 82/324 (25%), Positives = 159/324 (49%), Gaps = 10/324 (3%)

Query: 5 EFMSQEEVDALLKGVTGEDDSADEPAEASG---IRPYNIATQERIVRGRMPGLEIINDRF 61
E +SQ+E+D LL ++ D S ++ S I Y+ ++ + +M L ++++ F
Sbjct: 3 EVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETF 62

Query: 62 ARLLRIGIFNFMRRTAEISVSQVKVQKYSEFTRNLPIPTNLNLVHVKPLRGTSLFVFDPN 121
ARL + +R + V+ V Y EF R++P P+ L ++ + PL+G ++ DP+
Sbjct: 63 ARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPS 122

Query: 122 LVFFVVDNLFGGDGRFHTRVEGRDFTATEQRIIGKLLNLVFEHYASAWKSVRPLQFEFVR 181
+ F ++D LFGG G+ RD T E ++ ++ + + +W V L+ +
Sbjct: 123 ITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQ 180

Query: 182 SEMHTQFANVATPNEIVIVTQFSIEFGPTGGTLHICMPYSMIEPIRDVLSSPIQGEAL-- 239
E + QFA + P+E+V++ + G G ++ C+PY IEPI LSS ++
Sbjct: 181 IETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 240 EVDRRWVRVLSQQVQSAEVELVADLAEVPTTFEKILNLRTGDVLPLD---ITDSITAKVD 296
+++ VL ++ + ++++VA++ + + IL LR GD++ L + D +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 297 GVPVMECGYGIFNGQYALRVQRMI 320
C G+ + A ++ I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18285FLGMOTORFLIN1343e-43 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 134 bits (338), Expect = 3e-43
Identities = 78/126 (61%), Positives = 97/126 (76%), Gaps = 3/126 (2%)

Query: 41 AMDD-WAAALAEQNQQPIETGATGAGVFRPLSKATASSTHNDIDLILDIPVKMTVELGRT 99
A+DD WA AL EQ ++ A VF+ L S DIDLI+DIPVK+TVELGRT
Sbjct: 14 ALDDLWADALNEQKATTTKSAADA--VFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRT 71

Query: 100 KIAIRNLLQLAQGSVVELDGLAGEPMDVLVNGCLIAQGEVVVVNDKFGIRLTDIITPSER 159
++ I+ LL+L QGSVV LDGLAGEP+D+L+NG LIAQGEVVVV DK+G+R+TDIITPSER
Sbjct: 72 RMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSER 131

Query: 160 IRKLNR 165
+R+L+R
Sbjct: 132 MRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18295FLGBIOSNFLIP291e-102 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 291 bits (747), Expect = e-102
Identities = 155/242 (64%), Positives = 192/242 (79%), Gaps = 1/242 (0%)

Query: 11 RWLPAILIGLAPALACAQAAGLPAFNSAPGPNGGTTYSLSVQTMLLLTMLSFLPAMLLMM 70
R L + L + A LP S P P GG ++SL VQT++ +T L+F+PA+LLMM
Sbjct: 3 RLLSVAPVLLW-LITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMM 61

Query: 71 TSFTRIIIVLSLLRQAIGTASTPPNQVLVGLALFLTLFVMSPVIDRAYNDAYKPFSEGTL 130
TSFTRIIIV LLR A+GT S PPNQVL+GLALFLT F+MSPVID+ Y DAY+PFSE +
Sbjct: 62 TSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKI 121

Query: 131 QMDQAVQRGTAPFKAFMLKQTRETDLALFAKISKAAPMQGPEDVPLSLLVPAFVTSELKT 190
M +A+++G P + FML+QTRE DL LFA+++ P+QGPE VP+ +L+PA+VTSELKT
Sbjct: 122 SMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKT 181

Query: 191 GFQIGFTIFIPFLIIDMVVASVLMSMGMMMVSPATVSLPFKLMLFVLVDGWQLLIGSLAQ 250
FQIGFTIFIPFLIID+V+ASVLM++GMMMV PAT++LPFKLMLFVLVDGWQLL+GSLAQ
Sbjct: 182 AFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQ 241

Query: 251 SF 252
SF
Sbjct: 242 SF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18300TYPE3IMQPROT694e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 68.6 bits (168), Expect = 4e-19
Identities = 26/85 (30%), Positives = 46/85 (54%)

Query: 4 ENVMTLAHQAMYIGLLLAAPLLLVALAVGLVVSLFQAATQINEATLSFIPKLLAVAATMV 63
++++ ++A+Y+ L+L+ +VA +GL+V LFQ TQ+ E TL F KLL V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLSTMIDYLRETLLRVATLG 88
+ W ++ Y R+ + G
Sbjct: 62 LLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18305TYPE3IMRPROT1617e-51 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 161 bits (409), Expect = 7e-51
Identities = 116/253 (45%), Positives = 158/253 (62%), Gaps = 4/253 (1%)

Query: 1 MFSVTYAQLNGWLTAFLWPFVRMLALVAIAPVTGHRSTPVRVKIGLAGFMALVVAPTLPP 60
M VT Q WL + WP +R+LAL++ AP+ RS P RVK+GLA + +AP+LP
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 MPPMPVATVFSAQGVWIIVNQFLIGAALGFTMQIVFAAIEAAGDIIGLSMGLGFATFFDP 120
VFS +W+ V Q LIG ALGFTMQ FAA+ AG+IIGL MGL FATF DP
Sbjct: 61 NDVP----VFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDP 116

Query: 121 HSSGATPVMGRFLNAVAILAFLAFDGHLQVFAALVDSFRLVPVSADLLRAAGWQTLVAFG 180
S PV+ R ++ +A+L FL F+GHL + + LVD+F +P+ + L + + L G
Sbjct: 117 ASHLNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAG 176

Query: 181 AAIFEMGLLLALPVVAALLIANLALGILNRAAPQIGIFQVGFPVTMLVGLLLVQLMAPNL 240
+ IF GL+LALP++ LL NLALG+LNR APQ+ IF +GFP+T+ VG+ L+ + P +
Sbjct: 177 SLIFLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLI 236

Query: 241 IPFVGRLFDTGVD 253
PF LF +
Sbjct: 237 APFCEHLFSEIFN 249


84E4F39_18240E4F39_18290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_182402141.120917sensor histidine kinase
E4F39_182452161.507726response regulator
E4F39_182501131.023802porin
E4F39_182550120.438384hypothetical protein
E4F39_182650100.072672site-specific DNA-methyltransferase
E4F39_18275212-1.466486restriction endonuclease subunit R
E4F39_18280212-1.086497hypothetical protein
E4F39_18290-1121.783810porin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18340PF06580543e-10 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 54.5 bits (131), Expect = 3e-10
Identities = 24/128 (18%), Positives = 45/128 (35%), Gaps = 22/128 (17%)

Query: 334 RIDLGAELDDDLQVAGSESLLSALLMNLVDNAVRYAHE----GGRVTVSARRDGDAVVLE 389
R+ +++ + + L+ LV+N +++ GG++ + +D V LE
Sbjct: 239 RLQFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLE 295

Query: 390 VVDDGPGIPAEARPHVFKRFYRVARDEEGTGLGLAIVEE-IAQSHGGAVSLATGPGNRGV 448
V + G +E TG GL V E + +G + V
Sbjct: 296 VENTGSLALKN--------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV 341

Query: 449 RMTVRLPA 456
V +P
Sbjct: 342 NAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18345HTHFIS963e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.1 bits (239), Expect = 3e-25
Identities = 30/119 (25%), Positives = 60/119 (50%), Gaps = 1/119 (0%)

Query: 2 KLLLVEDNAELAHWIVDLLRGEGFGVDSAPDGESADTVLKAQRYDALLLDMRLPGMSGKE 61
+L+ +D+A + + L G+ V + + + A D ++ D+ +P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LLARLRRRGDNVPVLMLTAHGSVDDKVDCFSAGADDYVVKPFESRELVARI-RALIRRQ 119
LL R+++ ++PVL+++A + + GA DY+ KPF+ EL+ I RAL +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18350ECOLNEIPORIN641e-13 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 64.1 bits (156), Expect = 1e-13
Identities = 56/228 (24%), Positives = 94/228 (41%), Gaps = 28/228 (12%)

Query: 1 MKRQYLALSIATAACAAPQAHAQSSVQLYGLIDLSVPTYRSHANAKGDHVIGMGLGGEPW 60
MK+ +AL++A AA + V LYG I V T RS A+ G + G
Sbjct: 1 MKKSLIALTLAALPVAA-----MADVTLYGTIKAGVETSRSVAH-NGAQAASVETGTGIV 54

Query: 61 FSGSRWGLKGAEDIGGGTKVIFRLESEYTVADGNMEDPGQIFDRDAWVGVENDTFGKLTA 120
GS+ G KG ED+G G K I+++E + ++A + +R +++G++ FGKL
Sbjct: 55 DLGSKIGFKGQEDLGNGLKAIWQVEQKASIAGTD----SGWGNRQSFIGLKGG-FGKLRV 109

Query: 121 GFQNTIARDAAAIYGDPYGSAKLTTEEGGWTNANNFKQMIFYAAGATGTRYNNGLAWKKL 180
G N++ +D I +P+ S RY++ +
Sbjct: 110 GRLNSVLKDTGDI--NPWDSKSDYLGVNKIAEPEARL---------ISVRYDS----PEF 154

Query: 181 FGNGIFASAGYAFSNSTSFGQNSTYQVALGYNGGPFNVSGFFSHVNHA 228
G+ S YA +++ + +Y Y G F V ++ H
Sbjct: 155 A--GLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKRHH 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_18375ECOLNEIPORIN745e-17 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 74.5 bits (183), Expect = 5e-17
Identities = 76/356 (21%), Positives = 117/356 (32%), Gaps = 75/356 (21%)

Query: 1 MKK--FAVAAAGLAVATGAHASDGSVTLFGLIDAGVSYVSNEGGKRNVYFDDGIAVPNLW 58
MKK A+ A L VA A VTL+G I AGV + +
Sbjct: 1 MKKSLIALTLAALPVAAMA-----DVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVD 55

Query: 59 -----GLRGTEDLGGGAKAIFELTSQYALGNGAALPTPGSMFSRTALVGLWSERLGSVTL 113
G +G EDLG G KAI+++ + T +R + +GL G + +
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVEQ-----KASIAGTDSGWGNRQSFIGLKGG-FGKLRV 109

Query: 114 GQQYDFMTDSLTFGSFDGAFRYGGLYNFRQGPFSKLGIPDNPTGSFDFDRLAGSSRVPNS 173
G+ + D+ +D Y G+ + P+ S
Sbjct: 110 GRLNSVLKDTGDINPWDSKSDYLGVNKIAE--------PEA---------------RLIS 146

Query: 174 VKYTSANLNGLVFGLMYGFGNQAGGGLAANSTVSAGLKYETGSFALGAAYVDVKYPQMNN 233
V+Y S GL + Y + AG + + AG Y+ G F + ++ Q+
Sbjct: 147 VRYDSPEFAGLSGSVQYALNDNAGRH--NSESYHAGFNYKNGGFFVQYGGAYKRHHQVQE 204

Query: 234 --GHDGLRNWGLGARYALSAFDLNL-LYTNTRNT--LTGAAIDVIQAGVRYVGAPWTIGA 288
+ + L + Y L + ++ + Q V A
Sbjct: 205 NVNIEKYQIHRLVSGY--DNDALYASVAVQQQDAKLVEENYSHNSQTEV---------AA 253

Query: 289 NYEYMKGNAQLDRNYAH----------------QVTAAAQYALSKRTSAYVETVYQ 328
Y GN +YAH QV A+Y SKRTSA V +
Sbjct: 254 TLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWL 309


85E4F39_18930E4F39_18970N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_18930-111-1.873050nucleoid occlusion factor SlmA
E4F39_18935-313-1.534728pyrimidine 5'-nucleotidase
E4F39_18945-311-0.298223acetylglutamate kinase
E4F39_18950-3101.112968hypothetical protein
E4F39_189552152.674375HAMP domain-containing histidine kinase
E4F39_189650111.789673response regulator transcription factor
E4F39_18970-2111.618956ATP-dependent protease ATPase subunit HslU
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19070HTHTETR575e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 5e-12
Identities = 33/184 (17%), Positives = 63/184 (34%), Gaps = 16/184 (8%)

Query: 24 ASRTRPKPGERRVHILQTLASMLEAPKSEKITTAALAARLDVSEAALYRHFSSKAQMFEG 83
A +T+ + E R HIL + + +A V+ A+Y HF K+ +F
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 84 LIEFIEETFFGLVNQIAANEPNGVLQA-RSIALMLLNFSAKNPGMTRVLTGEALVGEHER 142
+ E E L + A P L R I + +L + ++ + H+
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME----IIFHKC 117

Query: 143 LAERVNQMLERVEASIKQCLR---VALLEAQAHAAGGGAPPPVPLPDDYDPALRASLVIS 199
++++ + ++ L+ A LP D A ++
Sbjct: 118 EFVGEMAVVQQAQRNLCLESYDRIEQTLKH-CIEAKM-------LPADLMTRRAAIIMRG 169

Query: 200 YVLG 203
Y+ G
Sbjct: 170 YISG 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19080CARBMTKINASE445e-07 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 43.7 bits (103), Expect = 5e-07
Identities = 27/99 (27%), Positives = 48/99 (48%), Gaps = 6/99 (6%)

Query: 180 IPVISPIGFGEDGLSYNINADLVAGKLATVLNAEKLVMMTNIPGVMDKEG----NLLTDL 235
+PVI G G+ I+ DL KLA +NA+ +++T++ G G L ++
Sbjct: 197 VPVILEDG-EIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREV 255

Query: 236 SAREIDALFEDGT-ISGGMLPKISSALDAAKSGVKSVHI 273
E+ +E+G +G M PK+ +A+ + G + I
Sbjct: 256 KVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAII 294



Score = 36.7 bits (85), Expect = 8e-05
Identities = 21/60 (35%), Positives = 27/60 (45%), Gaps = 10/60 (16%)

Query: 31 GKTVVIKYGGNAMTEERLKQGF----------ARDVILLKLVGINPVIVHGGGPQIDQAL 80
GK VVI GGNA+ + K + AR + + G VI HG GPQ+ L
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19100HTHFIS889e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.6 bits (217), Expect = 9e-23
Identities = 30/127 (23%), Positives = 60/127 (47%)

Query: 1 MSDKNFLVIDDNEVFAGTLARGLERRGYAVRQAHNKDEALKLAGAEKFEFITVDLHLGND 60
M+ LV DD+ L + L R GY VR N + A + + D+ + ++
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGLSLIAPLCDLQPDARILVLTGYASIATAVQAVKDGADNYLAKPANVESILAALQTNAS 120
+ L+ + +PD +LV++ + TA++A + GA +YL KP ++ ++ + +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 EVQAEEA 127
E + +
Sbjct: 121 EPKRRPS 127



Score = 45.2 bits (107), Expect = 4e-08
Identities = 16/101 (15%), Positives = 32/101 (31%), Gaps = 3/101 (2%)

Query: 75 DARILVLTGYASIATAVQAVKDGADNYLAKPANVESILAALQTNASEVQAEEALENPVVL 134
I+ + I + L+ VE + + + L
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGL---YDR 431

Query: 135 SVDRLEWEHIQRVLAENNNNISATARALNMHRRTLQRKLAK 175
+ +E+ I L N A L ++R TL++K+ +
Sbjct: 432 VLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19105HTHFIS310.016 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.016
Identities = 13/68 (19%), Positives = 29/68 (42%), Gaps = 15/68 (22%)

Query: 17 IIGQAKAKKAVAVALRNRWRRQQVAEPLRQEITPKNILMIGPTGVGKTEIAR---RLAKL 73
++G++ A + + ++ + T +++ G +G GK +AR K
Sbjct: 139 LVGRSAAMQEI------YRVLARLMQ------TDLTLMITGESGTGKELVARALHDYGKR 186

Query: 74 ADAPFIKI 81
+ PF+ I
Sbjct: 187 RNGPFVAI 194


86E4F39_19130E4F39_19185N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_19130-114-0.088442flagellar hook-length control protein FliK
E4F39_191350150.815372flagellar export protein FliJ
E4F39_19140-2130.752895flagellar protein export ATPase FliI
E4F39_19145-2100.122788flagellar assembly protein FliH
E4F39_19150-111-0.313685flagellar motor switch protein FliG
E4F39_19155-211-1.096679flagellar basal body M-ring protein FliF
E4F39_19165014-1.104684flagellar hook-basal body complex protein FliE
E4F39_19170013-0.685655flagellar export chaperone FliS
E4F39_191751140.039248flagellar protein FliT
E4F39_191802121.250821flagellar hook-length control protein FliK
E4F39_191853121.272446export system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19240FLGHOOKFLIK733e-16 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 73.3 bits (179), Expect = 3e-16
Identities = 70/209 (33%), Positives = 95/209 (45%), Gaps = 7/209 (3%)

Query: 260 ANAAPPDASG-ALAALQDAADSARATLAASSAPAALQQAA-PAALAANASAAAASAAPSL 317
A P DA G L A++ S P+ + AA P AAP L
Sbjct: 172 TTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVL 231

Query: 318 APPVGTPDWTDALSQKVVFLSNAHQQSAELTLNPPDLGPLQVVLRVADNHAHALFVSQHA 377
+ P+G+ +W +LSQ + + QQSAEL L+P DLG +Q+ L+V DN A VS H
Sbjct: 232 SAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQ 291

Query: 378 QVRDAVEAALPKLREAMEAGGLGLGSASVSDGGFASAQQQQTPQRQSSDGSATRRAFGAS 437
VR A+EAALP LR + G+ LG +++S F+ QQ + Q+QS +A
Sbjct: 292 HVRAALEAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQR-TANHEPLAGE 350

Query: 438 TADAALDELAAASSGGAARRTVGMVDTFA 466
D L S VD FA
Sbjct: 351 DDDT----LPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19245FLGFLIJ602e-14 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 59.8 bits (144), Expect = 2e-14
Identities = 43/140 (30%), Positives = 74/140 (52%)

Query: 1 MAQSFPLQLLLERAQDDLDTAAKQLGRAQRERTDAQAQLDALMRYRDEYRVRFAESAQSG 60
MA+ L L + A+ +++ AA+ LG +R A+ QL L+ Y++EYR +G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MPAGNWRNFQAFLDTLDAAIEQQRRVLAAAQTRIDAARPEWQAKKRTLGSYEILQARGAR 120
+ + W N+Q F+ TL+ AI Q R+ L ++D A W+ KK+ L +++ LQ R +
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 QDAQRAAKREQRDADEHAAK 140
+ +Q+ DE A +
Sbjct: 121 AALLAENRLDQKKMDEFAQR 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19255FLGFLIH1083e-31 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 108 bits (271), Expect = 3e-31
Identities = 64/184 (34%), Positives = 106/184 (57%), Gaps = 4/184 (2%)

Query: 37 AAAALAAELQRVRDAAHAEGLAAGHVEGQALGYQAGYEQGRAKGFDEGRAEAHTHAAQLA 96
A +L +L +++ AH +G AG EG+ G++ GY++G A+G ++G AEA + A +
Sbjct: 36 AEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIH 95

Query: 97 A----LAASFRDALAGVERDLADDIATLALEIAQQVVRQHVQHDPAALIAAAREVLAAEP 152
A L + F+ L ++ +A + +ALE A+QV+ Q D +ALI +++L EP
Sbjct: 96 ARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEP 155

Query: 153 ALAGAPHLIVNPADLPVVEAYLKDELDTLGWSVRTDTSIERGGCRAHASTGEIDATLTTR 212
+G P L V+P DL V+ L L GW +R D ++ GGC+ A G++DA++ TR
Sbjct: 156 LFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATR 215

Query: 213 WERV 216
W+ +
Sbjct: 216 WQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19260FLGMOTORFLIG298e-102 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 298 bits (765), Expect = e-102
Identities = 114/324 (35%), Positives = 191/324 (58%)

Query: 5 GLNKSALLLMSIGEEEAAQVFKFLAPREVQKIGAAMAALKNVTREQVEDVLNDFVQEAEK 64
G K+A+LL+SIG E +++VFK+L+ E++ + +A L+ +T E ++VL +F +
Sbjct: 17 GKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMA 76

Query: 65 HTALSLDSSEYIRTVLTKALGEDKAGVLIDRILQGSDTSGIEGLKWMDSAAVAELIKNEH 124
+ +Y R +L K+LG KA +I+ + + E ++ D A + I+ EH
Sbjct: 77 QEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEH 136

Query: 125 PQIIATILVHLDRDQASEIASCFTERLRNDVLLRIATLDGIQPTALRELDDVLTGLLSGS 184
PQ IA IL +LD +AS I S ++ +V RIA +D P +RE++ VL L+
Sbjct: 137 PQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASL 196

Query: 185 DNLKRAPMGGIRTAAEILNFMTSVHEEAVIENVKQYDPDLAQKIIDQMFVFENLLDLEDR 244
+ GG+ EI+N E+ +IE++++ DP+LA++I +MFVFE+++ L+DR
Sbjct: 197 SSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDR 256

Query: 245 AIQLLLKEVESEALIIALKGAPPALRQKFLSNMSQRAAELLAEDLDARGPVRVSEVETQQ 304
+IQ +L+E++ + L ALK +++K NMS+RAA +L ED++ GP R +VE Q
Sbjct: 257 SIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQ 316

Query: 305 RKILQVVRNLAESGQIVIGGKAED 328
+KI+ ++R L E G+IVI E+
Sbjct: 317 QKIVSLIRKLEEQGEIVISRGGEE 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19265FLGMRINGFLIF468e-162 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 468 bits (1206), Expect = e-162
Identities = 254/562 (45%), Positives = 360/562 (64%), Gaps = 37/562 (6%)

Query: 53 LSRMKTNPRLPFLIGAALAIAAIVALVLWSRAPDYRVLYSNLSDRDGGAIIAALQQANVP 112
L+R++ NPR+P ++ + A+A +VA+VLW++ PDYR L+SNLSD+DGGAI+A L Q N+P
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 113 YKFADAGGAILVPANQVHETRLKLAAMGLPKGGSVGFELMDNQKFGISQFAEQVNYQRAL 172
Y+FA+ GAI VPA++VHE RL+LA GLPKGG+VGFEL+D +KFGISQF+EQVNYQRAL
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 173 EGELQRTVESINAVRAARVHLAIPKPSVFVRDREAPSASVLVDLYPGRVLDEGQVLAVTR 232
EGEL RT+E++ V++ARVHLA+PKPS+FVR++++PSASV V L PGR LDEGQ+ AV
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 233 MVSSSVPDMPAKNVTIVDQDGNLLTQT-ASATGLDASQLKYVQQIERNTQKRIDAILAPI 291
+VSS+V +P NVT+VDQ G+LLTQ+ S L+ +QLK+ +E Q+RI+AIL+PI
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 292 FGAGNARSQVSADVDFSKIEQTSESYGPNGTPQQSAIRSQQTSSSTELAQSGASGVPGAL 351
G GN +QV+A +DF+ EQT E Y PNG ++ +RS+Q + S ++ GVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 352 SNTPPQPASAPIVA-------------SNGQPAGPAATPVSDRKDSTTNYELDKTVRHVE 398
SN P P API ++ +A P S +++ T+NYE+D+T+RH +
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTK 375

Query: 399 QSMGTIKRLSVAVVVNYQPSTDAKGRVTMQPLAADKLAQVQQLVKDAMGYDEKRGDSVNV 458
++G I+RLSVAVVVNY+ D K PL AD++ Q++ L ++AMG+ +KRGD++NV
Sbjct: 376 MNVGDIERLSVAVVVNYKTLADGKP----LPLTADQMKQIEDLTREAMGFSDKRGDTLNV 431

Query: 459 VNSAFSAAADPFANLPWWRQPDMIELGKDIAKWLGVAAAAAALYFMFVRPALRR---AFP 515
VNS FSA + LP+W+Q I+ +WL V A L+ VRP L R
Sbjct: 432 VNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAK 491

Query: 516 PPAEPAAAAVPALDGPDDMLALDGLPSPDKKQLAEEDEEHPALLAFENERNRYERNLDYA 575
E A + + L+ D + N+R E
Sbjct: 492 AAQEQAQVRQETEEAVEVRLSKDEQLQQRR----------------ANQRLGAEVMSQRI 535

Query: 576 RTIARQDPKIVATVVKNWVSDE 597
R ++ DP++VA V++ W+S++
Sbjct: 536 REMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19270FLGHOOKFLIE619e-16 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 61.2 bits (148), Expect = 9e-16
Identities = 47/111 (42%), Positives = 62/111 (55%), Gaps = 8/111 (7%)

Query: 3 APVNGIASALQQMQAMAAQAAGGASPATSLAGSGAASAGSFASAMKASLDKISGDQQKAL 62
+ + GI + Q+QA A A S SFA + A+LD+IS Q A
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQES--------LPQPTISFAGQLHAALDRISDTQTAAR 52

Query: 63 GEAHAFEIGAQNVSLNDVMVDMQKANIGFQFGLQVRNKLVSAYNEIMQMSV 113
+A F +G V+LNDVM DMQKA++ Q G+QVRNKLV+AY E+M M V
Sbjct: 53 TQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19290TYPE3IMSPROT624e-15 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 62.5 bits (152), Expect = 4e-15
Identities = 17/81 (20%), Positives = 32/81 (39%), Gaps = 1/81 (1%)

Query: 10 AVLAYDAKGGDTAPRVVAKGYGLVAERIIERARDAGLYVHTAPEMV-SLLMQVDLDARIP 68
A+ +G P V K + + + A + G+ + + +L +D IP
Sbjct: 268 AIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEEGVPILQRIPLARALYWDALVDHYIP 327

Query: 69 PQLYQAVAELLAWLYALERDA 89
+ +A AE+L WL +
Sbjct: 328 AEQIEATAEVLRWLERQNIEK 348


87E4F39_19380E4F39_19430N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
E4F39_193802121.292142flagellar basal body rod protein FlgC
E4F39_193851111.813598flagellar hook assembly protein FlgD
E4F39_19390-1102.467410flagellar hook protein FlgE
E4F39_19395-1103.390180flagellar basal-body rod protein FlgF
E4F39_194000113.451113flagellar basal-body rod protein FlgG
E4F39_19405-2113.250986flagellar basal body L-ring protein FlgH
E4F39_19410-1103.568488flagellar basal body P-ring protein FlgI
E4F39_19415-1111.965309flagellar assembly peptidoglycan hydrolase FlgJ
E4F39_194200112.519285flagellar brake protein
E4F39_194250101.932941flagellar hook-associated protein FlgK
E4F39_194301112.134904flagellar hook-associated protein 3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19500FLGHOOKAP1270.029 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 26.8 bits (59), Expect = 0.029
Identities = 10/38 (26%), Positives = 17/38 (44%)

Query: 102 NVDPVQEMVNMISASRSYQANVETLNTAKQLMLKTLTI 139
V+ +E N+ + Y AN + L TA + + I
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19510FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 17/58 (29%), Positives = 24/58 (41%)

Query: 356 ISAPGSTNHGTLQGSALENSNVDLTSQLVKLITAQRNYQANAQTIKTQQTVDQTLINL 413
SA L S V+L + L Q+ Y ANAQ ++T + LIN+
Sbjct: 488 SSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 30.3 bits (68), Expect = 0.017
Identities = 11/31 (35%), Positives = 17/31 (54%)

Query: 6 GLSGLAGASSDLDVIGNNIANANTVGFKGST 36
+SGL A + L+ NNI++ N G+ T
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19515FLGHOOKAP1290.018 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.018
Identities = 9/34 (26%), Positives = 18/34 (52%)

Query: 4 LIYTAMTGATQSLEQQSVVANNLANASTTGFRAQ 37
LI AM+G + + +NN+++ + G+ Q
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQ 36


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19520FLGHOOKAP1421e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 1e-06
Identities = 10/48 (20%), Positives = 23/48 (47%)

Query: 213 TLKQGYVESSNVNVVQELVNMIQTQRAYEINSKAVTTSDQMLQTVTQM 260
L S VN+ +E N+ + Q+ Y N++ + T++ + + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 40.3 bits (94), Expect = 5e-06
Identities = 19/80 (23%), Positives = 34/80 (42%), Gaps = 14/80 (17%)

Query: 4 SLYIAATGMNAQQAQMDVISNNLANVSTNGFKGSRAVFEDLLYQTVRQPGANSTQQTELP 63
+ A +G+NA QA ++ SNN+++ + G+ RQ + + L
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYT--------------RQTTIMAQANSTLG 48

Query: 64 SGLQLGTGVQQVATERLYTQ 83
+G +G GV +R Y
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19525FLGLRINGFLGH2063e-69 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 206 bits (526), Expect = 3e-69
Identities = 128/222 (57%), Positives = 156/222 (70%), Gaps = 7/222 (3%)

Query: 25 AALAAAALALAGCAQIPREPITQQPMSAMPPMPPAMQAPGSIY---NPGYAG-RPLFEDQ 80
A + L+L GCA IP P+ Q SA P P A GSI+ P G +PLFED+
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNVGDILTIVIAENINATKSSGANTNRQGNTSFDVPTAG-FLGGLF--NKANLSAQGA 137
RPRN+GD LTIV+ EN++A+KSS AN +R G T+F T +L GLF +A++ A G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 138 NKFAATGGASAANTFNGTITVTVTNVLPNGNLVVSGEKQMLINQGNEFVRFSGIVNPNTI 197
N F GGA+A+NTF+GT+TVTV VL NGNL V GEKQ+ INQG EF+RFSG+VNP TI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 198 SGQNSVYSTQVADARIEYSAKGYINEAETMGWLQRFFLNIAP 239
SG N+V STQVADARIEY GYINEA+ MGWLQRFFLN++P
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19530FLGPRINGFLGI363e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 363 bits (934), Expect = e-127
Identities = 158/367 (43%), Positives = 216/367 (58%), Gaps = 19/367 (5%)

Query: 4 LAFAPAAARAERLKDLAQIQGVRDNPLIGYGLVVGLDGTGDQTMQTPFTTQTLANMLANL 63
L+ PA A R+KD+A +Q RDN LIGYGLVVGL GTGD +PFT Q++ ML NL
Sbjct: 19 LSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNL 78

Query: 64 GISINNGSANGGGSSAMTNMQLKNVAAVMVTATLPPFARPGEAIDVTVSSLGNAKSLRGG 123
GI+ G +N KN+AAVMVTA LPPFA PG +DVTVSSLG+A SLRGG
Sbjct: 79 GITTQGGQSN-----------AKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGG 127

Query: 124 TLLLTPLKGADGQVYALAQGNMAVGGAGASANGSRVQVNQLAAGRIAGGAIVERSVPNAV 183
L++T L GADGQ+YA+AQG + V G A + + + + R+ GAI+ER +P+
Sbjct: 128 NLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKF 187

Query: 184 AQMNGVLQLQLNDMDYGTAQRIVSAVNS----SFGAGTATALDGRTIQLTAPADSAQQVA 239
L LQL + D+ TA R+ VN+ +G A D + I + P +
Sbjct: 188 KDSV-NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVA-DLTR 245

Query: 240 FMARLQNLEVSPERAAAKVILNARTGSIVMNQMVTLQNCAVAHGNLSVVVNTQPVVSQPG 299
MA ++NL V + AKV++N RTG+IV+ V + AV++G L+V V P V QP
Sbjct: 246 LMAEIENLTVETD-TPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPA 304

Query: 300 PFSNGQTVVAQQSQIQLKQDNGSLRMVTAGANLAEVVKALNSLGATPADLMSILQAMKAA 359
PFS GQT V Q+ I Q+ + + G +L +V LNS+G +++ILQ +K+A
Sbjct: 305 PFSRGQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSA 363

Query: 360 GALRADL 366
GAL+A+L
Sbjct: 364 GALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19535FLGFLGJ2274e-75 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 227 bits (579), Expect = 4e-75
Identities = 124/297 (41%), Positives = 172/297 (57%), Gaps = 15/297 (5%)

Query: 15 ALDVQGFDALRSKATAAAPREGVKMVAGQFDAMFTQMMLKSMRDATPSDGLLDSSSSKMY 74
A D Q + L++KA P ++ VA Q + MF QMMLKSMRDA P DGL S +++Y
Sbjct: 12 AWDAQSLNELKAKA-GEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEHTRLY 70

Query: 75 TSMLDQQLAQQMSS-KGIGVADALTKQLLRNANVAPDAQGEGGLAAMNALAKAYANSNGA 133
TSM DQQ+AQQM++ KG+G+A+ + KQ+ + ++ + Y N +
Sbjct: 71 TSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETVVRYQNQALS 130

Query: 134 PGNGALAGTRGYSAASALTPPLKGNGNSAQADAFVEKMALAAQAASATTGIPARFIVGQA 193
P + + AF+ +++L AQ AS +G+P I+ QA
Sbjct: 131 ------------QLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 194 ALESGWGKREIRGANGESSYNVFGIKATKGWTGRTVSAVTTEYVNGRPHRVVAQFRAYDS 253
ALESGWG+R+IR NGE SYN+FG+KA+ W G TTEY NG +V A+FR Y S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 254 YEHAMTDYANLLKNNPRYASVLNAGHNAEGFAHGMQKAGYATDPHYAKKLISIMQQI 310
Y A++DY LL NPRYA+V A +AE A +Q AGYATDPHYA+KL +++QQ+
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAA-SAEQGAQALQDAGYATDPHYARKLTNMIQQM 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19545FLGHOOKAP12314e-70 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 231 bits (591), Expect = 4e-70
Identities = 162/444 (36%), Positives = 253/444 (56%), Gaps = 12/444 (2%)

Query: 3 NTLMNLGVSGLNAALWGLTTTGQNISNAATPGYSVERPVYAEASGQYTSSGYLPQGVSTV 62
++L+N +SGLNAA L T NIS+ GY+ + + A+A+ + G++ GV
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 63 TVERQYNQYLSNQLNAAQTQGSSLSTYYTLVAQLNNYVGSPTAGIATAITNYFTGLQTVA 122
V+R+Y+ +++NQL AAQTQ S L+ Y +++++N + + T+ +AT + ++FT LQT+
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 123 NNAADPSARQTAMSNAQTLASQLVAAGQQYSQLRQSVNSQLTDTVTQINSYTSQIAQLNE 182
+NA DP+ARQ + ++ L +Q Q + VN + +V QIN+Y QIA LN+
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 183 QIA--SASSQGQPPNQLLDQRDLAVSKLSQLAGVQV-VQSNGNYSVFLSGGQPLVVGNAS 239
QI+ + G PN LLDQRD VS+L+Q+ GV+V VQ G Y++ ++ G LV G+ +
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 240 YQLATVASPSDPSELTI-VSKGVAGSAQPGPTQYLPDVSLTGGALGGLLAFRSQTLDPAQ 298
QLA V S +DPS T+ G AG+ + +P+ L G+LGG+L FRSQ LD +
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIE------IPEKLLNTGSLGGILTFRSQDLDQTR 294

Query: 299 AQLGALAVSFASQVNAQNALGVDMSGNPGGSLFAVGAPAVYANQNNTGSATLSVSFVDGT 358
LG LA++FA N Q+ G D +G+ G FA+G PAV N N G + + D +
Sbjct: 295 NTLGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDAS 354

Query: 359 QPTTSDYALSYDGAKYTLTDRATGSVVGTATPSSTPPTMTIGGLKLSLSSTPNAGDSFTV 418
+DY +S+D ++ +T R + T TP + + GL+L+ + TP DSFT+
Sbjct: 355 AVLATDYKISFDNNQWQVT-RLASNTTFTVTPDAN-GKVAFDGLELTFTGTPAVNDSFTL 412

Query: 419 LPTRGALDGFSLATANGSAIAAAS 442
P A+ + + + IA AS
Sbjct: 413 KPVSDAIVNMDVLITDEAKIAMAS 436



Score = 83.1 bits (205), Expect = 9e-19
Identities = 46/105 (43%), Positives = 66/105 (62%)

Query: 561 GTNDGRNALALSQLVNSKTMNNGTTTLTGAYAGYVNAIGNAASQLKASSAAQTALVGQIT 620
G +D RN AL L ++ G + AYA V+ IGN + LK SSA Q +V Q++
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 621 QAQQSVSGVNQNEEAANLMQYQQLYQANAKVIQTANSVFQTVLGL 665
QQS+SGVN +EE NL ++QQ Y ANA+V+QTAN++F ++ +
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
E4F39_19550FLAGELLIN416e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.2 bits (96), Expect = 6e-06
Identities = 55/369 (14%), Positives = 113/369 (30%), Gaps = 10/369 (2%)

Query: 16 MNDQQAQIAQLYQQVSSGISLTTPADNPLAAAQAVQLSATSATLAQYTQNQTIVQTALQT 75
+N Q+ ++ +++SSG+ + + D+ A A + ++ L Q ++N + QT
Sbjct: 17 LNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQT 76

Query: 76 EDTTLTSVNDVLNAAYQALMHAGDGGLSDSDRAALAAQIQGSRDHLLTLANTADGAGNYL 135
+ L +N+ L + + A +G SDSD ++ +IQ + + ++N G +
Sbjct: 77 TEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKV 136

Query: 136 FAGFQPTTQPFSNKPGGGVTY------AGDYGARAVQIADTRTVSQGDNGANVFMSVPFL 189
+ G +T G + + + GD ++ +
Sbjct: 137 LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYD 196

Query: 190 GSLPVPAAGASNTGTGTIGAVSITNPSDPTNTHQFTITFGGTAAAPTYTVTDNSVTPPTT 249
+ +G + + T A T D T +T
Sbjct: 197 TYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKST 256

Query: 250 TAAQAYSSGQGINLGGQTVAVSGKPAVGDTFTVTPAPQAGTDVFATLD----TVIAALKS 305
+ G GG+ V T V T++ T+ A +
Sbjct: 257 AGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADIT 316

Query: 306 PVGNSQTASTALTNTMATASTKLMNTMTNVLTVQASVGGRLQEVKAMQAVTTTNTLQTTN 365
+ A+T ++ S + T S E + T+
Sbjct: 317 AGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAE 376

Query: 366 SLSNLTDTN 374
+N
Sbjct: 377 YTANAAGDK 385



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.