PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomePhong.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in Phong_contig000001 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1Phong_00016Phong_00027Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_000162182.894475Glutamyl-tRNA(Gln) amidotransferase subunit C
Phong_000171173.559076Bacterial regulatory proteins, tetR family
Phong_000181153.980443Beta-lactamase HcpA precursor
Phong_000190174.510222Uracil-DNA glycosylase
Phong_00020-1154.218756Putative Holliday junction resolvase
Phong_00021-1154.043953Putative acyl-CoA dehydrogenase AidB
Phong_00022-1303.094229Aspartate carbamoyltransferase
Phong_000230293.260864Dihydroorotase
Phong_000240292.989383putative glycerol-3-phosphate acyltransferase
Phong_000250293.518302hypothetical protein
Phong_000261323.671230hypothetical protein
Phong_000271323.486434DNA topoisomerase 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00017HTHTETR417e-07 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 40.8 bits (95), Expect = 7e-07
Identities = 19/99 (19%), Positives = 34/99 (34%), Gaps = 6/99 (6%)

Query: 1 MPRP---QSSDLSQRLYRSALHHLARHGYAACKLAAITLEAKTSKQAIYRRWPHGKQQLA 57
M R ++ + Q + AL ++ G ++ L I A ++ AIY + K L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFK-DKSDLF 59

Query: 58 IEALRFGFSRVPYQAAGSASFDEELDTSYAALQHALTHT 96
E S + + + L+ L H
Sbjct: 60 SEIWELSESNI--GELELEYQAKFPGDPLSVLREILIHV 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00024PF04183280.026 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 28.3 bits (63), Expect = 0.026
Identities = 27/164 (16%), Positives = 52/164 (31%), Gaps = 28/164 (17%)

Query: 49 NIGTTNVLRTGK-KSLTALTLLCDTLKATLS---------VLIIAEWAPLGASNPLAIYL 98
I T+ R + + A L L+ + +I+ E A S+ Y
Sbjct: 277 TIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEG--YA 334

Query: 99 PLIAGGFAFLGHLYPIW----LKFNGGKGVATYLGVLFAVYWPMALGFIALWISTALITR 154
L + + L IW ++ + L +I + +
Sbjct: 335 ALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECD-ENNQPLAGAYIDRSGLDA 393

Query: 155 FSSLSALLTSLLTPFILWVFNQYETAGLMALMTLILWMKHHENI 198
+ L+ L ++ P + + +Y A L+A H +NI
Sbjct: 394 ETWLTQLFRVVVVP-LYHLLCRYGVA-LIA---------HGQNI 426


2Phong_00044Phong_00061Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00044-122-4.048643ATP-dependent Clp protease adapter protein ClpS
Phong_00045-124-4.054474Phasin protein
Phong_00046-123-3.735679D-alanyl-D-alanine carboxypeptidase DacF
Phong_00047-127-4.545251DnaJ-like protein MG200
Phong_00048-320-3.747086hypothetical protein
Phong_00049-317-2.815669Doubled CXXCH motif (Paired_CXXCH_1)
Phong_00050-212-1.749073hypothetical protein
Phong_00051-213-1.618968hypothetical protein
Phong_00052-112-1.597513ATP-dependent Clp protease proteolytic subunit
Phong_00053-216-2.988902Cyclic di-GMP phosphodiesterase Gmr
Phong_00054120-4.040858ACT domain protein
Phong_00055119-4.605544Magnesium transporter MgtE
Phong_00056-124-3.800093hypothetical protein
Phong_00057-122-2.934602hypothetical protein
Phong_00058-117-2.007399hypothetical protein
Phong_000591161.276685hypothetical protein
Phong_000612171.261734*Flagellar motor switch protein FliN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00046BLACTAMASEA424e-06 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 41.7 bits (98), Expect = 4e-06
Identities = 30/127 (23%), Positives = 55/127 (43%), Gaps = 17/127 (13%)

Query: 34 IVVDAKTGKVLYAENANARRYPASTTKIMTLFVLFEELEAGRLSLNTRMKVSR-----YA 88
I +D +G+ L A A+ R ST K++ + ++AG L ++ + Y+
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYS 102

Query: 89 AGRPPTKLYLKAGSTLRVKDAIYALITKSANDASTVIAEHISGSEAAFAKRMTSTARAIG 148
P ++ +L G T+ + A IT S N A+ ++ + G +T+ R IG
Sbjct: 103 ---PVSEKHLADGMTV--GELCAAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIG 152

Query: 149 MRNTTFR 155
+ R
Sbjct: 153 --DNVTR 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00049SYCDCHAPRONE300.021 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.9 bits (67), Expect = 0.021
Identities = 23/120 (19%), Positives = 49/120 (40%), Gaps = 3/120 (2%)

Query: 652 ERFDKEISTLEQGIASQPESSALRYAYALALIRRQRYEPAVEQLERATDISPRDAQGWFV 711
E F K T+ ++ Y+ A + +YE A + + + D++ +
Sbjct: 16 ESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLG 75

Query: 712 LGLAEQRTGR-GTGIHALRKAAELSA-DPQYLYAYCESLLKARH-SESRDCVDKLREVVG 768
LG Q G+ IH+ A + +P++ + E LL+ +E+ + +E++
Sbjct: 76 LGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIA 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00061FLGMOTORFLIN724e-20 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 72.2 bits (177), Expect = 4e-20
Identities = 30/75 (40%), Positives = 49/75 (65%)

Query: 6 DISVEISVVLGQAEMPIHQLLRMGRGGVIGLDAHESDDVLILANKTPIARGQVMVIGEKV 65
DI V+++V LG+ M I +LLR+ +G V+ LD + + IL N IA+G+V+V+ +K
Sbjct: 59 DIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKY 118

Query: 66 GVRITEVLERGPEVR 80
GVRIT+++ +R
Sbjct: 119 GVRITDIITPSERMR 133


3Phong_00275Phong_00288Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_002752242.732615hypothetical protein
Phong_002763242.977291Enamine/imine deaminase
Phong_002772253.138861HTH-type transcriptional regulator RpiR
Phong_002782253.4121502,5-dichloro-2,5-cyclohexadiene-1,4-diol
Phong_002792243.131579Beta-hexosaminidase
Phong_002800141.786764hypothetical protein
Phong_002810120.501276Pseudouridine kinase
Phong_00282-117-0.295301Pseudouridine-5'-phosphate glycosidase
Phong_00283021-0.917040hypothetical protein
Phong_00284022-1.115777hypothetical protein
Phong_00285023-1.007948Blue-light-activated protein
Phong_00286-128-1.607845Flagellar biosynthetic protein FlhB
Phong_00287218-0.980549flagellar biosynthesis protein FliR
Phong_00288215-0.157810Flagellar biosynthetic protein FliQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00278DHBDHDRGNASE972e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 97.4 bits (242), Expect = 2e-26
Identities = 71/251 (28%), Positives = 115/251 (45%), Gaps = 12/251 (4%)

Query: 9 IAIITGAAGDIGRAIAAALTSD-YHLALVDIDADALEGAGKTLATQGAKLSLHRCDLTCP 67
IA ITGAA IG A+A L S H+A VD + + LE +L + D+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 68 ADLTALKEQL-APLGRVSLLVNNAGAAAALSLQELTAEQLRQDLALNLEAATNIFKTFED 126
A + + ++ +G + +LVN AG + L+ E+ ++N N ++
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 127 DLK--QGGNLINIAS-VNGMAVFGHPAYSMAKAGLIHFTRLVATEYGKYGLRANTVAPGT 183
+ + G+++ + S G+ AY+ +KA + FT+ + E +Y +R N V+PG+
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 184 VRT----QAWNARAAANPKVFEEAAYW---YPLGRVAEPEDVANAVAFLASPAANAITGV 236
T W A + + PL ++A+P D+A+AV FL S A IT
Sbjct: 190 TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMH 249

Query: 237 CLPVDCGLTSG 247
L VD G T G
Sbjct: 250 NLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00282PF06057310.004 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 31.3 bits (71), Expect = 0.004
Identities = 19/76 (25%), Positives = 33/76 (43%), Gaps = 10/76 (13%)

Query: 146 RTPVCVIAAGAKALLDLPKTM-EVLETRGVPVISYGVDELPAFWSRTSGIKAPTRLDSAA 204
+ P+ + +G L K + +L+ +G PV+ G L +W + K P D
Sbjct: 50 KPPLVIFLSGDGGWATLDKAVGGILQQQGWPVV--GWSSLKYYWKQ----KDPK--DVTQ 101

Query: 205 DIAKLL-KARGEFGGH 219
D ++ K + EFG
Sbjct: 102 DTLAIIDKYQAEFGTQ 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00285HTHFIS912e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-21
Identities = 41/125 (32%), Positives = 65/125 (52%), Gaps = 12/125 (9%)

Query: 726 ARILLVEDEEAVRAFAARALSSRGYEVFEAGSGTEALEVMEEEGGAMDLVVSDVVMPEMD 785
A IL+ +D+ A+R +ALS GY+V + + G DLVV+DVVMP+ +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG--DLVVTDVVMPDEN 61

Query: 786 GPSLLVELRKMQPDLKIIFVSGYAE-----EAFEKNLPAEETFNFLPKPFTLKQLATTVK 840
LL ++K +PDL ++ +S +A E + +++LPKPF L +L +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASE-----KGAYDYLPKPFDLTELIGIIG 116

Query: 841 EVLAE 845
LAE
Sbjct: 117 RALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00286TYPE3IMSPROT318e-109 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 318 bits (816), Expect = e-109
Identities = 103/346 (29%), Positives = 184/346 (53%), Gaps = 5/346 (1%)

Query: 8 SEKTEEPTQKRLDDAHEKGDVPKSQEVSTWFAVLGTAMAMTFLADRTAGSLSGLLENFFD 67
EKTE+PT K++ DA +KG V KS+EV + ++ + + L+D S L+ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 68 KAGQMRLDGPALVLLWRDLGPALLAIIAAP-MAVMLLFAMAGNLIQHKPIWSSERLKPKL 126
Q L + D + P + V L A+A +++Q+ + S E +KP +
Sbjct: 63 ---QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 127 EKVSPLKGLKRLFSQESLANFLKGLVKILVVAGLMGLVLWPQRRVLDTLVFRDLRLLLEE 186
+K++P++G KR+FS +SL FLK ++K+++++ L+ +++ L L + +
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 187 TKELSLQLIGAILAVMTVIAGVDYLWQRQKWFKKQRMTRREIKEEYKQAEGDPQIKAKIR 246
++ QL+ VI+ DY ++ ++ K+ +M++ EIK EYK+ EG P+IK+K R
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 247 ELRVMRSRHRMMSAVPEATVVIANPTHYAVALHYEEG-MGAPKCVAKGVDALALRIRRVA 305
+ M V ++VV+ANPTH A+ + Y+ G P K DA +R++A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 306 EDAQVPVVENPPLARTLHAAMDLDELVPEEHFEAVAKVIGYVMQLK 351
E+ VP+++ PLAR L+ +D +P E EA A+V+ ++ +
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00287TYPE3IMRPROT1039e-29 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 103 bits (259), Expect = 9e-29
Identities = 60/242 (24%), Positives = 112/242 (46%), Gaps = 6/242 (2%)

Query: 14 VFFLVFARVGTMIMLVPVFGERAIPVRIRLIVALMLCYVLYPLV-QPLYPLGELQQLPVL 72
++F RV +I P+ ER++P R++L +A+M+ + + P + P+ L +
Sbjct: 15 LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLA 74

Query: 73 LRYLFLELVLGFFVGLGMRLIAYALQTAGAIIANQSGLAFAMGGDMTNFGEQGAMVGSFL 132
++ ++++G +G M+ A++TAG II Q GL+FA D + ++ +
Sbjct: 75 VQ----QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPAS-HLNMPVLARIM 129

Query: 133 AMLGTTLVLATNLHYVIIAGIHDSFTLFPPGRILPVGDMAHMAVDTVSHVFVIAAHIGAP 192
ML L L N H +I+ + D+F P G + S +F+ + P
Sbjct: 130 DMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALP 189

Query: 193 FILFGLVFYFGLGLLNKLMPQLQIFFVAMPVAILAGFTLLMLLLSTLMGWYLQQYEAVLS 252
I L LGLLN++ PQL IF + P+ + G +L+ L+ + + + + +
Sbjct: 190 LITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249

Query: 253 PF 254

Sbjct: 250 LL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00288TYPE3IMQPROT584e-15 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 58.2 bits (141), Expect = 4e-15
Identities = 25/82 (30%), Positives = 49/82 (59%)

Query: 5 EVLDIASSGIWTLIKLSVPVLLVGLVVGVIVALFQALTQIQEQTLVFVPKIISIFLALLV 64
+++ + ++ ++ LS +V ++G++V LFQ +TQ+QEQTL F K++ + L L +
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 AFPFMGSVLGSFTEQISQLIVS 86
+ G VL S+ Q+ L ++
Sbjct: 63 LSGWYGEVLLSYGRQVIFLALA 84


4Phong_00305Phong_00310Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00305220-1.952851Flagellar P-ring protein precursor
Phong_00306221-3.279398chemotactic signal-response protein CheL
Phong_00307219-3.160718hypothetical protein
Phong_00308216-3.584813hypothetical protein
Phong_00309118-3.233743Flagellin
Phong_00310-117-3.120841Flagellin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00305FLGPRINGFLGI383e-135 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 383 bits (986), Expect = e-135
Identities = 187/369 (50%), Positives = 252/369 (68%), Gaps = 9/369 (2%)

Query: 10 RTFKALLLLLLLPYLIHPANSA--SRIKDIADFEGIRENQLIGYGLVVGLNGSGDGLNNA 67
R A L+ LP+L P A SRIKDIA + R+NQLIGYGLVVGL G+GD L ++
Sbjct: 5 RIIAAALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSS 64

Query: 68 PFTRQSLQAMLERLGVNTNELDLNTKNAAAVMVTANLPPFSTQGSRIDVAVSALGDASSL 127
PFT QS++AML+ LG+ T N KN AAVMVTANLPPF++ GSR+DV VS+LGDA+SL
Sbjct: 65 PFTEQSMRAMLQNLGITTQGGQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSL 124

Query: 128 QGGTLLVTPLIGANGETYAIAQGPVTVAGFEASGDAASITRGVPTSGRVANGGLVEKEVD 187
+GG L++T L GA+G+ YA+AQG + V GF A GDAA++T+GV TS RV NG ++E+E+
Sbjct: 125 RGGNLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELP 184

Query: 188 FKLASLDTLRIALRNPDLTTTRRVALAINEL----IGLPTAEPLDSATVRISLPRLYDGN 243
K L + LRNPD +T RVA +N G P AEP DS + + PR+ D
Sbjct: 185 SKFKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVAD-- 242

Query: 244 IVDLLTDIEQLVIEPDMPARIVIDESSGIIVMGKEVKVSTVAVAQGNLTVTIAEAPQVSQ 303
+ L+ +IE L +E D PA++VI+E +G IV+G +V++S VAV+ G LTV + E+PQV Q
Sbjct: 243 LTRLMAEIENLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQ 302

Query: 304 PDPFSLGETTQVPRTNVSVDEDNSHLAIVSEAVTLQQLVDGLNALGISPRDLIAILQAIK 363
P PFS G+T P+T++ ++ S +AI E L+ LV GLN++G+ +IAILQ IK
Sbjct: 303 PAPFSRGQTAVQPQTDIMAMQEGSKVAI-VEGPDLRTLVAGLNSIGLKADGIIAILQGIK 361

Query: 364 AAGALQAEI 372
+AGALQAE+
Sbjct: 362 SAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00306FLGFLGJ412e-07 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 40.9 bits (95), Expect = 2e-07
Identities = 21/79 (26%), Positives = 39/79 (49%), Gaps = 1/79 (1%)

Query: 22 PTNNLDKKAKEFEAVYLNQLMQSMFSGLQEGGTYGSGPGSDAWRSMLLGEYANQLAQSGG 81
P N+ A++ E +++ +++SM L + G + S + + SM + A Q+ G
Sbjct: 29 PAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSS-EHTRLYTSMYDQQIAQQMTAGKG 87

Query: 82 IGLAETIKAQMLEIQEANQ 100
+GLAE + QM Q +
Sbjct: 88 LGLAEMMVKQMTPEQPLPE 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00309FLAGELLIN569e-11 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 56.2 bits (135), Expect = 9e-11
Identities = 42/313 (13%), Positives = 83/313 (26%), Gaps = 5/313 (1%)

Query: 14 LSTLQSTATLISSTQERLSTGKKVNSALDNPNSYFTAASLNNRASDLSNLQDDMGQSVST 73
+ L + + +SS ERLS+G ++NSA D+ A + L+ + +S
Sbjct: 14 QNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISI 73

Query: 74 LTAADKGIKAISKLIDAAKGKANQALQ-TEDVSQRNKYAKEFNDLRTQIQDLAKDSGYKG 132
+ + I+ + + + QA T S E +I ++ + + G
Sbjct: 74 AQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNG 133

Query: 133 KNLLGGDGN-DLTVKFNEDGTSKLEIQSVDFTDLGKEDGDLKLGSLEIAKNGDFSATLEG 191
+L D + V N+ T +++Q +D LG + ++ + S
Sbjct: 134 VKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVT 193

Query: 192 TITDGDTKLASLDKLAAGDSIKLTFGEGDDAKTAELEITADSTLNDLTAKMKEVSGSDFK 251
++ + D V
Sbjct: 194 GYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTT 253

Query: 252 LDATAKTITGSVGAKLAISYTDNDSGDSADGSDLTAATTIGVKAGAWDTDSGIDSSLTEI 311
T A + T G + I+ +
Sbjct: 254 KSTAG---TAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTL 310

Query: 312 KAAQDALRAQAST 324
A A
Sbjct: 311 TVADITAGAANVD 323



Score = 45.4 bits (107), Expect = 2e-07
Identities = 48/334 (14%), Positives = 93/334 (27%), Gaps = 9/334 (2%)

Query: 68 GQSVSTLTAADKGIKAISKLIDAAKGKANQALQTEDVSQRNKYAKEFNDLRTQIQDLAKD 127
G +T+ K ++ A G + S + ++ A +
Sbjct: 176 GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVN--SGAVVTDTTAPTVPDKVYVNAAN 233

Query: 128 SGYKGKNLLGGDGNDLTVKFNEDGTSKLEIQSVDFTDLGKEDGDLKLGSLEIAKNGDFSA 187
+ DL + GKE + +
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGN 293

Query: 188 TLEGTITDGDTKLASLDKLAAGDSIKLTFGEGDDAKTAELEITADSTLNDLTAKMKEVSG 247
G ++ +A + + + + + K K S
Sbjct: 294 DGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESA 353

Query: 248 SDFKLDATAK-------TITGSVGAKLAISYTDNDSGDSADGSDLTAATTIGVKAGAWDT 300
L+A T+ G+ A +G + + + + A
Sbjct: 354 KLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAA 413

Query: 301 DSGIDSSLTEIKAAQDALRAQASTFGTNLSVVKNRKNFTADMINTLEVGAGKLTLADTNR 360
+ L I +A + A S+ G + + + + L ++ AD
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYAT 473

Query: 361 EGANLAALQTQQQLATSTLALAARANQSVLQLIR 394
E +N++ Q QQ TS LA A + Q+VL L+R
Sbjct: 474 EVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00310FLAGELLIN622e-12 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 61.6 bits (149), Expect = 2e-12
Identities = 50/371 (13%), Positives = 104/371 (28%), Gaps = 2/371 (0%)

Query: 12 NNLSTLQSTASMISSTQERLSTGMKVNSALDNPNSYFTAASLNNRASDLSNLQDDMGQSV 71
+ L + S +SS ERLS+G+++NSA D+ A + L+ + +
Sbjct: 12 LTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 72 STLTAADKGIKAISKLVDAAKGKANQALQ-SEDKEQRAKYSKEFNDLRTQIQDLAKDSGY 130
S + + I+ + + + QA + E +I ++ + +
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 131 KGKNLLGGDGN-DLKVKFNEDGTSKLDIKSVDFTDLSAADSDLKLGAVQQLEKGDFTSGA 189
G +L D ++V N+ T +D++ +D L ++ + +
Sbjct: 132 NGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKN 191

Query: 190 MTLGSGTFEAATKLDTLDNYAAGGKITIKYGSGDDAKTKELEITADNTVGDLAAAIKEVT 249
+T A K N A T D + A+
Sbjct: 192 VTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFK 251

Query: 250 GEDFTIDGTAKTLSGKVADDLTFSYEASSGGETGGELSTGDTTTGLVKNGWSEDAGIEAS 309
T G T + + + +
Sbjct: 252 TTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLT 311

Query: 310 LETLKTAQDSLRAQASTFGTNLSVVQNRKNFTADMINTLEVGAGKLTLADTNKEGANLAA 369
+ + ++ A N+ FT D E A+ +G +
Sbjct: 312 VADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKIT 371

Query: 370 LQTQQQLATST 380
+ + A +
Sbjct: 372 VNGAEYTANAA 382



Score = 45.8 bits (108), Expect = 2e-07
Identities = 44/336 (13%), Positives = 93/336 (27%), Gaps = 11/336 (3%)

Query: 68 GQSVSTLTAADKGIKAISKLVDAAKGKANQ-ALQSEDKEQRAKYSKEFNDLRTQIQDLAK 126
G +T+ K ++ A G + + D +
Sbjct: 176 GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQ 235

Query: 127 DSGYKGKNLLGGDGNDLKVKFNEDGTSKLDIKSVDFTDLSAADSDLKLGAVQQLEKGDFT 186
+ +N D +K ++ + + G+
Sbjct: 236 LTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDG 295

Query: 187 SGAMTLGSGTFEAATKLDTLDNYAAGGKITIKYGSGDDAKTKELEITADNTVGDLAAAIK 246
+G ++ K+ G + + +K + D +
Sbjct: 296 NGKVSTTIN----GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 247 EVTGEDFTIDGTAKTLSGKVADDLTFSYEASSGGETGGELSTGDTTTGLVKNGWSED--- 303
D + K S + ++ A+ T + T + +
Sbjct: 352 SAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAA 411

Query: 304 ---AGIEASLETLKTAQDSLRAQASTFGTNLSVVQNRKNFTADMINTLEVGAGKLTLADT 360
L ++ +A + A S+ G + + + + L ++ AD
Sbjct: 412 AAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADY 471

Query: 361 NKEGANLAALQTQQQLATSTLALAARANQSVLQLIR 396
E +N++ Q QQ TS LA A + Q+VL L+R
Sbjct: 472 ATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


5Phong_00339Phong_00350Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_003392181.538423Hydroxymethylpyrimidine/phosphomethylpyrimidine
Phong_003402181.275291Hydroxyethylthiazole kinase
Phong_003410180.397008DNA-binding transcriptional regulator AsnC
Phong_003420221.005341Riboflavin transporter
Phong_00343-1171.505645Caffeine dehydrogenase subunit alpha
Phong_00344-3141.510647Riboflavin transporter
Phong_00345-3141.789726hypothetical protein
Phong_00346-2142.284680Long-chain-fatty-acid--CoA ligase
Phong_00347-2183.730833Adenine deaminase
Phong_00348-2233.919082Glucose-6-phosphate 1-dehydrogenase
Phong_00349-2203.550082KHG/KDPG aldolase
Phong_00350-2193.021057Glucose-6-phosphate isomerase
6Phong_00403Phong_00413Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00403-2183.000558hypothetical protein
Phong_00404-2213.345884DNA repair protein RecN
Phong_00405-1202.904023Outer membrane protein assembly factor BamD
Phong_004060213.599355UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine
Phong_004071234.198649Cell division protein FtsZ
Phong_004083294.756305Cell division protein FtsA
Phong_004094303.912791Cell division protein FtsQ
Phong_004104293.732636D-alanine--D-alanine ligase B
Phong_004115303.418676UDP-N-acetylenolpyruvoylglucosamine reductase
Phong_004123283.378735UDP-N-acetylmuramate--L-alanine ligase
Phong_004132272.896155UDP-N-acetylglucosamine--N-acetylmuramyl-
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00404FbpA_PF05833340.002 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 33.7 bits (77), Expect = 0.002
Identities = 14/70 (20%), Positives = 29/70 (41%), Gaps = 14/70 (20%)

Query: 161 ALYRAYSSAKRELSQLEKRIDEARREADYLRSSVDELSGLDPQAGEEQELALKRQDMMQV 220
+ Y+ Y+ K+ +++ + E +YL S + ++ D ++
Sbjct: 385 SYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYD--------------EI 430

Query: 221 EKIASELIEA 230
E+I ELIE
Sbjct: 431 EEIKKELIET 440


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00408SHAPEPROTEIN543e-10 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 54.4 bits (131), Expect = 3e-10
Identities = 43/186 (23%), Positives = 74/186 (39%), Gaps = 14/186 (7%)

Query: 208 RLVATPYASGLATLMADETELGVACVDFGGGTTTLSIFVDGKMVHLDAMAVGGNHITMDI 267
L+ P A+ + + G VD GGGTT +++ +V+ ++ +GG+ I
Sbjct: 139 FLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAI 198

Query: 268 ARWF--TTNM----HEAERLKTLYGSALPSVADEHDYIEVAPLDAEGDLPQSAPKS--EL 319
+ AER+K GSA P DE IEV + +P+ + E+
Sbjct: 199 INYVRRNYGSLIGEATAERIKHEIGSAYP--GDEVREIEVRGRNLAEGVPRGFTLNSNEI 256

Query: 320 TKVIGPRVEETLELVRDKLNSSG---FAGRVGKRIVLTGGGSQLTGMGEMARRILGHNVR 376
+ + + + V L + + +VLTGGG+ L + + G V
Sbjct: 257 LEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVV 316

Query: 377 LG-RPL 381
+ PL
Sbjct: 317 VAEDPL 322


7Phong_00423Phong_00471Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00423733-1.552525hypothetical protein
Phong_00424732-0.577636Tyrosine recombinase XerC
Phong_00425628-0.694818helix-turn-helix protein
Phong_004266270.359936hypothetical protein
Phong_004276250.413246hypothetical protein
Phong_004286250.642576hypothetical protein
Phong_004295271.695124hypothetical protein
Phong_004304251.859508hypothetical protein
Phong_004314261.828458PD-(D/E)XK nuclease superfamily protein
Phong_004324271.280730hypothetical protein
Phong_004336271.739132hypothetical protein
Phong_004345271.600930C-5 cytosine-specific DNA methylase
Phong_004356260.732075DNA polymerase family A
Phong_004367290.648113hypothetical protein
Phong_004377292.124078hypothetical protein
Phong_004387292.104757Regulatory protein RepA
Phong_004397292.574560hypothetical protein
Phong_004407292.695538hypothetical protein
Phong_004417282.639648Transglycosylase SLT domain protein
Phong_004427282.434811hypothetical protein
Phong_004438281.128493hypothetical protein
Phong_004447280.752484hypothetical protein
Phong_00445828-0.085477hypothetical protein
Phong_004468280.397230hypothetical protein
Phong_004478250.813886hypothetical protein
Phong_004487250.727617hypothetical protein
Phong_004496250.513516hypothetical protein
Phong_004506250.549037hypothetical protein
Phong_004516260.814768hypothetical protein
Phong_004526260.634995hypothetical protein
Phong_004535280.503213hypothetical protein
Phong_004546320.245699Terminase-like family protein
Phong_004556310.397549hypothetical protein
Phong_004566251.087674hypothetical protein
Phong_004573201.435926hypothetical protein
Phong_004583201.670412hypothetical protein
Phong_004593201.282813hypothetical protein
Phong_004603201.481447hypothetical protein
Phong_004612221.376083hypothetical protein
Phong_004622250.741107hypothetical protein
Phong_004636310.802175hypothetical protein
Phong_004644311.024285hypothetical protein
Phong_004654321.603096hypothetical protein
Phong_004662190.506708hypothetical protein
Phong_004673190.367358hypothetical protein
Phong_004683190.729167hypothetical protein
Phong_004693221.576685hypothetical protein
Phong_004702191.423558hypothetical protein
Phong_004713171.117261hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00445RTXTOXIND310.005 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.005
Identities = 17/118 (14%), Positives = 37/118 (31%), Gaps = 5/118 (4%)

Query: 11 ARARREEEQRQARVRAGTSQINSVFDGQFTDDFFKTRAQSYLDYAN-PQLEDQYNNATKE 69
+ R E AR+ + + DDF + + + E++Y A E
Sbjct: 210 DKKRAERLTVLARINRYENLSRV--EKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNE 267

Query: 70 LSFALARSGTLDSSVRGGKLSELQKLYDIEKQNVAD--QAQAQSKQLRGEVERSRGDL 125
L ++ ++S + K + + + Q L E+ ++
Sbjct: 268 LRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00446SACTRNSFRASE300.002 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 30.3 bits (68), Expect = 0.002
Identities = 13/71 (18%), Positives = 26/71 (36%), Gaps = 5/71 (7%)

Query: 42 QYLQSANPTIFVAEDERCLVGFLVASMHGYMWAGGHYTTQDVLFVRPENRGSRAAVHLMN 101
Y++ F+ E +G + + W G Y + + V + R L++
Sbjct: 58 SYVEEEGKAAFLYYLENNCIGRIKIRSN---WNG--YALIEDIAVAKDYRKKGVGTALLH 112

Query: 102 NLIRWSKQIGA 112
I W+K+
Sbjct: 113 KAIEWAKENHF 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00451FIMBRILLIN310.008 Porphyromonas gingivalis: fimbrillin protein signature.
		>FIMBRILLIN#Porphyromonas gingivalis: fimbrillin protein signature.

Length = 348

Score = 30.8 bits (69), Expect = 0.008
Identities = 17/72 (23%), Positives = 29/72 (40%), Gaps = 7/72 (9%)

Query: 36 LKAFDKRAKSFVGGKEKVTAGVRAGAGQLSLAGYSHDDQVSYGNPVKNKRAGYKWREHHI 95
L A ++ A + E V + AG G +Q+S P++ KR H
Sbjct: 82 LTAENQEAAGLIMTAEPVEVTLVAGNNYYGYDGSQGGNQISQDTPLEIKRV-------HA 134

Query: 96 GMGITHTELKMA 107
M T +++M+
Sbjct: 135 RMAFTEIKVQMS 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00462PF07132340.002 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 33.5 bits (76), Expect = 0.002
Identities = 29/85 (34%), Positives = 39/85 (45%), Gaps = 2/85 (2%)

Query: 534 GGGGVAGAGYTPHGGGGSFGSNGGPSNNILE--NRVFYPKTFHSGTGAGGNGSGGRGQLA 591
G GG+ + + +GG S + GG +NI E + + F GG G G G +
Sbjct: 19 GNGGLFPSQSSQNGGSPSQSAFGGQRSNIAEQLSDIMTTMMFMGSMMGGGLGGGLGGLGS 78

Query: 592 AGGGSVGSGGLGGGGGGLGGLGGPG 616
+ GG G GG GGGLG G G
Sbjct: 79 SLGGLGGGLLGGGLGGGLGSSLGSG 103


8Phong_00498Phong_00515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_004982110.526977Glycosyl hydrolases family 18
Phong_004991110.746733putative ABC transporter ATP-binding protein
Phong_005003141.246771cobalt transport protein CbiM
Phong_005013141.349391Cyclic di-GMP phosphodiesterase Gmr
Phong_005024152.739501Nuclease SbcCD subunit C
Phong_005032193.033285Nuclease SbcCD subunit D
Phong_005041272.745385Quercetin 2,3-dioxygenase
Phong_005051273.352171Helix-turn-helix domain protein
Phong_005060263.613611Arsenate reductase
Phong_005070184.067325FMN-dependent NADPH-azoreductase
Phong_005080153.549920Glucosamine kinase GspK
Phong_005090162.988467putative HTH-type transcriptional regulator
Phong_005102162.262964Glutamine--fructose-6-phosphate aminotransferase
Phong_005112171.610246N-acetylglucosamine-6-phosphate deacetylase
Phong_005122180.276982hypothetical protein
Phong_00513315-0.365475putative allantoin permease
Phong_00514113-1.148854Omega-amino acid--pyruvate aminotransferase
Phong_00515219-0.884028HTH-type transcriptional regulator RutR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00498CARBMTKINASE290.022 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 29.0 bits (65), Expect = 0.022
Identities = 10/44 (22%), Positives = 22/44 (50%), Gaps = 4/44 (9%)

Query: 73 KRKVLVALGGATLASD----TWANCAQNVEEVAQCLADFVKQNN 112
++V++ALGG L ++ NV + A+ +A+ + +
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGY 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00501CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.9 bits (77), Expect = 0.002
Identities = 32/97 (32%), Positives = 45/97 (46%), Gaps = 5/97 (5%)

Query: 313 KTQAEVSLSAQRMLDSLAKPFSLKNDLNIRISACVGIALIPEVGKHPDAI-LAKAGLALK 371
KTQAE + A+ ++ AK + ++ L R+ V AL + P A LA A A
Sbjct: 64 KTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAM 123

Query: 372 RAELSGKGKLAFFEPSMEASLKERRTIELDLQEALQQ 408
+AE +LA E E + KE E QEA Q+
Sbjct: 124 QAEDERL-RLAKAE---EKARKEAEAAEKAFQEAEQR 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00502RTXTOXIND437e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D signature.

Length = 478

Score = 42.9 bits (101), Expect = 7e-06
Identities = 28/204 (13%), Positives = 64/204 (31%), Gaps = 16/204 (7%)

Query: 816 ARLTGLQGEALMRGLSALLRFVGETRKSWNECKDELERCEVQLGAMAAEIASLQARLEEN 875
+LT L EA + L + + +E ++ + E EE
Sbjct: 125 LKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 876 EGQLLPQQQQLDALQQQLGELQ----RTRAELLGGEGTQAHRARHQDALKQAQAEAAQLL 931
++Q Q Q + + + RAE L + +R + +++++ + L
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLA-RINRYENLSRVEKSRLDDFSSL 243

Query: 932 GQLGQV--------NEQLAAAHGQSTQKKARLEELRTAITQGEAGFATALQREGLSPEQF 983
+ + A + K++LE++ + I + L +
Sbjct: 244 LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAK---EEYQLVTQLFKNEI 300

Query: 984 TLLRPTADEQLAQLQEELERLTQE 1007
+ + L EL + +
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEER 324



Score = 39.0 bits (91), Expect = 1e-04
Identities = 31/225 (13%), Positives = 68/225 (30%), Gaps = 25/225 (11%)

Query: 722 LTGLQEASKSRLGELEGQLAALEAQRVALAEAQQARDELHQSLTRASAQASAVKADLVGL 781
LT L + + + A LE R + ++L + V + V
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV-- 184

Query: 782 LVQRQEREAQLKEGKSELARLLAAIDDVAGAPQSARLTGLQGEALMRGLSALLRFVGETR 841
L + Q +++ + + + +L +
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNL-----------------DKKRAERLTVLARINRYE 227

Query: 842 KSWNECKDELERCEVQLGAMAAEIASLQARLEENEGQLLPQQQQLDALQQQLGELQR--T 899
K L+ ++ + A + + E E + + +L + QL +++
Sbjct: 228 NLSRVEKSRLDD----FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEIL 283

Query: 900 RAELLGGEGTQAHRARHQDALKQAQAEAAQLLGQLGQVNEQLAAA 944
A+ TQ + D L+Q L +L + E+ A+
Sbjct: 284 SAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328



Score = 35.6 bits (82), Expect = 0.001
Identities = 29/174 (16%), Positives = 55/174 (31%), Gaps = 19/174 (10%)

Query: 340 DAQRQGEQEALRKAQQLSAQLQDQVRQFEPQWRLAEQKDLQLSHSQAECDKLSRRVGDLA 399
Q E+E LR + Q Q QK+L L +AE + R+
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQ-------KYQKELNLDKKRAERLTVLARINRYE 227

Query: 400 VRTADTREKLVNAQSRQQDLAAVKAKVKAQLERLQPLRPIVDAEQSLRGDLHRAGDTAEQ 459
+ + +L L +A K + + V+A LR + +
Sbjct: 228 NLSRVEKSRL----DDFSSLLHKQAIAKHAVLEQE--NKYVEAVNELRVYKSQLEQIESE 281

Query: 460 LQKQEAQLSQLKQAEHEKAQLQQELESKLQDMGAQVAQAEAQLAAEQQAIAAID 513
+ + + + Q + E+ KL+ + +LA ++ A
Sbjct: 282 ILSAKEEYQLVTQ------LFKNEILDKLRQTTDNIGLLTLELAKNEERQQASV 329



Score = 35.6 bits (82), Expect = 0.001
Identities = 30/253 (11%), Positives = 75/253 (29%), Gaps = 44/253 (17%)

Query: 251 EAAQLKGLQEDLQKNKALDAAQQQLVSTREAMGQAQQRLGNAADKQQLLATLKRIKIVEP 310
+ L Q L+ R + Q + + L +
Sbjct: 119 RKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQN 178

Query: 311 LAQQVEMRKGEEVSE-LDAF----TQLQTQMQALDAQRQGEQEALRKAQQLSAQLQDQVR 365
++++ +R + E + Q + + A+R + + + LS + ++
Sbjct: 179 VSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 366 QFEPQWRLAEQKDLQLSHSQAECDKLSRRVGDLAVRTADTREKLVNAQSRQQDLAAVKAK 425
F L ++ + + ++ +++ +
Sbjct: 239 DFSS---LLHKQAIA-------------------------KHAVLEQENKYVEAVNELRV 270

Query: 426 VKAQLERLQPLRPIVDAEQSLRGDLHRAGDTAEQLQKQEAQLSQLKQAEHEKAQLQQELE 485
K+QLE+++ I+ A++ + + L +L+Q L EL
Sbjct: 271 YKSQLEQIE--SEILSAKEEYQL---------VTQLFKNEILDKLRQTTDNIGLLTLELA 319

Query: 486 SKLQDMGAQVAQA 498
+ A V +A
Sbjct: 320 KNEERQQASVIRA 332



Score = 32.5 bits (74), Expect = 0.012
Identities = 27/208 (12%), Positives = 67/208 (32%), Gaps = 34/208 (16%)

Query: 396 GDLAVRTADTREKLVNAQSRQQDLAAVKAKVKAQLERLQPLRPIVDAEQSLRGDLH---R 452
GD+ ++ + +++ L A + + R Q L ++ + L
Sbjct: 121 GDVLLKLTALGAEADTLKTQSSLLQA-----RLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 453 AGDTAEQLQKQEAQL--SQLKQAEHEKAQLQQELESK---LQDMGAQVAQAEAQLAAEQQ 507
+ +E+ + L Q +++K Q + L+ K + A++ + E E+
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 508 AIA---------AIDGARLHEQQKLMLRLEQQVSS------------LQMSAQKARHKEQ 546
+ AI + EQ+ + ++ L + +
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 547 DKEQISTEVTVLEAALTGLAHEVDGARQ 574
K +I ++ + L E+ +
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEE 323



Score = 31.7 bits (72), Expect = 0.018
Identities = 36/252 (14%), Positives = 72/252 (28%), Gaps = 57/252 (22%)

Query: 632 QLFAGVQKARTEAQVRAKQARAALERVLAKQSGDAARLQALLAQKTRAEKEIIALREEAG 691
+ + EA Q+ R+ + R Q L +
Sbjct: 122 DVLLKLTALGAEADTLKTQSSLLQARL------EQTRYQILSRSIELNK----------- 164

Query: 692 RAWHGLPLMLEQSDLPAPPPLEDVLLGRGSLTGLQEASKSRLGELEGQLAALEAQRVALA 751
L + LP P + + + L K + + Q E L
Sbjct: 165 ---------LPELKLPDEPYFQ--NVSEEEVLRLTSLIKEQFSTWQNQKYQKEL---NLD 210

Query: 752 EAQQARDELHQSLTRASAQASAVKADLVGL-------LVQRQ---EREAQLKEGKSELAR 801
+ + R + + R + K+ L + + E+E + E +EL
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 802 LLAAIDDVAGAPQSARLTGLQGEALMRGLSALLRFVGETRKSWNECKDELERCEVQLGAM 861
+ ++ + SA+ L + NE D+L + +G +
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQLFK----------------NEILDKLRQTTDNIGLL 314

Query: 862 AAEIASLQARLE 873
E+A + R +
Sbjct: 315 TLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00515HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 27/116 (23%), Positives = 51/116 (43%), Gaps = 7/116 (6%)

Query: 1 MARASDAKTQTRIQREKKQIILEAALEVFSAHGYKGATLEQIANQSNLSKPNLLYYFDSK 60
MAR KT+ Q ++ I L+ AL +FS G +L +IA + +++ + ++F K
Sbjct: 1 MAR----KTKQEAQETRQHI-LDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDK 55

Query: 61 EGIFQRLIEELLVSWLDPLRKFSA--TGNPVEEICAYVQRKLEMARDYPRESRLFA 114
+F + E + + ++ A G+P+ + + LE R L
Sbjct: 56 SDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME 111


9Phong_00071Phong_00079N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_000711202.195555putative lipoprotein YiaD precursor
Phong_000721273.338407Bacterioferritin
Phong_000731303.601809ATP12 chaperone protein
Phong_00074-1251.9782685'-nucleotidase
Phong_00075-2230.540025Ribosomal large subunit pseudouridine synthase
Phong_00076-3220.106359Putative fluoride ion transporter CrcB
Phong_00077-320-0.287968Replication-associated recombination protein A
Phong_00078-121-1.222715putative periplasmic serine endoprotease
Phong_00079020-1.924834Methyl-accepting chemotaxis protein IV
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00071OMPADOMAIN1032e-28 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 103 bits (257), Expect = 2e-28
Identities = 40/125 (32%), Positives = 61/125 (48%), Gaps = 11/125 (8%)

Query: 107 KLNMPNDVTFATNQTALSPRAMQVLTSVAVVAKEY--TQTRLNVYGHTDNVGSAQYNMQL 164
+ +DV F N+ L P L + + V G+TD +GS YN L
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGL 273

Query: 165 SQQRAQIVGSYLINQGIPAQRIAAFGYGLSQPIASNASEAGR---------AQNRRVEVV 215
S++RAQ V YLI++GIPA +I+A G G S P+ N + + A +RRVE+
Sbjct: 274 SERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIE 333

Query: 216 LSPLQ 220
+ ++
Sbjct: 334 VKGIK 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00072HELNAPAPROT353e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 35.2 bits (81), Expect = 3e-05
Identities = 23/109 (21%), Positives = 44/109 (40%), Gaps = 17/109 (15%)

Query: 43 KELEEAVEE-QEHADQLAERVLFLGGVPDMR-PEF-----------VPSMKDSVKAILEA 89
++ EE + E D +AER+L +GG P E+ S + V+A++
Sbjct: 48 EKFEELYDHAAETVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVND 107

Query: 90 DLAGEKDAREAYRVSREICDAEGDYVSMQLFDQLLADEEGHIDFLETQL 138
+++ ++ E D + LF L+ + E + L + L
Sbjct: 108 YKQISSESKFVIGLAEE----NQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00077HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 0.001
Identities = 51/233 (21%), Positives = 76/233 (32%), Gaps = 58/233 (24%)

Query: 17 RPLADRLRPDELAAVVGQGHLLDEDGVLARMLASGTLGSLIFWGPPGTGKTTVARLLADH 76
RP + +VG+ + E + L L +I G GTGK VAR L D+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMIT-GESGTGKELVARALHDY 183

Query: 77 T----------------------ELQFEQVSAIFSG-VADLKKVFERARMTRLTGRGTLL 113
EL F F+G FE+A GT L
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESEL-FGHEKGAFTGAQTRSTGRFEQA------EGGT-L 235

Query: 114 FVDEIHRFNRAQLDSFLPVMEDGTITLVG-------------ATTENPSFELN-----AA 155
F+DEI L V++ G T VG AT ++ +N
Sbjct: 236 FLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFRED 295

Query: 156 LLSRAHVLVF------ERLDEAAL--DALLCRAEEETGAALPLDGDARRALIQ 200
L R +V+ +R ++ + +AE+E D +A +
Sbjct: 296 LYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKA 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00078V8PROTEASE724e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 72.3 bits (177), Expect = 4e-16
Identities = 33/191 (17%), Positives = 58/191 (30%), Gaps = 41/191 (21%)

Query: 90 SPFGAPRTRERVKSSLGSGVIVSKDGTIITNHHVIEGAQDVRVVLSDKRE---------- 139
+ SGV+V KD T++TN HV++ L
Sbjct: 95 VEAPTGTF-------IASGVVVGKD-TLLTNKHVVDATHGDPHALKAFPSAINQDNYPNG 146

Query: 140 --FEAKVLLKDERTDLAILKI----DGE--GADLPMVEFADSDGLEVGDLVLAIGNPFGV 191
++ DLAI+K + G + +++ +V + G P
Sbjct: 147 GFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD- 205

Query: 192 GQTVTQGIVSAVARTQVGVSDYQFFIQTDAAINPGNSGGALVDVDGKLVGINTAIFSRSG 251
T G +Q D + GNSG + + +++GI+
Sbjct: 206 KPVATMWESKGKITYLKGE-----AMQYDLSTTGGNSGSPVFNEKNEVIGIHWG------ 254

Query: 252 GSNGIGFAIPG 262
G+ G
Sbjct: 255 ---GVPNEFNG 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00079CHANLCOLICIN320.009 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 32.0 bits (72), Expect = 0.009
Identities = 37/227 (16%), Positives = 80/227 (35%), Gaps = 12/227 (5%)

Query: 424 LQDERELRKERAREQEERNQRAELMEELNAEFDQKVRGILQEVTEAGDTLNTTAKSMARM 483
Q+ + RKE RE+ E ++ +L E + + V A L+ + +M
Sbjct: 150 FQEAEQRRKEIEREKAETERQLKLAEAEEKRLA-ALSEEAKAVEIAQKKLSAAQSEVVKM 208

Query: 484 SEQTNSEAQTVSSASEQSLMNLQAVAGATEELSATSG---EIDREIRQSADI------SR 534
+ + +SS+ ++ +AG EL+ S E+D +++ + +R
Sbjct: 209 DGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNR 268

Query: 535 EAAETASSASTTIHGLQQAAVRIGEVVSLINDIAEQTNLL--ALNATIEAARAGEAGRGF 592
E ++ ++ + IN I + A++ AG A
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 593 AVVASEVKELANQTGSATGEISAQIQAVQNETQQAVAAIDEIAKVIA 639
A + + + A + Q T++ ++A+ +A
Sbjct: 329 AEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYGEKYSKMAQELA 375


10Phong_00108Phong_00114N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00108739-1.545390Elongation factor Tu
Phong_00109635-1.639134Elongation factor G
Phong_00110634-1.51860430S ribosomal protein S7
Phong_00111633-1.42235530S ribosomal protein S12
Phong_00112633-1.443274hypothetical protein
Phong_00113534-1.439788DNA-directed RNA polymerase subunit beta'
Phong_00114534-2.199925DNA-directed RNA polymerase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00108TCRTETOQM862e-20 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 86.1 bits (213), Expect = 2e-20
Identities = 52/150 (34%), Positives = 78/150 (52%), Gaps = 10/150 (6%)

Query: 13 VNIGTIGHVDHGKTTLTAAI------TKSFGDFKAYDEI-DGAPEERARGITISTAHVEY 65
+NIG + HVD GKTTLT ++ G D ER RGITI T +
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 66 ETEARHYAHVDCPGHADYVKNMITGAAQMDGAILVVNAADGPMPQTREHILLARQVGVPA 125
+ E +D PGH D++ + + +DGAIL+++A DG QTR R++G+P
Sbjct: 64 QWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPT 123

Query: 126 LVVFMNKVDQVDDEELLELVEMEVRELLSS 155
+ F+NK+DQ + L V +++E LS+
Sbjct: 124 -IFFINKIDQNGID--LSTVYQDIKEKLSA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00109TCRTETOQM6040.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 604 bits (1559), Expect = 0.0
Identities = 173/669 (25%), Positives = 298/669 (44%), Gaps = 66/669 (9%)

Query: 11 RNFGIMAHIDAGKTTTTERILYYTGKSHKIGEVHDGAATMDWMEQEQERGITITSAATTA 70
N G++AH+DAGKTT TE +LY +G ++G V G D E++RGITI + T+
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 71 FWNDKRLNIIDTPGHVDFTIEVERSLRVLDGAVALLDANAGVEPQTETVWRQADKYNVPR 130
W + ++NIIDTPGH+DF EV RSL VLDGA+ L+ A GV+ QT ++ K +P
Sbjct: 64 QWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPT 123

Query: 131 MIFVNKMDKLGADFYRCVEMIEKRLGANPLCLQLPIGAESEFQGVVDLLKMKALVWKSEN 190
+ F+NK+D+ G D + I+++L A + Q V+L
Sbjct: 124 IFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ-----------KVELYP---------- 162

Query: 191 LGASWDELDIPADLADQAAEYRDKLIETAVEVDEEAMEAYLEGEEPSFEKLQALIRKGTI 250
+ ++Q +T +E +++ +E Y+ G+ +L+
Sbjct: 163 -----NMCVTNFTESEQ--------WDTVIEGNDDLLEKYMSGKSLEALELEQEESIRFH 209

Query: 251 AGDFVPVMCGTAFKNKGVQPLLDAVVDYLPSPVEVPAIKGIDAKTEEETERKSSDEEPLG 310
PV G+A N G+ L++ + + S + L
Sbjct: 210 NCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSELC 250

Query: 311 MLAFKIMNDPFVGSLTFARIYSGVLTKGSSVMNTVKGKRERVGRMMQMHSNSREDIDEAY 370
FKI L + R+YSGVL SV + K K ++ M + ID+AY
Sbjct: 251 GKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAY 309

Query: 371 AGDIVAIAG----LKDTTTGDTLCDPLKPVILERMEFPDPVIEIAVEPKTKGDQEKMGLA 426
+G+IV + L GDT P + ER+E P P+++ VEP +E + A
Sbjct: 310 SGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLLDA 364

Query: 427 LNRLAAEDPSFRVKTDEESGQTIIAGMGELHLDILVDRMKREFKVEANIGAPQVAYRETI 486
L ++ DP R D + + I++ +G++ +++ ++ ++ VE I P V Y E
Sbjct: 365 LLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYMERP 424

Query: 487 TAPADIDYTHKKQSGGTGQFARIKLSIEPNEIGAGFEFKDTIVGGNVPKEYIPGVSKGIE 546
A+ YT + +A I LS+ P +G+G +++ ++ G + + + V +GI
Sbjct: 425 LKKAE--YTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGIR 482

Query: 547 SVMSSGPLAGFPMVDIKATLTDGAYHDVDSSVLAFEIAGRAGFREAIAQAKPKLLEPIMK 606
G L G+ + D K G Y+ S+ F + + + +A +LLEP +
Sbjct: 483 YGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYLS 541

Query: 607 VEVVTPEDYMGDVIGDLNSRRGQISGTENRGVVTVITAMVPLANMFGYVNNLRSMSQGRA 666
++ P++Y+ D I T+ + +++ +P + Y ++L + GR+
Sbjct: 542 FKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGRS 601

Query: 667 QYSMVFDHY 675
Y
Sbjct: 602 VCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00111CHLAMIDIAOM6270.025 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 27.0 bits (59), Expect = 0.025
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 16 PKRNKVPAMEACPQKRGVCTRVYTTT 41
P ++ +E CP KRG T + T +
Sbjct: 272 PGEHRTITVEFCPLKRGRATNIATVS 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00114PF05272330.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.013
Identities = 24/133 (18%), Positives = 40/133 (30%), Gaps = 19/133 (14%)

Query: 298 EDLFTQYLAEDMVNLQTGEIYYEAGDEISEKMVDELVDRGFGEIPVLDIDHVNTGAYIRN 357
+ Q AE + GE Y+ + ++ E R +
Sbjct: 723 QKFRGQLFAEALHLYLAGERYFPSPEDEEIYFRPEQELRLVETG---------VQGRLWA 773

Query: 358 TLAADKNETREGALFDIYRVMRPGEPPTIETAEAMFQSLFFDTERYDLSAVGRVKMNMRL 417
L + EGA Y V T T + Q+L D + G+V+ +
Sbjct: 774 LLTREGAPAAEGAAQKGYSVNT-----TFVTIADLVQALGADPGKSSPMLEGQVRDWLN- 827

Query: 418 DLDCEDTVRVLRK 430
E+ LR+
Sbjct: 828 ----ENGWEYLRE 836


11Phong_00271Phong_00278N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00271-1110.734693putative ABC transporter-binding protein
Phong_002720130.689635Maltose/maltodextrin import ATP-binding protein
Phong_002730160.900099D-threonine aldolase
Phong_002741161.710421hypothetical protein
Phong_002752242.732615hypothetical protein
Phong_002763242.977291Enamine/imine deaminase
Phong_002772253.138861HTH-type transcriptional regulator RpiR
Phong_002782253.4121502,5-dichloro-2,5-cyclohexadiene-1,4-diol
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00271MALTOSEBP347e-04 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 34.3 bits (78), Expect = 7e-04
Identities = 76/337 (22%), Positives = 125/337 (37%), Gaps = 58/337 (17%)

Query: 7 LGTAVLAGLSTAALAEDTRIHMINCGDNNGDGIKLQDQLIADWEKANPGFKVDVEFVPWG 66
L T + + + A + E + IN GD +G+ ++ +EK + G KV VE
Sbjct: 15 LTTMMFSASALAKIEEGKLVIWIN-GDKGYNGLA---EVGKKFEK-DTGIKVTVEHP--D 67

Query: 67 QCQEKSTTLASAGSPPAVAYMGSRTLKQLAANDLIIPVSMSDAEKATYAAPILATVTANG 126
+ +EK +A+ G P + + A + L+ ++ A + V NG
Sbjct: 68 KLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNG 127

Query: 127 EQWGLPRAFSTKAYYWSKDLFKAAGLDPETPPKTWDEMYEAAKAIKEKTDADGVGLVAAS 186
+ P A + ++KDL PPKTW+E+ K +K K G A
Sbjct: 128 KLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAK------GKSALM 174

Query: 187 FDNTMHQFMNYVYTNGGEVINADGEIVFNSPNNVEAMAFYGKLATVSQPG---------- 236
F N + + +I ADG F N + G ++ G
Sbjct: 175 F-NLQEPYFTW------PLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKN 227

Query: 237 ---PIAYDRGKLYPLFNEGKIGMFISGPWERKKLTG---DWGVAPIPVGPKGTPGTLLIT 290
D FN+G+ M I+GPW + ++GV +P KG P +
Sbjct: 228 KHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTF-KGQPSKPFV- 285

Query: 291 DSLAVFKGSGVEAQALDLAKLLTNPENQMAFELAEGY 327
GV + ++ A +P ++A E E Y
Sbjct: 286 ---------GVLSAGINAA----SPNKELAKEFLENY 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00272PF05272355e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 5e-04
Identities = 12/31 (38%), Positives = 17/31 (54%)

Query: 32 IVFVGPSGCGKSTLLRMIAGLEEITKGDIDI 62
+V G G GKSTL+ + GL+ + DI
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00273ALARACEMASE385e-05 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 37.8 bits (88), Expect = 5e-05
Identities = 30/170 (17%), Positives = 59/170 (34%), Gaps = 28/170 (16%)

Query: 14 AVIDAEVMENNITRVQSYMDRIGRRFRPHIKT----------HKIPAVAAKQIAAGAVGI 63
A +D + ++ N+ I R+ H + H I + + A +
Sbjct: 7 ASLDLQALKQNL--------SIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFAL 58

Query: 64 NCQKVSEAQVFADAGF-DDILLTYNVLGAEALADLVRLNAQISGLSVTCDNAVVLAGLQQ 122
+ EA + G+ IL+ A+ L + S A+ A
Sbjct: 59 LN--LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNA---- 112

Query: 123 TFEQEASLCVLVECDTGGARAGVQSPQEARELVQQIQNAPGLRFGGLMTY 172
+A L + ++ ++G R G Q P + QQ++ + LM++
Sbjct: 113 --RLKAPLDIYLKVNSGMNRLGFQ-PDRVLTVWQQLRAMANVGEMTLMSH 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00278DHBDHDRGNASE972e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 97.4 bits (242), Expect = 2e-26
Identities = 71/251 (28%), Positives = 115/251 (45%), Gaps = 12/251 (4%)

Query: 9 IAIITGAAGDIGRAIAAALTSD-YHLALVDIDADALEGAGKTLATQGAKLSLHRCDLTCP 67
IA ITGAA IG A+A L S H+A VD + + LE +L + D+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 68 ADLTALKEQL-APLGRVSLLVNNAGAAAALSLQELTAEQLRQDLALNLEAATNIFKTFED 126
A + + ++ +G + +LVN AG + L+ E+ ++N N ++
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 127 DLK--QGGNLINIAS-VNGMAVFGHPAYSMAKAGLIHFTRLVATEYGKYGLRANTVAPGT 183
+ + G+++ + S G+ AY+ +KA + FT+ + E +Y +R N V+PG+
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 184 VRT----QAWNARAAANPKVFEEAAYW---YPLGRVAEPEDVANAVAFLASPAANAITGV 236
T W A + + PL ++A+P D+A+AV FL S A IT
Sbjct: 190 TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMH 249

Query: 237 CLPVDCGLTSG 247
L VD G T G
Sbjct: 250 NLCVDGGATLG 260


12Phong_00282Phong_00316N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00282-117-0.295301Pseudouridine-5'-phosphate glycosidase
Phong_00283021-0.917040hypothetical protein
Phong_00284022-1.115777hypothetical protein
Phong_00285023-1.007948Blue-light-activated protein
Phong_00286-128-1.607845Flagellar biosynthetic protein FlhB
Phong_00287218-0.980549flagellar biosynthesis protein FliR
Phong_00288215-0.157810Flagellar biosynthetic protein FliQ
Phong_002891161.160734Flagellar hook-basal body complex protein FliE
Phong_002901160.890919Flagellar basal-body rod protein FlgC
Phong_002911171.231978Flagellar basal body rod protein FlgB
Phong_00292-1161.283461hypothetical protein
Phong_00293-1191.202851Flagellar biosynthetic protein FliP precursor
Phong_00294-2201.168271hypothetical protein
Phong_00295-2120.784537hypothetical protein
Phong_00296-2121.595038hypothetical protein
Phong_00297-2171.565264Flagellar motor switch protein FliM
Phong_00298-1201.280267Flagellar FliL protein
Phong_002990191.366785Flagellar basal-body rod protein FlgG
Phong_00300-2191.720262Flagellar basal-body rod protein FlgG
Phong_00301-2181.233311flagellar basal body P-ring biosynthesis protein
Phong_00302-1180.965574Flagellar L-ring protein precursor
Phong_003030180.258786RNA polymerase-binding transcription factor
Phong_00304018-0.828976Flagellar assembly protein FliX
Phong_00305220-1.952851Flagellar P-ring protein precursor
Phong_00306221-3.279398chemotactic signal-response protein CheL
Phong_00307219-3.160718hypothetical protein
Phong_00308216-3.584813hypothetical protein
Phong_00309118-3.233743Flagellin
Phong_00310-117-3.120841Flagellin
Phong_00311-214-2.453988flagellum biosynthesis repressor protein FlbT
Phong_00312-212-2.225507flagellar biosynthesis regulatory protein FlaF
Phong_00313-215-1.564890flagellar hook-associated protein FlgL
Phong_00314-213-0.917170Flagellar hook-associated protein 1
Phong_00315-1110.656072Flagellar hook protein FlgE
Phong_00316-1192.467536ATP-dependent zinc metalloprotease FtsH 2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00282PF06057310.004 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 31.3 bits (71), Expect = 0.004
Identities = 19/76 (25%), Positives = 33/76 (43%), Gaps = 10/76 (13%)

Query: 146 RTPVCVIAAGAKALLDLPKTM-EVLETRGVPVISYGVDELPAFWSRTSGIKAPTRLDSAA 204
+ P+ + +G L K + +L+ +G PV+ G L +W + K P D
Sbjct: 50 KPPLVIFLSGDGGWATLDKAVGGILQQQGWPVV--GWSSLKYYWKQ----KDPK--DVTQ 101

Query: 205 DIAKLL-KARGEFGGH 219
D ++ K + EFG
Sbjct: 102 DTLAIIDKYQAEFGTQ 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00285HTHFIS912e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-21
Identities = 41/125 (32%), Positives = 65/125 (52%), Gaps = 12/125 (9%)

Query: 726 ARILLVEDEEAVRAFAARALSSRGYEVFEAGSGTEALEVMEEEGGAMDLVVSDVVMPEMD 785
A IL+ +D+ A+R +ALS GY+V + + G DLVV+DVVMP+ +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG--DLVVTDVVMPDEN 61

Query: 786 GPSLLVELRKMQPDLKIIFVSGYAE-----EAFEKNLPAEETFNFLPKPFTLKQLATTVK 840
LL ++K +PDL ++ +S +A E + +++LPKPF L +L +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASE-----KGAYDYLPKPFDLTELIGIIG 116

Query: 841 EVLAE 845
LAE
Sbjct: 117 RALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00286TYPE3IMSPROT318e-109 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 318 bits (816), Expect = e-109
Identities = 103/346 (29%), Positives = 184/346 (53%), Gaps = 5/346 (1%)

Query: 8 SEKTEEPTQKRLDDAHEKGDVPKSQEVSTWFAVLGTAMAMTFLADRTAGSLSGLLENFFD 67
EKTE+PT K++ DA +KG V KS+EV + ++ + + L+D S L+ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 68 KAGQMRLDGPALVLLWRDLGPALLAIIAAP-MAVMLLFAMAGNLIQHKPIWSSERLKPKL 126
Q L + D + P + V L A+A +++Q+ + S E +KP +
Sbjct: 63 ---QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 127 EKVSPLKGLKRLFSQESLANFLKGLVKILVVAGLMGLVLWPQRRVLDTLVFRDLRLLLEE 186
+K++P++G KR+FS +SL FLK ++K+++++ L+ +++ L L + +
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 187 TKELSLQLIGAILAVMTVIAGVDYLWQRQKWFKKQRMTRREIKEEYKQAEGDPQIKAKIR 246
++ QL+ VI+ DY ++ ++ K+ +M++ EIK EYK+ EG P+IK+K R
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 247 ELRVMRSRHRMMSAVPEATVVIANPTHYAVALHYEEG-MGAPKCVAKGVDALALRIRRVA 305
+ M V ++VV+ANPTH A+ + Y+ G P K DA +R++A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 306 EDAQVPVVENPPLARTLHAAMDLDELVPEEHFEAVAKVIGYVMQLK 351
E+ VP+++ PLAR L+ +D +P E EA A+V+ ++ +
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00287TYPE3IMRPROT1039e-29 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 103 bits (259), Expect = 9e-29
Identities = 60/242 (24%), Positives = 112/242 (46%), Gaps = 6/242 (2%)

Query: 14 VFFLVFARVGTMIMLVPVFGERAIPVRIRLIVALMLCYVLYPLV-QPLYPLGELQQLPVL 72
++F RV +I P+ ER++P R++L +A+M+ + + P + P+ L +
Sbjct: 15 LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPVFSFFALWLA 74

Query: 73 LRYLFLELVLGFFVGLGMRLIAYALQTAGAIIANQSGLAFAMGGDMTNFGEQGAMVGSFL 132
++ ++++G +G M+ A++TAG II Q GL+FA D + ++ +
Sbjct: 75 VQ----QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPAS-HLNMPVLARIM 129

Query: 133 AMLGTTLVLATNLHYVIIAGIHDSFTLFPPGRILPVGDMAHMAVDTVSHVFVIAAHIGAP 192
ML L L N H +I+ + D+F P G + S +F+ + P
Sbjct: 130 DMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALP 189

Query: 193 FILFGLVFYFGLGLLNKLMPQLQIFFVAMPVAILAGFTLLMLLLSTLMGWYLQQYEAVLS 252
I L LGLLN++ PQL IF + P+ + G +L+ L+ + + + + +
Sbjct: 190 LITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFN 249

Query: 253 PF 254

Sbjct: 250 LL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00288TYPE3IMQPROT584e-15 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 58.2 bits (141), Expect = 4e-15
Identities = 25/82 (30%), Positives = 49/82 (59%)

Query: 5 EVLDIASSGIWTLIKLSVPVLLVGLVVGVIVALFQALTQIQEQTLVFVPKIISIFLALLV 64
+++ + ++ ++ LS +V ++G++V LFQ +TQ+QEQTL F K++ + L L +
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 AFPFMGSVLGSFTEQISQLIVS 86
+ G VL S+ Q+ L ++
Sbjct: 63 LSGWYGEVLLSYGRQVIFLALA 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00289FLGHOOKFLIE371e-06 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 37.3 bits (86), Expect = 1e-06
Identities = 19/71 (26%), Positives = 38/71 (53%), Gaps = 2/71 (2%)

Query: 38 FDNLVKSAIAEVREGGQATEAQALNLVEGRA--NVVDVVTAVAETEVAMETVVSVRDKVI 95
F + +A+ + + A QA G + DV+T + + V+M+ + VR+K++
Sbjct: 33 FAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKLV 92

Query: 96 SAYEEIMRMPI 106
+AY+E+M M +
Sbjct: 93 AAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00290FLGHOOKAP1290.007 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.007
Identities = 8/40 (20%), Positives = 22/40 (55%)

Query: 94 LPNVSTMIELADMREAQRSYEANLNVVRSSRSMVQRTLDI 133
+ V+ E +++ Q+ Y AN V++++ ++ ++I
Sbjct: 506 ISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00292IGASERPTASE661e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 66.2 bits (161), Expect = 1e-13
Identities = 62/332 (18%), Positives = 101/332 (30%), Gaps = 33/332 (9%)

Query: 91 LVVEQNILRAMPQGQTQRNQTLQQGQQATQAAAITQPQAVAQSNQAVAQPQAAPTTQAAP 150
L +RNQT+ T +V +N+ +A+ AP AP
Sbjct: 971 LRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAP 1030

Query: 151 MQQEHALHASAPQPTATQPQQQIAQPAAQPSTPS-ATRQQRASADYQRHLAASRERLRKA 209
A + + A +Q+ + T Q RE ++A
Sbjct: 1031 -----ATPSETTETVAENSKQESKTVEKNEQDATETTAQN-------------REVAKEA 1072

Query: 210 EATVKANGTNSSVANGGVHVTEAQRQAQQVAAARALAASRASVSPPSAGPAARIVAAQQE 269
++ VKAN + VA G E Q + A +A V ++ +
Sbjct: 1073 KSNVKANTQTNEVAQSGSETKETQTTETKETATVE-KEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 270 KQRQDEA----AQSEVVADTAALAKDQTKPVEETVPTLKLASSLEEALSPEVTTSTQSPE 325
KQ Q E A+ D K+ T T + A + VT ST
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNT 1191

Query: 326 AKEPDATP---TNADKKPENAENVHQLDLEAAIQETVAKASNSDENPTQEDKVEAPAAAD 382
P T A +P + + +V ++ E T + A
Sbjct: 1192 GNSVVENPENTTPATTQPTVNSESSNKP-KNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250

Query: 383 DATAESKGEKSKNPIPAEALDKEMADLLSELK 414
D T+ + N + ++A K L+ K
Sbjct: 1251 DLTS-----TNTNAVLSDARAKAQFVALNVGK 1277



Score = 38.1 bits (88), Expect = 6e-05
Identities = 35/267 (13%), Positives = 81/267 (30%), Gaps = 23/267 (8%)

Query: 105 QTQRNQTLQ-QGQQATQAAAITQPQAVAQSNQAVAQPQAAPTTQAAPMQQEHALHASAPQ 163
Q ++T++ Q AT+ A + A + A Q Q+ +E
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE--------- 1094

Query: 164 PTATQPQQQIAQPAAQPSTPSATRQQRASADYQRHLAASRERLRKAEATVKANGTNSSVA 223
T T ++ A + T + + ++ +E+ + + N
Sbjct: 1095 -TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 224 NGGVHVTEAQRQAQQVAAARALAASRASVSPPSAGPAARIVAAQQEKQRQDEAAQSEVVA 283
N ++ A A+ + + E ++ A
Sbjct: 1154 NIKEPQSQTNTTADTEQPAK--------ETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 284 DTAALAKDQTKPVEETVPTLKLASSLEEALSPEVTTSTQSPEAKEPDATPTNADKKPENA 343
T ++ + + S +++ +S A D T TN + +A
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA-LCDLTSTNTNAVLSDA 1264

Query: 344 ENVHQ---LDLEAAIQETVAKASNSDE 367
Q L++ A+ + +++ ++E
Sbjct: 1265 RAKAQFVALNVGKAVSQHISQLEMNNE 1291



Score = 35.4 bits (81), Expect = 5e-04
Identities = 43/258 (16%), Positives = 78/258 (30%), Gaps = 30/258 (11%)

Query: 176 PAAQPSTPSATRQQRASADYQRHLAASRERLRKAEATVKANGTNSSVANGGVHVTEAQRQ 235
+TP+ + S R+ +A A T S V E +Q
Sbjct: 993 DTTNITTPNNIQADVPSVPSNN---EEIARVDEAPVPPPAPATPSETTE---TVAENSKQ 1046

Query: 236 AQQVAAARALAASRASVSPPSAGPAARIVAAQQEKQRQDEAAQSEVVADTAALAKDQTKP 295
+ A+ + R VA + + + +EV + + QT
Sbjct: 1047 ESKTVEKNEQDATETT-------AQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 296 VEETVPTLKLASSLEEA--------LSPEVTTSTQSPEAKEPDATPTNADKKPENAENVH 347
+ET K + E ++ +V+ + E +P A P + N +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 348 QLDLEAAIQETVAKASNSDENP---------TQEDKVEAPAAADDATAESKGEKSKNPIP 398
A E AK ++S+ T VE P AT + + P
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219

Query: 399 AEALDKEMADLLSELKPA 416
+ + + ++PA
Sbjct: 1220 KNRHRRSVRSVPHNVEPA 1237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00293FLGBIOSNFLIP2723e-94 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 272 bits (696), Expect = 3e-94
Identities = 109/237 (45%), Positives = 156/237 (65%)

Query: 36 IALLLCAFAALWAAPALAQSISINFGEGDTVTERALQIVALLTVLSLAPSILIMVTSFTR 95
+A +L A L S G +Q + +T L+ P+IL+M+TSFTR
Sbjct: 7 VAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFTR 66

Query: 96 IVVVLSLLRTAIGLQTAPPNAVMIALALFLTAFVMEPTFTKAYDEAVRPLMDGTIEVEQA 155
I++V LLR A+G +APPN V++ LALFLT F+M P K Y +A +P + I +++A
Sbjct: 67 IIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQEA 126

Query: 156 AERGVEPFHEFMRSQARVEDVTMFQSLSRKEPPATPEELSLRVLVPAFMISELRRAFEIG 215
E+G +P EFM Q R D+ +F L+ P PE + +R+L+PA++ SEL+ AF+IG
Sbjct: 127 LEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQIG 186

Query: 216 FLLYLPFLIIDLVVASVLMSMGMMMLPPVVISLPFKLIFFVLVDGWQMIAGSLVRSF 272
F +++PFLIIDLV+ASVLM++GMMM+PP I+LPFKL+ FVLVDGWQ++ GSL +SF
Sbjct: 187 FTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00294TONBPROTEIN330.006 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.0 bits (75), Expect = 0.006
Identities = 30/150 (20%), Positives = 51/150 (34%), Gaps = 14/150 (9%)

Query: 387 APVHAIGGLPVQARPVKKTAVPAAL----QVAKAPPMSLLPEEVSSQRTPASSSPEPVLV 442
VH + LP A+P+ T V A Q + PP ++ E + P PV++
Sbjct: 30 TSVHQVIELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVI 89

Query: 443 SQKAAEPRPSVRPTLSRQQGPAPEPNNTTRILREPQLQIQAGRATANDSVLAPNERRSGP 502
+ +P+P +P Q+ P + P R T++ + A
Sbjct: 90 EKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAAT------- 142

Query: 503 DEAGAGDVPALPSERIVKRDQSRLPTDEQM 532
R + R+Q + P Q
Sbjct: 143 ---SKPVTSVASGPRALSRNQPQYPARAQA 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00296IGASERPTASE290.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.9 bits (64), Expect = 0.011
Identities = 25/128 (19%), Positives = 42/128 (32%), Gaps = 15/128 (11%)

Query: 33 RADEASLRATIAELVTATEIAERAILGLKSTAGNADRTLG--ARLGEAEEVSRKLSNQLQ 90
+ + A T E+A+ S T E EE ++ + + Q
Sbjct: 1067 EVAKEAKSNVKANTQTN-EVAQ-----SGSETKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 91 GGEDVLERIGSVAASAQAVAPQAGVSQPAPTQAPEPAAVQPVDEMPARKKVTGDLCRAAN 150
+ V + Q+ QP A E + E ++ T D + A
Sbjct: 1121 -------EVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 151 ETASRLEQ 158
ET+S +EQ
Sbjct: 1174 ETSSNVEQ 1181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00297FLGMOTORFLIM2796e-94 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 279 bits (716), Expect = 6e-94
Identities = 89/324 (27%), Positives = 155/324 (47%), Gaps = 15/324 (4%)

Query: 78 VLNQEEIDNLLGFNLE-DSLDGDDSGIRALINSAMVSY--------ERLPMLEIVFDRLV 128
VL+Q+EID LL D+ D I + + E++ L ++ +
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 129 RLTTTSLRNFTSDNVEVSLDSINSVRFGDYLNSIPLPAILGVFKAHEWDGFGLVTVESSL 188
RLTTTSL V V + S++ + + +++ SIP P+ L V G ++ V+ S+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 189 IYSIIDVLLGGGRGSSAVRVEGRPYTTIETTLVRRMIDIILADAEQAFAPLSPVRLPLER 248
+SIID L GG ++ V+ R T IE +++ +I ILA+ +++ + +R L +
Sbjct: 124 TFSIIDRLFGGTGQAAKVQ---RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQ 180

Query: 249 LETNPRFAAISRPANAAILVELRIDMDDRGGKVEILLPYATLEPIRELLLQMFMGEKLGR 308
+ETNP+FA I P+ +LV L + + G + +PY T+EPI L F + R
Sbjct: 181 IETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 309 DAK--WEEHLATEVYGSEVQVDAVLYENEISLRRVLGFEVGQTLVF-DTEPKDPVQIKCG 365
+ + L ++ ++ V A + +S+R +LG VG + DT DP + G
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 366 GVPLTEGVMGRLGNNISIEISKQL 389
G +G I+ +I +++
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00299FLGHOOKAP1346e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.8 bits (77), Expect = 6e-04
Identities = 8/23 (34%), Positives = 14/23 (60%)

Query: 16 RRQLDNVANNIANINTTGFKRQR 38
+ L+ +NNI++ N G+ RQ
Sbjct: 15 QAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00300FLGHOOKAP1406e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 39.9 bits (93), Expect = 6e-06
Identities = 10/43 (23%), Positives = 22/43 (51%)

Query: 218 LEASNVDAVTELSDLIAAQRAYEMNSKIVKAADEMYATSNNMR 260
S V+ E +L Q+ Y N+++++ A+ ++ N+R
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 35.7 bits (82), Expect = 2e-04
Identities = 10/34 (29%), Positives = 18/34 (52%)

Query: 4 LYIAATGMKAQELNVEVISNNVANMRTTGYKRQR 37
+ A +G+ A + + SNN+++ GY RQ
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00302FLGLRINGFLGH1632e-52 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 163 bits (414), Expect = 2e-52
Identities = 56/213 (26%), Positives = 97/213 (45%), Gaps = 7/213 (3%)

Query: 33 PLTAIKDPTTTPGYRPVQMPMPPKETQQSKANSLWRSGNRGFFKDQRAKRVGDIVTVTVT 92
P T + T+ +PV P P ++ G + F+D+R + +GD +T+ +
Sbjct: 26 PSTPLVQGATSA--QPVPGPTPVANGSIFQSAQPINYGYQPLFEDRRPRNIGDTLTIVLQ 83

Query: 93 IADKAAFNNTSKQSRADASNVGASGALGAAINTVALPASSNAGALAGIDSTKTYTGTGSI 152
A+ ++++ SR +N G + + + A ++ A T+ G G
Sbjct: 84 ENVSASKSSSANASRDGKTNFGF-DTVPRYLQGLFGNARADVEA----SGGNTFNGKGGA 138

Query: 153 DRKETLETTVAAVVTQVLPNGNLVIEGRQEVRVNYEVRELIVAGVVRPQDISAKNTVDSK 212
+ T T+ V QVL NGNL + G +++ +N + +GVV P+ IS NTV S
Sbjct: 139 NASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPST 198

Query: 213 KIAEARIGYGGRGQITAVQQPRYGSQVLDIVLP 245
++A+ARI Y G G I Q + + + P
Sbjct: 199 QVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00305FLGPRINGFLGI383e-135 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 383 bits (986), Expect = e-135
Identities = 187/369 (50%), Positives = 252/369 (68%), Gaps = 9/369 (2%)

Query: 10 RTFKALLLLLLLPYLIHPANSA--SRIKDIADFEGIRENQLIGYGLVVGLNGSGDGLNNA 67
R A L+ LP+L P A SRIKDIA + R+NQLIGYGLVVGL G+GD L ++
Sbjct: 5 RIIAAALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSS 64

Query: 68 PFTRQSLQAMLERLGVNTNELDLNTKNAAAVMVTANLPPFSTQGSRIDVAVSALGDASSL 127
PFT QS++AML+ LG+ T N KN AAVMVTANLPPF++ GSR+DV VS+LGDA+SL
Sbjct: 65 PFTEQSMRAMLQNLGITTQGGQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSL 124

Query: 128 QGGTLLVTPLIGANGETYAIAQGPVTVAGFEASGDAASITRGVPTSGRVANGGLVEKEVD 187
+GG L++T L GA+G+ YA+AQG + V GF A GDAA++T+GV TS RV NG ++E+E+
Sbjct: 125 RGGNLIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELP 184

Query: 188 FKLASLDTLRIALRNPDLTTTRRVALAINEL----IGLPTAEPLDSATVRISLPRLYDGN 243
K L + LRNPD +T RVA +N G P AEP DS + + PR+ D
Sbjct: 185 SKFKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVAD-- 242

Query: 244 IVDLLTDIEQLVIEPDMPARIVIDESSGIIVMGKEVKVSTVAVAQGNLTVTIAEAPQVSQ 303
+ L+ +IE L +E D PA++VI+E +G IV+G +V++S VAV+ G LTV + E+PQV Q
Sbjct: 243 LTRLMAEIENLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQ 302

Query: 304 PDPFSLGETTQVPRTNVSVDEDNSHLAIVSEAVTLQQLVDGLNALGISPRDLIAILQAIK 363
P PFS G+T P+T++ ++ S +AI E L+ LV GLN++G+ +IAILQ IK
Sbjct: 303 PAPFSRGQTAVQPQTDIMAMQEGSKVAI-VEGPDLRTLVAGLNSIGLKADGIIAILQGIK 361

Query: 364 AAGALQAEI 372
+AGALQAE+
Sbjct: 362 SAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00306FLGFLGJ412e-07 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 40.9 bits (95), Expect = 2e-07
Identities = 21/79 (26%), Positives = 39/79 (49%), Gaps = 1/79 (1%)

Query: 22 PTNNLDKKAKEFEAVYLNQLMQSMFSGLQEGGTYGSGPGSDAWRSMLLGEYANQLAQSGG 81
P N+ A++ E +++ +++SM L + G + S + + SM + A Q+ G
Sbjct: 29 PAANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSS-EHTRLYTSMYDQQIAQQMTAGKG 87

Query: 82 IGLAETIKAQMLEIQEANQ 100
+GLAE + QM Q +
Sbjct: 88 LGLAEMMVKQMTPEQPLPE 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00309FLAGELLIN569e-11 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 56.2 bits (135), Expect = 9e-11
Identities = 42/313 (13%), Positives = 83/313 (26%), Gaps = 5/313 (1%)

Query: 14 LSTLQSTATLISSTQERLSTGKKVNSALDNPNSYFTAASLNNRASDLSNLQDDMGQSVST 73
+ L + + +SS ERLS+G ++NSA D+ A + L+ + +S
Sbjct: 14 QNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISI 73

Query: 74 LTAADKGIKAISKLIDAAKGKANQALQ-TEDVSQRNKYAKEFNDLRTQIQDLAKDSGYKG 132
+ + I+ + + + QA T S E +I ++ + + G
Sbjct: 74 AQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNG 133

Query: 133 KNLLGGDGN-DLTVKFNEDGTSKLEIQSVDFTDLGKEDGDLKLGSLEIAKNGDFSATLEG 191
+L D + V N+ T +++Q +D LG + ++ + S
Sbjct: 134 VKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVT 193

Query: 192 TITDGDTKLASLDKLAAGDSIKLTFGEGDDAKTAELEITADSTLNDLTAKMKEVSGSDFK 251
++ + D V
Sbjct: 194 GYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTT 253

Query: 252 LDATAKTITGSVGAKLAISYTDNDSGDSADGSDLTAATTIGVKAGAWDTDSGIDSSLTEI 311
T A + T G + I+ +
Sbjct: 254 KSTAG---TAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTL 310

Query: 312 KAAQDALRAQAST 324
A A
Sbjct: 311 TVADITAGAANVD 323



Score = 45.4 bits (107), Expect = 2e-07
Identities = 48/334 (14%), Positives = 93/334 (27%), Gaps = 9/334 (2%)

Query: 68 GQSVSTLTAADKGIKAISKLIDAAKGKANQALQTEDVSQRNKYAKEFNDLRTQIQDLAKD 127
G +T+ K ++ A G + S + ++ A +
Sbjct: 176 GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVN--SGAVVTDTTAPTVPDKVYVNAAN 233

Query: 128 SGYKGKNLLGGDGNDLTVKFNEDGTSKLEIQSVDFTDLGKEDGDLKLGSLEIAKNGDFSA 187
+ DL + GKE + +
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGN 293

Query: 188 TLEGTITDGDTKLASLDKLAAGDSIKLTFGEGDDAKTAELEITADSTLNDLTAKMKEVSG 247
G ++ +A + + + + + K K S
Sbjct: 294 DGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESA 353

Query: 248 SDFKLDATAK-------TITGSVGAKLAISYTDNDSGDSADGSDLTAATTIGVKAGAWDT 300
L+A T+ G+ A +G + + + + A
Sbjct: 354 KLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAA 413

Query: 301 DSGIDSSLTEIKAAQDALRAQASTFGTNLSVVKNRKNFTADMINTLEVGAGKLTLADTNR 360
+ L I +A + A S+ G + + + + L ++ AD
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYAT 473

Query: 361 EGANLAALQTQQQLATSTLALAARANQSVLQLIR 394
E +N++ Q QQ TS LA A + Q+VL L+R
Sbjct: 474 EVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00310FLAGELLIN622e-12 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 61.6 bits (149), Expect = 2e-12
Identities = 50/371 (13%), Positives = 104/371 (28%), Gaps = 2/371 (0%)

Query: 12 NNLSTLQSTASMISSTQERLSTGMKVNSALDNPNSYFTAASLNNRASDLSNLQDDMGQSV 71
+ L + S +SS ERLS+G+++NSA D+ A + L+ + +
Sbjct: 12 LTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGI 71

Query: 72 STLTAADKGIKAISKLVDAAKGKANQALQ-SEDKEQRAKYSKEFNDLRTQIQDLAKDSGY 130
S + + I+ + + + QA + E +I ++ + +
Sbjct: 72 SIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQF 131

Query: 131 KGKNLLGGDGN-DLKVKFNEDGTSKLDIKSVDFTDLSAADSDLKLGAVQQLEKGDFTSGA 189
G +L D ++V N+ T +D++ +D L ++ + +
Sbjct: 132 NGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKN 191

Query: 190 MTLGSGTFEAATKLDTLDNYAAGGKITIKYGSGDDAKTKELEITADNTVGDLAAAIKEVT 249
+T A K N A T D + A+
Sbjct: 192 VTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFK 251

Query: 250 GEDFTIDGTAKTLSGKVADDLTFSYEASSGGETGGELSTGDTTTGLVKNGWSEDAGIEAS 309
T G T + + + +
Sbjct: 252 TTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLT 311

Query: 310 LETLKTAQDSLRAQASTFGTNLSVVQNRKNFTADMINTLEVGAGKLTLADTNKEGANLAA 369
+ + ++ A N+ FT D E A+ +G +
Sbjct: 312 VADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKIT 371

Query: 370 LQTQQQLATST 380
+ + A +
Sbjct: 372 VNGAEYTANAA 382



Score = 45.8 bits (108), Expect = 2e-07
Identities = 44/336 (13%), Positives = 93/336 (27%), Gaps = 11/336 (3%)

Query: 68 GQSVSTLTAADKGIKAISKLVDAAKGKANQ-ALQSEDKEQRAKYSKEFNDLRTQIQDLAK 126
G +T+ K ++ A G + + D +
Sbjct: 176 GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQ 235

Query: 127 DSGYKGKNLLGGDGNDLKVKFNEDGTSKLDIKSVDFTDLSAADSDLKLGAVQQLEKGDFT 186
+ +N D +K ++ + + G+
Sbjct: 236 LTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDG 295

Query: 187 SGAMTLGSGTFEAATKLDTLDNYAAGGKITIKYGSGDDAKTKELEITADNTVGDLAAAIK 246
+G ++ K+ G + + +K + D +
Sbjct: 296 NGKVSTTIN----GEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 247 EVTGEDFTIDGTAKTLSGKVADDLTFSYEASSGGETGGELSTGDTTTGLVKNGWSED--- 303
D + K S + ++ A+ T + T + +
Sbjct: 352 SAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAA 411

Query: 304 ---AGIEASLETLKTAQDSLRAQASTFGTNLSVVQNRKNFTADMINTLEVGAGKLTLADT 360
L ++ +A + A S+ G + + + + L ++ AD
Sbjct: 412 AAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADY 471

Query: 361 NKEGANLAALQTQQQLATSTLALAARANQSVLQLIR 396
E +N++ Q QQ TS LA A + Q+VL L+R
Sbjct: 472 ATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00314FLGHOOKAP11071e-26 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 107 bits (269), Expect = 1e-26
Identities = 129/620 (20%), Positives = 222/620 (35%), Gaps = 85/620 (13%)

Query: 5 SAITNAFAGLKATQSRLAVTSQNIANAGQAGYTRKSMEVAETLGDSGSSSS----VRTSL 60
S I NA +GL A Q+ L S NI++ AGYTR++ +A+ G+ V S
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 61 VQRNVDQMLRDTYWQKSSEANFAEKRAVYAGRLDSYFGQLNDKNGLPSLVNNLNGSLIKL 120
VQR D + + ++++ R ++D+ + L + + + SL L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLST--STSSLATQMQDFFTSLQTL 119

Query: 121 VSDPASPSAQANVVSHARTLASELNAGAEYVQALRRETETGITEDVDRVNNLLREVSSLE 180
VS+ P+A+ ++ + L ++ +Y++ ++ I VD++NN ++++SL
Sbjct: 120 VSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLN 179

Query: 181 KDIVSERAKGISVAALD--DQMDMKLEELSKYLDINVHRDNDGELRISTVSGMTLYDVKP 238
I G + + DQ D + EL++ + + V + G I+ +G +L
Sbjct: 180 DQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGST 239

Query: 239 A-QLSFVNSPTLGAGVEGNPVEIVTPSGQRSTLQPADFRGGTIGASLKVRDSDLPEAQTR 297
A QL+ V S A V V + + G++G L R DL + +
Sbjct: 240 ARQLAAVPSS---ADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNT 296

Query: 298 LDELASQMALALSREEAKGAAADDGLTPPEKTGLAVNLSGIKSSGDKLELSFVD-ASGAK 356
L +LA A A + + G A+ + V + K
Sbjct: 297 LGQLALAFAEAFNTQHKAGFDAN---------------GDAGEDFFAIGKPAVLQNTKNK 341

Query: 357 RELSFIAVKDPALLPLAGEATAKGGDLVYGLDLSDSAPDLATQIGTALGASFEVSDAGGG 416
+++ A A + D D ++V+
Sbjct: 342 GDVAIGATVTDA-------SAVLATDYKISFD----------------NNQWQVTRLASN 378

Query: 417 AVKILADPTKTSVKLDNAVATNTRTITTQDGLALPMFVDSEHGGIPFTGALEEGGQKTGF 476
+ V D T T T D L D I L K
Sbjct: 379 TTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSD----AIVNMDVLITDEAKIAM 434

Query: 477 AHRITVNDQLVRNPALLSGMDKDGSGFDSARALFLKDSLSTARFHYSSGTGIGSREAPYS 536
A D RN + +D + + + + S + S + IG++ A
Sbjct: 435 ASEEDAGDSDNRN--GQALLDLQSN------SKTVGGAKSFNDAYASLVSDIGNKTAT-L 485

Query: 537 GTISDFANQVIARQGQLTSQVNAQKTASESGMNVAKKAYEMSYKVDVDTELVNLMELQNA 596
T S V+ QL++Q S SG+N +D E NL Q
Sbjct: 486 KTSSATQGNVVT---QLSNQQ-----QSISGVN-------------LDEEYGNLQRFQQY 524

Query: 597 FAANARVLDVSNKLFDQLMQ 616
+ ANA+VL +N +FD L+
Sbjct: 525 YLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00315FLGHOOKAP1424e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 4e-06
Identities = 66/355 (18%), Positives = 112/355 (31%), Gaps = 70/355 (19%)

Query: 255 EYIEFLNQFNGVSADIAIDGEVNIASSVNFDL-SGSGLTPIDIDEETNDPALSS---VSR 310
+ + LNQ GV + G NI + + L GS + + DP+ ++ V
Sbjct: 203 QLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTARQLAAVPSSADPSRTTVAYVDG 262

Query: 311 DAVLAENNTKFLNSTITGGAVTVYDKNGTPVNVELRWALNREDQWSLYYNSAPKAEGTNT 370
A E K LN+ GG +T + +L N Q +L + AE NT
Sbjct: 263 TAGNIEIPEKLLNTGSLGGILTFRSQ-------DLDQTRNTLGQLALAF-----AEAFNT 310

Query: 371 SW-------KKIGDVSFD-AAGRMVTPADG----AMSISGLEVNGVKAADHTLSFGSDK- 417
G+ F ++ A+ + + + V A D+ +SF +++
Sbjct: 311 QHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQW 370

Query: 418 -------------LTQYEDKIGFATSIDIAQNGYPAGNLRSISITADGFV---------- 454
K+ F + ++ +D V
Sbjct: 371 QVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLITDEA 430

Query: 455 ---------TGNYANGKSQQLYKVPIATFAAEQELQRIDGAAFART----PTSGEPDFRN 501
G+ N Q L + + D A + T+
Sbjct: 431 KIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSA 490

Query: 502 GGSIRAKAYEASNADIAG-----EFSKLIITQQAYSANSKVLSTANQMLDSVLNI 551
I+G E+ L QQ Y AN++VL TAN + D+++NI
Sbjct: 491 TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 32.6 bits (74), Expect = 0.005
Identities = 14/36 (38%), Positives = 19/36 (52%)

Query: 7 INAATTGLDAQSTALENISGNVANSGTTGYKRLDTT 42
IN A +GL+A AL S N+++ GY R T
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00316ACRIFLAVINRP300.011 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.011
Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 10/89 (11%)

Query: 5 GTDTYVATEDLQVAVNAAITLGRPLLVKGEPGTGKTALAQEIAASQGAPLIEWHIKSTTK 64
GTD +A +Q + A PLL + G + + S + L+ S
Sbjct: 97 GTDPDIAQVQVQNKLQLA----TPLLPQEVQQQGIS-----VEKSSSSYLMVAGFVSDNP 147

Query: 65 AQQGLYEYD-AVSRLRDSQLGDPKVADIS 92
D S ++D+ V D+
Sbjct: 148 GTTQDDISDYVASNVKDTLSRLNGVGDVQ 176


13Phong_00327Phong_00333N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_003271141.754104Apolipoprotein A1/A4/E domain protein
Phong_00328-220-1.122523hypothetical protein
Phong_00329-219-0.711846mce related protein
Phong_00330-218-0.320267putative ABC transporter ATP-binding protein
Phong_00331-317-0.590604putative phospholipid ABC transporter permease
Phong_00332-319-0.839792putative 3-phenylpropionic acid transporter
Phong_00333-4180.255158chemotaxis regulator CheZ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00327FLAGELLIN320.029 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 32.3 bits (73), Expect = 0.029
Identities = 57/432 (13%), Positives = 131/432 (30%), Gaps = 27/432 (6%)

Query: 738 EDMQTKVSDTLSTKLDSFTADVDSANTRLSGTLRSTTNEVTEKITQVNDELGFAIAA--G 795
+ + + LS+ L +A D+A ++ S +T+ ND + A
Sbjct: 21 QSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQTTEGA 80

Query: 796 VNQITQTFSTTQDELNLTLTATGNTVGDRIAAANDDISRTLSQGVSSLNSTVSGASEQLQ 855
+N+I + EL++ T N+ D + D+I + L + N T + L
Sbjct: 81 LNEINNNLQRVR-ELSVQATNGTNSDSDL-KSIQDEIQQRLEEIDRVSNQTQFNGVKVLS 138

Query: 856 ETLNVT---AADVTTRISTANDELSGKLTTGMSALNRTITEAGDKLETQLGETSSRVSGE 912
+ + A+ I+ ++ K + G+ N + + + +
Sbjct: 139 QDNQMKIQVGANDGETITIDLQKIDVK-SLGLDGFNVNGPKEATVGDLKSSFKNVTGYDT 197

Query: 913 LTQTTQKVTEELAGASSQLTTTAASSLGEISDKIAEANIVFGTKVETGVTGITEAVSLAT 972
K ++ + TTA + ++ A + + +
Sbjct: 198 YAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTA 257

Query: 973 EQLEEALSSNSRRVTGELNSASTTLVDTVSGSLDQISEKIDEANTGFSEKLATGVNGISD 1032
E + G + G I K G K++T +NG
Sbjct: 258 GTAEAKA------IAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNG---KVSTTINGEKV 308

Query: 1033 ALSLATSELEETLTNNSMRVTGELNTASTNLVDTVSLALADVSQQVDSANTRLTTTMQGG 1092
L++A + + + + S D ++ + + L
Sbjct: 309 TLTVADITAGAANVDAATLQSSKNVYTSVVNGQF---TFDDKTKNESAKLSDLEAN-NAV 364

Query: 1093 LETLSTSLTEAEQSLSGKVEATNRNIESALQNSQEAIAAATSTATQTLGNSLETSVTGIS 1152
++ AE + + + ++ I S + + + +
Sbjct: 365 KGESKITVNGAEYTANAAGDKVTLAGKTM------FIDKTASGVSTLINEDAAAAKKSTA 418

Query: 1153 EAISSADTALAQ 1164
++S D+AL++
Sbjct: 419 NPLASIDSALSK 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00329OMS28PORIN290.046 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.0 bits (64), Expect = 0.046
Identities = 44/192 (22%), Positives = 81/192 (42%), Gaps = 23/192 (11%)

Query: 235 AGIVSNSEEFTAKLVDKAEELDGVFADVQEAAGQVQQFTEGL-NSSLQQVDAVIAAVDPN 293
+ ++ +S++ K +D+ ++++ + + V EG+ SSL+ V++ A V
Sbjct: 36 SNVLEHSDQKDNKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKK 95

Query: 294 AVEQAVQGAADLGALVSDNKADVAKTLVNASQISTDLAAVSQELADNKQLVSELGTRAKE 353
V G+ L +DVAK V ASQ +T +A S +A+ V E+ +A +
Sbjct: 96 FV-----GSMSL-------MSDVAKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQ 143

Query: 354 VMERLSVAAGDAQAVIAAV----------EPEKVRAIVANVEDVSAAVAGKSTQIAKTVD 403
++ AG+A +I E E + A VE V + + +TV
Sbjct: 144 ETQKAVSVAGEATFLIEKQIMLNKSPNNKELELTKEEFAKVEQVKETLMASERALDETVQ 203

Query: 404 DVSAAAANVRGI 415
+ V G+
Sbjct: 204 EAQKVLNMVNGL 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00332TCRTETA515e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 5e-09
Identities = 81/392 (20%), Positives = 146/392 (37%), Gaps = 49/392 (12%)

Query: 17 ISALFSTFI--FGIGLFLPYFPVYLEGLQFTPVQIA---VLASLPNLVKVVSTPLLTSLS 71
I L + + GIGL +P P L L + A +L +L L++ P+L +LS
Sbjct: 8 IVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 72 DKSGRRRRSIALYALFCV-LSLVALSQTDTYAVVFGIVLVFSFFLAPLQPLSDAYAFEAV 130
D+ GRR + A V +++A + VL +A + + A A +
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLW-------VLYIGRIVAGITGATGAVAGAYI 120

Query: 131 NNCGIDYGKPRAVG------SGFFIIATMFGGWYIGFGPSVHLLYFIAAAF--CLLIWMA 182
+ + R G + + GG GF P H +F AAA +
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP--HAPFFAAAALNGLNFLTGC 178

Query: 183 LALPPMPSEHRREMQQGTVDGNELKTASLHIMLLATGLVLGSLGALFS-FGSIFWLANG- 240
LP RR +++ ++ + + ++A + + + L + W+ G
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 241 ----LSDSQVGLLWSA-GVVAEICVFLVGQRLIKRFGVFQLIIIGSLGALVRWVLFPIAD 295
+ +G+ +A G++ + ++ + R G + +++G + ++L A
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 296 GFWSALALQLLHAFNFGMVFIALMNYLSKSISAARLGTAQGLSQTYIGLFTGLTGV---- 351
W A + +L A G+ AL LS+ + R G QG T LT +
Sbjct: 299 RGWMAFPIMVLLAS-GGIGMPALQAMLSRQVDEERQGQLQGSLAA----LTSLTSIVGPL 353

Query: 352 LSGWLYEMSPAY----------AFYSMALIVL 373
L +Y S A Y + L L
Sbjct: 354 LFTAIYAASITTWNGWAWIAGAALYLLCLPAL 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00333IGASERPTASE364e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 4e-04
Identities = 24/142 (16%), Positives = 43/142 (30%), Gaps = 15/142 (10%)

Query: 261 EPAKVAPAEAEEALTEPQAQEPEAKPAVPDTIKT-----------EASAEETASEPEVGI 309
+ + +A E + + EAK V +T E ET V
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 310 ADTGSEEA--VLDAPRSTSKQPAQLGA--ISQPDPAPQPEPDEQADASAEEDEEEERPSS 365
+ E + P+ TS+ + QP P E D + + + +
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 366 GRTFKQIAKNSDPLEELSTGER 387
+ K+ + N + ST
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVN 1190


14Phong_00613Phong_00617N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_006130191.512525Transcriptional regulatory protein ZraR
Phong_00614-119-0.150482Flagellar motor switch protein FliN
Phong_00615-1201.232765flagellar assembly protein H
Phong_00616-2141.044607Flagellar motor switch protein FliG
Phong_00617-2150.227493Flagellar M-ring protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00613HTHFIS432e-151 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 432 bits (1113), Expect = e-151
Identities = 165/471 (35%), Positives = 245/471 (52%), Gaps = 37/471 (7%)

Query: 2 RLLIVGTLDGQLTTATKIAMDRGASVTHAGELTSALRVLRGGGGADLLMVDV---GVDIA 58
+L+ T + G V + R + G G DL++ DV +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDENAF 63

Query: 59 DLVGQMEQERIHVPIVACGIGADAKSAVSAIKAGAKEYIPLPPDPELIAAV--------- 109
DL+ ++++ R +P++ +A+ A + GA +Y+P P D + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 110 -----LAAVSQDQSSMVFRDEAMAALVSLARQIAPSDASVLITGESGSGKEVIARFVHEN 164
L SQD +V R AM + + ++ +D +++ITGESG+GKE++AR +H+
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 165 SSRKDKPFISINCAAIPEHLLESELFGHEKGAFTGAVARRIGKFEEASGGTLLLDEISEM 224
R++ PF++IN AAIP L+ESELFGHEKGAFTGA R G+FE+A GGTL LDEI +M
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 225 DVRLQAKLLRALQERVIDRVGGSKPVAVNIRVLATSNRDLTQAVQDGSFREDLLYRLNVV 284
+ Q +LLR LQ+ VGG P+ ++R++A +N+DL Q++ G FREDL YRLNVV
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 285 NLKLPPLRERPGDILALSEHFIRIYAKANGMKPMPISAEAQSALAANEWRGNVRELENTM 344
L+LPPLR+R DI L HF++ K G+ EA + A+ W GNVRELEN +
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 345 HRALLLAQGGEIGLEAI--RMPDGTPIATSPAQQAASRAASAAEALSRTM---------- 392
R L I E I + P + A S + S ++A+ M
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 393 ------VGRTVADVERDLILDTLDHCLGNRTHAAKILGISIRTLRNKLNQY 437
R +A++E LIL L GN+ AA +LG++ TLR K+ +
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00614FLGMOTORFLIN885e-26 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 88.4 bits (219), Expect = 5e-26
Identities = 36/81 (44%), Positives = 55/81 (67%)

Query: 32 VSKSAADLEAVFDVPVQISAVLGHSRMKISEMLHLDTGVVMELDRKVGEAIDIYVNDRLV 91
VS + D++ + D+PV+++ LG +RM I E+L L G V+ LD GE +DI +N L+
Sbjct: 47 VSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLI 106

Query: 92 ARGEVVLVEEKLGVTMTEIIK 112
A+GEVV+V +K GV +T+II
Sbjct: 107 AQGEVVVVADKYGVRITDIIT 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00616FLGMOTORFLIG2813e-95 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 281 bits (720), Expect = 3e-95
Identities = 104/331 (31%), Positives = 184/331 (55%), Gaps = 2/331 (0%)

Query: 22 KLSGPERAAVLLLALGEEYGRPIWEGLDDIEVRQVSSAMASLGPITPAMLEELFVDFLKR 81
L+G ++AA+LL+++G E +++ L E+ ++ +A L IT + + + ++F +
Sbjct: 14 ALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKEL 73

Query: 82 LSTNGAL-TGNVDVTERLLNSFLPGERVELIMEEIRGPAGRNMWEKLSNVPESVLANYLK 140
+ + G +D LL L ++ I+ + +E + + + N+++
Sbjct: 74 MMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQ 133

Query: 141 NEYPQTIAVVLSKIQPEHSARVLALLNDELALEVVHRMLKMESIQKDILRKVEQTLRIEF 200
E+PQTIA++LS + P+ ++ +L+ L E+ V R+ M+ +++R+VE+ L +
Sbjct: 134 QEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKL 193

Query: 201 MSNLS-QTSRRDSHEIMAEIFNNFDRQTEVRFLSSLEEDNRESAERIRNLMFTFEDLLKL 259
S S + + + EI N DR+TE + SLEE++ E AE I+ MF FED++ L
Sbjct: 194 ASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLL 253

Query: 260 DAASCQTLLRNVEKDKLALALKGANEIAMEFFTSNMSQRAAKMLQEDMEALGPVRLREVD 319
D S Q +LR ++ +LA ALK + E NMS+RAA ML+EDME LGP R ++V+
Sbjct: 254 DDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVE 313

Query: 320 EAQSAMVAKAKELATSGDIEINKKKGEDDLV 350
E+Q +V+ ++L G+I I++ ED LV
Sbjct: 314 ESQQKIVSLIRKLEEQGEIVISRGGEEDVLV 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00617FLGMRINGFLIF316e-103 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 316 bits (811), Expect = e-103
Identities = 165/579 (28%), Positives = 267/579 (46%), Gaps = 68/579 (11%)

Query: 6 DFLKRLGPARLGAMGAMAAILVGVFAFIIMRATAPQLVPLYTDLSLEDSSAIVSQLQSSG 65
++L RL + + V + +++ A P L+++LS +D AIV+QL
Sbjct: 14 EWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMN 73

Query: 66 TVYEIRENGAALYVPQESVHQLRLTMAGVGLPSGGGVGYEIFDKSDTLGATSFVQNINRL 125
Y A+ VP + VH+LRL +A GLP GG VG+E+ D+ G + F + +N
Sbjct: 74 IPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEK-FGISQFSEQVNYQ 132

Query: 126 RALEGELARTIRSISRVVSARVHLVIPEKQLFQRDREPPTASIAVKVRG--SLDSGQIRS 183
RALEGELARTI ++ V SARVHL +P+ LF R+++ P+AS+ V + +LD GQI +
Sbjct: 133 RALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISA 192

Query: 184 IQHLVASAVEGLEPSYVSIVDERGALLAAGTGADDGAFGSANLDERRISIENRMRAQVED 243
+ HLV+SAV GL P V++VD+ G LL G + + +E+R++ ++E
Sbjct: 193 VVHLVSSAVAGLPPGNVTLVDQSGHLLTQSN--TSGRDLNDAQLKFANDVESRIQRRIEA 250

Query: 244 ILNNVVGSGRARVRVAAELNLNRKTETEEVFDPNGQVVRSSQSREENQRSSSLNQSV--- 300
IL+ +VG+G +V A+L+ K +TEE + PNG +++ + S +
Sbjct: 251 ILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGG 310

Query: 301 ---TVSNQLP---DAGANDTGGGQEDNSNVT------------------EETVNYEISRS 336
+SNQ +A Q++ N ET NYE+ R+
Sbjct: 311 VPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRT 370

Query: 337 TTTEVIEAGGLERLSIAVLVDGTYVNDESGAQVYQPRSQEELDSIAALVRSAVGFDERRG 396
+ G +ERLS+AV+V+ + D P + +++ I L R A+GF ++RG
Sbjct: 371 IRHTKMNVGDIERLSVAVVVNYKTLADGK----PLPLTADQMKQIEDLTREAMGFSDKRG 426

Query: 397 DKIEVVNLRF--IDTPVEEMPEDGQGWMEFTRADIMRLAELGVLLVISVLMLVFAVNPLM 454
D + VVN F +D E+P Q ++ ++LV++ ++ AV P +
Sbjct: 427 DTLNVVNSPFSAVDNTGGELPFWQQQSF---IDQLLAAGRWLLVLVVAWILWRKAVRPQL 483

Query: 455 RRILASEDTAEPSEVIGAGAAVTGTAPAAGAAVGAEAEAAELGPSASLQEAEAQAMQKNG 514
R A AA + E L + E Q
Sbjct: 484 TRR-------------------VEEAKAAQEQAQVR-QETEEAVEVRLSKDE----QLQQ 519

Query: 515 WIADAKTQAALHSSSIKELGNMIDEMPTEAVNIVRSWIN 553
A+ + A + S I+E M D P ++R W++
Sbjct: 520 RRANQRLGAEVMSQRIRE---MSDNDPRVVALVIRQWMS 555


15Phong_00652Phong_00659N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Phong_00652015-1.923315putative HTH-type transcriptional regulator
Phong_00653015-2.161776Multidrug export protein EmrA
Phong_00654012-2.600634putative ABC transporter ATP-binding protein
Phong_00655-112-2.564464Inner membrane transport permease YbhR
Phong_00656114-2.316846Aldehyde-alcohol dehydrogenase
Phong_00657218-0.531540Ribosomal large subunit pseudouridine synthase
Phong_00658119-0.739812SCP-2 sterol transfer family protein
Phong_00659017-0.478800Fatty acid metabolism regulator protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00652HTHTETR611e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 1e-13
Identities = 26/130 (20%), Positives = 50/130 (38%), Gaps = 14/130 (10%)

Query: 1 MPIDTQTQSTELSSSQKTRLALINAALRLFGSKGYAATSTREIAEAASTNISSIAYHFGG 60
M T+ ++ E TR +++ ALRLF +G ++TS EIA+AA +I +HF
Sbjct: 1 MARKTKQEAQE------TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54

Query: 61 KQGLHTACAEYFVSLAKTHVNDTSPATSAAWQSLSQENAEEALISFVGRMIDFLLGSKEA 120
K L + E S + + L + +++ + +
Sbjct: 55 KSDLFSEIWELSESNIGELELEYQAKFPG--------DPLSVLREILIHVLESTVTEERR 106

Query: 121 PSFVSFVLRE 130
+ + +
Sbjct: 107 RLLMEIIFHK 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00653RTXTOXIND611e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 61.0 bits (148), Expect = 1e-12
Identities = 43/274 (15%), Positives = 99/274 (36%), Gaps = 72/274 (26%)

Query: 28 VEGEYARLAPVEAAEIAQVYVRRGDSVKAGDIMARLQTEDVEIDLVEAEAAVNEA----- 82
G + P+E + + ++ V+ G+SV+ GD++ +L E D ++ ++++ +A
Sbjct: 92 HSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQT 151

Query: 83 --QAHLDNLKLGKRPE----------------------------------------EINV 100
Q +++L K PE ++
Sbjct: 152 RYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211

Query: 101 LEASIASAAALLTEKE---RVLARQVG----LYKREVMSK-------ATLDKAQTDHDVA 146
A + A + E RV ++ L ++ ++K +A + V
Sbjct: 212 KRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVY 271

Query: 147 SSRLAELKAQLEVAKLPARK----------DQIKAAQSALQRAKARLERTKWRLAERTIH 196
S+L ++++++ AK + D+++ + L + + R I
Sbjct: 272 KSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIR 331

Query: 197 APADGKVFDV-IHHRGEIASPTAPVVSVLPEGAV 229
AP KV + +H G + + ++ ++PE
Sbjct: 332 APVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00655ABC2TRNSPORT414e-06 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 41.1 bits (96), Expect = 4e-06
Identities = 43/170 (25%), Positives = 67/170 (39%), Gaps = 3/170 (1%)

Query: 205 SLTRERERGTMENLLAMPANSAEIMLGKVTPYLCLGAVQMTIILGAGQLLFDVPFVGSLW 264
+ R + T E +L +I+LG++ A+ I L ++ SL
Sbjct: 90 AFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWL-SLL 148

Query: 265 LILFSTLIYVFSLVLLGYFFSTIAKSQMQAMQLMVLSLLPSILLSGFMFPFRGMPHWAQY 324
L + + LG + +A S + L + P + LSG +FP +P Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 325 IGECLPITHFIRIIRGVML--KGADLSQVGNELAILFVFVFVFSLMALSR 372
LP++H I +IR +ML D+ Q L I V F S L R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRR 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Phong_00659HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 2e-14
Identities = 29/174 (16%), Positives = 68/174 (39%), Gaps = 6/174 (3%)

Query: 27 SRQEEKSRRMRREIVNAAISVLHEQGYHRASIKKIAERSAFSQGALQHHFPTKNDLMQHV 86
+ +++++ R+ I++ A+ + +QG S+ +IA+ + ++GA+ HF K+DL +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 87 LERLLRKSVTLTQDWIAEQGGANI-QLGALTQSWWRLQIRSPEFLAMVEILV----ASRT 141
E L ++ A+ G + L + + ++EI+
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 142 EETLRGRLRHAFEDYMQQISALLARHTNSDEADAA-DAAMLARVTRCMLFGFIT 194
++ R+ + +I L + A A + R + G +
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLME 176



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.