PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeFusobacterium_nucleatum_uid295_AE009951.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in AE009951 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1FN1501FN1555Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1501270.495066Nickel transport ATP-binding protein nikD
FN1502171.069633Nickel transport system permease protein nikC
FN1503081.574072Nickel transport system permease protein nikB
FN1504-1102.091592Nickel-binding protein
FN1505-1152.1047426,7-dimethyl-8-ribityllumazine synthase
FN1506-2150.769514Diaminohydroxyphosphoribosylaminopyrimidine
FN1507-215-0.543200Riboflavin synthase alpha chain
FN1508-214-1.054896GTP cyclohydrolase II
FN1509315-2.816592unknown
FN1510314-3.297444unknown
FN1511213-3.162176Transposase
FN151228-1.526915hypothetical exported 24-amino acid repeat
FN151317-1.301728Phophatidylinositol-4-phosphate 5-kinase
FN151416-1.079524hypothetical exported 24-amino acid repeat
FN1515090.360053hypothetical exported 24-amino acid repeat
FN15160102.067791unknown
FN1517-182.432331Leucyl-tRNA synthetase
FN1518-1102.827122RNA polymerase sigma-H factor
FN1519-193.41980723S rRNA methyltransferase
FN1520-193.637296UDP-N-acetylglucosamine
FN1521173.742626Dipeptide transport system permease protein
FN1522183.672153Dipeptide transport system permease protein
FN1523273.544395Dipeptide-binding protein
FN1524273.263027Dipeptide transport ATP-binding protein dppD
FN1525293.058621Dipeptide transport ATP-binding protein dppF
FN15261103.066455Fusobacterium outer membrane protein family
FN15272110.687236Hypothetical protein
FN15281101.356970Hypothetical protein
FN1529192.083993Hypothetical protein
FN1530193.392412Transposase
FN15311103.939585murein hydrolase export regulator
FN15321113.469644murein hydrolase exporter
FN15331102.773443Electron transfer flavoprotein alpha-subunit
FN15342123.171723Electron transfer flavoprotein beta-subunit
FN15353133.820131Acyl-CoA dehydrogenase, short-chain specific
FN15363133.282248(S)-2-hydroxy-acid oxidase chain D
FN15372162.834471Arsenical pump-driving ATPase
FN15383152.843863Arsenical pump-driving ATPase
FN15391174.542288Iron-sulfur cluster-binding protein
FN15401164.138483Iron-sulfur cluster-binding protein
FN1541-1123.170418Heptaprenyl diphosphate synthase component II
FN1542-2113.7236981,4-dihydroxy-2-naphthoate
FN1543-1103.612813Ubiquinone/menaquinone biosynthesis
FN1544-1113.928764Probable electron transfer flavoprotein-quinone
FN1545-292.025061Ferredoxin like protein
FN1546-291.185972Protein Translation Elongation Factor G (EF-G)
FN1547-19-0.596938PTS permease for N-acetylglucosamine and
FN1548010-3.064461Hypothetical protein
FN1549-1110.254679Stomatin like protein
FN15500121.091727Hypothetical protein
FN1551-1132.517626Hypothetical protein
FN15520143.089437abortive phage resistance protein
FN15530153.956898abortive phage resistance protein
FN15541144.280353Fusobacterium outer membrane protein family
FN15553174.582042Protein Translation Elongation Factor Tu
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1509PF08280220.029 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 22.5 bits (48), Expect = 0.029
Identities = 11/21 (52%), Positives = 13/21 (61%)

Query: 1 MFFSVIFLFGLNYLICNSPLF 21
MFFS FLF L + I + LF
Sbjct: 351 MFFSKSFLFNLQHFIPETNLF 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1512NUCEPIMERASE290.022 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.022
Identities = 13/32 (40%), Positives = 18/32 (56%)

Query: 208 YVDDLREGKTIDYYNDGKVFRLKNYKDNIGNG 239
+ + EGK+ID YN GK+ R Y D+I
Sbjct: 195 FTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEA 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1515ANTHRAXTOXNA310.014 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 31.3 bits (70), Expect = 0.014
Identities = 23/80 (28%), Positives = 39/80 (48%), Gaps = 9/80 (11%)

Query: 24 KNETELQKFRNKVDKVIKEELKNDYKKEYLKRKDNLKKIENN---------GEIGFEDED 74
KN+TE +KF++ ++ ++K E N+ + + +D LKKI + GEI F D D
Sbjct: 51 KNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDID 110

Query: 75 FIFQFEDNALTLASKKLKSI 94
+ E L+ K +
Sbjct: 111 LVEHKELQDLSEEEKNSMNS 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1536PF07675320.007 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/86 (31%), Positives = 33/86 (38%), Gaps = 17/86 (19%)

Query: 360 NQIAPYLHYVNEVGKKYDFTVKSFGHAGDGNLHIYACSNDMEI-----GEFKRQVEEFLT 414
NQ A Y + E GKKY FT++ G GDG DME+ + V T
Sbjct: 493 NQPARYDDFAFEAGKKYTFTMRRAG-MGDG--------TDMEVEDDSPASYTYTVYRDGT 543

Query: 415 DIYSKASELGGLISGEHGIGYGKMEY 440
I L E G+ G EY
Sbjct: 544 KI---KEGLTATTFEEDGVAAGNHEY 566


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1546TCRTETOQM402e-134 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 402 bits (1034), Expect = e-134
Identities = 153/669 (22%), Positives = 276/669 (41%), Gaps = 63/669 (9%)

Query: 10 NIRNISLLGHRGSGKTTLVESILYVKDYIKRKGDVENGTTVSDFDKEEIRRIFSINTSLI 69
I NI +L H +GKTTL ES+LY I G V+ GTT +D E +R +I T +
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 70 PVEHNDVKLNFLDTPGYFDFVGEVVSALRVSASAVLVLDATAGVEVGTEKAWKLLEERKL 129
+ + K+N +DTPG+ DF+ EV +L V A+L++ A GV+ T + L + +
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 130 PRIIFVNKMDKGYVNYPKLLTELKEKFGKKIAPFCIPIGEKDEFKGFVNVVDMVGRVFDG 189
P I F+NK+D+ ++ + ++KEK +I
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIK-------------------------- 155

Query: 190 KECVDTPIPADIDVSEVRNLLFEAIAETDEVLMDKYFAGEEFTKEEIVKGLHKGVVNGDI 249
++ P + +E + + E ++ L++KY +G+ E+ + N +
Sbjct: 156 QKVELYPNMCVTNFTESEQW--DTVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSL 213

Query: 250 VPVMVGSAQQNIGIHTLLNYLELYMPCPTELFSGQRIGEDPTTQQEKVVKISSENPFSAI 309
PV GSA+ NIGI L+ + T ++
Sbjct: 214 FPVYHGSAKNNIGIDNLIEVITNKFYSSTH---------------------RGQSELCGK 252

Query: 310 VFKTLVDPFIGKISFFKVNSGTIKKETEVFNPKKNKKERIAQILTMQGNKQIELDELRAG 369
VFK ++++ ++ SG + V +K K +I ++ T + ++D+ +G
Sbjct: 253 VFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELCKIDKAYSG 311

Query: 370 DIGAT--TKLQFTQT-GDTLCDKNFPVVFNKIRFPKPNIYSGVLPADKNDDEKLSTAIQR 426
+I L+ GDT +I P P + + V P+ E L A+
Sbjct: 312 EIVILQNEFLKLNSVLGDTKLLPQ----RERIENPLPLLQTTVEPSKPQQREMLLDALLE 367

Query: 427 VMEEDPTFVMSRNYETKQLLIGGQGEKHLYIILCKIKNKFGVHAELENVIVSYRETILGK 486
+ + DP + T ++++ G+ + + ++ K+ V E++ V Y E L K
Sbjct: 368 ISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYMERPLKK 427

Query: 487 AEVQGKHKKQSGGAGQFGDVFIRFE--HSDKDFEFIDEIKGGVVPRNYIPAVEKGLIEAK 544
AE + + + + ++ + G + +++ AV +G+
Sbjct: 428 AE--YTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGIRYGC 485

Query: 545 EKGVLAGYPVINFKATLYDGSYHPVDSNDLSFKLAAILAFKAGMEKAKPVLLEPFVRMEI 604
E+G L G+ V + K G Y+ S F++ A + + ++KA LLEP++ +I
Sbjct: 486 EQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYLSFKI 544

Query: 605 RIPEEYMGDVMGDLNKRRGRVLGMDHTETGEQLLLAEVPEAEILKYSIDLRALTQGRGEF 664
P+EY+ D K ++ + E +L E+P I +Y DL T GR
Sbjct: 545 YAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTFFTNGRSVC 603

Query: 665 EYEFVRYEE 673
E Y
Sbjct: 604 LTELKGYHV 612


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1555TCRTETOQM858e-20 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 84.5 bits (209), Expect = 8e-20
Identities = 53/155 (34%), Positives = 81/155 (52%), Gaps = 17/155 (10%)

Query: 13 VNIGTIGHVDHGKTTTT-------AAISKVLS-DKGWASKVDFDQIDAAPEEKERGITIN 64
+NIG + HVD GKTT T AI+++ S DKG + E++RGITI
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLL------ERQRGITIQ 57

Query: 65 TAHIEYETEKRHYAHVDCPGHADYVKNMITGAAQMDGAILVVSAADGPMPQTREHILLSR 124
T ++ E +D PGH D++ + + +DGAIL++SA DG QTR R
Sbjct: 58 TGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 125 QVGVPYIVVYLNKSDMVEDEELLELVEMEVRELLT 159
++G+P I ++NK D + L V +++E L+
Sbjct: 118 KMGIPTI-FFINKIDQNGID--LSTVYQDIKEKLS 149


2FN1662FN1734Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1662210-2.543466Hypothetical protein
FN166329-2.883568Hypothetical protein
FN1664011-2.932507unknown
FN1665110-3.356462Virulence-associated protein I
FN1666110-4.019529Hypothetical protein
FN1667110-4.063058dTDP-glucose 4,6-dehydratase
FN1668212-4.943025Cholinephosphate cytidylyltransferase
FN1669213-4.636767Choline transport protein
FN1670213-4.575568Choline kinase
FN1671117-4.068592hypothetical exported protein
FN1672318-3.047002Membrane-bound O-acyltransferase
FN1673222-1.876974unknown
FN1674015-0.192206Hypothetical protein
FN1675013-0.726946Transposase
FN1676115-0.930913Transposase
FN1677013-3.451111Transposase
FN1678010-3.442690unknown
FN1679-19-2.294877LPS biosynthesis protein WbpG
FN1680-19-3.877351Transposase
FN1681010-3.974610unknown
FN168209-3.525625Heteropolysaccharide repeat unit export protein
FN168309-2.641353Acetyltransferase
FN1684-110-2.018803N-acetylneuraminate synthase
FN1685012-2.953989dTDP-4-dehydrorhamnose reductase
FN1686012-3.454865Spore coat polysaccharide biosynthesis protein
FN1687-110-3.426879Gluconate 5-dehydrogenase
FN1688010-4.070152Oxidoreductase
FN168909-3.609588UDP-N-acetylglucosamine 4,6-dehydratase
FN1690212-4.267819Hypothetical protein
FN169129-3.759574Hypothetical membrane-spanning protein
FN1692311-3.805214Glycosyl transferase
FN1693210-3.126845Hypothetical protein
FN1694311-3.109184UDP-N-acetyl-D-quinovosamine 4-epimerase
FN1695310-3.389991Probable quinovosaminephosphotransferae
FN1696210-4.072922UDP-N-acetylglucosamine 4,6-dehydratase
FN1697010-3.781871Hypothetical protein
FN1698-211-1.943888dTDP-4-dehydrorhamnose reductase
FN169908-2.014893dTDP-4-dehydrorhamnose 3,5-epimerase
FN170008-1.435395Hypothetical protein
FN170109-0.789603ABC transporter ATP-binding protein
FN170229-0.374876Bacitracin resistance protein (Putative
FN1703-2100.670625ADP-L-glycero-D-manno-heptose-6-epimerase
FN1704-2110.703723Serine protease
FN1705-2130.965536unknown
FN1706-3130.880730Transporter
FN1707-2130.669131Aldose 1-epimerase
FN17080140.996582Polyribonucleotide nucleotidyltransferase
FN1709113-1.337191CDP-diacylglycerol--glycerol-3-phosphate
FN1710212-2.332286Integral membrane protein, YGGT family
FN1711212-1.943877Methyltransferase
FN171209-1.659338Hypothetical protein
FN171318-0.407991tRNA (Uracil-5-) -methyltransferase
FN171418-0.051166Hypothetical protein
FN171517-0.287913ATPase
FN1716190.107964Hypothetical protein
FN171728-0.067839NAD-dependent DNA ligase
FN17181100.705963Protein translocase subunit secA
FN1719112-0.500676Hypothetical protein
FN1720-115-0.926611Hypothetical protein
FN1721-1130.110493unknown
FN1722-1131.025400Glucose inhibited division protein B
FN1723-1131.270672Glucose inhibited division protein A
FN1724212-0.089313Potassium uptake protein KtrA
FN1725011-0.470474Potassium uptake protein KtrB
FN1726011-0.196995Na+ driven multidrug efflux pump
FN172709-0.581758Chloride channel protein
FN172819-1.480845Pyrrolidone-carboxylate peptidase
FN17291100.3249594-amino-4-deoxychorismate lyase
FN17301122.097840Para-aminobenzoate synthase component I
FN17310133.747105Anthranilate synthase component II
FN17320114.047928Hypothetical protein
FN17331113.631990V-type sodium ATP synthase subunit D
FN17340113.254818V-type sodium ATP synthase subunit B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1667NUCEPIMERASE1683e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 168 bits (428), Expect = 3e-51
Identities = 92/399 (23%), Positives = 156/399 (39%), Gaps = 91/399 (22%)

Query: 1 MKTYLITGAAGFIGANFLKYILKKHKDIKVIVVDSLT--YAGNLGTIK-EELKDSRVKFE 57
MK YL+TGAAGFIG + K +L+ V+ +D+L Y +L + E L +F
Sbjct: 1 MK-YLVTGAAGFIGFHVSKRLLEAGHQ--VVGIDNLNDYYDVSLKQARLELLAQPGFQFH 57

Query: 58 KVDIRDRKEIERVFSENKVDYVVNFAAESHVDRSIENPQIFLETNILGTQNLLDNAKKAW 117
K+D+ DR+ + +F+ + V V S+ENP + ++N+ G N+L+ +
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN- 116

Query: 118 TVSKDENGYPIYREDIKYLQV-STDEVYGSLSKDYDEPIELVIDDEDVKKVIKNRKNLKT 176
I++L S+ VYG K P
Sbjct: 117 --------------KIQHLLYASSSSVYGLNRKM---P---------------------- 137

Query: 177 YGNNFFTEESPVD-PRSPYSASKTGADHIVIAYGETYKLPINITRCSNNYGPYHFPEKLI 235
F+ + VD P S Y+A+K + + Y Y LP R YGP+ P+ +
Sbjct: 138 -----FSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMAL 192

Query: 236 PLMIKNILEGKKLPVYGKGDNIRDWLYVKDHCKGIDLVL------------------REA 277
K +LEGK + VY G RD+ Y+ D + I +
Sbjct: 193 FKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASI 252

Query: 278 KSGEIYNIGGFNEEKNINIVKLVIDVLKEEITNNDEYKKVLKTDISNINYDLITYVQDRL 337
+YNIG + + ++ ++ + D L E N + +
Sbjct: 253 APYRVYNIGNSSPVELMDYIQALEDALGIEAKKN--------------------MLPLQP 292

Query: 338 GHDMRYAINPSRIAKDLGWYPETDFETGIRKTVKWYLEN 376
G + + + + + +G+ PET + G++ V WY +
Sbjct: 293 GDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1685NUCEPIMERASE685e-15 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 67.5 bits (165), Expect = 5e-15
Identities = 56/285 (19%), Positives = 109/285 (38%), Gaps = 50/285 (17%)

Query: 3 KVLILGSCGMLGSVLCEYLLQNNYQVIGIDKIN------LEN------KFEKYKLYNIDL 50
K L+ G+ G +G + + LL+ +QV+GID +N L+ ++ + IDL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 51 LDFFKVEEVIFQEKPNIIINAAAIVNLNLCEENYELAELLHVDLNEQ-FLNL---SKKIS 106
D + ++ + + + + EN + D N FLN+ +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLEN----PHAYADSNLTGFLNILEGCRHNK 117

Query: 107 FK-FIYISTDSVF-DGTKSNYIEEDLA-IPLNNYAKTKFLGEEEVKKMEDYIVIRT---- 159
+ +Y S+ SV+ K + +D P++ YA TK E +
Sbjct: 118 IQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177

Query: 160 --NIYGYSDRQN-SLLKWAYDELN-KNKKIYGYKNVIFNPVSIYQLADAILILIQKNFKG 215
+YG R + +L K+ L K+ +Y Y + + I +A+AI+ L
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 216 -------------------ILNIVSDKPISKFEFLKIIEEYLKKK 241
+ NI + P+ ++++ +E+ L +
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIE 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1687DHBDHDRGNASE1105e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 110 bits (275), Expect = 5e-31
Identities = 70/273 (25%), Positives = 111/273 (40%), Gaps = 26/273 (9%)

Query: 8 NLFSLEDKIILITGGNGHLGKAMCHALADYGATLILASRNIEKNKQLCSELTNLYKNQNI 67
N +E KI ITG +G+A+ LA GA + N EK +++ S L ++
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE- 60

Query: 68 ALELDLESKQDTTIKIKNLIEKYGRIDILINNSYYGFSGKFHEMDYESWNRGIEGSLGTV 127
A D+ + + G IDIL+N + G H + E W + V
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 128 FLCTNAVINEMLKNKKGKIINIASMYGINAPNVYELYEGSLCEKYYNPVNYGVGKAGIIQ 187
F + +V M+ + G I+ + S N V + Y KA +
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGS----NPAGVPRTSMAA----------YASSKAAAVM 166

Query: 188 FTKYIAAVYGKEGIISNSISPGPF-----------PNFEIQKNQIFVERLTNKVPLKRIG 236
FTK + + I N +SPG N Q + +E +PLK++
Sbjct: 167 FTKCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLA 226

Query: 237 KPEDLQGAIVFLCSDSSNYVNGHNLVIDGGWTI 269
KP D+ A++FL S + ++ HNL +DGG T+
Sbjct: 227 KPSDIADAVLFLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1689NUCEPIMERASE761e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 76.0 bits (187), Expect = 1e-17
Identities = 44/238 (18%), Positives = 93/238 (39%), Gaps = 27/238 (11%)

Query: 6 TILITGGTGSFGNRFIERILSEYNPKKIIIYSRDEFKQDLMKKNFVMKYGVEKTKKLRFF 65
L+TG G G +R+L + + + I + +++ +K+ + + +F
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQA---RLELLAQPGFQFH 57

Query: 66 IGDVRDKERLYRAFD--GVDYVIHAAAMKQVPACEYNPFEAIKTNIHGAENVIEAAIDRG 123
D+ D+E + F + V + V NP +N+ G N++E
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK 117

Query: 124 VKKVVALST---------------DKAVNPINLYGGTKLVSDKLFISANAYSGEKGTIFS 168
++ ++ S+ D +P++LY TK ++ + A+ YS G +
Sbjct: 118 IQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELM---AHTYSHLYGLPAT 174

Query: 169 VVRYGNVAGSRGS---VIPFFKQLLSQGKTELPITDIHMTRFWMVLDDAVNLVLKALK 223
+R+ V G G + F + + +GK+ M R + +DD +++
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQD 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1694NUCEPIMERASE383e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 37.8 bits (88), Expect = 3e-05
Identities = 28/168 (16%), Positives = 54/168 (32%), Gaps = 42/168 (25%)

Query: 4 INLMITGASGFIGS-----------------NFIDKY-------------KNEYNIIPVD 33
+ ++TGA+GFIG N D Y + + +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 34 LL-KEKPENLSFDG-VDCILHLAA---LVHQMKGAAKEKYFEVNTELTRRIAEKAKIEGV 88
L +E +L G + + + + ++ Y + N I E + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHA--YADSNLTGFLNILEGCRHNKI 118

Query: 89 KHFVFYSTVKVYGYDGDLKNHNYVLNELSECNPKNDPYGESKWEAEKI 136
+H ++ S+ VYG N + + Y +K E +
Sbjct: 119 QHLLYASSSSVYG-----LNRKMPFSTDDSVDHPVSLYAATKKANELM 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1696NUCEPIMERASE853e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 85.2 bits (211), Expect = 3e-20
Identities = 56/295 (18%), Positives = 112/295 (37%), Gaps = 42/295 (14%)

Query: 282 IFVTGGGGSIGSELINQIAKYNPKKIINIEINENASYLMELELKR--KYPYLDYKTEIAS 339
VTG G IG + ++ + +++ I+ N N Y + L+ R ++
Sbjct: 3 YLVTGAAGFIGFHVSKRLLE-AGHQVVGID-NLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 340 VRDFDKLDMLFNNYKPDILFHAAAHKHVPLMENNPEEAIKNNVFGTRNVAECCLKYKLES 399
+ D + + LF + + +F + V NP +N+ G N+ E C K++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 400 VVLIST---------------DKAVNPTNIMGATKRVCEMIFQKYSERSSETKFVAVRFG 444
++ S+ D +P ++ ATK+ E++ YS +RF
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPATGLRFF 179

Query: 445 NVLGSNGS---VIPIFSKLIEERKNLTL-THKDIIRYFMTIPEAAQLVIEA--------- 491
V G G + F+K + E K++ + + + R F I + A+ +I
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADT 239

Query: 492 ---------ATIGKGGEILILDMGEPVKIYDLAKNMIKLSGSTVGIDIVGLRPGE 537
A + + PV++ D + + G +++ L+PG+
Sbjct: 240 QWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1698NUCEPIMERASE379e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.7 bits (85), Expect = 9e-05
Identities = 27/157 (17%), Positives = 53/157 (33%), Gaps = 21/157 (13%)

Query: 33 EVDITNGDFLRAYIKTMHQNYKIDTIINCAAYNDVDKAETEKELCYKANAEAPANLAMIA 92
++D+ + + + + + + V + +N N+
Sbjct: 58 KIDLADREGMTDLFA----SGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 93 SEIGATYITY-STDFVFNGMTTNYLYNESTGYTEEDEA-HPLSAYAKAKYEGELLV---S 147
++ Y S+ V Y N ++ +D HP+S YA K EL+ S
Sbjct: 114 RHNKIQHLLYASSSSV-------YGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYS 166

Query: 148 QIIENPENTSRIYIVRTSWVFGKGGM---NFVEKIIE 181
+ P R + V W G+ M F + ++E
Sbjct: 167 HLYGLPATGLRFFTVYGPW--GRPDMALFKFTKAMLE 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1703NUCEPIMERASE841e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 84.4 bits (209), Expect = 1e-20
Identities = 85/367 (23%), Positives = 141/367 (38%), Gaps = 81/367 (22%)

Query: 2 IIVTGGAGMIGSAFVWKLNEMGIKDILIVDKL-------RKEDKWLNIRKREYYDWIDKD 54
+VTG AG IG +L E G ++ +D L K+ + L + + + + D
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQAR-LELLAQPGFQFHKID 60

Query: 55 -NLKEWLVCKENADNIEAVIHMGACSAT--TETDADFLMDNNFGYTKFLWNFCAEKNIKY 111
+E + + + E V A + + D+N + C I++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 112 -IYASSAATYGMGE-LGYNDDVSPEELQKLMPLNKYGYSKKFFDDWAF---KQKNQPKQW 166
+YASS++ YG+ + ++ D S + P++ Y +KK + A P
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDH-----PVSLYAATKKANELMAHTYSHLYGLPA-- 173

Query: 167 NGLKFFNVYGPQEYHKGR--MASMVFHTYNQYKENGYVKLFKSYKEG-----FKDGEQLR 219
GL+FF VYGP GR MA K K+ EG + G+ R
Sbjct: 174 TGLRFFTVYGP----WGRPDMA--------------LFKFTKAMLEGKSIDVYNYGKMKR 215

Query: 220 DFVYVKDVVDIMYFMLVN--------DVKSG----------IYNIGTGKARSFMDLSMAT 261
DF Y+ D+ + + + V++G +YNIG MD A
Sbjct: 216 DFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQAL 275

Query: 262 MRAASHNDNLDKNEVVKLIEM-PEDLQGRYQYFTEAKINKLRE-IGYTKEMHSLEEGVKD 319
D L ++ + P D+ T A L E IG+T E ++++GVK+
Sbjct: 276 ------EDALGIEAKKNMLPLQPGDV-----LETSADTKALYEVIGFTPET-TVKDGVKN 323

Query: 320 YVQNYLA 326
+V Y
Sbjct: 324 FVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1718SECA11060.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1106 bits (2862), Expect = 0.0
Identities = 449/916 (49%), Positives = 596/916 (65%), Gaps = 64/916 (6%)

Query: 1 MIGSLLKKIFGTKNDREIKALTKEVEKINALESEYEKLSDEDLKNKTNIFKERLKNGETL 60
M+ LL K+FG++NDR ++ + K V INA+E E EKLSDE+LK KT F+ RL+ GE L
Sbjct: 1 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL 60

Query: 61 DDILVEAFATVREASKRVLGLRHYDVQLIGGMVLHQGKITEMKTGEGKTLVATCPVYLNA 120
++++ EAFA VREASKRV G+RH+DVQL+GGMVL++ I EM+TGEGKTL AT P YLNA
Sbjct: 61 ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA 120

Query: 121 LAGHGVHVITVNDYLAKRDRDQMSRLYGFLGLSSGVILNGLPTDQRKKSYNSDITYGTNS 180
L G GVHV+TVNDYLA+RD + L+ FLGL+ G+ L G+P ++++Y +DITYGTN+
Sbjct: 121 LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN 180

Query: 181 EFGFDYLRDNMVSSLDQKVQRELNFCIVDEVDSILIDEARTPLIISGAAEDKIKWYQISF 240
E+GFDYLRDNM S +++VQR+L++ +VDEVDSILIDEARTPLIISG AED + Y+
Sbjct: 181 EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN 240

Query: 241 QVVSMLNRSYETEKIKNIKEKKAMNIPDEKWGDYEVDEKSRVIVFTEKGVKRVEEILKI- 299
+++ L I+++K + + G + VDEKSR + TE+G+ +EE+L
Sbjct: 241 KIIPHL-----------IRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKE 289

Query: 300 ------DNLYAPEYVELTHFLNQALKAKELFKRDRDYLVRDNGEVVIIDEFTGRAMEGRR 353
++LY+P + L H + AL+A LF RD DY+V+D GEV+I+DE TGR M+GRR
Sbjct: 290 GIMDEGESLYSPANIMLMHHVTAALRAHALFTRDVDYIVKD-GEVIIVDEHTGRTMQGRR 348

Query: 354 YSDGLHQAIEAKEGVKIASENQTLATITLQNYFRMYKKLSGMTGTAETEATEFMHTYGLE 413
+SDGLHQA+EAKEGV+I +ENQTLA+IT QNYFR+Y+KL+GMTGTA+TEA EF Y L+
Sbjct: 349 WSDGLHQAVEAKEGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLD 408

Query: 414 VVVIPTNLPVIRKDDADLVYKTKKEKINSIIDRIQGLYEKGQPVLVGTISIKSSEELSEL 473
VV+PTN P+IRKD DLVY T+ EKI +II+ I+ KGQPVLVGTISI+ SE +S
Sbjct: 409 TVVVPTNRPMIRKDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNE 468

Query: 474 LKKRKIPHNVLNAKYHAKEAEIVAQAGRYKAVTIATNMAGRGTDIMLGGNPEFMALAEVD 533
L K I HNVLNAK+HA EA IVAQAG AVTIATNMAGRGTDI+LGG+ + AEV
Sbjct: 469 LTKAGIKHNVLNAKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQ----AEVA 524

Query: 534 SRDDENFPEVFAKYQEQCANEKEQVLALGGLFILGTERHESRRIDNQLRGRSGRQGDPGE 593
+ + E K + + VL GGL I+GTERHESRRIDNQLRGRSGRQGD G
Sbjct: 525 AL-ENPTAEQIEKIKADWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGS 583

Query: 594 SEFYLSLEDDLMRLFGSERVMIWMDRLKLPEGEPITHRMINSAIEKAQKKIEARNFGIRK 653
S FYLS+ED LMR+F S+RV M +L + GE I H + AI AQ+K+E+RNF IRK
Sbjct: 584 SRFYLSMEDALMRIFASDRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRK 643

Query: 654 NLLEFDDVMNKQRTAIYESRNEALAIDNLKDRILGMLQRNITEKVYEKFAPE-MREDWDI 712
LLE+DDV N QR AIY RNE L + ++ + I + + + P+ + E WDI
Sbjct: 644 QLLEYDDVANDQRRAIYSQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDI 703

Query: 713 DGLNEYLKDFYVYE---ERDDKAYLRSTKEEYIERIYNALVEQYNNKEAELGSDLMRKLE 769
GL E LK+ + + +E ERI +E Y KE +G+++MR E
Sbjct: 704 PGLQERLKNDFDLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFE 763

Query: 770 KHILFDVVDNRWRGHLKSLDALRESIYLRAYGQRDPVTEYKLISSQIFEEMIATIQEQAT 829
K ++ +D+ W+ HL ++D LR+ I+LR Y Q+DP EYK S +F M+ +++ +
Sbjct: 764 KGVMLQTLDSLWKEHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVI 823

Query: 830 SFLFKVVV------------------------------------NTEPVKDEKNEIEADG 853
S L KV V + + ++ +
Sbjct: 824 STLSKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRND 883

Query: 854 LCPCGSGKPYEKCCGR 869
CPCGSGK Y++C GR
Sbjct: 884 PCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1719PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 21/109 (19%), Positives = 41/109 (37%), Gaps = 19/109 (17%)

Query: 20 EKYSLNKESAHNWETTMSKVMVAEATNP--DWYGEENPLVNFRKQGKISEKEYYFLDYLG 77
Y + SA E ++ +P DW V ++ ++ E + + LG
Sbjct: 507 TTYGTGEASAQTTEQAINVAADMNRVHPFRDW-------VKAQQWDEVPRLEKWLVHVLG 559

Query: 78 KTPANEISDDDFDRFTKILISYVKKLPRKFII-EVTNIKDPKGLVDYMV 125
KTP D + + Y++ + + ++ V + +P DY V
Sbjct: 560 KTP---------DDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSV 599


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1724NUCEPIMERASE290.020 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 28.6 bits (64), Expect = 0.020
Identities = 16/32 (50%), Positives = 19/32 (59%), Gaps = 2/32 (6%)

Query: 1 MKQYLVIGLGRF-GASVAKTLYSAGEIVLGVD 31
MK YLV G F G V+K L AG V+G+D
Sbjct: 1 MK-YLVTGAAGFIGFHVSKRLLEAGHQVVGID 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1727TCRTETA310.016 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.6 bits (69), Expect = 0.016
Identities = 15/52 (28%), Positives = 24/52 (46%), Gaps = 1/52 (1%)

Query: 314 IIFVIKLLFTAISYSTGFAGGIFLPMLVLGAIIGKIFGETVDIFAQTGADFT 365
+ ++ L A+ Y+ A FL +L +G I+ I G T + AD T
Sbjct: 74 PVLLVSLAGAAVDYAI-MATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT 124


3FN1787FN1798Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN17872130.071501Tetratricopeptide repeat family protein
FN17880101.6419292C-methyl-D-erythritol 2,4-cyclodiphosphate
FN17891101.735305Na+ driven multidrug efflux pump
FN17902101.168996Cob(I)alamin adenosyltransferase
FN17913110.852722Mutator MutT protein
FN17922111.392714Hypothetical protein
FN1793-180.287970Phosphoenolpyruvate-protein phosphotransferase
FN1794-110-0.292270Phosphocarrier protein HPr
FN1795-1100.405127Hypothetical protein
FN17962101.369108unknown
FN17972100.976427Spermidine/putrescine transport ATP-binding
FN17983100.017584Spermidine/putrescine transport system permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1787SYCDCHAPRONE601e-12 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 59.9 bits (145), Expect = 1e-12
Identities = 20/102 (19%), Positives = 38/102 (37%)

Query: 64 YYNRACSYYCSNKYDKAIEDYDKAIKLNPNDACYFNNRGHSYFALNKYSEAIEDYDKAIK 123
Y+ A + Y S KY+ A + + L+ D+ +F G A+ +Y AI Y
Sbjct: 39 LYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAI 98

Query: 124 LDPNNASYYYKRGFSYYALNKYDKAIEDYNKAIKLDPNNAAY 165
+D + + + +A A +L + +
Sbjct: 99 MDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140



Score = 55.7 bits (134), Expect = 4e-11
Identities = 23/138 (16%), Positives = 49/138 (35%), Gaps = 11/138 (7%)

Query: 123 KLDPNNASYYYKRGFSYYALNKYDKAIEDYNKAIKLDPNNAAYFSSRGDIYYYEKAYNKS 182
++ + Y F+ Y KY+ A + + LD ++ +F G Y+ +
Sbjct: 30 EISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLA 89

Query: 183 IEDYNKAIKLDPNNAFYYDNRGLAYEKLKKYKEAINDYNKAIKLNPNNAFYCYNRGFTYN 242
I Y+ +D + + + + EA + A +L + +
Sbjct: 90 IHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF--------- 140

Query: 243 KLKKYKEAINDYDKAIKL 260
K+ ++ +AIKL
Sbjct: 141 --KELSTRVSSMLEAIKL 156



Score = 52.6 bits (126), Expect = 3e-10
Identities = 21/103 (20%), Positives = 36/103 (34%), Gaps = 3/103 (2%)

Query: 235 YNRGFTYNKLKKYKEAINDYDKAIKLDPNNASYFNNRGVAYNNLGEYSKALEDYDKAIKL 294
Y+ F + KY++A + LD ++ +F G +G+Y A+ Y +
Sbjct: 40 YSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIM 99

Query: 295 NPNYTFAYNNKGITFDNLGEFEEAIMNYNKAIEL---DPSYKS 334
+ + GE EA A EL +K
Sbjct: 100 DIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEFKE 142



Score = 50.3 bits (120), Expect = 2e-09
Identities = 14/100 (14%), Positives = 37/100 (37%)

Query: 168 SRGDIYYYEKAYNKSIEDYNKAIKLDPNNAFYYDNRGLAYEKLKKYKEAINDYNKAIKLN 227
S Y Y + + + LD ++ ++ G + + +Y AI+ Y+ ++
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 228 PNNAFYCYNRGFTYNKLKKYKEAINDYDKAIKLDPNNASY 267
+ ++ + + EA + A +L + +
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140



Score = 49.5 bits (118), Expect = 4e-09
Identities = 15/100 (15%), Positives = 34/100 (34%), Gaps = 3/100 (3%)

Query: 202 NRGLAYEKLKKYKEAINDYNKAIKLNPNNAFYCYNRGFTYNKLKKYKEAINDYDKAIKLD 261
+ + KY++A + L+ ++ + G + +Y AI+ Y +D
Sbjct: 41 SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMD 100

Query: 262 PNNASYFNNRGVAYNNLGEYSKALEDYDKAIKL---NPNY 298
+ + GE ++A A +L +
Sbjct: 101 IKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140



Score = 49.5 bits (118), Expect = 4e-09
Identities = 15/106 (14%), Positives = 37/106 (34%), Gaps = 2/106 (1%)

Query: 26 NDIYYNNRGLSYFLLKKYEEAINDYNRAIELNLNNASYYYNRACSYYCSNKYDKAIEDYD 85
+Y + + + KYE+A + L+ ++ ++ +YD AI Y
Sbjct: 37 EQLY--SLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYS 94

Query: 86 KAIKLNPNDACYFNNRGHSYFALNKYSEAIEDYDKAIKLDPNNASY 131
++ + + + + +EA A +L + +
Sbjct: 95 YGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF 140



Score = 48.4 bits (115), Expect = 9e-09
Identities = 21/146 (14%), Positives = 51/146 (34%), Gaps = 11/146 (7%)

Query: 89 KLNPNDACYFNNRGHSYFALNKYSEAIEDYDKAIKLDPNNASYYYKRGFSYYALNKYDKA 148
+++ + + + + KY +A + + LD ++ ++ G A+ +YD A
Sbjct: 30 EISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLA 89

Query: 149 IEDYNKAIKLDPNNAAYFSSRGDIYYYEKAYNKSIEDYNKAIKLDPNNAFYYDNRGLAYE 208
I Y+ +D + + + ++ A +L + +
Sbjct: 90 IHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIADKTEF--------- 140

Query: 209 KLKKYKEAINDYNKAIKLNPNNAFYC 234
K+ ++ +AIKL C
Sbjct: 141 --KELSTRVSSMLEAIKLKKEMEHEC 164



Score = 37.2 bits (86), Expect = 6e-05
Identities = 9/68 (13%), Positives = 27/68 (39%)

Query: 23 EPNNDIYYNNRGLSYFLLKKYEEAINDYNRAIELNLNNASYYYNRACSYYCSNKYDKAIE 82
+ + ++ G + +Y+ AI+ Y+ +++ + ++ A + +A
Sbjct: 66 DHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAES 125

Query: 83 DYDKAIKL 90
A +L
Sbjct: 126 GLFLAQEL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1792IGASERPTASE270.035 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 26.6 bits (58), Expect = 0.035
Identities = 29/110 (26%), Positives = 38/110 (34%), Gaps = 13/110 (11%)

Query: 12 KEEEKPAEQAAVEATATEAPATETTEAAAEAKTFSLKTEDGKEFTLVVAADGSTATLTDA 71
EE A A +ETTE AE KT + E D + T +
Sbjct: 1013 NNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNE------QDATETTAQNR 1066

Query: 72 EGKATELKNAETASGERYADEAGNEVAMKGAEGILTLGDLKEVPVTVEAK 121
E A+ A A+ NEVA G+E T + TVE +
Sbjct: 1067 E-------VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1793PHPHTRNFRASE7060.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 706 bits (1825), Expect = 0.0
Identities = 274/568 (48%), Positives = 395/568 (69%)

Query: 11 IKGIPASPGIAIGKAFLYKENKLEIDEKSNLSKEEEIERLVKGREIAKKQLEEIKENTLK 70
I GI AS G+AI KAF++ E ++I++ S EIE+L E +K++L IK+ T
Sbjct: 5 ITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEA 64

Query: 71 KLGKDKADIFEGHITLLEDEELFSEIDSKISQKKCTAEFALNEAIDEYATMLANLEDTYF 130
+G DKA+IF H+ +L+D EL I KI ++ AE+AL E D + +M ++++ Y
Sbjct: 65 SMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYM 124

Query: 131 KERAGDLRDIGKRWLYGVMNEQIVDLSKLEPETIIVAKELNPSDTAQINLDNVLAFVTEI 190
KERA D+RD+ KR L ++ + L+ + ET+I+A++L PSDTAQ+N V F T+I
Sbjct: 125 KERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDI 184

Query: 191 GGKTAHSSIMARSLELPAVVGVGAVLDELEDNQILIVDALKGEVIVSPDVETLQIYKEKR 250
GG+T+HS+IM+RSLE+PAVVG V ++++ ++IVD ++G VIV+P E ++ Y+EKR
Sbjct: 185 GGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKR 244

Query: 251 EKFLKEKEELKALKDKEAISKDGIKVDVWGNIGSPNDVKGIISNGGFGVGLYRTEFLFME 310
F K+K+E L + + +KDG V++ NIG+P DV G+++NGG G+GLYRTEFL+M+
Sbjct: 245 AAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMD 304

Query: 311 KDSFPTEDEQFEAYKIVAEELKGYPVTIRTMDIGGDKSLPYMELPKEENPFLGWRAIRVC 370
+D PTE+EQFEAYK V + + G PV IRT+DIGGDK L Y++LPKE NPFLG+RAIR+C
Sbjct: 305 RDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIRLC 364

Query: 371 LDREEILRTQFKALLRASKYGKIKIMLPMIMDIVEVRKAKAIFEECKKELQEKGIEFDKN 430
L++++I RTQ +ALLRAS YG +K+M PMI + E+R+AKAI +E K +L +G++ +
Sbjct: 365 LEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVSDS 424

Query: 431 IMLGIMVETPAVAFRAKYFAKECDFFSIGTNDLTQYTLAVDRGNEKIANLYDTYNPSVLQ 490
I +GIMVE P+ A A FAKE DFFSIGTNDL QYT+A DR NE+++ LY Y+P++L+
Sbjct: 425 IEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAILR 484

Query: 491 AIKMLIDGAHDGGIKISMCGEFAGDENAIAILFGMGLDAFSMSGISIPRVKRIIMKLEKK 550
+ M+I AH G + MCGE AGDE AI +L G+GLD FSMS SI + ++KL K+
Sbjct: 485 LVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKLSKE 544

Query: 551 ECQNLVERVLSLSTASEIKEEIKKFMEK 578
E + ++ L L TA E+++ +KK K
Sbjct: 545 ELKPFAQKALMLDTAEEVEQLVKKTYLK 572


4FN1877FN1886Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN18773100.247017Guanine-hypoxanthine permease
FN1878517-1.353843unknown
FN1879516-1.092192SSU ribosomal protein S20P
FN1880316-0.676108Oxygen-insensitive NAD(P)H nitroreductase
FN1881516-0.358270Esterase
FN1882420-0.609614unknown
FN1883418-0.176863Transposase
FN18842150.206645unknown
FN1885015-0.327573Hemolysin III
FN1886313-1.019379unknown
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1881TYPE3OMGPROT270.024 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.2 bits (60), Expect = 0.024
Identities = 17/72 (23%), Positives = 30/72 (41%), Gaps = 10/72 (13%)

Query: 55 VEATVQYKKQLFLDQKIEVR-----ITKIEIKGLKIIF--EHEI-YNGQDLVITGTATVL 106
+E Q L +Q+ ++R IEI LK + I Y ++ G AT+L
Sbjct: 158 LELVEQTAAAL--EQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATIL 215

Query: 107 AYNYEEQKVKKV 118
+ +++V
Sbjct: 216 QRVLSDATIQQV 227


5FN1929FN1965Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1929011-3.020305Competence-damage protein cinA
FN193019-3.799835Phosphatidylglycerophosphatase A
FN193109-3.795756Protease
FN1932-110-4.585403Dephospho-CoA kinase
FN1933-210-4.183834Hypothetical protein
FN1934-28-4.551130Hypothetical protein
FN1935-29-1.256393Adenine-specific methyltransferase
FN1936-2110.572279Hypothetical protein
FN1937-2100.149510unknown
FN19380131.364076unknown
FN19392214.855485Hypothetical protein
FN19402225.685200Na+ driven multidrug efflux pump
FN19413215.428738ClpB protein
FN19422205.809918putative DNA-binding protein
FN19432206.356459Tryptophanase
FN19442246.343640Sodium-dependent tryptophan transporter
FN1948-3151.966906LSU Ribosomal RNA
FN1949-2152.0671765S Ribosomal RNA
FN1950-2182.389669SSU Ribosomal RNA
FN1951115-1.414799Hypothetical protein
FN1952417-2.685832Xaa-Pro dipeptidase
FN1953410-0.165815Serine protease
FN1954590.060379ATPase associated with chromosome
FN195559-0.039023Na(+)-linked D-alanine glycine permease
FN195648-0.466384Na(+)-linked D-alanine glycine permease
FN195738-0.123524Glycerophosphoryl diester phosphodiesterase
FN1964290.181786unknown
FN196528-0.550080Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1941HTHFIS472e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.1 bits (112), Expect = 2e-07
Identities = 41/178 (23%), Positives = 69/178 (38%), Gaps = 41/178 (23%)

Query: 164 EVLEKYAKDLVELAREGK--------IDPIIGRDSEIRRAIQIISRRTKND-PILI-GEP 213
E++ + L E R P++GR + ++ ++++R + D ++I GE
Sbjct: 110 ELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGES 169

Query: 214 GVGKTAIVEGLAQR----------ILNGDVPESLKNKKIFSLDMGALV-AGAKYKGEFEE 262
G GK + L I +P L ++F + GA A + G FE+
Sbjct: 170 GTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQ 229

Query: 263 RMKGVLKEVEESNGNIILFIDEIHTIVGAGKGEGSLDAGNMLKPMLARGELRVIGATT 320
G LF+DEI G+ +DA L +L +GE +G T
Sbjct: 230 AEGGT------------LFLDEI--------GDMPMDAQTRLLRVLQQGEYTTVGGRT 267



Score = 32.5 bits (74), Expect = 0.008
Identities = 36/210 (17%), Positives = 67/210 (31%), Gaps = 48/210 (22%)

Query: 580 GQDEAIKSVADTMLRSVAGLKDPNRPMGSFIFLGPTGVGKTYLAKTLAYNLFDSEDNVVR 639
G+ A++ + + R + + G +G GK +A+ L V
Sbjct: 141 GRSAAMQEIYRVLAR-LMQTDLT------LMITGESGTGKELVARALHDYGKRRNGPFVA 193

Query: 640 IDMSEYMDKFSVTRLIGAPPGYVGYEEGGQLTEAIRTKPYSV-------ILFDEIEKAHP 692
I+M+ + L G E G T A + DEI
Sbjct: 194 INMAAIPRDLIESELFGH--------EKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPM 245

Query: 693 DVFNVLLQVLDDG---RLTDGQGRIVDFKNTLIIMTSNIGSHFILEDPNLSEDTREKVAD 749
D LL+VL G + D + I+ +N +D ++ +
Sbjct: 246 DAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN-------------KDLKQSINQ 289

Query: 750 ELKARFKPEFLNRIDEIITFKALDLPAIKE 779
F+ + R++ + L LP +++
Sbjct: 290 ---GLFREDLYYRLNVV----PLRLPPLRD 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1942HTHTETR300.004 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.4 bits (68), Expect = 0.004
Identities = 6/52 (11%), Positives = 22/52 (42%), Gaps = 6/52 (11%)

Query: 177 PLERLTKQEREKIVKA----LYEKGIFNLKDAINFVAKKLSCSPTTVYRYVG 224
++ ++ R+ I+ ++G+ + ++ +AK + +Y +
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSST--SLGEIAKAAGVTRGAIYWHFK 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1950SUBTILISIN511e-08 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 50.6 bits (121), Expect = 1e-08
Identities = 62/360 (17%), Positives = 101/360 (28%), Gaps = 110/360 (30%)

Query: 30 GDKVKVGVLDG-------DFINKKNYLENKYGNIEILERNYNPKHSNHGELVLKVLREKN 82
G VKV VLD D + N + + + + ++ HG V
Sbjct: 40 GRGVKVAVLDTGCDADHPDLKARIIGGRN-FTDDDEGDPEIFKDYNGHGTHV-------- 90

Query: 83 KLGIIAGSIGENDDFSGKEGTTVIPKLE-------------TYEKVLEKFNP--NQKVKV 127
AG+I ++ +G G V P+ + Y+ +++ QKV +
Sbjct: 91 -----AGTIAATENENGVVG--VAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDI 143

Query: 128 FSQSWGVPETIGSYISKSEEERMRAIAPGQTSSEGRKILDFYKKEVNKDTLFIWANGNTL 187
S S G G E + KK V L + A GN
Sbjct: 144 ISMSLG-----GPEDVPELHEAV-------------------KKAVASQILVMCAAGNEG 179

Query: 188 VNNQNQVVLFNDAYYQGGLPHLYSELEKGWITVVGVKKKDGNMLNKHYTDTAHLAYPGNA 247
+ L Y I+V + + + N
Sbjct: 180 DGDDRTDELGYPGCY------------NEVISVGAINFDRHA---------SEFSNSNN- 217

Query: 248 KWWAISADAEVVDS-----KGVTHRGSSFAAPRVAKVAALVAEKY---DW--MTADQVRQ 297
+ A E + S K T G+S A P VA AL+ + +T ++
Sbjct: 218 -EVDLVAPGEDILSTVPGGKYATFSGTSMATPHVAGALALIKQLANASFERDLTEPELYA 276

Query: 298 TLFTTTDRTNIDENETYIRNIVTEPDEKYGWGMLNRERALKGPGAFINIRRQYQY-SPSS 356
L T + G G+L + + +R S +S
Sbjct: 277 QLIKRTIPLGNS-------------PKMEGNGLLYLTAVEE-LSRIFDTQRVAGILSTAS 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1964SYCDCHAPRONE421e-06 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 42.2 bits (99), Expect = 1e-06
Identities = 20/87 (22%), Positives = 35/87 (40%), Gaps = 3/87 (3%)

Query: 345 GKYKEAIKKYEYALSLDEEGKDERYINSQLGWCYRQLEEYKKAIKFHKKAKELGRNDIWI 404
GKY++A K ++ LD D R+ LG C + + +Y AI + + +
Sbjct: 50 GKYEDAHKVFQALCVLDHY--DSRFFLG-LGACRQAMGQYDLAIHSYSYGAIMDIKEPRF 106

Query: 405 NMEIGMCYAKLEEYEKAAENYLIAYEM 431
C + E +A +A E+
Sbjct: 107 PFHAAECLLQKGELAEAESGLFLAQEL 133



Score = 34.1 bits (78), Expect = 9e-04
Identities = 16/90 (17%), Positives = 35/90 (38%), Gaps = 1/90 (1%)

Query: 557 YNPDTREEALKYFERAIELGRNDAWVWEMRGTLLFDLRKYEEALDSFRKAYAL-NDDGWY 615
Y E+A K F+ L D+ + G + +Y+ A+ S+ + + +
Sbjct: 47 YQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRF 106

Query: 616 LYSIGRCLRRLERYEEALENLLNSRQISLN 645
+ CL + EA L ++++ +
Sbjct: 107 PFHAAECLLQKGELAEAESGLFLAQELIAD 136



Score = 33.7 bits (77), Expect = 0.001
Identities = 22/109 (20%), Positives = 39/109 (35%)

Query: 79 LKNIINKSKNDEWLYSELGYCLAEQGKQEEALESYFKAIELNRNDAWIFTRIGMCYKNMD 138
+ + S + L + + GK E+A + + L+ D+ F +G C + M
Sbjct: 25 IAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMG 84

Query: 139 RKEEAIEYYLKALEQKEDDIFIMSDIAWLYDSLGEFEKALKYLERLEEL 187
+ + AI Y + A GE +A L +EL
Sbjct: 85 QYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQEL 133



Score = 31.4 bits (71), Expect = 0.007
Identities = 20/115 (17%), Positives = 44/115 (38%), Gaps = 3/115 (2%)

Query: 181 LERLEELGQNDAWTSTEYGYCLAKLKRFDEAIVKINRALEAEDEDKDTAYIYSQLGWCQR 240
+ L E+ + + + ++++A K+ +AL D D+ + LG C++
Sbjct: 25 IAMLNEISSDTLEQLYSLAFNQYQSGKYEDAH-KVFQALCVLDH-YDSRFFLG-LGACRQ 81

Query: 241 HLEKYDEAIETFLKAKKWARNDAWINIELGHCYKAKNEKEKALEFYLKAEKFDKN 295
+ +YD AI ++ + C K E +A A++ +
Sbjct: 82 AMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIAD 136



Score = 30.3 bits (68), Expect = 0.016
Identities = 24/105 (22%), Positives = 41/105 (39%), Gaps = 3/105 (2%)

Query: 296 DISLLSDIAWHYDALDRNEEALKYIKRVVRLGRDDAWINEEYGACLSGLGKYKEAIKKYE 355
+ L +A++ + E+A K + + L D+ GAC +G+Y AI Y
Sbjct: 35 TLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYS 94

Query: 356 YALSLDEEGKDERYINSQLGWCYRQLEEYKKAIKFHKKAKELGRN 400
Y +D + + C Q E +A A+EL +
Sbjct: 95 YGAIMDIKEPRFPF---HAAECLLQKGELAEAESGLFLAQELIAD 136



Score = 29.1 bits (65), Expect = 0.037
Identities = 14/62 (22%), Positives = 25/62 (40%)

Query: 374 LGWCYRQLEEYKKAIKFHKKAKELGRNDIWINMEIGMCYAKLEEYEKAAENYLIAYEMDR 433
L + Q +Y+ A K + L D + +G C + +Y+ A +Y MD
Sbjct: 42 LAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101

Query: 434 DD 435
+
Sbjct: 102 KE 103


6FN2083FN2091Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN2083310-2.572465Hypothetical protein
FN2084311-2.537418MunI regulatory protein
FN2085315-3.195503NA+/H+ antiporter NHAC
FN2086417-3.678150Transcriptional regulator, DeoR family
FN2087517-4.744877ABC transporter ATP-binding protein
FN2088118-3.831523ABC transporter integral membrane protein
FN2089116-3.846992ABC transporter substrate-binding protein
FN2090013-3.212191Formate--tetrahydrofolate ligase
FN2091113-3.051628Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2085THERMOLYSIN334e-04 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 33.1 bits (75), Expect = 4e-04
Identities = 23/123 (18%), Positives = 44/123 (35%), Gaps = 12/123 (9%)

Query: 23 AELNEQQAKDIIKKEVPNGQITKFKLDKENGKMVYEIKVMDGNIEK---EYEIDAETGAI 79
++ ++ K+ E D+E ++ YE+ V Y IDA G +
Sbjct: 149 QDVADRVTKERPAAEEGKPTRLVIYPDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKV 208

Query: 80 LK----MEQEQKGNK----NANSVNNPKISSDKAKEI-ALKNSKNGKFKEIELKHKNGVL 130
L M++ + G ++V + K I +S G + + +G+
Sbjct: 209 LNKWNQMDEAKPGGAQPVAGTSTVGVGRGVLGDQKYINTTYSSYYGYYYLQDNTRGSGIF 268

Query: 131 VYD 133
YD
Sbjct: 269 TYD 271



Score = 32.7 bits (74), Expect = 5e-04
Identities = 20/145 (13%), Positives = 47/145 (32%), Gaps = 21/145 (14%)

Query: 26 NEQQAKDIIKKEVPNGQITKFKLDKE-NGKMVYE----IKVMDGNI-----EKEYEIDAE 75
++ +I ++ T + ++ + V DG + +D
Sbjct: 71 QARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAHVNDGELSSLSGTLIPNLDKR 130

Query: 76 TGAILKMEQEQKGNKNANSVNNPKISSDKAKEIALKNSKNGKF--KEIELKHKNGVLVYD 133
T Q+ A +++ ++ ++ GK I + L Y+
Sbjct: 131 TLKTEAAISIQQAEMIAKQDVADRVTKERPA------AEEGKPTRLVIYPDEETPRLAYE 184

Query: 134 VEI---AEGFMDREFLIDANTGEIL 155
V + + ++IDA G++L
Sbjct: 185 VNVRFLTPVPGNWIYMIDAADGKVL 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2086BCTERIALGSPD1252e-33 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 125 bits (315), Expect = 2e-33
Identities = 72/300 (24%), Positives = 134/300 (44%), Gaps = 29/300 (9%)

Query: 114 ETFGENIKVSTLSKVNKLVVSAERDILENAISIIEDIDKNPKQVKITSQILDISNNLFEE 173
+NI + + N L+V+A D++ + +I +D QV + + I ++ +
Sbjct: 304 AALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLN 363

Query: 174 LGFDWVYKQNVESQERNSLTAMILGKAGLN--------------GVGSTVNIVRQFHNKS 219
LG W K +Q NS + AG N + S I F+ +
Sbjct: 364 LGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGN 423

Query: 220 DVLSTGINLLEATNDLVVSSVPTLMIASGEEGEFKVTEEV--VVGIKTHREDKKDRYSEP 277
+ + L ++ + + P+++ E F V +EV + G +T D ++
Sbjct: 424 WAML--LTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNI--FNTV 479

Query: 278 VFKEAGLIMKVKPIIKDNDYIILEISLELSDFKFKKNVLNLKDINSGTYNSEGGSKVGRG 337
K G+ +KVKP I + D ++LEI E+S ++ D S T + G + R
Sbjct: 480 ERKTVGIKLKVKPQINEGDSVLLEIEQEVS---------SVADAASSTSSDLGATFNTRT 530

Query: 338 LTTKVRVKNGDTILLGGLKKSIQQNIESKIPILGDIPIISFFFKNTTKKNENSDMYIKLK 397
+ V V +G+T+++GGL + K+P+LGDIP+I F++T+KK ++ + ++
Sbjct: 531 VNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIR 590



Score = 31.4 bits (71), Expect = 0.007
Identities = 12/71 (16%), Positives = 32/71 (45%), Gaps = 2/71 (2%)

Query: 95 TKTFSLFNVSPDEVSKILHETFGEN--IKVSTLSKVNKLVVSAERDILENAISIIEDIDK 152
T+ L NV+ +++ +L + V N L+++ +++ ++I+E +D
Sbjct: 129 TRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDN 188

Query: 153 NPKQVKITSQI 163
+ +T +
Sbjct: 189 AGDRSVVTVPL 199


7FN2105FN0001Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN2105-1113.038395unknown
FN2106-2101.987495MRP-family nucleotide-binding protein
FN2107-1100.935106unknown
FN21080100.555283Hypothetical protein
FN21092110.688603Transporter
FN2110720-1.691828ABC transporter ATP-binding protein
FN2111621-1.637333tricarboxylate-binding protein
FN2112721-1.883073tricarboxylate transport membrane protein TctB
FN2113621-1.215436tricarboxylate transport membrane protein RctA
FN2114719-1.533076Transporter
FN2115716-1.393574Galactokinase
FN2116515-0.824067Galactose-1-phosphate uridylyltransferase
FN2117111-0.403135UDP-glucose 4-epimerase
FN2118190.276986Hypothetical exported 24-amino acid repeat
FN2119180.187794Hypothetical exported 24-amino acid repeat
FN2120191.057865Hypothetical protein
FN2121191.351733Hypothetical protein
FN2122291.033581Hypothetical protein
FN21232110.361701Hypothetical protein
FN2124211-0.063389Hypothetical exported 24-amino acid repeat
FN212518-1.570103Hypothetical exported 24-amino acid repeat
FN212608-3.349488Hypothetical exported 24-amino acid repeat
FN212729-5.799080Hypothetical exported 24-amino acid repeat
FN212829-5.049007Hypothetical exported 24-amino acid repeat
FN2129310-4.777118Hypothetical exported 24-amino acid repeat
FN000119-3.714631Phenylalanyl-tRNA synthetase beta chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2109NUCEPIMERASE1713e-53 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 171 bits (436), Expect = 3e-53
Identities = 85/348 (24%), Positives = 148/348 (42%), Gaps = 44/348 (12%)

Query: 1 MSILVCGGAGYIGSHVVKYLLEKNEDVVVVDSLITGHIDAVDEKAHLEL----------G 50
M LV G AG+IG HV K LLE VV +D+L + D ++A LEL
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYY-DVSLKQARLELLAQPGFQFHKI 59

Query: 51 DLKDEEFLNRVFEKYQIDGVIDFAAFSLVGESVGEPLKYFENNFYGTLCLLKVMKNNNVD 110
DL D E + +F + V V S+ P Y ++N G L +L+ ++N +
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 111 KIVFSSTAATYGEAENMPILETDRTE-PTNPYGESKLAVEKMFKWCANAYGLKYTALRYF 169
++++S+++ YG MP D + P + Y +K A E M ++ YGL T LR+F
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 170 NVAGAYPSGEIGEAHTCETHLIPLILQVALGQREKISIYGDDYPTPDGTCIRDYIHVMDL 229
V G P G A T + + + I +Y G RD+ ++ D+
Sbjct: 180 TVYG--PWGRPDMALFKFTKAML--------EGKSIDVYN------YGKMKRDFTYIDDI 223

Query: 230 ADAHYLALNRL---------RNGGDS------QIFNLGNGEGFSVKEVIEVTRKVTGYPI 274
A+A + + G + +++N+GN + + I+ G
Sbjct: 224 AEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEA 283

Query: 275 PAEVSPRRAGDPARLIASSKKAIDELKWKPKYDKLEQIIETAWNWHKN 322
+ P + GD A +K + + + P+ ++ ++ NW+++
Sbjct: 284 KKNMLPLQPGDVLETSADTKALYEVIGFTPETT-VKDGVKNFVNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2117MICOLLPTASE290.023 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 28.9 bits (64), Expect = 0.023
Identities = 26/102 (25%), Positives = 36/102 (35%), Gaps = 4/102 (3%)

Query: 114 TSYKNGLIDGLMKNYYPNGNIRSEMLLKKEVLDGITRTYYKNGKLKVEVSYQNGVKVGVQ 173
T + D + GN++ + V GIT T YK G L V Y G V
Sbjct: 889 TLSEEDYSDKYYFDVAKKGNVKITLNNLNSV--GITWTLYKEGDLNNYVLYATGNDGTVL 946

Query: 174 KGYY--QNGKLKIEHPIDKNGLTTGTVKVYYPSGKIMAEESY 213
KG + G+ + N T TV V + E +
Sbjct: 947 KGEKTLEPGRYYLSVYTYDNQSGTYTVNVKGNLKNEVKETAK 988


8FN0010FN0022Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN001029-2.739893Ribonuclease P protein component
FN0011410-3.888714hypothetical cytosolic protein
FN001249-6.062300Inner membrane protein
FN001309-5.260076Jag protein
FN001427-3.737883Thiophene and furan oxidation protein THDF
FN001537-3.638891Glucose inhibited division protein A
FN001628-3.454235Quinolinate synthetase A
FN001708-2.965638L-aspartate oxidase
FN0018-18-0.062833Nicotinate-nucleotide pyrophosphorylase
FN0019080.298639Transcriptional regulator
FN00201151.846227DNA topology modulation protein FLAR-related
FN00211142.095887Transporter
FN00222162.889829Transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0020CLENTEROTOXN260.032 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 25.8 bits (56), Expect = 0.032
Identities = 9/47 (19%), Positives = 20/47 (42%)

Query: 52 YNKYFKFEILQVPLGNVSKDKTSDLVKLIETKGLDIEINLDKDEDFF 98
Y Y K++ +++ GN+S D + + I + + D+
Sbjct: 133 YATYRKYQAIRISHGNISDDGSIYKLTGIWLSKTSADSLGNIDQGSL 179


9FN0076FN0081Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN00761113.591244Lipoprotein signal peptidase
FN00772123.865912Glycyl-tRNA synthetase alpha chain
FN00784155.495304Glycyl-tRNA synthetase beta chain
FN00794165.830551GTP cyclohydrolase I
FN00804134.0166392-amino-4-hydroxy-6-
FN00812132.759321Dihydropteroate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0076HTHFIS604e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.2 bits (146), Expect = 4e-13
Identities = 24/113 (21%), Positives = 48/113 (42%), Gaps = 3/113 (2%)

Query: 4 RVVVVEDETLTRIDLIEILKENGYDVVGEAADGIEAVEVCKKLQPDIVLLDIKIPYISGL 63
++V +D+ R L + L GYDV ++ D+V+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 64 KVANILKEEGFKGCVIILTAYNIAEYIQEASNTIVMGYILKP--IDEVIFLER 114
+ +K+ V++++A N +AS Y+ KP + E+I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0077PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 2e-06
Identities = 28/187 (14%), Positives = 72/187 (38%), Gaps = 28/187 (14%)

Query: 279 NNLQTVASLLRIQKRRVKNAETKKILDETINRILSIAITHEILSATGMDTISIKHILEIL 338
N L + +L+ + +++L LS + L + +S+ L +
Sbjct: 177 NALNNIRALILED-----PTKAREMLTS-----LS-ELMRYSLRYSNARQVSLADELT-V 224

Query: 339 CQNYFK-NNVDKSKKIEFNINGDEFSISSDKATSVALVVNEIVQNATEHAFIGR-DSGKV 396
+Y + ++ +++F + + ++V +V+N +H GK+
Sbjct: 225 VDSYLQLASIQFEDRLQFENQINP---AIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKI 281

Query: 397 IIKILKGEKFSKIKISDNGVGMEVN-RETNNMGLLIISSLVKDKLKG------NLEIRSK 449
++K K +++ + G N +E+ GL V+++L+ +++ K
Sbjct: 282 LLKGTKDNGTVTLEVENTGSLALKNTKESTGTGL----QNVRERLQMLYGTEAQIKLSEK 337

Query: 450 KDKGTTI 456
+ K +
Sbjct: 338 QGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0079MICOLLPTASE290.037 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 29.3 bits (65), Expect = 0.037
Identities = 14/52 (26%), Positives = 25/52 (48%), Gaps = 3/52 (5%)

Query: 55 TLKDLKENPAVPYEEDE-VTRIIIDDLNLQIYDEIKDW-TVSDLREWLLSEE 104
+L + +N VP DE V D+N +I ++IK+ + DL + +
Sbjct: 639 SLLNNIDNLDVPLVSDEYVNGHEAKDIN-EITNDIKEVSNIKDLSSNVEKSQ 689


10FN0109FN0120Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0109210-0.2631005S Ribosomal RNA
FN01102120.742490*tRNA-Asn
FN01113102.430652Inorganic pyrophosphatase
FN01122112.345358Flavodoxins/hemoproteins
FN0113293.311380Glutaredoxin
FN01142103.363061Ribonucleoside-diphosphate reductase alpha
FN01150123.368352Ribonucleoside-diphosphate reductase beta chain
FN01160124.010999Acetyltransferase
FN01172151.684420Hypothetical cytosolic protein
FN0118112-0.411749Hypothetical protein
FN0119213-2.885430Sodium/proline symporter
FN0120211-1.796847Microcin C7 self-immunity protein mccF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0114IGASERPTASE280.019 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.019
Identities = 29/179 (16%), Positives = 54/179 (30%), Gaps = 25/179 (13%)

Query: 2 KDKEIKEEVLKEEINKEVNEKKKCECEEGKEEAHEHKNDEHACCGKHNHKEEIEKLKAEI 61
+ KE + KE E EK K E E+ +E K E ++
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS----PKQEQSETVQPQAEPA 1146

Query: 62 EEWKNSFLRKQADFQNFTKRKEKEVDELKKFASEKIITQFLGSLDNFERAIESSSESKDF 121
E + K+ Q + ++ K S N E+ + S+
Sbjct: 1147 RENDPTVNIKEPQSQT---NTTADTEQPAKETSS-----------NVEQPVTESTTVNTG 1192

Query: 122 DSLLQGVEMIVRNLKDIMSSEDVEEIPTEGAFNPEYHHAVGVETSEDKKEDEIVKVLQK 180
+S +V N ++ + + +E + P+ H V + E +
Sbjct: 1193 NS-------VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0116SHAPEPROTEIN1653e-48 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 165 bits (420), Expect = 3e-48
Identities = 70/354 (19%), Positives = 135/354 (38%), Gaps = 39/354 (11%)

Query: 2 SKIIGIDLGTTNSCVAVMEGGSATIIPNSEGARTTPSVVNIKDNGEVVVGEIAKRQAVTN 61
S + IDLGT N+ + V G P+ R + VG AK+
Sbjct: 10 SNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSP---KSVAAVGHDAKQMLGRT 66

Query: 62 PTSTVSSIKTHMGSDYKVEIFGKKYTPQEISAKTLQKLKKDAEAYLGEEVKEAVITVPAY 121
P + +++I+ K + + +++ ++++ ++ ++ VP
Sbjct: 67 PGN-IAAIR-----PMKDGVIADFFVTEKMLQHFIKQVHSNS---FMRPSPRVLVCVPVG 117

Query: 122 FTDSQRQATKDAGTIAGLDVKRIINEPTAAALAYGLEKKKEEKVLVFDLGGGTFDVSVLE 181
T +R+A +++ AG +I EP AAA+ GL + +V D+GGGT +V+V+
Sbjct: 118 ATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVIS 177

Query: 182 ISDGVIEVISTAGNNHLGGDDFDNEIINWLVAEFKKETGIDLSNDKMAYQRLKDAAEKAK 241
++ V + + +GGD FD IIN++ + G + AE+ K
Sbjct: 178 LNGVV-----YSSSVRIGGDRFDEAIINYVRRNYGSLIG-------------EATAERIK 219

Query: 242 KELSTLM----ETSISLPFITMDATGPKHLEMKLTRAKFNDLTKHLVEATQGPTKTALKD 297
E+ + I + + P+ + + L + L AL+
Sbjct: 220 HEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN-SNEILEALQEPLTGIVSA-VMVALEQ 277

Query: 298 AS--LEANQIDE-ILLVGGSTRIPAVQEWVENFFGKKPNKGINPDEVVAAGAAI 348
L ++ + ++L GG + + + G +P VA G
Sbjct: 278 CPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLTCVARGGGK 331


11FN0133FN0147Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN0133414-1.484004Chaperone protein dnaJ
FN0134319-2.131055Flavodoxin
FN0135623-2.095446unknown
FN0136524-1.788697unknown
FN0137315-3.023954unknown
FN0138413-3.500123ATPase
FN0139514-3.843180***Fe-S oxidoreductase
FN0140414-4.757996Spermidine/putrescine-binding protein
FN0141811-7.005048Urease accessory protein ureG
FN0142311-3.700469ABC transporter ATP-binding protein
FN0143-212-1.151854Hemolysin activator protein precursor
FN0144-110-0.058313Hemolysin
FN01450100.125137Hypothetical protein
FN01460122.431821Hypothetical protein
FN01471123.498716Transposase
12FN0184FN0206Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0184411-1.585463RNA binding protein
FN018539-1.658916Hypothetical protein
FN0186312-1.090650Enoyl-[acyl-carrier-protein] reductase
FN0187311-0.567167Cell division inhibitor MinC
FN0188213-0.730421Cell division inhibitor MinD
FN0189214-0.546851Cell division inhibitor MinE
FN0190212-0.213222UNC-44 ankyrins
FN01912130.544596Ankyrin repeat proteins
FN01922130.371314Tetratricopeptide repeat family protein
FN0193211-2.230685Hypothetical protein
FN019429-2.166145Sarcosine oxidase alpha subunit
FN019519-2.020660Glycerol-3-phosphate dehydrogenase
FN0196010-0.970238unknown
FN0197-112-0.197097Hypothetical protein
FN01980140.897600Cytochrome c-type biogenesis protein ccdA
FN01992185.647056Thiol:disulfide interchange protein tlpA
FN02003185.604648Peptide methionine sulfoxide reductase
FN02014175.546827Two-component response regulator yesN
FN02023175.542326Two-component sensor kinase yesM
FN02033155.241204helix-turn-helix DNA-binding protein
FN02043144.843797Dipeptide-binding protein
FN02054144.076125Dipeptide transport system permease protein
FN02061133.433166Dipeptide transport system permease protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0189HTHFIS871e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 1e-21
Identities = 26/126 (20%), Positives = 57/126 (45%), Gaps = 3/126 (2%)

Query: 3 KLMIADDEPLIRRGIKQLIDLSSLQIGEIYEASTGEEALKVFEEFKPEIVLMDINMPKID 62
+++ADD+ IR + Q + + ++ S + ++V+ D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GLSVAKKIKSINPDTKIAIITGYNYFDYAQTAIKIGVEDYILKPISKSDVSEIIVKLVSS 122
+ +IK PD + +++ N F A A + G DY+ KP +++ II + ++
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 LQKERK 128
++
Sbjct: 122 PKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0190PF065801965e-60 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 196 bits (500), Expect = 5e-60
Identities = 57/195 (29%), Positives = 99/195 (50%), Gaps = 7/195 (3%)

Query: 355 LREYEINALYSQINPHFLYNTLDTIIWMAEFQDTEKVISITKALSNFFRISLSNGKEK-I 413
+E ++ AL +QINPHF++N L+ I + +D K + +LS R SL + +
Sbjct: 158 AQEAQLMALKAQINPHFMFNALNNIRALIL-EDPTKAREMLTSLSELMRYSLRYSNARQV 216

Query: 414 PLKEEINHIKEYLYIQKQRYEDKLEYKISIQEELENIEVPKIILQPFVENAIYHGIKNLD 473
L +E+ + YL + ++ED+L+++ I + +++VP +++Q VEN I HGI L
Sbjct: 217 SLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLP 276

Query: 474 TTGIISIYSQIVENKIELIIEDNGIGFEAAKKQALMKMGGVGIKNVNKRIQYYYGKEYGA 533
G I + + L +E+ G K+ G G++NV +R+Q YG E
Sbjct: 277 QGGKILLKGTKDNGTVTLEVENTGSLALKNTKE----STGTGLQNVRERLQMLYGTEAQI 332

Query: 534 KIDSSF-KTGARIII 547
K+ K A ++I
Sbjct: 333 KLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0198PF05043310.019 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 30.7 bits (69), Expect = 0.019
Identities = 76/497 (15%), Positives = 169/497 (34%), Gaps = 74/497 (14%)

Query: 11 LPKIAELLTLTERSIRYKIDEIN---EELGTKKIEIKKREFFSSLTDEDMTKLVENVEGE 67
++AELL TER+++ + + +L R + +D +M
Sbjct: 28 RSELAELLNCTERAVKDDLSHVKSAFPDLIFHSSTNGIRIINTDDSDIEMV--------- 78

Query: 68 NYIYNQKERQELIILYTLMKKDNFLLKEVAEKLGTSKSTIRNDLKNLKKILLDYNIKLLQ 127
Y + K IL + + + + ++ S S++ + + K++ +
Sbjct: 79 -YHHFFKHSTHFSILEFIFFNEGCQAESICKEFYISSSSLYRIISQINKVIKRQFQFEVS 137

Query: 128 DDKLKYYFDYLEDDYRYFIATYLYKYVSFDEKYDKIFFDDISYFRKVIYKEIKDEYITEI 187
++ + E D RYF A Y F EKY F ++
Sbjct: 138 LTPVQIIGN--ERDIRYFFAQY------FSEKY-YFLEWPFENFSSEPLSQL-------- 180

Query: 188 NSVSKKMKKAELDYMDETLNILAILMVISQKREKKNSNLKIENIKILEKRKEYLQLKKIF 247
+ K+ T +L +L+V + R K ++++ ++ ++L +
Sbjct: 181 --LELVYKETSFPMNLSTHRMLKLLLVTNLYRIKFGHFMEVDKDSFNDQSLDFLMQAEGI 238

Query: 248 TDFSNT----------------------NLMFFTDYLFRITRDEKDIFVKFKNWLDIIVA 285
+ + MFF D + +KD +V + ++
Sbjct: 239 EGVAQSFESEYNISLDEEVVCQLFVSYFQKMFFIDESLFMKCVKKDSYV--EKSYHLL-- 294

Query: 286 VNKIVRTFEIKIKVDLKNID---IFLDEIFYYIKPLIFRTKKRIKLKNSILKDVKKLYPS 342
+ + +K +++++N D L + + +F K + +++ + ++P
Sbjct: 295 -SDFIDQISVKYQIEIENKDNLIWHLHNTAHLYRQELFTEFILFDQKGNTIRNFQNIFPK 353

Query: 343 IFNFLKKNFYFLEEVINEKVSDEEIAYLVPFF-----HKALQNNNKINKKGILVTTYKEN 397
+ +KK E + S + +L F H + K +LV + +
Sbjct: 354 FVSDVKKELSHYLETLEVCSSSMMVNHLSYTFITHTKHLVINLLQNQPKLKVLVMSNFDQ 413

Query: 398 IALFLKENIEAEFLIDIDKILTLKSFEQIVKDLEN--YDYILTTFSVEKDFVKEIKRTKI 455
+ + + ++ E + LE+ YD I++ F + I+ ++
Sbjct: 414 YHAKFVAETLSYYCSNNFELEVWTELELSKESLEDSPYDIIISNFIIPP-----IENKRL 468

Query: 456 IELNPILTEKDIKKLEE 472
I N I T I L
Sbjct: 469 IYSNNINTVSLIYLLNA 485


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0200RTXTOXIND290.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.004
Identities = 11/42 (26%), Positives = 17/42 (40%)

Query: 67 TITSPMPGSILDVKVNVGDKVKFGQTLAILEAMKMENDIPAT 108
I + ++ V G+ V+ G L L A+ E D T
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKT 139


13FN0246FN0271Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN02462160.865852unknown
FN02472181.180407ABC transporter ATP-binding protein
FN02486251.863619ABC transporter substrate-binding protein
FN02493244.156101ABC transporter permease protein
FN02503234.151307Hypothetical protein
FN02512223.484368unknown
FN02523212.804344Thymidylate synthase
FN02532152.752677Dihydrofolate reductase
FN02542142.764897Trk system potassium uptake protein trkA
FN0255090.454008Poly(A) polymerase
FN0256-180.631795COP associated protein
FN02570102.103968Copper-exporting ATPase
FN02580112.318705unknown
FN02592132.642237Hypothetical cytosolic protein
FN02601112.042676Hypothetical Exported Protein
FN02610102.027858unknown
FN02621101.672970unknown
FN0263410-0.156719Hypothetical membrane-spanning Protein
FN0264310-0.961916unknown
FN0265210-1.245292Outer membrane protein
FN0266111-0.285718Fusobacterium outer membrane protein family
FN0267-1110.491366Hypothetical cytosolic protein
FN0268-2121.830305Hypothetical cytosolic protein
FN0269-3132.583763Transporter
FN0270-1173.188767Zinc-transporting ATPase
FN02712162.975938Zinc-transporting ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0250IGASERPTASE270.039 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.039
Identities = 30/154 (19%), Positives = 49/154 (31%), Gaps = 16/154 (10%)

Query: 24 APDVTSGTT-MSAEEQKEAMDILDRMREKIEKEEAEKAKLIAEAKELGMSPSEVASMDNV 82
AP S TT AE K+ E K +A E EVA
Sbjct: 1029 APATPSETTETVAENSKQ--------------ESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 83 EEMLEAKRAAEAKPKTEAEKLELTRKKALNKLDFYERVVRSVAREENEVSDYYGVMGEEK 142
+ A+ +E ++ + T K ++ E + + EV + ++
Sbjct: 1075 NVKANTQTNEVAQSGSETKETQTTETKETATVE-KEEKAKVETEKTQEVPKVTSQVSPKQ 1133

Query: 143 QRSTVYLGTAEAAAEQQVEQNAAPAEIQPETPEE 176
++S AE A E N + Q T +
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0251IGASERPTASE300.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.002
Identities = 17/68 (25%), Positives = 26/68 (38%), Gaps = 5/68 (7%)

Query: 24 EEYDKMQAEKAKEAERMAKENPQATEVVGENGEVVVTEGEEVAMAPKKSEKDMTESERMD 83
E E AKEA+ K N Q EV E +E K + + E+
Sbjct: 1059 TETTAQNREVAKEAKSNVKANTQTNEVAQSGSET-----KETQTTETKETATVEKEEKAK 1113

Query: 84 VEVQRIKK 91
VE ++ ++
Sbjct: 1114 VETEKTQE 1121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0253OMPADOMAIN1078e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 107 bits (268), Expect = 8e-31
Identities = 39/128 (30%), Positives = 66/128 (51%), Gaps = 16/128 (12%)

Query: 38 FDFDKSNVKPQYYDLLNNIKEFVEQ---NNYEITIVGHTDSIGSNAYNFKLSRRRAESVK 94
F+F+K+ +KP+ L+ + + + + ++G+TD IGS+AYN LS RRA+SV
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 95 AKLLEFGLSEDRIVGIEAMGEEQPIATN---------ATKEGRAQNRRVEFKLVQRETVQ 145
L+ G+ D+I MGE P+ N A + A +RRVE ++ + ++
Sbjct: 283 DYLISKGIPADKI-SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVE---IEVKGIK 338

Query: 146 GTTATPEA 153
P+A
Sbjct: 339 DVVTQPQA 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0266GPOSANCHOR432e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 42.7 bits (100), Expect = 2e-06
Identities = 26/187 (13%), Positives = 61/187 (32%)

Query: 22 SVKDMNKRLKNIDKEIEKKNTRIKAIDTETSKLEKMIKELEEEIKKLEHERKEIEDEITV 81
D+ K L+ + +IK ++ E + LE ELE+ ++ + +I
Sbjct: 156 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 215

Query: 82 VKKNIDYSRKNLEISEVEHNRKESEFVAKIIAWDKYSKIHRKEIDEKVLLTKNYREMLHG 141
++ E + A + L K ++
Sbjct: 216 LEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNF 275

Query: 142 DLQRMGHIEKVTGSIKEVKEKIEAEKRKLDRLEAELRENLRKSDIKKEEQKKLKEKLQVE 201
I+ + ++ + + + L A + R D +E +K+L+ + Q
Sbjct: 276 STADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKL 335

Query: 202 KKGHQSS 208
++ ++ S
Sbjct: 336 EEQNKIS 342



Score = 40.0 bits (93), Expect = 1e-05
Identities = 31/232 (13%), Positives = 71/232 (30%), Gaps = 3/232 (1%)

Query: 24 KDMNKRLKNIDKEIEKKNTRIKAIDTETSKLEKMIKELEEEIKKLEHERKEIEDEITVVK 83
K + + ++ ++ IK LE E LE + E+E +
Sbjct: 144 KTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAM 203

Query: 84 KNIDYSRKNLEISEVEHNRKESEFVAKIIAWDKYSKIHRKEIDEKVLLTKNYREMLHGDL 143
++ E E + A + + + L +
Sbjct: 204 NFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQA 263

Query: 144 QRMGHIEKVTGSIKEVKEKIEAEKRKLDRLEAELRENLRKSDIKKEEQKKLKEKLQV--- 200
+ +E KI+ + + LEAE + +S + ++ L+ L
Sbjct: 264 ELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASRE 323

Query: 201 EKKGHQSSIEKLKKEKQRISKEIERIIRENARRAAEKAAREKAAKEAAKNKG 252
KK ++ +KL+++ + + + R+ K E ++ +
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375



Score = 32.3 bits (73), Expect = 0.003
Identities = 29/220 (13%), Positives = 66/220 (30%), Gaps = 22/220 (10%)

Query: 38 EKKNTRIKAIDTETSKLEKMIKELEEEIKKLEHERKEIEDEITVVKKNIDYSRKNLEISE 97
E ++ K+++ + E E L+ + ++ +K + D + L ++
Sbjct: 39 EVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAK 98

Query: 98 VEHNRKESEFVAKIIAWDKYSKIHRKEIDEKVLLTKNYREMLHGDLQRMGHIEKVTGSIK 157
+ + K + EK + + + + +
Sbjct: 99 EKLR------------------KNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 158 EVKEKIEAEKRKL----DRLEAELRENLRKSDIKKEEQKKLKEKLQVEKKGHQSSIEKLK 213
+ +EAEK L LE L + S + K L+ + + + L+
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 214 KEKQRISKEIERIIRENARRAAEKAAREKAAKEAAKNKGK 253
+ + +I A +AA A + K
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 240



Score = 30.0 bits (67), Expect = 0.019
Identities = 55/282 (19%), Positives = 104/282 (36%), Gaps = 24/282 (8%)

Query: 11 LLSASIYPASKSVKDMNKRLKNIDKEIEKKNTRIKAIDTETSKLEKMIKELEEEIKKLEH 70
SA I + R +++K +E A + LE LE +LE
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 71 ERKEIEDEITVVKKNIDYSRKNLEISEVEHNRKESEFVAKIIAWDKYSKIHRKEIDEKVL 130
+ + T I E E E + + +R+ + +
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ--------SQVLNANRQSLRRDLD 319

Query: 131 LTKNYREMLHGDLQRMGHIEKVT-GSIKEVKEKIEAEKRKLDRLEAEL------------ 177
++ ++ L + Q++ K++ S + ++ ++A + +LEAE
Sbjct: 320 ASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEA 379

Query: 178 -RENLRKSDIKKEEQKKLKEKLQVEKKGHQSSIEKLKKEKQRISKEIERIIRE-NARRAA 235
R++LR+ E KK EK E +++EKL KE + K E+ E A+ A
Sbjct: 380 SRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEA 439

Query: 236 E-KAAREKAAKEAAKNKGKGSKRSGGTKVTTTTVDMPKISNP 276
E KA +EK AK+A + + ++ ++ +
Sbjct: 440 EAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPGK 481



Score = 29.3 bits (65), Expect = 0.029
Identities = 45/245 (18%), Positives = 88/245 (35%), Gaps = 7/245 (2%)

Query: 19 ASKSVKDMNKRLKNIDKEIEKKNTRIKAIDTETSKLEKMIKELEEEIKKLEHERKEIEDE 78
+ R ++K +E A + LE LE E LEH+ + +
Sbjct: 251 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNAN 310

Query: 79 ITVVKKNIDYSRKNLEISEVEHNRKESEFVAKIIAWDKYSKIHRKEIDEK----VLLTKN 134
+++++D SR+ + E EH + E + + + + K K
Sbjct: 311 RQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKL 370

Query: 135 YREMLHGDLQRMGHIEKVTGSI---KEVKEKIEAEKRKLDRLEAELRENLRKSDIKKEEQ 191
+ + R + S K+V++ +E KL LE +E + ++E+
Sbjct: 371 EEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEK 430

Query: 192 KKLKEKLQVEKKGHQSSIEKLKKEKQRISKEIERIIRENARRAAEKAAREKAAKEAAKNK 251
+L+ KL+ E K + + K +E ++ + + KA K A K
Sbjct: 431 AELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTK 490

Query: 252 GKGSK 256
+K
Sbjct: 491 PNQNK 495


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0268PF01540364e-04 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 35.9 bits (82), Expect = 4e-04
Identities = 60/286 (20%), Positives = 117/286 (40%), Gaps = 48/286 (16%)

Query: 156 KDEKDIKENLASLLYQYREIDSKIEDIEREKRETLEKKEFYEYQLEEIEKLKLKDGEDEL 215
+ IKE LL ++ KI+ T+ K E ++Q++E K +L + L
Sbjct: 118 DENLKIKEGAKELL----KLSEKIQSFADTIALTITKLEGKKFQIDETFKKQLISTIELL 173

Query: 216 LE--VEYKRVFNAEKIREK-VYESLEYLKNDDDSALSLITNSIRNI---------EYLGK 263
+ E K I++ + LE K + S L I + + E +
Sbjct: 174 NKKSAEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKKAWSKELAEIKAE 233

Query: 264 YDERYTELAKRIENAYYELEDCANEIEDISKGIDVTENDLDKIASRMNTLKRIKEKYKRT 323
D++ E ++I+ EL + +I+ + I +T L++ +I EK+K+
Sbjct: 234 DDKKLAEENQKIKEGAKELLKLSEKIQSFADTIALTITKLERKF-------QIDEKFKKQ 286

Query: 324 LPELIEYREDLKEKLSDIDS--------GDFKTKELKK-------ELNKIKDEYDKIAEK 368
L IE L +K ++ + DF EL+ L KI E++++ +
Sbjct: 287 LISTIE---LLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKKA 343

Query: 369 LTNSRKEIAVKIENELLNELKFLNMEDAKLKVQINKLERMTNEGYD 414
+ EI + + +L E+ K+K + +L+++ NE ++
Sbjct: 344 WSKELAEIKAEDDKKLAE-------ENQKIKNGVEELKKINNEAFE 382


14FN0517FN0530Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN051739-4.045039Ornithine decarboxylase
FN051858-4.483412Phosphoheptose isomerase
FN051948-4.648234Transcriptional regulatory protein, LYSR family
FN052029-3.296086Arginine permease
FN052128-3.033499Anthranilate synthase component II
FN052228-3.159262Arginyl-tRNA synthetase
FN052309-1.967854unknown
FN0524-18-1.893124Serine protease
FN0525-19-0.702612Transposase
FN0526-19-2.364187unknown
FN0527-113-1.742160D-lactate dehydrogenase
FN0528215-0.591235Flavoprotein
FN0529315-0.953561Flavodoxin
FN05304150.207385Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0522GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 39/294 (13%), Positives = 94/294 (31%), Gaps = 6/294 (2%)

Query: 224 NINVVSKNLENEIKDYETTEIELNNLVKNIKDEENKIKKYLNILKENIIEAKQAKKSKII 283
+ V + + + T +++ ++L N K ++ + L + ++ KS
Sbjct: 51 TLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSE 110

Query: 284 VKETEKSYLEYLEIENRLKDLRENLDNLLEEQKLNIQYQNNIEKLELSNKNLKNDIINLE 343
+ + + N + ++ + L +L+ +
Sbjct: 111 KASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEK--AALAARKADLEKALEGAM 168

Query: 344 ENISKNSEKKENLESEISNLKIKEEDLDLKLKKYISLLDELEKLENFKDKKLEDKLKKTT 403
+ +S K + LE+E + L+ ++ +L+ L+ ++ K K LE +
Sbjct: 169 NFSTADSAKIKTLEAEKAALEARQAELEKALEG----AMNFSTADSAKIKTLEAEKAALA 224

Query: 404 EIDILKKELISKKDLFKTINIEIIEEKLSNFQELEKELKLLEEQKIIFEIEIKTLKKSSK 463
++ + F T + I+ + LE LE+ K
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 284

Query: 464 ELSDKICPFLNEKCQNLEDKEAEDYFSSKISIKKEDLENLKKNIKEKTQILVEK 517
L + EK + + + + KK ++ + Q L E+
Sbjct: 285 TLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQ 338



Score = 41.2 bits (96), Expect = 2e-05
Identities = 52/306 (16%), Positives = 97/306 (31%), Gaps = 27/306 (8%)

Query: 343 EENISKNSEKKENLESEISNLKIKEEDLDLKLKKYISLLDELEKLENFKDKKLEDKLKKT 402
+ + K E+ + E E + LK+K DL K DEL + + +KL K
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKS- 107

Query: 403 TEIDILKKELISKKDLFKTINIEIIEEKLSNFQELEKELKLLEEQKIIFEIEIKTLKKSS 462
+ EK S QELE LE+
Sbjct: 108 ------------------------LSEKASKIQELEARKADLEKALEGAMNFSTADSAKI 143

Query: 463 KELSDKICPFLNEKCQNLEDKEAEDYFSSKISIKKEDLENLKKNIKEKTQILVEKVVFED 522
K L + K + E FS+ S K + LE K ++ + L + E
Sbjct: 144 KTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA--LEG 201

Query: 523 KKKQYFELEKSIKDLEISLKNEEINLKEIELDIKNLDIDIQKLIENQEFQNSQMLREKKT 582
IK LE ++E ++ + ++ +
Sbjct: 202 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 261

Query: 583 ELEVELRNLNLDEKRENLKNILENLEIEKEKILKNQNSIKSNLEEIDVFSKKIKEDINKN 642
+ E+E ++ LE EK + + ++ + ++ + ++ D++ +
Sbjct: 262 QAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS 321

Query: 643 IESIKS 648
E+ K
Sbjct: 322 REAKKQ 327


15FN0563FN0573Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0563111-3.576142RecA protein
FN0564011-6.058701Regulatory protein recX
FN0565-111-6.372172O-sialoglycoprotein endopeptidase
FN0566-19-6.202626hypothetical Protein
FN0567-28-7.035058Bacterial regulatory proteins, crp family
FN056808-6.472034Serine racemase
FN056908-6.287176D-serine dehydratase
FN0570110-6.039103D-serine permease
FN0571211-4.006591Transcriptional regulator, MerR family
FN0572110-3.329011unknown
FN0573210-2.092103unknown
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0563FbpA_PF05833483e-08 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 48.3 bits (115), Expect = 3e-08
Identities = 22/101 (21%), Positives = 44/101 (43%), Gaps = 5/101 (4%)

Query: 208 LEDDGLLKDEHYWLFKLIKEARFFRF--SQARYLFVGRNKESNDKIDEFRKEKNLDFYIQ 265
L + G +K + + K K ++ F ++VG+N ND + + D +
Sbjct: 437 LIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYL-TLKFANKHDIWFH 495

Query: 266 SSEVPGPHII--ANTNLTDEEIEFAKKLFSRYSKVKGNEKV 304
+ +PG H+I ++ + + A L + YSK + + V
Sbjct: 496 TKNIPGSHVIVKNIMDIPESTLLEAANLAAYYSKSQNSSNV 536


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0566NAFLGMOTY290.033 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 28.6 bits (63), Expect = 0.033
Identities = 26/109 (23%), Positives = 51/109 (46%), Gaps = 12/109 (11%)

Query: 27 QMFYNRELKTEISIDKINLLVKFNAFNGKEIKEVEYLYEDKNNHGTIKYFDKEGKLKLKA 86
Q + +R+ + E+++ + K+NAF+ ++Y +ED TI +++++G KA
Sbjct: 140 QDWQSRDQRIEVALSSVLFQSKYNAFSDCIANLLKYSFEDIAF--TILHYERQGDQLTKA 197

Query: 87 AISTNNLLNGNSRLDSIEAYYENGNKIDMNLL-RYPVEIDGLWKNNKLT 134
+ RL I Y + ID+ L+ Y DG ++ L+
Sbjct: 198 S---------KKRLAQIADYVRHNQDIDLVLVATYTDSTDGKSESQSLS 237


16FN0665FN0670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN066529-0.006005Exoenzymes regulatory protein aepA precursor
FN066629-0.167616NA+/H+ antiporter NHAC
FN066728-0.395975Ribosomal large subunit pseudouridine synthase
FN066827-0.285442Glyceraldehyde 3-phosphate dehydrogenase
FN066929-0.867741unknown
FN0670290.417160Phosphoglycerate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0668ADHESNFAMILY2641e-89 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 264 bits (676), Expect = 1e-89
Identities = 94/315 (29%), Positives = 171/315 (54%), Gaps = 16/315 (5%)

Query: 5 MKKIFKLLTVMMISLFIVACGEKKEEVGTSNEIQKIKVTTTLNYYQNLIEEIGGDKVEVI 64
MKK+ LL + + ++ +VAC K++ + QK+KV T + ++ + I GDK+++
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSG---QKLKVVATNSIIADITKNIAGDKIDLH 57

Query: 65 GLMKEGEDPHLYVATAGDVEKLQNADLVVYGGLHLEGKMTDIFANLS-------NKYILN 117
++ G+DPH Y DV+K ADL+ Y G++LE F L NK
Sbjct: 58 SIVPIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDYFA 117

Query: 118 LGDQLDKSLLHKEDE-NTYDPHVWFNTKFWAIQAKSVADKLSEILPENKDYFENNLQVYL 176
+ D +D L ++E DPH W N + I AK++A +LS P NK+++E NL+ Y
Sbjct: 118 VSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYT 177

Query: 177 KSLDEATEYIQAKINEIPEESRYLITAHDAFAYFAEQFGLQVKAIQGVSTDSEIGTKQIE 236
LD+ + + K N+IP E + ++T+ AF YF++ +G+ I ++T+ E +QI+
Sbjct: 178 DKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIK 237

Query: 237 DLANFIVEHNIKAIFVESSVNHKSIEALQEAVKAKGGNVEIGGELYSDSMGDKENNTETY 296
L + + + ++FVESSV+ + ++ + ++ N+ I ++++DS+ ++ ++Y
Sbjct: 238 TLVEKLRQTKVPSLFVESSVDDRPMKTV-----SQDTNIPIYAQIFTDSIAEQGKEGDSY 292

Query: 297 IKTIKANADTISNAL 311
+K N D I+ L
Sbjct: 293 YSMMKYNLDKIAEGL 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0670RTXTOXINA280.049 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.4 bits (63), Expect = 0.049
Identities = 14/57 (24%), Positives = 28/57 (49%), Gaps = 11/57 (19%)

Query: 229 SVVTLLSAIIGGISGA-MGSIISTFDVALPTGPLIILVSGIFALISFL--FSKKGII 282
++ T+L+++ GIS A S++ P+ LV + +IS + SK+ +
Sbjct: 370 TISTVLASVSSGISAAATTSLVGA--------PVSALVGAVTGIISGILEASKQAMF 418


17FN0833FN0861Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN0833110-3.395247ABC transporter ATP-binding protein
FN0834211-2.906054DNA-binding protein HU
FN0835116-2.847011Tetratricopeptide repeat family protein
FN0836516-2.668560Mercuric reductase
FN0837516-2.341744Hypothetical protein
FN0838519-2.327722Shikimate kinase
FN0839419-2.769815GTP-binding protein hflX
FN0840617-2.546666hypothetical cytosolic protein
FN0841115-2.539570Hypothetical cytosolic protein
FN0842013-2.592721periplasmic component of efflux system
FN0843113-2.763805ABC transporter ATP-binding protein
FN0844010-2.644774ABC transporter permease protein
FN084509-2.751725unknown
FN0846-110-2.897063Hypothetical protein
FN0847-111-3.057140Hemin receptor
FN0848111-2.338932Hypothetical protein
FN0849011-2.389789Hypothetical protein
FN0850311-1.567774Hypothetical Exported Protein
FN0851310-0.160378Hypothetical protein
FN0852280.731518Hypothetical protein
FN0853280.970125Integrase/recombinase
FN0854270.723583unknown
FN0855371.118307unknown
FN0856180.946866Transposase
FN0857-170.408760Transposase
FN085829-0.858918Hypothetical protein
FN0859215-1.701776unknown
FN0860414-0.868582unknown
FN0861212-1.238334unknown
18FN0890FN0897Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN0890-19-3.481158Phosphohydrolase (MUTT/NUDIX family protein)
FN0891-110-2.80482223S rRNA methyltransferase
FN089219-4.166318Gamma-glutamyltranspeptidase
FN089318-5.765624Sodium/proline symporter
FN089417-5.673759Transcriptional regulator, GntR family
FN089508-4.662599ABC transporter integral membrane protein
FN0896-18-4.144363ABC transporter ATP-binding protein
FN0897-18-4.139437ABC transporter integral membrane protein
19FN0913FN0939Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0913111-3.304940Hypothetical cytosolic protein
FN0914112-3.829048Hypothetical protein
FN0915-111-2.2860353-oxoacyl-[acyl-carrier protein] reductase
FN091619-3.595746Metal dependent hydrolase
FN0917310-5.031879DNA polymerase, bacteriophage-type
FN091817-4.5271185-formyltetrahydrofolate cyclo-ligase
FN091908-4.294549Polysialic acid capsule expression protein kpsF
FN0920-18-4.514070NAD(FAD)-utilizing dehydrogenases
FN092118-5.167972Hypothetical protein
FN0922210-4.996338Glycerol-3-phosphate dehydrogenase [NAD(P)+]
FN0923011-3.844262Ribosomal RNA small subunit methyltransferase C
FN0924-112-3.158775Tpl protein
FN0925011-2.501344DNA repair protein radC
FN0926011-1.731066Nicotinate-nucleotide--dimethylbenzimidazole
FN0927011-2.592065Alpha-ribazole-5'-phosphate phosphatase
FN0928-111-2.462596Cobalamin [5'-phosphate] synthase
FN0929110-2.126756Cobinamide kinase
FN0930312-2.580897Hypothetical protein
FN0931313-3.193938PTS system, N-acetylglucosamine-specific IIA
FN0932312-3.226474Hypothetical Exported Protein
FN0933212-1.756996Hypothetical protein
FN0934113-0.570536unknown
FN0935211-0.258944hypothetical protein
FN09360111.716572Protease HTPX
FN09370110.549813Hypothetical protein
FN09381110.951861Homoserine kinase
FN0939290.266536Cardiolipin synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0922YERSSTKINASE290.024 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.3 bits (65), Expect = 0.024
Identities = 15/47 (31%), Positives = 22/47 (46%), Gaps = 13/47 (27%)

Query: 183 SGIIHGDIFPDNVLLDEYNNIKVIFD-------------FNESYYAP 216
+G++H DI P NV+ D + V+ D F ES+ AP
Sbjct: 264 AGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGFTESFKAP 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0924LCRVANTIGEN290.013 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 28.9 bits (64), Expect = 0.013
Identities = 30/137 (21%), Positives = 58/137 (42%), Gaps = 23/137 (16%)

Query: 75 IAFYKVYGDYKDRD-----KVVISTQDIKRIKMLAFYNEFWEKLR------AEFNKRPTI 123
+ + + D D D ++ R K+ E +L+ AE NK +
Sbjct: 121 VMHFSLTADRIDDDILKVIVDSMNHHGDARSKLREELAELTAELKIYSVIQAEINKHLSS 180

Query: 124 LFTVNLEDKVFLDVLDFIIAKTDRLQPIYLYTGDEIDKLLTDKDIISFINKYSIEIIKGE 183
T+N+ DK ++++D + +Y YT +EI K + I+ + + +I++ E
Sbjct: 181 SGTINIHDKS-INLMD---------KNLYGYTDEEIFKASAEYKILEKMPQTTIQVDGSE 230

Query: 184 NKEFIANVKEKFFGEKK 200
K I ++K+ E K
Sbjct: 231 KK--IVSIKDFLGSENK 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0926PHPHLIPASEA1290.019 Bacterial phospholipase A1 protein signature.
		>PHPHLIPASEA1#Bacterial phospholipase A1 protein signature.

Length = 289

Score = 28.8 bits (64), Expect = 0.019
Identities = 8/25 (32%), Positives = 12/25 (48%)

Query: 203 RYPYDVDNPIITEYLGIFNRIVGSA 227
DNP IT+Y+G + +G
Sbjct: 198 VVGNTDDNPDITKYMGYYQLKIGYH 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0934cloacin290.036 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.9 bits (64), Expect = 0.036
Identities = 14/49 (28%), Positives = 27/49 (55%), Gaps = 2/49 (4%)

Query: 300 SVVVKPTSSISLEQKTVNIKEMKEDILKINGRHDTCI--VPRVLPVIEA 346
+ P SS+ L++ TVN+ D +K ++ + + VP +PV++A
Sbjct: 163 DITESPVSSLPLDKATVNVNVRVVDDVKDERQNISVVSGVPMSVPVVDA 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0935TYPE4SSCAGX290.036 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 28.6 bits (63), Expect = 0.036
Identities = 36/122 (29%), Positives = 53/122 (43%), Gaps = 25/122 (20%)

Query: 182 EEEFDKNLENRY--EFYKNESQNNIKAFINIGGNLLSLGENADIINN-------QKILLD 232
EEE K E + E K E+ N A+IN + + N IIN QKI+LD
Sbjct: 324 EEELKKREEAKRQRELIKQENLNT-TAYIN----RVMMASNEQIINKEKIREEKQKIILD 378

Query: 233 ESSPLKTGLVGKFLKDDIPVFYLLNIKSIALYYNLEHDPDKFSK-IGTSSIYYDSSKNFW 291
++ L+T V LK + + YN P+K SK I S I+ D + ++
Sbjct: 379 QAKALETQYVHNALKRN----------PVPRNYNYYQAPEKRSKHIMPSEIFDDGTFTYF 428

Query: 292 NY 293
+
Sbjct: 429 GF 430


20FN1032FN1056Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1032111-3.531855Lipid A biosynthesis lauroyl acyltransferase
FN103329-2.876230Hypothetical Exported Protein
FN1034411-3.489511unknown
FN1035512-2.9993413-hydroxybutyryl-CoA dehydrogenase
FN1036412-4.2021953-hydroxybutyryl-CoA dehydratase
FN1037211-3.307401hypothetical cytosolic protein
FN1038112-2.727396Calcium-transporting ATPase
FN1039014-3.6306545-Nitroimidazole antibiotic resistance protein
FN1040114-3.445777DNA-binding protein HU
FN1041114-3.295737Guanine-hypoxanthine permease
FN1042-115-2.896957tRNA pseudouridine synthase A
FN1043-116-4.394140Rod shape-determining protein rodA
FN1044015-4.392938Deoxyuridine 5'-triphosphate
FN1045-114-4.085942Zinc protease
FN1046-113-4.711955Hypothetical membrane-spanning protein
FN1047-113-5.131957Hypothetical membrane-spanning protein
FN1048114-4.132974Hypothetical protein
FN1049214-3.558435Methyltransferase
FN1050012-2.486520Transcriptional regulator, TetR family
FN1051-113-3.503903Iron-sulfur flavoprotein
FN1052013-3.451908Hypothetical protein
FN1053114-3.691450Hypothetical cytosolic protein
FN1054211-2.957637Transporter
FN105529-2.562025Branched-chain amino acid transport protein
FN105628-3.731509Branched-chain amino acid transport protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1034TETREPRESSOR506e-10 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 50.3 bits (120), Expect = 6e-10
Identities = 21/55 (38%), Positives = 32/55 (58%)

Query: 18 KEELIKKGIELINEVGENKLSLRKLAIICGVSNSAPYTHFKSKDELLKGMSLYIL 72
+E +I +EL+NE G + L+ RKLA G+ Y H K+K LL +++ IL
Sbjct: 6 RESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEIL 60


21FN1073FN1082Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1073210-2.969896Diamine acetyltransferase
FN1074-111-3.006401Hypothetical protein
FN1075011-2.812913Branched chain amino acid transport system II
FN1076-212-3.336642hypothetical cytosolic protein
FN1077014-2.136228unknown
FN1078-114-2.064695Hydrolase
FN1079114-1.629899N-acyl-L-amino acid amidohydrolase
FN1080213-1.307104Hypothetical protein
FN1081-116-3.656472Hypothetical protein
FN1082114-3.761777Exodeoxyribonuclease VII large subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1079HELNAPAPROT1048e-32 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 104 bits (262), Expect = 8e-32
Identities = 42/138 (30%), Positives = 73/138 (52%), Gaps = 1/138 (0%)

Query: 4 KENLNKYLSNLGILITKTHNLHWNVVGARFKAIHEYTESLYDYYFEKFDEVAEAFKMKGE 63
+ +LN LSN +L +K H HW V G F +HE E LYD+ E D +AE G
Sbjct: 14 ENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGG 73

Query: 64 FPLVKVADYLKHATVKELEAKDFTIPEVVTSIKEDIELMLADARKIREVANEEDDFLVAN 123
P+ V +Y +HA++ + + + E+V ++ D + + ++++ + +A E D A+
Sbjct: 74 QPVATVKEYTEHASITD-GGNETSASEMVQALVNDYKQISSESKFVIGLAEENQDNATAD 132

Query: 124 MMEDQIEYFVKQLWFISA 141
+ IE KQ+W +S+
Sbjct: 133 LFVGLIEEVEKQVWMLSS 150


22FN1092FN1097Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN1092-18-3.177943Hypothetical protein
FN1093110-4.100560Hypothetical protein
FN1094210-4.102211Hypothetical exported 24-amino acid repeat
FN1095310-4.288758Neutrophil-activating protein A
FN109629-3.564068Export ABC transporter
FN109739-2.780312unknown
23FN1165FN1180Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN11652112.850136ATP-dependent nuclease subunit A
FN11661102.655636unknown
FN11673113.384932Na+ driven multidrug efflux pump
FN11681103.131984Aspartate aminotransferase
FN11691124.052499Hypothetical protein
FN11702143.674657Ribonuclease BN
FN1171-1151.466471Cell division protein ftsI
FN1172-113-1.355350Primosomal protein N'
FN1173614-4.560679Polypeptide deformylase
FN117439-3.915633Hypothetical protein
FN117538-3.752484Fructose-1,6-bisphosphatase
FN117637-3.048999SWF/SNF family helicase
FN117737-3.634261Glutamate racemase
FN117838-3.345472Hydroxyacylglutathione hydrolase
FN117938-3.642109Thioredoxin reductase
FN1180210-3.287384Glucokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1168TCRTETA466e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.4 bits (110), Expect = 6e-08
Identities = 49/304 (16%), Positives = 116/304 (38%), Gaps = 21/304 (6%)

Query: 5 ESNIKLLLLGRVVSLFGNTIYLI--VLPLYILNIT---QNLKTTGIFFASINLPTIIISP 59
+ N L+++ V+L I LI VLP + ++ GI A L +P
Sbjct: 2 KPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAP 61

Query: 60 LIGTLIEKFNKKNIILICDFLTSMLYFIPFLYFKNSNSVIFLLIVSLILNIISKFFEIAS 119
++G L ++F ++ ++L+ ++ Y I + +++L + I+ I+ +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAI-----MATAPFLWVLYIGRIVAGITGATGAVA 116

Query: 120 KVLFSEINTPETLEKYNGLQSFLENAVMIFGPVVGTYLFATFTFNFVLIIISLAYFLSFL 179
++I + ++ G S M+ GPV+G + F+ + + L+FL
Sbjct: 117 GAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFL 175

Query: 180 QELFIKYEKNTNLSKEKSSFFKDFKEGISYIKSKKIIFNFFILAMFLNFFIANNDEIINP 239
F+ E + + + + + ++ + F+ + +
Sbjct: 176 TGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVF-FIMQLVGQVPAALWV 234

Query: 240 GILIKKYQISEKLFGFSVACYGL-----GSVVAGYLSSYIGAD----VTYIIDNVAIIII 290
++ G S+A +G+ +++ G +++ +G + I D I++
Sbjct: 235 IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILL 294

Query: 291 VFLV 294
F
Sbjct: 295 AFAT 298



Score = 30.2 bits (68), Expect = 0.011
Identities = 23/122 (18%), Positives = 49/122 (40%), Gaps = 6/122 (4%)

Query: 39 NLKTTGIFFASINLPTIIISPLI-GTLIEKFNKKNIILICDFLTSMLYFIPFLYFKNSNS 97
+ T GI A+ + + +I G + + ++ +++ + +I L F
Sbjct: 244 DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-GMIADGTGYI-LLAFATRGW 301

Query: 98 VIFLLIVSLILNIISKFFEIASKVLFSEINTPETLEKYNGLQSFLENAVMIFGPVVGTYL 157
+ F ++V L I A + + S E + G + L + I GP++ T +
Sbjct: 302 MAFPIMVLLASGGI---GMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358

Query: 158 FA 159
+A
Sbjct: 359 YA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1171ACETATEKNASE5180.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 518 bits (1336), Expect = 0.0
Identities = 220/400 (55%), Positives = 282/400 (70%), Gaps = 5/400 (1%)

Query: 1 MKILVINCGSSSLKYQLINPETEEVFAKGLCERIGIDGSKLEYEVVAKDFEKKLETPMPS 60
MKILVINCGSSSLKYQLI + V AKGL ERIGI+ S L + + + K++ M
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGE--KIKIKKDMKD 58

Query: 61 HKEALELVISHLTDKEIGVIASVDEVDAIGHRVVHGGEEFAQSVLINDAVLKAIEANNDL 120
HK+A++LV+ L + + GVI + E+DA+GHRVVHGGE F SVLI D VLKAI +L
Sbjct: 59 HKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIEL 118

Query: 121 APLHNPANLMGIRTCMELMPGKKNVAVFDTAFHQTMKPEAFMYPLPYEDYKELKVRKYGF 180
APLHNPAN+ GI+ C ++MP VAVFDTAFHQTM A++YP+PYE Y + K+RKYGF
Sbjct: 119 APLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGF 178

Query: 181 HGTSHLYVSGIMREIMGNP-EHSKIIVCHLGNGASITAVKDGKSVDTSMGLTPLQGLMMG 239
HGTSH YVS EI+ P E KII CHLGNG+SI AVK+GKS+DTSMG TPL+GL MG
Sbjct: 179 HGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMG 238

Query: 240 TRCGDIDPAAVLFVKNKRGLTDAQMDDRMNKKSGILGLFGKSSDCRDL-ENAVVEGDERA 298
TR G IDP+ + ++ K ++ ++ + +NKKSG+ G+ G SSD RDL + A GD+RA
Sbjct: 239 TRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRA 298

Query: 299 ILAESVSMHRLRSYIGAYAAIMGGVDAICFTGGIGENSSMTREKALEGLEFLGVELDKEI 358
LA +V +R++ IG+YAA MGGVD I FT GIGEN RE L+GLEFLG +LDKE
Sbjct: 299 QLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEK 358

Query: 359 NSVRKKGNVKLSKDSSKVLIYKIPTNEELVIARDTFRLAK 398
N VR + + +S SKV + +PTNEE +IA+DT ++ +
Sbjct: 359 NKVRGEEAI-ISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1179TYPE4SSCAGX310.030 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.5 bits (68), Expect = 0.030
Identities = 35/161 (21%), Positives = 70/161 (43%), Gaps = 10/161 (6%)

Query: 635 SQDYSELFKNYSKQINEGDKIKLIEENLSFENLKESNFVDEFEKAYEKYQRILNSDENSQ 694
++DY E K ++ D +L E+ + E KE+ E + +K +R +E ++
Sbjct: 119 TRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAK---EQAQKAQKDKREKRKEERAK 175

Query: 695 DALKLRDIQSVTVIPYDIYEENEENIKELIKKIEDKNLGLEER-----QNAKTELLKKTL 749
+ L ++ + P ++ N +N+ ELIK+ + L ER + A+ LK+
Sbjct: 176 NRANLENLTNAMSNPQNL--SNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIE 233

Query: 750 SIQYYQLSKYISEILKGKADANKYKSESINKFEKITVMEAD 790
+ Q + + + K K KS+ + I + +D
Sbjct: 234 ELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSD 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1180cloacin300.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.5 bits (68), Expect = 0.006
Identities = 18/64 (28%), Positives = 28/64 (43%), Gaps = 11/64 (17%)

Query: 19 NSEYKKKLEEFKALKKEIAD-----------KKKKEDKKLETFKQLSEEEKKVKLEEEKY 67
++ K F A KE +D +KKKEDKK L++E+ K + + Y
Sbjct: 401 QTDVNNKQAAFDAAAKEKSDADAALSSAMESRKKKEDKKRSAENNLNDEKNKPRKGFKDY 460

Query: 68 KEDF 71
D+
Sbjct: 461 GHDY 464


24FN1225FN1253Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1225210-0.721096Hypothetical RNA binding protein
FN1226-1111.750846Metal dependent hydrolase
FN1227-1121.533663Cell division protein ftsI
FN1228-1111.123662unknown
FN1229-1100.579589Hypothetical protein
FN1230-190.553780Fe-S oxidoreductase
FN123109-0.285345Hypothetical cytosolic protein
FN1232310-3.065407RRF2 family protein
FN123329-3.323899Holliday junction DNA helicase ruvB
FN123429-3.926307unknown
FN123539-4.042488Hypothetical protein
FN1236610-4.748012Cysteine synthase
FN1237510-4.949731Hypothetical protein
FN1238212-4.689993Hypothetical protein
FN1239313-4.867175Oxygen-insensitive NAD(P)H nitroreductase
FN1240314-4.8314782-dehydro-3-deoxyphosphooctonate aldolase
FN1241313-4.780103UDP-N-acetylmuramoyl-L-alanyl-D-glutamate--meso-
FN1242313-4.978886Uracil-DNA glycosylase
FN1243312-4.070836Hypothetical protein
FN1244310-3.525493Hypothetical protein
FN1245-110-1.586618Hypothetical protein
FN1246-290.103077Hypothetical cytosolic protein
FN1247-191.615194Inosine-5'-monophosphate dehydrogenase
FN12481123.701494Transcriptional regulator, MarR family
FN1249293.537317Putative NAD(P)H oxidoreductase
FN1250093.233588Hypothetical protein
FN12511113.096136Ankyrin repeat proteins
FN12521112.563831Choline transport protein
FN12532111.846295Choline kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1241PF05704443e-07 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 43.7 bits (103), Expect = 3e-07
Identities = 24/97 (24%), Positives = 45/97 (46%), Gaps = 2/97 (2%)

Query: 1 MIEKKIHYVWFG--NAKSEKVLKCIESWKKNLPDYEIIEWNEKNFNIEEELKSNKFFREC 58
M +K I W V +C+ S KKN D+++I + N+ ++ R
Sbjct: 66 MRQKYIFICWLQGIEKAPYIVQQCVASVKKNSGDFKVIIIDGNNYKEWVDIPDFLIKRWQ 125

Query: 59 YNRKLWAFVSDYVRVKVLYNYGGIYLDTDMEIIKDIT 95
+ L A+ SD +R+ +L YGG+++D + + +
Sbjct: 126 EGKMLDAWFSDILRLFLLCKYGGLWIDATVYMFDKVP 162


25FN1281FN1291Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN12811143.415378Outer membrane protein
FN12825193.089278UTP--glucose-1-phosphate uridylyltransferase
FN12834212.477192Hypothetical protein
FN1284214-0.477493Methionyl-tRNA synthetase
FN1285210-1.637333Hypothetical lipoprotein
FN128648-3.264492Hypothetical cytosolic protein
FN128739-3.611928Protease IV
FN128828-3.856341Transcriptional regulator, TetR family
FN128927-3.656485Outer membrane protein tolC
FN129019-3.602036Acriflavin resistance protein E
FN129109-3.691404Acriflavin resistance protein B
26FN1307FN1317Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1307211-0.348268Hypothetical protein
FN1308-111-0.695130Hypothetical protein
FN13090100.314635Hypothetical protein
FN1310010-0.084546Ribosomal-protein-alanine acetyltransferase
FN1311011-1.993511Acetyltransferase
FN1312312-1.827568unknown
FN1313512-1.322804Methionine aminopeptidase
FN1314413-1.598233Adenylate kinase
FN1315411-1.190245dTDP-glucose 4,6-dehydratase
FN1316310-1.044599Integral membrane protein
FN131729-0.428902ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1310PF03544621e-13 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 61.9 bits (150), Expect = 1e-13
Identities = 37/184 (20%), Positives = 74/184 (40%), Gaps = 19/184 (10%)

Query: 46 PPKPKVEKKPEEKKPEEKKVEKKPE-KKEIESNIPSKDAKPVEKQPETTTSENTQS-TES 103
P V+ PE E + E PE KE I KP K E + +
Sbjct: 61 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKP 120

Query: 104 ADKVESTPSDNNSSSGGGNTSSTGSGDEGFGSNFISDGDGSYIALSSKGINYQIINEVEP 163
+ ++P +N + + ++++T + + S + ++ +P
Sbjct: 121 VESRPASPFENTAPARPTSSTATAATSKPVTS---VASG------------PRALSRNQP 165

Query: 164 DYPSQAESIGYSKQVKVTVKFLVGLKGNVEKAEITQSHKDLGFDAEVMKAIKKWKFKPIY 223
YP++A+++ + +V VKF V G V+ +I + F+ EV A+++W+++P
Sbjct: 166 QYPARAQALR--IEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGK 223

Query: 224 HNGK 227

Sbjct: 224 PGSG 227


27FN1326FN1337Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
FN1326211-0.665262TonB protein
FN1327390.084420Biopolymer transport exbD protein
FN132829-0.135918Biopolymer transport exbB protein
FN1329290.333993Oligopeptide-binding protein oppA
FN1330290.482501unknown
FN1331290.149722Hypothetical protein
FN1332-17-0.457484NGG1-interacting factor 3
FN133309-2.071506RNA polymerase sigma factor
FN133409-1.944553RNA polymerase sigma factor rpoD
FN133527-1.613889DNA primase
FN133625-1.139930Peptidyl-prolyl cis-trans isomerase
FN133726-1.122487Acetoacetate metabolism regulatory protein atoC
28FN1358FN1375Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1358413-0.994714Holo-[acyl-carrier protein] synthase
FN1359314-0.452727seC-independent protein TATD
FN13602120.047157unknown
FN1361-1100.9765252-hydroxy-6-oxo-6-phenylhexa-2,4-dienoate
FN1362090.782021Hypothetical cytosolic protein
FN13630120.571196Hypothetical cytosolic protein
FN13642140.803341ABC transporter ATP-binding protein
FN13650140.261133ABC transporter permease protein
FN1366315-0.397411Integral membrane protein
FN1367117-1.54387315 kDa lipoprotein precursor
FN1368-117-0.655906ABC transporter ATP-binding protein
FN1369-215-0.931890ABC transporter permease protein
FN1370-2131.328372ABC transporter permease protein
FN1371-1112.272944Integral membrane protein
FN1372-1111.427877unknown
FN13730132.088379Transposase
FN13742152.599599Hypothetical protein
FN13752184.265086Dipeptide-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1363BINARYTOXINB280.046 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 27.7 bits (61), Expect = 0.046
Identities = 15/58 (25%), Positives = 24/58 (41%), Gaps = 3/58 (5%)

Query: 58 LKPTMGTILFEGKPLSEIPRKDIQAIFQDPYSALNPSLKIGEILEEPLIA---NGIFQ 112
++ T I+F GK L+ + R+ DP P + + E L+ NG Q
Sbjct: 513 IQETTARIIFNGKDLNLVERRIAAVNPSDPLETTKPDMTLKEALKIAFGFNEPNGNLQ 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1374HTHFIS542e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.7 bits (129), Expect = 2e-10
Identities = 9/49 (18%), Positives = 23/49 (46%)

Query: 219 KVTMNDVEKDRLIEGLKRNGFSLSNTAKDLGMSRTTLWRKLKKFNIIIE 267
+ ++E ++ L + A LG++R TL +K+++ + +
Sbjct: 430 DRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVY 478


29FN1384FN1428Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN13842212.111999COMF operon protein 3
FN13852192.452195Zinc finger protein
FN13862214.134335Endonuclease
FN13875308.308161Ribonuclease HII
FN1391111-0.219049Ribosomal large subunit pseudouridine synthase
FN1392080.490701regulatory protein
FN1393091.885950Transcriptional regulator
FN1394-110-1.042605Citrate-sodium symport
FN1395-110-3.019890Oxaloacetate decarboxylase alpha chain
FN1396-112-0.762605CITG protein
FN1397012-0.693085Citrate lyase acyl carrier protein
FN1398112-0.134424Citrate lyase beta chain
FN1399011-1.004822Citrate lyase beta chain
FN1400111-0.199033unknown
FN14013132.762074ATPase
FN14023121.534589DNA polymerase III alpha subunit
FN14031131.912438IAA acetyltransferase
FN14041111.436800Hypothetical protein
FN14051122.324392SWF/SNF family helicase
FN14061122.513622Metal dependent hydrolase
FN1407-2101.257121LSU Ribosomal RNA
FN1408-191.2372475S Ribosomal RNA
FN1409391.511590SSU Ribosomal RNA
FN14101101.023830Acetyltransferase
FN1411-111-0.477493SSU ribosomal protein S16P
FN141209-0.684067Signal recognition particle, subunit FFH/SRP54
FN141308-0.583923Signal recognition particle associated protein
FN1414-19-0.033550unknown
FN1415-1100.310689unknown
FN14160132.126613Glutaminase
FN14171123.435816Amino acid carrier protein alsT
FN14181123.332255Hypothetical cytosolic protein
FN14192124.640722serine/threonine kinase
FN14202114.436240Urocanate hydratase
FN1421194.805104Hypothetical membrane-spanning protein
FN14220103.996804Histidine permease
FN1423-1103.851591Imidazolonepropionase
FN1424-2123.961285Formiminotetrahydrofolate cyclodeaminase
FN1425-2122.872104Histidine ammonia-lyase
FN1426-1123.166144Glutamate formiminotransferase
FN14270101.863965Aminoacyl-histidine dipeptidase
FN14282142.094264Transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1384SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 2e-04
Identities = 21/80 (26%), Positives = 35/80 (43%), Gaps = 5/80 (6%)

Query: 11 MYLLDDNGIKGECVVTDEGEGILEIKNIAIEPDCQRKGYGKALIDFIAK--KYRGQYSI- 67
+Y L++N I G + G I++IA+ D ++KG G AL+ + K +
Sbjct: 69 LYYLENNCI-GRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLM 127

Query: 68 LQVGTGDSPMIISFYEKCGF 87
L+ + FY K F
Sbjct: 128 LETQDINISA-CHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1391SACTRNSFRASE403e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 39.5 bits (92), Expect = 3e-07
Identities = 15/64 (23%), Positives = 34/64 (53%), Gaps = 1/64 (1%)

Query: 4 IHSEGKGFYIYDENKEILARLEYKKNDN-VLTFDHTVVSDKLKGQGIAQKLLDEAVDYAR 62
+ EGK ++Y + R++ + N N + V+ + +G+ LL +A+++A+
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAK 119

Query: 63 KNNF 66
+N+F
Sbjct: 120 ENHF 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1393TCRTETOQM300.027 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.8 bits (67), Expect = 0.027
Identities = 22/98 (22%), Positives = 41/98 (41%), Gaps = 6/98 (6%)

Query: 288 RLVSRILGMGDVVSLVEKAQEVIDENEAKSLEEKIKSQKFDLNDFLKQLQTIKRLGSLGG 347
RL S +L + D V + EK + I E E K K + + +L S+ G
Sbjct: 269 RLYSGVLHLRDSVRISEKEKIKITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLG 328

Query: 348 ILKLIPGMPKIDDLAPAEKEMKKVEAIIQSMTKEERKK 385
KL+P +I++ P ++ ++ ++R+
Sbjct: 329 DTKLLPQRERIENPLPL------LQTTVEPSKPQQREM 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1400HTHFIS320.016 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.016
Identities = 11/33 (33%), Positives = 22/33 (66%), Gaps = 3/33 (9%)

Query: 269 SNKSILITGESGIGKTILKKEILN---RNSGKF 298
++ +++ITGESG GK ++ + + + R +G F
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1426IGASERPTASE435e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.1 bits (101), Expect = 5e-06
Identities = 55/263 (20%), Positives = 90/263 (34%), Gaps = 21/263 (7%)

Query: 593 NPVEYIGKNAEASTKNVAENVE--NVFQDLDKKVMSGTATKEELAMGAIVQNMTTMGFTS 650
N V +N +T N E N ++ ++ + E A + + +T+
Sbjct: 1193 NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTS-SNDRSTVALCD 1251

Query: 651 ATEMMSGEIYASAQALTFSQAQNINRDLSNRLAGLDNFKNSNKDSEVWFSAIGSGGKLKR 710
T + + + A+A A N+ + +S ++ L N+ VW S
Sbjct: 1252 LTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL--EMNNEGQYNVWVSNTSMNKNYSS 1309

Query: 711 DGYASADTRVTGGQFGIDTKYKGTTTLGVAMNYSYAKANFNRYAGESKSDMVGVSFYAKQ 770
Y ++ T Q G D LG Y NF++ SK+ + V+FY+K
Sbjct: 1310 SQYRRFSSKSTQTQLGWDQTISNNVQLGGVFTYVRNSNNFDKAT--SKNTLAQVNFYSKY 1367

Query: 771 DLPYGFYTAGRLGLSNISSKVERELLTSTGETVTGKIKHHDKMLSAYVEIGKKF--GWF- 827
+Y LG SK++ K + GK F G F
Sbjct: 1368 YADNHWYLGIDLGYGKFQSKLQTN----------HNAKFARHTAQFGLTAGKAFNLGNFG 1417

Query: 828 -TPFIGYSQDYLRRGSFNESEAS 849
TP +G YL F +A
Sbjct: 1418 ITPIVGVRYSYLSNADFALDQAR 1440


30FN1808FN1813N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN18080120.341189Hypothetical protein
FN1809-111-0.090174Iron/zinc/copper-binding protein
FN1810-19-0.086824Manganese transport system membrane protein
FN1811-311-0.777118Manganese transport system ATP-binding protein
FN1812012-0.039939Manganese-binding protein
FN1813113-0.238078Manganese-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1808CHLAMIDIAOMP260.040 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 26.1 bits (57), Expect = 0.040
Identities = 20/54 (37%), Positives = 24/54 (44%), Gaps = 7/54 (12%)

Query: 73 KPATEKYEVYFNAGEGHVVSKKG------P-ALTAGEKANWDKATASFDFGEWK 119
KP E+ V NA E + KG P AL AG A AS D+ EW+
Sbjct: 219 KPKVEELNVLCNAAEFTINKPKGYVGQEFPLALIAGTDAATGTKDASIDYHEWQ 272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1809adhesinb564e-11 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 55.6 bits (134), Expect = 4e-11
Identities = 51/274 (18%), Positives = 97/274 (35%), Gaps = 30/274 (10%)

Query: 22 YSLTSYLTKGTDIKVYTPFGSDISMTMSKESIREEGFNLAVAKKA--QAVVDIAKIWPED 79
+L S + G D Y P D+ T + I G NL A +V+ AK
Sbjct: 55 INLHSIVPVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENK 114

Query: 80 VIYGKARMNKINIVEIDASHPYDEKMTTLFFSDYSNGKVNPYIWTGSKNLVRMVNIIGRD 139
Y A ++++ + GK +P+ W +N + I +
Sbjct: 115 DYY--AVSEGVDVIYL-EGQSE-------------KGKEDPHAWLNLENGIIYAQNIAKR 158

Query: 140 LIRLYPQNKAKIEKNITKFTADLLKIENEANEKLLAVGDA--EVISLSENLQYFLNDMNI 197
L P NK EKN+ + L ++ EA EK + +++ +YF N+
Sbjct: 159 LSEKDPANKETYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNV 218

Query: 198 YTEYV-------DYDSVNAQNIAKLIKDKGIKVIVSDRWLKKDAIKAL-KEAGGEFVVIN 249
+ Y+ + + + + ++ + + + + +K + K+
Sbjct: 219 PSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKI 278

Query: 250 TLDIPMDKDGKMDPEAILKAFKENTDNLIEALSK 283
D +K + D + K N + + E LSK
Sbjct: 279 FTDSVAEKGEEGD--SYYSMMKYNLEKIAEGLSK 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1812ADHESNFAMILY1422e-42 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 142 bits (359), Expect = 2e-42
Identities = 65/287 (22%), Positives = 138/287 (48%), Gaps = 11/287 (3%)

Query: 14 SFSAFAKEKLKIGVTLQPYYSFAANIVKDKAEVVPVVRLDQYDSHSYQPKPEDIKRMNTL 73
+ +KLK+ T NI DK ++ +V D H Y+P PED+K+ +
Sbjct: 24 KKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIVP-IGQDPHEYEPLPEDVKKTSEA 82

Query: 74 DVLIVNGV----GHDEFIFDILNAADRKKDIKVIYANKNVSLMPIAGSIRAEKVMNPHTF 129
D++ NG+ G + + ++ A + ++ + V ++ + G K +PH +
Sbjct: 83 DLIFYNGINLETGGNAWFTKLVENAKKTENKDYFAVSDGVDVIYLEGQNEKGK-EDPHAW 141

Query: 130 ISITTSIQQVYNIAKELGEIDPANKEFYLKNARDYAKKLRKLKTDALNEVKHLGNIDIRV 189
+++ I NIAK+L DP NKEFY KN ++Y KL KL ++ ++ + +
Sbjct: 142 LNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLI 201

Query: 190 ATLHGGYDYLLSEFGIDVKAVIEPSHGAQPSAADLEKVIKIIKNEKIDIIFGEKNFNNKF 249
T G + Y +G+ + E + + + ++ +++ ++ K+ +F E + +++
Sbjct: 202 VTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRP 261

Query: 250 VDTIHKETGV----EVRSLSHMTNGPYEVDSFEKFIKIDLDEVVKAI 292
+ T+ ++T + ++ + S G E DS+ +K +LD++ + +
Sbjct: 262 MKTVSQDTNIPIYAQIFTDSIAEQGK-EGDSYYSMMKYNLDKIAEGL 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1813adhesinb1431e-42 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 143 bits (361), Expect = 1e-42
Identities = 73/310 (23%), Positives = 138/310 (44%), Gaps = 21/310 (6%)

Query: 1 MKRIIFIVFLIFNTFLL------------GQEKLKVGITLLPYYSFVANIVKDKAEVIPI 48
MK+ F+V L+ L G KL V T NI DK + I
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSI 60

Query: 49 VKAESFDSHTYQPKVEDIERASKVDVVVVNGI----GHDEFIYKILDAVDKNKRPVIINA 104
V D H Y+P ED+++ S+ D++ NGI G + + K+++ K +
Sbjct: 61 VPV-GQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAV 119

Query: 105 NKDVPLMPVAGTLNDEKIMDSHTFISITAAIQQVHNITKELVKLDPKNKDFYLANSREYV 164
++ V ++ + G +++ D H ++++ I NI K L + DP NK+ Y N + YV
Sbjct: 120 SEGVDVIYLEGQ-SEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYV 178

Query: 165 KKLRKLKTDALKEVQNVNGEDVKVATFLGGYNYLLAEFGIDVKAVLEPTHGSQISMSSLQ 224
+KL L +A ++ N+ GE + T G + Y + + + E + + ++
Sbjct: 179 EKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIK 238

Query: 225 KIIEKIKKDKVDIIFGEKNYSDEYVTIIKNETGIEVRK---LEHLTTGAYTADSFEKFIK 281
++EK++K KV +F E + D + + +T I + + + DS+ +K
Sbjct: 239 TLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMK 298

Query: 282 VDLDEVVSAI 291
+L+++ +
Sbjct: 299 YNLEKIAEGL 308


31FN1831FN1836N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1831310-0.140196Nitrogen assimilation regulatory protein
FN1832411-0.592977TonB protein
FN1833310-0.204840Biopolymer transport exbD protein
FN1834191.035774Biopolymer transport exbB protein
FN1835171.310163Hypothetical protein
FN1836071.173222Tetratricopeptide repeat family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1831HTHFIS337e-114 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 337 bits (866), Expect = e-114
Identities = 122/450 (27%), Positives = 239/450 (53%), Gaps = 10/450 (2%)

Query: 23 DLVFVENIMSFMEAIKSRKYEAIVIDERNSKEEALINLIVKVTEIQKKAVIIILGETSNW 82
D+ N + I + + +V D E A +L+ ++ + + ++++ + +
Sbjct: 29 DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF-DLLPRIKKARPDLPVLVMSAQNTF 87

Query: 83 RIIAGSIKAGAYDYILKPELPRTIVRIVEKSVKDYKGLVERVDKTKSTGEKLIGRSKLMI 142
+ + GAYDY+ KP ++ I+ +++ + K +++ G L+GRS M
Sbjct: 88 MTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQ 147

Query: 143 DLYKVIGKVANNSAPVLVTGERGTGKTSVAKAIHQFSNVYDKPLISINCNSYRANLLERK 202
++Y+V+ ++ +++TGE GTGK VA+A+H + + P ++IN + +L+E +
Sbjct: 148 EIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESE 207

Query: 203 LFGYEKGSFEGAAFSQYGELEKAEGGILHLANIESLSLDMQSKILFLLEENKFLRLGGME 262
LFG+EKG+F GA G E+AEGG L L I + +D Q+++L +L++ ++ +GG
Sbjct: 208 LFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRT 267

Query: 263 PINAFVRIIASTSVNLEELIEKGLFIDELYRKLKVLEIDIPNLKERKEDIPFIIDHYMAE 322
PI + VRI+A+T+ +L++ I +GLF ++LY +L V+ + +P L++R EDIP ++ H++ +
Sbjct: 268 PIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ 327

Query: 323 CNQEMNKNIKGVTKVALKKIMRYDWPGNVNELKNAIKYAVAMCRGSSILIEDLPPNVVGE 382
+E ++K + AL+ + + WPGNV EL+N ++ A+ I E + + E
Sbjct: 328 AEKE-GLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSE 386

Query: 383 --------KILNGKEESKTVSIENLVKNEINQLRSKNKKRDYYFEIISKIEKEIIKQVLE 434
S + ++E ++ Y +++++E +I L
Sbjct: 387 IPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALT 446

Query: 435 ITNGKKVETAEILGITRNTLRTKMNYYDLE 464
T G +++ A++LG+ RNTLR K+ +
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1832PF03544546e-11 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 53.8 bits (129), Expect = 6e-11
Identities = 33/178 (18%), Positives = 63/178 (35%), Gaps = 14/178 (7%)

Query: 56 EEKPEKKVEEDKKAEKTVQVEDKPKTTPKKEKPSLADLKKQISDSQPKTSNGGFSPSADP 115
E +PE + + E V +E K KP +++ D +P S P++
Sbjct: 75 EPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPV-KKVEQPKRDVKPVES----RPASPF 129

Query: 116 DGEEVVDRVLQNVTYSNGLVSGSKMGNSEGGLLVDWNDSNRAPEFPQSARASGKHGKIKI 175
+ + + S + G + S P++P A+A G++K+
Sbjct: 130 ENTAPARP---TSSTATAATSKPVTSVASGPRAL----SRNQPQYPARAQALRIEGQVKV 182

Query: 176 KLKVDKAGNVLSYVIVEGSGVPEIDASVERVVGSWRVKLLKKGKPVNGTFYLNYNFNF 233
K V G V + I+ + V+ + WR + K G + + + N
Sbjct: 183 KFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGI--VVNILFKING 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1835PF07675300.002 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.1 bits (67), Expect = 0.002
Identities = 23/80 (28%), Positives = 35/80 (43%), Gaps = 3/80 (3%)

Query: 33 ENVATTELTKQVTSENNQQLDVKEIDTEDLILQNQNLESSSVNITGENLKE--NGDKVKV 90
VAT +TKQ+T N + + + +I Q Q E S NL G KV +
Sbjct: 292 SGVATVNMTKQITENGNYDVVITRSNYLPVIKQIQAGEPSPYQPV-SNLTATAQGQKVTL 350

Query: 91 NQENSATLEEELSRGVEKKG 110
+ + + E SR V++ G
Sbjct: 351 KWDAPSAKKAEGSREVKRIG 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1836SYCDCHAPRONE330.003 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 33.0 bits (75), Expect = 0.003
Identities = 14/34 (41%), Positives = 24/34 (70%)

Query: 300 PKLIYGEAYSLYKNGKYEEALKKFQSLKNSDYYN 333
+ +Y A++ Y++GKYE+A K FQ+L D+Y+
Sbjct: 36 LEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYD 69



Score = 29.9 bits (67), Expect = 0.026
Identities = 21/88 (23%), Positives = 37/88 (42%), Gaps = 5/88 (5%)

Query: 131 FAVAQNFLAKENNEAAQKAYKEIIDNKYENYKESMMGLGIVYYNLKDYDKAIYWLSEFSK 190
+++A N E A K ++ + + + + +GLG + YD AI+ +S
Sbjct: 40 YSLAFNQYQSGKYEDAHKVFQALCVLDHYDSR-FFLGLGACRQAMGQYDLAIH---SYSY 95

Query: 191 EMPKENKE-MVSYLRASALYRKGNTDEA 217
+ KE + A L +KG EA
Sbjct: 96 GAIMDIKEPRFPFHAAECLLQKGELAEA 123



Score = 29.5 bits (66), Expect = 0.039
Identities = 20/93 (21%), Positives = 33/93 (35%), Gaps = 6/93 (6%)

Query: 543 YLNRVRNYFLAERYNEAVQAGEQYLSKLSPDKEKVIYSEMLDKIGLSYFRLGKYDQARSY 602
+ N + + +Y +A + Q L L + +G +G+YD A
Sbjct: 39 LYSLAFNQYQSGKYEDAHKV-FQALCVLDHYDSRFFLG-----LGACRQAMGQYDLAIHS 92

Query: 603 YSKIASMKGYEVYGKFQIADSYYNEKNYEKAGS 635
YS A M E F A+ + +A S
Sbjct: 93 YSYGAIMDIKEPRFPFHAAECLLQKGELAEAES 125


32FN2042FN2062N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN20421172.687353Protein yicC
FN20441192.510966DNA-directed RNA polymerase beta' chain
FN20451192.594127DNA-directed RNA polymerase beta chain
FN20460203.349729LSU ribosomal protein L12P (L7/L12)
FN20471233.679925LSU ribosomal protein L10P
FN20484243.337563LSU ribosomal protein L1P
FN20492203.085120LSU ribosomal protein L11P
FN20500182.446669Transcription antitermination protein nusG
FN20510162.037601Protein translocase subunit SecE
FN2052-2140.941280*LSU ribosomal protein L33P
FN2053-1142.806453Ferric uptake regulation protein
FN2054-1162.548205Acetyltransferase
FN2055-1182.630813Fusobacterium outer membrane protein family
FN20560193.106113Outer membrane protein
FN20571203.753494unknown
FN20581213.922380Hypothetical membrane-spanning protein
FN20598312.154609unknown
FN20609211.852778unknown
FN206110161.844605Serine/threonine sodium symporter
FN20627151.327767Glucose-6-phosphate isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2042SECETRNLCASE376e-07 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 36.8 bits (85), Expect = 6e-07
Identities = 19/55 (34%), Positives = 33/55 (60%)

Query: 3 LFQKVKMEYSKVEWPSRTEVIHSTLWVVTMTVLVSIYLGIFDILAVRALNFLEAL 57
++ + E KV WP+R E +H+TL V +T ++S+ L D + VR ++F+ L
Sbjct: 71 FAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVSFITGL 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2046SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.4 bits (76), Expect = 2e-04
Identities = 19/84 (22%), Positives = 36/84 (42%), Gaps = 1/84 (1%)

Query: 43 GLVFVIKENDKIVCIVEYMQVFNKKSLFLYGISTLKEYRHKGYGNYILNETEKILKNLSY 102
F+ + + ++ +N + I+ K+YR KG G +L++ + K +
Sbjct: 65 KAAFLYYLENNCIGRIKIRSNWNGY-ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHF 123

Query: 103 EEIELTVAPENDIAINFYKKHGYI 126
+ L N A +FY KH +I
Sbjct: 124 CGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2048OMPADOMAIN1091e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 109 bits (273), Expect = 1e-31
Identities = 37/113 (32%), Positives = 60/113 (53%), Gaps = 13/113 (11%)

Query: 38 FDFDKSNVKPQYYDLLNNIKEFVEQ---NNYEITIVGHTDSIGSNAYNFKLSRRRAESVK 94
F+F+K+ +KP+ L+ + + + + ++G+TD IGS+AYN LS RRA+SV
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 95 AKLLEFGLSEDRIVGIEAMGEEQPIATN---------ATKEGRAQNRRVEFKL 138
L+ G+ D+I MGE P+ N A + A +RRVE ++
Sbjct: 283 DYLISKGIPADKI-SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2050IGASERPTASE300.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.002
Identities = 17/68 (25%), Positives = 26/68 (38%), Gaps = 5/68 (7%)

Query: 24 EEYDKMQAEKAKEAERMAKENPQATEVVGENGEVVVTEGEEVAMAPKKSEKDMTESERMD 83
E E AKEA+ K N Q EV E +E K + + E+
Sbjct: 1059 TETTAQNREVAKEAKSNVKANTQTNEVAQSGSET-----KETQTTETKETATVEKEEKAK 1113

Query: 84 VEVQRIKK 91
VE ++ ++
Sbjct: 1114 VETEKTQE 1121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2051IGASERPTASE270.039 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.039
Identities = 30/154 (19%), Positives = 49/154 (31%), Gaps = 16/154 (10%)

Query: 24 APDVTSGTT-MSAEEQKEAMDILDRMREKIEKEEAEKAKLIAEAKELGMSPSEVASMDNV 82
AP S TT AE K+ E K +A E EVA
Sbjct: 1029 APATPSETTETVAENSKQ--------------ESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 83 EEMLEAKRAAEAKPKTEAEKLELTRKKALNKLDFYERVVRSVAREENEVSDYYGVMGEEK 142
+ A+ +E ++ + T K ++ E + + EV + ++
Sbjct: 1075 NVKANTQTNEVAQSGSETKETQTTETKETATVE-KEEKAKVETEKTQEVPKVTSQVSPKQ 1133

Query: 143 QRSTVYLGTAEAAAEQQVEQNAAPAEIQPETPEE 176
++S AE A E N + Q T +
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2054ACRIFLAVINRP310.013 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.013
Identities = 30/179 (16%), Positives = 54/179 (30%), Gaps = 21/179 (11%)

Query: 42 NNFLGWLDLPINYDKEEFSRIKKASEKIKADSDVLIVIGIGGS-YLGARAVIECLSHSFF 100
F GW + ++ ++ ++ + + G L R L SF
Sbjct: 509 GGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLR-----LPSSFL 563

Query: 101 NSLNK----------EKRNAPEIYFAGQNISGRYLKDLIEIIGDRDFSVNIISKSGTTTE 150
++ ++ YLK+ + F+VN S SG
Sbjct: 564 PEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVE-SVFTVNGFSFSGQAQN 622

Query: 151 PAIAFRVFKELLENKYGEKAKDRIYVTTDKNKGALKKLADEKGYEKFVIPDDVGGRFSV 209
+AF K E E + + + + K L K+ D F +P V +
Sbjct: 623 AGMAFVSLKPWEERNGDENSAEAVI---HRAKMELGKIRDGFVIP-FNMPAIVELGTAT 677


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2056PF04183270.048 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.2 bits (60), Expect = 0.048
Identities = 9/32 (28%), Positives = 15/32 (46%)

Query: 68 DEYDSPLVSAYINGYYESAKDFFKTFYSTVLK 99
DE + PL AYI+ A+ + + V+
Sbjct: 375 DENNQPLAGAYIDRSGLDAETWLTQLFRVVVV 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2059OMPADOMAIN1091e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 109 bits (273), Expect = 1e-31
Identities = 37/113 (32%), Positives = 60/113 (53%), Gaps = 13/113 (11%)

Query: 38 FDFDKSNVKPQYYDLLNNIKEFVEQ---NNYEITIVGHTDSIGSNAYNFKLSRRRAESVK 94
F+F+K+ +KP+ L+ + + + + ++G+TD IGS+AYN LS RRA+SV
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 95 AKLLEFGLSEDRIVGIEAMGEEQPIATN---------ATKEGRAQNRRVEFKL 138
L+ G+ D+I MGE P+ N A + A +RRVE ++
Sbjct: 283 DYLISKGIPADKI-SARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2061IGASERPTASE300.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.002
Identities = 17/68 (25%), Positives = 26/68 (38%), Gaps = 5/68 (7%)

Query: 24 EEYDKMQAEKAKEAERMAKENPQATEVVGENGEVVVTEGEEVAMAPKKSEKDMTESERMD 83
E E AKEA+ K N Q EV E +E K + + E+
Sbjct: 1059 TETTAQNREVAKEAKSNVKANTQTNEVAQSGSET-----KETQTTETKETATVEKEEKAK 1113

Query: 84 VEVQRIKK 91
VE ++ ++
Sbjct: 1114 VETEKTQE 1121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN2062IGASERPTASE270.039 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.039
Identities = 30/154 (19%), Positives = 49/154 (31%), Gaps = 16/154 (10%)

Query: 24 APDVTSGTT-MSAEEQKEAMDILDRMREKIEKEEAEKAKLIAEAKELGMSPSEVASMDNV 82
AP S TT AE K+ E K +A E EVA
Sbjct: 1029 APATPSETTETVAENSKQ--------------ESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 83 EEMLEAKRAAEAKPKTEAEKLELTRKKALNKLDFYERVVRSVAREENEVSDYYGVMGEEK 142
+ A+ +E ++ + T K ++ E + + EV + ++
Sbjct: 1075 NVKANTQTNEVAQSGSETKETQTTETKETATVE-KEEKAKVETEKTQEVPKVTSQVSPKQ 1133

Query: 143 QRSTVYLGTAEAAAEQQVEQNAAPAEIQPETPEE 176
++S AE A E N + Q T +
Sbjct: 1134 EQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167


33FN0075FN0079N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0075192.853277Isoleucyl-tRNA synthetase
FN00761113.591244Lipoprotein signal peptidase
FN00772123.865912Glycyl-tRNA synthetase alpha chain
FN00784155.495304Glycyl-tRNA synthetase beta chain
FN00794165.830551GTP cyclohydrolase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0075TCRTETOQM260.047 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 26.4 bits (58), Expect = 0.047
Identities = 23/78 (29%), Positives = 32/78 (41%), Gaps = 12/78 (15%)

Query: 37 KSKIIDTPGEYVENKMYYKSLLVLSADAKIIVLVQSAIDGATLFPPKFSTMFPKKEVIGL 96
K IIDTPG Y+SL VL +L+ SA DG + +F +G+
Sbjct: 69 KVNIIDTPGHMDFLAEVYRSLSVLDG----AILLISAKDGVQ---AQTRILFHALRKMGI 121

Query: 97 -----VTKIDLADADIER 109
+ KID D+
Sbjct: 122 PTIFFINKIDQNGIDLST 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0076HTHFIS604e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.2 bits (146), Expect = 4e-13
Identities = 24/113 (21%), Positives = 48/113 (42%), Gaps = 3/113 (2%)

Query: 4 RVVVVEDETLTRIDLIEILKENGYDVVGEAADGIEAVEVCKKLQPDIVLLDIKIPYISGL 63
++V +D+ R L + L GYDV ++ D+V+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 64 KVANILKEEGFKGCVIILTAYNIAEYIQEASNTIVMGYILKP--IDEVIFLER 114
+ +K+ V++++A N +AS Y+ KP + E+I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0077PF06580432e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 2e-06
Identities = 28/187 (14%), Positives = 72/187 (38%), Gaps = 28/187 (14%)

Query: 279 NNLQTVASLLRIQKRRVKNAETKKILDETINRILSIAITHEILSATGMDTISIKHILEIL 338
N L + +L+ + +++L LS + L + +S+ L +
Sbjct: 177 NALNNIRALILED-----PTKAREMLTS-----LS-ELMRYSLRYSNARQVSLADELT-V 224

Query: 339 CQNYFK-NNVDKSKKIEFNINGDEFSISSDKATSVALVVNEIVQNATEHAFIGR-DSGKV 396
+Y + ++ +++F + + ++V +V+N +H GK+
Sbjct: 225 VDSYLQLASIQFEDRLQFENQINP---AIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKI 281

Query: 397 IIKILKGEKFSKIKISDNGVGMEVN-RETNNMGLLIISSLVKDKLKG------NLEIRSK 449
++K K +++ + G N +E+ GL V+++L+ +++ K
Sbjct: 282 LLKGTKDNGTVTLEVENTGSLALKNTKESTGTGL----QNVRERLQMLYGTEAQIKLSEK 337

Query: 450 KDKGTTI 456
+ K +
Sbjct: 338 QGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0079MICOLLPTASE290.037 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 29.3 bits (65), Expect = 0.037
Identities = 14/52 (26%), Positives = 25/52 (48%), Gaps = 3/52 (5%)

Query: 55 TLKDLKENPAVPYEEDE-VTRIIIDDLNLQIYDEIKDW-TVSDLREWLLSEE 104
+L + +N VP DE V D+N +I ++IK+ + DL + +
Sbjct: 639 SLLNNIDNLDVPLVSDEYVNGHEAKDIN-EITNDIKEVSNIKDLSSNVEKSQ 689


34FN0216FN0222N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0216-280.502009Glutaconyl-CoA decarboxylase A subunit
FN0217-170.712724Sodium/glutamate symport carrier protein
FN0218-170.011322Activator of (R)-2-hydroxyglutaryl-CoA
FN0219-270.131973(R)-2-hydroxyglutaryl-CoA dehydratase
FN0220-170.814149(R)-2-hydroxyglutaryl-CoA dehydratase
FN0221-191.530667Hypothetical cytosolic protein
FN0222-380.514795Transcriptional regulator, COPG family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0216DHBDHDRGNASE741e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.5 bits (180), Expect = 1e-17
Identities = 51/192 (26%), Positives = 81/192 (42%), Gaps = 6/192 (3%)

Query: 3 IFIVGGSSGIGLSLAKRYASLGNEVAICGTNEEKLKKIEECNKNI----KIYKVDVRNKE 58
FI G + GIG ++A+ AS G +A N EKL+K+ K + + DVR+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 59 ELKSAIDDFSK--GNLDLIINSAGIYTNNRTTKLTDKEAYAMIDINLTGVLNTFEAVRDV 116
+ + G +D+++N AG+ L+D+E A +N TGV N +V
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 117 MFKNNRGHIAIISSVAGLLDYPKASVYARTKMTIMGVCETYRSFFRNYNINITTIVPGYI 176
M G I + S + + YA +K + + YNI + PG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 177 ATDKLKSLSEED 188
TD SL ++
Sbjct: 191 ETDMQWSLWADE 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0219HTHFIS711e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 1e-16
Identities = 26/107 (24%), Positives = 48/107 (44%), Gaps = 4/107 (3%)

Query: 5 IIVEDELPAREELKYFVNEEKEIKLIAEFDNPLDTLNFLENNTADVIFLDINMPDMNGIS 64
++ +D+ R L ++ I N ++ D++ D+ MPD N
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRITS--NAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 65 LGKIITKMYPDMKIVFITAYKDY--AVDAFEIKAFDYLLKPYSESRI 109
L I K PD+ ++ ++A + A+ A E A+DYL KP+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0220PF065802087e-65 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 208 bits (532), Expect = 7e-65
Identities = 66/201 (32%), Positives = 117/201 (58%), Gaps = 7/201 (3%)

Query: 336 ENLISLLKYSELKALQSQINPHFLFNVLNTMTSLIRTNPEKAREVTIDLSNYLRYNLDN- 394
+ S+ + ++L AL++QINPHF+FN LN + +LI +P KARE+ LS +RY+L
Sbjct: 152 WKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYS 211

Query: 395 NVKSVELIKELNQIDNYIKIEKARFGDKLNIVYDVDESLYNFQIPSLIIQPLVENSIKHG 454
N + V L EL +D+Y+++ +F D+L ++ ++ + Q+P +++Q LVEN IKHG
Sbjct: 212 NARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHG 271

Query: 455 ILKKRENGCVKVIVKKIDKDIEVIIEDDGVGIEQTVIDNLDKQIQENIGLKNVHQRLKLL 514
I + + G + + K + + + +E+ G + ++ GL+NV +RL++L
Sbjct: 272 IAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES------TGTGLQNVRERLQML 325

Query: 515 YGEGLNIKKLEQGTRINFRIL 535
YG IK E+ ++N +L
Sbjct: 326 YGTEAQIKLSEKQGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0222TCRTETA290.036 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.036
Identities = 54/337 (16%), Positives = 109/337 (32%), Gaps = 53/337 (15%)

Query: 52 IALAVSRFVDMITDPLVGFMSDKYNSKYGRRIPFVAVGTIPLILVTIAFFYPPTSNERAS 111
I LA+ + P++G +SD++ GRR P++LV++A A
Sbjct: 47 ILLALYALMQFACAPVLGALSDRF----GRR---------PVLLVSLAGA--------AV 85

Query: 112 FYYLMIVGSLFFTFYT------IVGAPY---NALIPEIGRTPEERLNLSTWQSVFRLSYT 162
Y +M + Y I GA A I +I T + + +
Sbjct: 86 DYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADI--TDGD-------ERARHFGFM 136

Query: 163 AVAIILPGILIKMIGGDNVLFGIRGMIIFLCVIVFIGLAVTVFIIRERDYSTG-EVSNVS 221
+ + ++GG F + + F++ E + +
Sbjct: 137 SACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREA 196

Query: 222 FKDTIGIIIKNKNFILYLFGMMFFFIGFNN--LRAIMNYYVEDIMGYGKREITLVSAVLF 279
++ +FF + A+ + ED + I +S F
Sbjct: 197 LNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIG-ISLAAF 255

Query: 280 GSAAICF--YPTNKLSKKYGYRKIMLCCLAMLIVTTSMLFFLGKIFPVNFGFVLFGLIGI 337
G T ++ + G R+ ++ + +L F + + VL GI
Sbjct: 256 GILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGI 315

Query: 338 PLAGAAFIFPPAMLSEISTQISEDSGARIEGISFGIQ 374
+ PA+ + +S Q+ E+ +++G +
Sbjct: 316 GM--------PALQAMLSRQVDEERQGQLQGSLAALT 344


35FN0469FN0474N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0469-2121.262970Xaa-Pro aminopeptidase
FN0470-2101.467652Aldehyde dehydrogenase B
FN0471-2100.984526Rubrerythrin
FN0472-2101.026526Hypothetical cytosolic protein
FN0473-190.761343unknown
FN0474-190.759437Hypothetical Exported Protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0469PF06340280.033 Vibrio cholerae toxin co-regulated pilus biosynthesis pr...
		>PF06340#Vibrio cholerae toxin co-regulated pilus biosynthesis

protein F (TcpF)
Length = 338

Score = 27.7 bits (61), Expect = 0.033
Identities = 19/68 (27%), Positives = 27/68 (39%), Gaps = 5/68 (7%)

Query: 131 DYIDGLVNIGIKRILTSGGKATALEGKDLINEMIKKTNGRLKIVVAGKVTKENLNDLSNL 190
D I+ LVN + L++ + G+ L K NGR IV G V L +
Sbjct: 133 DQIETLVNYANEGKLSTALNQEYITGRFLT-----KENGRYDIVNVGGVPDNTPVKLPAI 187

Query: 191 ISADEFHG 198
+S G
Sbjct: 188 VSKRGLMG 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0471OMADHESIN462e-07 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 45.7 bits (107), Expect = 2e-07
Identities = 49/182 (26%), Positives = 82/182 (45%), Gaps = 9/182 (4%)

Query: 10 AAGVSYSA-PTIEAGTGADSTKAGIGNEASGDNSSAFGFENTASGKNSSAIGNKNEASGF 68
A G+ Y P + G +++ GI + A G + A A G S A G + A G
Sbjct: 46 ALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGP 105

Query: 69 ESSVLGGKYKVTGSHSGAFGDPNVV-----TGNGSYAFGNDNTINGNNNFVLGNNVTIGS 123
S LG G+ S A D + T + A G ++ + N+ +G++ + +
Sbjct: 106 LSKALGDSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAA 165

Query: 124 GIQNSVALGNGSVVSFSNEVSVGSKGKERKITNVADGEVSATSTDAVNGRQLYNAMQNSN 183
S+A+G+ S N VS+G + R++T++A G TDAVN QL ++ +
Sbjct: 166 NHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAG---TKDTDAVNVAQLKKEIEKTQ 222

Query: 184 DN 185
+N
Sbjct: 223 EN 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0473HTHTETR587e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.1 bits (140), Expect = 7e-13
Identities = 21/83 (25%), Positives = 41/83 (49%), Gaps = 1/83 (1%)

Query: 6 KEEIKNRIYKAASKIFYEKGFLKTKMKDISEEAKIPVGLVYTYYKNKEELFDEIVNPIYY 65
+E + I A ++F ++G T + +I++ A + G +Y ++K+K +LF EI
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 66 YLN-LAIEKEEKEEGSALERFKA 87
+ L +E + K G L +
Sbjct: 69 NIGELELEYQAKFPGDPLSVLRE 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0474ACRIFLAVINRP5950.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 595 bits (1536), Expect = 0.0
Identities = 222/1040 (21%), Positives = 455/1040 (43%), Gaps = 50/1040 (4%)

Query: 11 LIKIAIKRTIVTTMISISLLILGIFAMKSMRTELLPDIEYPVVKIITHWSGASAEDVEKQ 70
+ I+R I +++I L++ G A+ + P I P V + ++ GA A+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 71 ITNKIERILPNVEGIENISSES-SYENSSISVEFNYGVNIQDKVTEIQREVFQIKNDLPN 129
+T IE+ + ++ + +SS S S + +I++ F G + ++Q ++ LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 130 SAKNPIIKKTEVGAGAITLFLTFVSPDKKA----LFSYLENYVKPNLETISGVAEVSILG 185
+ I E + + + FVS + + Y+ + VK L ++GV +V + G
Sbjct: 121 EVQQQGISV-EKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 186 GTKKQLQIQIEPAKLASYNLTPMDIYQLIRKSSMVIPLGSLMNG------REEYVIQALG 239
+ ++I ++ L Y LTP+D+ ++ + I G L + I A
Sbjct: 180 A-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 240 ELESVEEYENILLHSN--GDTLRLKDIANVVLTEEDPLNLGFNRGKPATTIAISKSSDGS 297
++ EE+ + L N G +RLKD+A V L E+ + GKPA + I ++ +
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 298 TIEINEKIMKAIKNMEETMPSNITYFKIFDSSESIKKSINTVGKSALQGLILASLFLWIF 357
++ + I + ++ P + +D++ ++ SI+ V K+ + ++L L +++F
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 358 FKNKKMTLIVSFAFPLAISTTFILMKGIHSTFNLISLMGLAIGVGMLTDNSVVVIDNIYN 417
+N + TLI + A P+ + TF ++ + N +++ G+ + +G+L D+++VV++N+
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 418 HIQEEK-NSIEAAFIGTNQVFSSVLASTLTSIIVFLPIIFTKGIFKEMFQDMVWAIIFSN 476
+ E+K EA +Q+ +++ + VF+P+ F G +++ I+ +
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 477 VAALLVSVTFIPMLASKFMKKNIIQTEG----------KYFSKIQRKYQKFLSISLKHKK 526
++LV++ P L + +K + F Y + L
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 527 KTVLISLLFAFLIFGIGGKFVKFGFLTKQDYGYYSVIAEFQNGSDFEKIQELRNEIETII 586
+ +LI L + + + FL ++D G + + + G+ E+ Q++ +++
Sbjct: 539 RYLLIYALIVAGMV-VLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 587 KK--ESHTKSYFSIIQKRNGTISVNVDVGF--------KEKRKESIFEIVKKVRKEVEKI 636
K +++ +S F++ + N + F + + S ++ + + E+ KI
Sbjct: 598 LKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKI 657

Query: 637 PDIRTTFF-----YEYAKGKPKKDIEFQIVGTDLETIQILARQIYKDVLK-IKGVTDVSS 690
D F E G + + Q+ + + V
Sbjct: 658 RDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 691 TLDSGGKKLEIIFKRDKIQSLNISIKQIEETISYYLLGGDRANTITIKSGNEEIEVLVRL 750
+ ++ ++K Q+L +S+ I +TIS LGG N ++ V+
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTA-LGGTYVNDFI--DRGRVKKLYVQA 774

Query: 751 SKDNRKSIKQLENLKIKVTDNSFINLSEIADIRKVENQLSIDKINRFYSVSIYVND-GGI 809
R + ++ L ++ + + S V +++ N S+ I G
Sbjct: 775 DAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGT 834

Query: 810 GTQKIQEELVKIFSEKNKDSSIQYRWGGDAEKMQRAMKELMLTFLIAIFLIYALLASQFE 869
+ + + + I Y W G + + + + + I+ +++ LA+ +E
Sbjct: 835 SSGDAMALMENL--ASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYE 892

Query: 870 SFLFPFLVMGSIPFSLVGVIIGFLITQHTLDAVAMVGIILLIGIVVNNAIVLLDFIQQ-E 928
S+ P VM +P +VGV++ + D MVG++ IG+ NAI++++F +
Sbjct: 893 SWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLM 952

Query: 929 EKESKNKKEAIEKACNLRLRPILLTSLTTIVGMIPLSLGIGDGSEVYQGLGISIIFGMSF 988
EKE K EA A +RLRPIL+TSL I+G++PL++ G GS +GI ++ GM
Sbjct: 953 EKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVS 1012

Query: 989 STLLTLIFVPTTYYMLTSIF 1008
+TLL + FVP + ++ F
Sbjct: 1013 ATLLAIFFVPVFFVVIRRCF 1032


36FN0695FN0700N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN0695-2132.129286GTPase
FN0696-2100.801373Ribulose-phosphate 3-epimerase
FN0697-2100.818673Transcriptional regulator, MarR family
FN0698090.231869Fibronectin-binding protein-like protein A
FN06990100.583601Hypothetical protein
FN0700-28-0.913173Prismane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0695IGASERPTASE300.010 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.010
Identities = 13/66 (19%), Positives = 25/66 (37%), Gaps = 9/66 (13%)

Query: 181 HLKKRDLGILITDHNV--RETLSITDKSYIMAKG-------KVLIEGTPREIANNPEARR 231
H++ D G + +HN+ ++IT +S I E P + +
Sbjct: 533 HIRNIDDGARLVNHNMTNASNITITGESLITDPNTITPYNIDAPDEDNPYAFRRIKDGGQ 592

Query: 232 IYLGEK 237
+YL +
Sbjct: 593 LYLNLE 598


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0697adhesinb310.013 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 31.4 bits (71), Expect = 0.013
Identities = 28/144 (19%), Positives = 56/144 (38%), Gaps = 16/144 (11%)

Query: 706 LKNIEQKLKATNSNLVEKVEKNL----ETLKDTEKEL-----EILKQKLALFETKAAISG 756
+NI ++L + E EKNL E L +KE I +K + ++
Sbjct: 152 AQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKY 211

Query: 757 MEEIGGVKVL----IAAFKDKSTEDLRTMIDTIKDNNEKAIIVLASTQDKLAFAVGVTKT 812
+ V I ++ + + ++T+++ ++ ++ V +S D+ V+K
Sbjct: 212 FSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVPSLFVESSVDDRPMKT--VSKD 269

Query: 813 LTDKIKAGDLVKKLAEITGGKGGG 836
I A +AE G +G
Sbjct: 270 TNIPIYAKIFTDSVAE-KGEEGDS 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0699SECFTRNLCASE642e-13 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 64.1 bits (156), Expect = 2e-13
Identities = 31/167 (18%), Positives = 76/167 (45%), Gaps = 3/167 (1%)

Query: 228 TVGATLGDESIAQSKNAGMVAIVLIWVFMII-FYRLPGIIADLAIIIFGFITFACLNFID 286
+VG + E + + + + A V+I ++ + F + A +A++ +T +
Sbjct: 142 SVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFAVLQ 201

Query: 287 ATLTLPGIAGFILSLGMAVDANVIIFERIKEELRF--GNSIRNSIDSGFNKGFIAIFDSN 344
L +A + G +++ V++F+R++E L +R+ ++ N+ +
Sbjct: 202 LKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTG 261

Query: 345 LTTLIITTILFVFGTGPIKGFAVTLALGTLASMFTAITVTKVLLLTF 391
+TTL+ + ++G I+GF + G ++++ V K ++L
Sbjct: 262 MTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFI 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN0700SECFTRNLCASE2584e-87 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 258 bits (661), Expect = 4e-87
Identities = 91/310 (29%), Positives = 167/310 (53%), Gaps = 15/310 (4%)

Query: 2 KTNLHVIKNIKIYLSISLVLVTLSIVIFFTKGLNYGIDFSGGNLFQLKYNGTTVTLNQIN 61
KTN + ++V++ S+++ GLN+GIDF GG + + T + +
Sbjct: 11 KTNFDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTEST-TAIDVGVYR 69

Query: 62 ENLDKLAKELPQI------------NSNSRKVQISDDGTIIVRVPEISENDKGKVLNNLK 109
L+ L I + ++Q+ +DG + KV L
Sbjct: 70 AALEPLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALT 129

Query: 110 ELG-SYTLDKEDKVGASIGDDLKKSAIYSLGIGAILIVIYITMRFEFSFAIGGILSLLHD 168
+ + + + VG + +L +A++SL ++I+ YI +RFE+ FA+G +++L+HD
Sbjct: 130 AVDPALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHD 189

Query: 169 IIIAVGFIALMGYEVDTPFIAAILTILGYSINDTIVIYDRIRENLKRKHKGWILEQCMDE 228
+++ VG A++ + D +AA+LTI GYSINDT+V++DR+RENL K+K L M+
Sbjct: 190 VLLTVGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENL-IKYKTMPLRDVMNL 248

Query: 229 SINQTAIRSLNTSVTTLFSVIAILVFGGASLKTFIMTLLIGILAGTYSSIFVATPVVYLL 288
S+N+T R++ T +TTL +++ +L++GG ++ F+ ++ G+ GTYSS++VA +V +
Sbjct: 249 SVNETLSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFI 308

Query: 289 NKRKGNNMED 298
+ +D
Sbjct: 309 GLDRNKEKKD 318


37FN1124FN1132N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1124-16-1.037969Phosphohydrolase (MUTT/NUDIX family protein)
FN112505-1.260921Dipeptide transport ATP-binding protein dppF
FN112615-1.959713Dipeptide transport ATP-binding protein dppD
FN112725-1.919094Dipeptide-binding protein
FN112826-2.211997Dipeptide transport system permease protein
FN112947-2.460352Dipeptide transport system permease protein
FN1130-29-1.882688unknown
FN1131-19-2.585768Hypothetical membrane-spanning protein
FN113209-1.759810Peptidase E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1124OMPADOMAIN1129e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 112 bits (281), Expect = 9e-31
Identities = 48/132 (36%), Positives = 71/132 (53%), Gaps = 12/132 (9%)

Query: 193 PPVPSIEANSLLITLDSGILFDVDKYDVRPEAEEVLKNLVIVLKEADIK--AFEIDGHTD 250
P P+ E + TL S +LF+ +K ++PE + L L L D K + + G+TD
Sbjct: 203 APAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTD 262

Query: 251 SDASDEHNQVLSENRANAVKNFLTSQGIMAE-ITIKGYGESRPIASNDTPEGKQK----- 304
SD +NQ LSE RA +V ++L S+GI A+ I+ +G GES P+ N KQ+
Sbjct: 263 RIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALID 322

Query: 305 ----NRRVEIVI 312
+RRVEI +
Sbjct: 323 CLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1126SACTRNSFRASE358e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.5 bits (79), Expect = 8e-05
Identities = 13/51 (25%), Positives = 23/51 (45%)

Query: 92 IDEQHRNNGLGHLLVDKHLDWLKDNKCESIFVNVLVENESTISFYESLGFK 142
+ + +R G+G L+ K ++W K+N + + N S FY F
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1127cloacin320.005 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.4 bits (73), Expect = 0.005
Identities = 17/59 (28%), Positives = 22/59 (37%), Gaps = 2/59 (3%)

Query: 549 NNNFSRSFNNLNGMVSK--TNSRASSAIASSRRSSSSGGGGGFSSGSSGGGGSRGGGGG 605
N + N+NG + AS S ++ GGG G GG G GGG
Sbjct: 10 NTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGN 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1129GPOSANCHOR581e-10 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 58.2 bits (140), Expect = 1e-10
Identities = 43/323 (13%), Positives = 106/323 (32%), Gaps = 10/323 (3%)

Query: 195 EINLDKVEFILNETRENKNKIEKQAELAQKYIDLRDEKSSLAKGI---YITELEQKEKNL 251
L+KV+ ++ N ++ + + + +L + +K+L
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSL 108

Query: 252 SENENIKEKYQTECFELQEKLNKTLERLNTIDLEKEEVKKEKLLIDSRNKELRNIISEKE 311
SE + ++ + +L++ L + + + ++ EK + +R +L +
Sbjct: 109 SEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAM 168

Query: 312 KEKAVTSERLDNVKKEKLVKEEYILHLDNKIEKKLEEVTESKNKKDEISKNIVEMAAANK 371
S ++ ++ EK E L+ +E + T K + +AA
Sbjct: 169 NFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKA 228

Query: 372 EFENKIFNLENIKVEKFDLIENRAKKVRDLELEKQLASNEIENNEKKLKSSQDEVENFKQ 431
+ E + N I+ + LE + +E + +++ +
Sbjct: 229 DLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEA 288

Query: 432 ELEEANKKLLANNKEKDL-------VHSQLEARKEELTKTEERNEFLVNQLSEISKSINK 484
E + + + + L+A +E + E ++ L Q S
Sbjct: 289 EKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQS 348

Query: 485 LSQDIREFEYQEKTSSGKLEALV 507
L +D+ +K + + L
Sbjct: 349 LRRDLDASREAKKQLEAEHQKLE 371



Score = 47.0 bits (111), Expect = 4e-07
Identities = 44/287 (15%), Positives = 102/287 (35%), Gaps = 7/287 (2%)

Query: 683 KKEIKTLEEKVTDLKSKITEGSKKREDLSIKLENYENEVDKIDSLEDSIRKDIDLLKKDF 742
+ E L + DL+ + S K++ E E +++ + + K ++
Sbjct: 147 EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFS 206

Query: 743 ESLSEKSEKLSKDIRSISFNIEDAEKYKTSYQDRINSSFSTIEETEKHIASLKKDIEADE 802
+ S K + L + +++ D EK + + + I+ E A+L+ E
Sbjct: 207 TADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 266

Query: 803 NLLKQTISEIDSLNKQFSDTRILFLNNQSTIEQLEKDIHSKEIENVELQEEKEKNSKIVI 862
L+ ++ + + + ++ LE ++ N Q +
Sbjct: 267 KALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ---SQVLNANRQSLRRDLDASRE 323

Query: 863 ELSHNIEELETLEEELQSQIEEHTKIYNSENRDIETLNEREQNLSNEERELSKDKSKLET 922
E + LEE+ + + RD++ E ++ L E ++L + E
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSL----RRDLDASREAKKQLEAEHQKLEEQNKISEA 379

Query: 923 DSLHANDRFEKIVEVIEKIKVDILNINEKLNELVEITAQVIEVEKLK 969
+ E ++++ + N KL L ++ ++ E +KL
Sbjct: 380 SRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLT 426



Score = 44.3 bits (104), Expect = 3e-06
Identities = 62/360 (17%), Positives = 132/360 (36%), Gaps = 9/360 (2%)

Query: 154 GKVERIINSSPKEIKSIIEEAAGIKKLQANRIEAQKNLANIEINLDKVEFILNETRENKN 213
++ + +E+ + E+ K + + + L + +L+K +
Sbjct: 81 KALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 214 KIEKQAELAQKYIDLRDEKSSLAKGIYITELEQKEKNLSENENIKEKYQTECFELQEKLN 273
K E + + R A + + E K + EL++ L
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 274 KTLERLNTIDLEKEEVKKEKLLIDSRNKELRNIISEKEKEKAVTSERLDNVKKEKLVKEE 333
+ + + ++ EK + +R +L + S ++ ++ EK E
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 260

Query: 334 YILHLDNKIEKKLEEVTESKNKKDEISKNIVEMAAANKEFENKIFNLE-NIKVEKFDLIE 392
L+ +E + T K + + A + E++ L N + + DL
Sbjct: 261 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDA 320

Query: 393 NRAKKVRDLELEKQLASNEIENNEKKLKSSQ---DEVENFKQELEEANKKLLANNKEKDL 449
+R K + LE E Q + + +E +S + D K++LE ++KL NK +
Sbjct: 321 SREAK-KQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEA 379

Query: 450 ----VHSQLEARKEELTKTEERNEFLVNQLSEISKSINKLSQDIREFEYQEKTSSGKLEA 505
+ L+A +E + E+ E ++L+ + K +L + + E ++ KLEA
Sbjct: 380 SRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEA 439



Score = 40.8 bits (95), Expect = 3e-05
Identities = 30/215 (13%), Positives = 77/215 (35%)

Query: 743 ESLSEKSEKLSKDIRSISFNIEDAEKYKTSYQDRINSSFSTIEETEKHIASLKKDIEADE 802
E + E+++K + ++ D + +D + + ++ + K +
Sbjct: 53 EKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKA 112

Query: 803 NLLKQTISEIDSLNKQFSDTRILFLNNQSTIEQLEKDIHSKEIENVELQEEKEKNSKIVI 862
+ +++ + L K + + I+ LE + + +L++ E
Sbjct: 113 SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 172

Query: 863 ELSHNIEELETLEEELQSQIEEHTKIYNSENRDIETLNEREQNLSNEERELSKDKSKLET 922
S I+ LE + L+++ E K + + + L E+ L+ K+ LE
Sbjct: 173 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEK 232

Query: 923 DSLHANDRFEKIVEVIEKIKVDILNINEKLNELVE 957
A + I+ ++ + + + EL +
Sbjct: 233 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1132LPSBIOSNTHSS310.001 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 31.3 bits (71), Expect = 0.001
Identities = 15/69 (21%), Positives = 32/69 (46%), Gaps = 6/69 (8%)

Query: 3 IAIYGGSFNPMHIGHEKIVDYVLKNLDM-DKIIIIPVGIPSHRENNLEQSDTRLKICKEI 61
AIY GSF+P+ GH +D + + + D++ + + P+ + + RL+ +
Sbjct: 2 NAIYPGSFDPITFGH---LDIIERGCRLFDQVYVAVLRNPN--KQPMFSVQERLEQIAKA 56

Query: 62 FKNNKKVEV 70
+ +V
Sbjct: 57 IAHLPNAQV 65


38FN1260FN1267N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN1260-37-1.670973Polysaccharide deacetylase
FN1261-38-0.160131Glycosyl transferase
FN1262-39-0.182171Lipooligosaccharide cholinephosphotransferase
FN1263-281.097763LOS biosynthesis enzyme LBGB
FN1264-381.204220Hypothetical cytosolic protein
FN1265-291.538202Transcriptional regulator, DeoR family
FN1266-27-0.180103Guanine-hypoxanthine permease
FN1267-27-1.263229High-affinity iron permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1260PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 25/123 (20%), Positives = 46/123 (37%), Gaps = 30/123 (24%)

Query: 316 EIFIDSDIG---LLKLLFKNLIENAIKYG-----NDNPINIE-LKKEKKIKVIIEDFGIG 366
E I+ I + +L + L+EN IK+G I ++ K + + +E+ G
Sbjct: 243 ENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL 302

Query: 367 ISKEAIPHIFERFYREDEARNREIKSYGLGLSIVKEIVALL---NIDIQIESQINKGTKI 423
K +S G GL V+E + +L I++ + K +
Sbjct: 303 ALKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 424 TLL 426
L+
Sbjct: 345 VLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1261HTHFIS964e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 96.1 bits (239), Expect = 4e-25
Identities = 28/114 (24%), Positives = 60/114 (52%), Gaps = 1/114 (0%)

Query: 12 KILIIEDDKNIQRLLTLELRHKNYSVDSAYDGEQGIEMFSKNSYDLVLLDLMLPKKSGKE 71
IL+ +DD I+ +L L Y V + + DLV+ D+++P ++ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 72 VCQELRKL-AETPIIIITAKDSVLDKVELLDLGANDYICKPFAMEELLARIRVA 124
+ ++K + P+++++A+++ + ++ + GA DY+ KPF + EL+ I A
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1265OMPADOMAIN1082e-30 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 108 bits (270), Expect = 2e-30
Identities = 41/124 (33%), Positives = 65/124 (52%), Gaps = 11/124 (8%)

Query: 90 INLPGGVTFASDSANITSGFYSALNGIAQSLNNY--PETRIQVNGYTDNTGKDAHNQELS 147
L V F + A + +AL+ + L+N + + V GYTD G DA+NQ LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 148 QRRANAVAQYLIAQGVSSNRIVANGFGSSNPIASNST---------PEGRLQNRRVEIKI 198
+RRA +V YLI++G+ +++I A G G SNP+ N+ + +RRVEI++
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 199 LPAQ 202
+
Sbjct: 335 KGIK 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1267SYCDCHAPRONE351e-04 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 34.5 bits (79), Expect = 1e-04
Identities = 24/152 (15%), Positives = 53/152 (34%), Gaps = 18/152 (11%)

Query: 13 ENVEYFSEIIDRIND---IQENNNYSDEEMDNDLDVALWRAFVYINLWSYKGYAKAERIL 69
+ EY + + I N S + ++ +A N + Y A ++
Sbjct: 7 DTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAF-------NQYQSGKYEDAHKVF 59

Query: 70 KKVENKGIKNPIWCYRYAVSIA----RLRKYEEALKYFLIGTEVDSTYPWNWLELGRLYY 125
+ + + + + R+ + + + +Y+ A+ + G +D P
Sbjct: 60 QAL---CVLDH-YDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLL 115

Query: 126 KFGELDKVFECIEKGLELVPNDYEFLTLKDDV 157
+ GEL + + EL+ + EF L V
Sbjct: 116 QKGELAEAESGLFLAQELIADKTEFKELSTRV 147


39FN1272FN1280N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
FN127218-0.342281C4-dicarboxylate transporter large subunit
FN1273180.177498C4-dicarboxylate transporter small subunit
FN1274080.651260C4-dicarboxylate-binding protein
FN1275-2100.823579hypothetical protein
FN1276-1141.154125Sensory Transduction Protein Kinase
FN1277-2131.355672Two-component response regulator
FN1278-2111.986305Integral membrane protein
FN1279-1112.421749Cobalt chelatase
FN12800122.925707Hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1272HTHTETR513e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 3e-10
Identities = 23/139 (16%), Positives = 56/139 (40%), Gaps = 7/139 (5%)

Query: 6 DKKYLILEKAKDMIITESYSSLSISKLTSELSISKGSFYTYFPSKDKMLSEILDEYIKNI 65
+ + IL+ A + + SS S+ ++ +++G+ Y +F K + SEI + NI
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 66 IIFKNNLLE-----NSKNIDECLDYYINSILSLTDEELKLELVITNLKRNYEVFNEENFK 120
+ + E L + + S ++ L +E++ + E+ + +
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ--Q 128

Query: 121 KLKIIACTMIDFVKEVLNK 139
+ + D +++ L
Sbjct: 129 AQRNLCLESYDRIEQTLKH 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1274RTXTOXIND545e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.7 bits (129), Expect = 5e-10
Identities = 43/252 (17%), Positives = 75/252 (29%), Gaps = 59/252 (23%)

Query: 105 AQTEADFLQAKANYQSATSNYNIARNNYQKFKTLYDKQLISYLEFSNYQASYTSAQGNLE 164
+ A+ L A + + ++ F +L KQ I+ + Y A L
Sbjct: 210 DKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELR 269

Query: 165 V--------------AKATYMNAQNSYSKLVA---------------------------- 182
V AK Y + +
Sbjct: 270 VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 183 -RAEISGIVGNLFIK-EGNDIAAKEVLFTIL-NDKQMQSYVGITPEAISKVKLGDEINVK 239
RA +S V L + EG + E L I+ D ++ + + I + +G +K
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 240 IDAL----AKEYKAKITELNP--IADSTTKN-FKVKLALDNPDG-------EIKDGMFGN 285
++A K+ +N I D F V ++++ + GM
Sbjct: 390 VEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVT 449

Query: 286 VIIPVGESSVLS 297
I G SV+S
Sbjct: 450 AEIKTGMRSVIS 461



Score = 49.4 bits (118), Expect = 1e-08
Identities = 21/109 (19%), Positives = 46/109 (42%), Gaps = 4/109 (3%)

Query: 78 KGGTVKTIYKKNGEYVKKGEVIVKLSDAQTEADFLQAKANYQSATSNYNIARNNYQKFKT 137
+ VK I K GE V+KG+V++KL+ EAD L+ +++ A + + YQ
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQAR----LEQTRYQILSR 158

Query: 138 LYDKQLISYLEFSNYQASYTSAQGNLEVAKATYMNAQNSYSKLVARAEI 186
+ + L+ + ++ + + +++ + E+
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKEL 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1275ACRIFLAVINRP6470.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 647 bits (1670), Expect = 0.0
Identities = 227/1041 (21%), Positives = 487/1041 (46%), Gaps = 54/1041 (5%)

Query: 3 LAGISIRRPVATTMVMVSFIFIGLLAMFSMKKELIPNIKIPVVTITTTWTGAVSENVETQ 62
+A IRRP+ ++ + + G LA+ + P I P V+++ + GA ++ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTKKIKDSLSNVDAIDKIQTVSA-YGVSNVVVNFDYGVDTDEKVTQIQREVSKIANNLPN 121
VT+ I+ +++ +D + + + S G + + F G D D Q+Q ++ LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DANTPLVRKIEAAGGNMTAIIAFNADSK----TALTTFIKEQLKPRLESLPGIGQVDIFG 177
+ + +E + + + F +D+ ++ ++ +K L L G+G V +FG
Sbjct: 121 EVQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 178 NPDKQLQIQVDSDKLASYNLSPMELYNIVRTSVATYPIGKLSTG------NKDMIIRFMG 231
++I +D+D L Y L+P+++ N ++ G+L + I
Sbjct: 180 -AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 232 ELDYIDQYKNILI--SSDGNTLRLKDVADIVLTTEDADNVGYLNGKESVVVLLQKSSDGD 289
+++ + + +SDG+ +RLKDVA + L E+ + + +NGK + + ++ ++ +
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 290 TITLNNAAFKVIEEMKPYMPAGTEYSIEMDASENINSSISNVSSSAIQGLVLATIILFIF 349
+ A + E++P+ P G + D + + SI V + + ++L +++++F
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 350 LKSFRTTILISLALPVAIIFTFAFLSMRGTTLNLISLMGLSIGVGMLTDNSVVVVDNIYR 409
L++ R T++ ++A+PV ++ TFA L+ G ++N +++ G+ + +G+L D+++VVV+N+ R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 410 HITELNSPVMEASENATEEVTFSVIASALTTIVVFLPILFIPGLAREFFRDMSYAIIFSN 469
+ E P EA+E + ++ +++ A+ VF+P+ F G +R S I+ +
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 470 LAAIIVAITMIPMLASRFLDRKSMKSEDG-----------RLFKKIKAFYLKVINKAISH 518
+++VA+ + P L + L K + +E F Y + K +
Sbjct: 479 ALSVLVALILTPALCATLL--KPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGS 536

Query: 519 KGLTVLIMVVLFFFSIFVGPKLLKFEFMPKQDEGKYSLTAELQNGTDLNKAERIAKELEE 578
G +LI ++ + + +L F+P++D+G + +L G + +++ ++ +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPS-SFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTD 595

Query: 579 II-KSDPHTQSYLMLVSTSSISVNA-NVGK----------KNTRDDSVFTIMNDIRNKTS 626
K++ + V+ S S A N G +N ++S +++ + +
Sbjct: 596 YYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELG 655

Query: 627 KVLDARVSMTNQFS----GGQTNKDVQ-FLLQGSNQNEIKQLGKQLLEKLQSYNG-MVDI 680
K+ D V N + G T D + G + + Q QLL + +V +
Sbjct: 656 KIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSV 715

Query: 681 SSTLDPGIIELRVNIDRDKIASYGISPTVVAQTISYYMLGGDKANTATLKTDTEEIDVLV 740
+ ++ +D++K + G+S + + QTIS LGG N + V
Sbjct: 716 RPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTA-LGGTYVN--DFIDRGRVKKLYV 772

Query: 741 RLPKDKRNDINTLASLNIKVGDNKFVKLSDVATLQYAEGTSEIRKKNGIYTVTISGND-G 799
+ R + L ++ + + V S T + G+ + + NG+ ++ I G
Sbjct: 773 QADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAP 832

Query: 800 GVGLGAIQSKIIEEFNNLNPPSTVSYSWGGQTENMQKTMSQLSFALSISIFLIYALLASQ 859
G G + + + L P+ + Y W G + + + +Q ++IS +++ LA+
Sbjct: 833 GTSSGDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 860 FESFIMPIIIIGSIPLALIGVIWGLVILRQPIDIMVMIGVILLAGVVVNNAIVLIDFIK- 918
+ES+ +P+ ++ +PL ++GV+ + Q D+ M+G++ G+ NAI++++F K
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 919 TMRTRGYDKDYSIIYSCETRLRPILMTTMTTVFGMIPMALGLGEGSEFYRGMAITVIFGL 978
M G + + + RLRPILMT++ + G++P+A+ G GS + I V+ G+
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 979 SFSTILTLVLIPILYSIVDSF 999
+T+L + +P+ + ++
Sbjct: 1011 VSATLLAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1278SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 31/118 (26%), Positives = 53/118 (44%), Gaps = 8/118 (6%)

Query: 10 KSFKDIDVNVDELCKRIKKNKQYQLFIKYSENIPVAYLGILYMSNLHYDGAWIDLVAVVE 69
K ++D D++V + + + F+ Y EN + + I N A I+ +AV +
Sbjct: 48 KQYEDDDMDVSYV-----EEEGKAAFLYYLENNCIGRIKIRSNWN---GYALIEDIAVAK 99

Query: 70 EHRNKGIGKELLKFAENMVKEKKGTVLTGLVRKDNVSSSTMFLNSNFESSEKDFILYS 127
++R KG+G LL A KE L + N+S+ + +F D +LYS
Sbjct: 100 DYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYS 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
FN1280V8PROTEASE742e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 74.3 bits (182), Expect = 2e-16
Identities = 39/178 (21%), Positives = 75/178 (42%), Gaps = 10/178 (5%)

Query: 81 SVCRISIRDSRGVVVGYGTGFLVAPNIILTNNHVINS-YEVASNSIAEFNYQDDENFMPC 139
V I + G + +G +V + +LTN HV+++ + A + + +N
Sbjct: 89 PVTYIQVEAPTGTFIA--SGVVVGKDTLLTNKHVVDATHGDPHALKAFPSAINQDN---- 142

Query: 140 PTYNFRLNPQTFFITNVKLDFTLVALNENVTNQKHLEDFGYLKMTQKEGTILPEEYVSII 199
N + + + D +V + N N+ E M+ + +++
Sbjct: 143 -YPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSN-NAETQVNQNITVT 200

Query: 200 QHPKGGPKSVTLR-ENKVSGLKENFIHYLTDTEPGSSGSPVFNDQWTLVALHHSGVPN 256
+P P + + K++ LK + Y T G+SGSPVFN++ ++ +H GVPN
Sbjct: 201 GYPGDKPVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPN 258



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.