PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeRif1.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP003905 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1C695_00225C695_00390Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_00225-214-3.596402hypothetical protein
C695_00230-115-5.330621adenine-specific DNA methyltransferase
C695_00235-115-4.155479cytosine specific DNA methyltransferase
C695_00240-110-2.602861hypothetical protein
C695_00245-19-2.475462hypothetical protein
C695_00250-19-2.331303hypothetical protein
C695_0025509-1.288057adenine/cytosine DNA methyltransferase
C695_002601100.372408sodium/proline symporter
C695_00265112-0.057791delta-1-pyrroline-5-carboxylate dehydrogenase
C695_00270315-1.026038hypothetical protein
C695_00275313-0.991632Myosin-3
C695_00280312-0.617845hypothetical protein
C695_00285212-0.578000hypothetical protein
C695_00290114-0.019418hypothetical protein
C695_002951100.463960hypothetical protein
C695_003000110.960622hypothetical protein
C695_003050101.488847hypothetical protein
C695_003100111.576896hypothetical protein
C695_00315-1112.151310ATP-binding protein
C695_003200142.805215urease accessory protein UreH
C695_003254233.259208urease accessory protein UreG
C695_003304242.824389urease accessory protein UreF
C695_003354242.468908urease accessory protein UreE
C695_003403212.389553urease accessory protein UreI
C695_003451202.009561hypothetical protein
C695_003500172.480532urease subunit beta
C695_00355-3101.672394urease subunit alpha
C695_003600132.215550*lipoprotein signal peptidase
C695_003652132.666633phosphoglucosamine mutase
C695_003703152.86193230S ribosomal protein S20
C695_003752132.056975peptide chain release factor 1
C695_003803131.924329hypothetical protein
C695_003853131.882538hypothetical protein
C695_003902141.132456hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00265ANTHRAXTOXNA310.034 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.034
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00280GPOSANCHOR461e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.8 bits (108), Expect = 1e-07
Identities = 40/217 (18%), Positives = 74/217 (34%)

Query: 13 QVRKELEARISGLEDENAELFAENEKLALGTSELKDANNQLRQKNDKLFTTKENLTQEKT 72
+ +LE + G + + A+ + L + L L + + + +
Sbjct: 120 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIK 179

Query: 73 ELTEKNKVLTTEKGNLDNQLNASQKQVQALEQSQQVLENEKVELTNKITDLSKEKENLTK 132
L + L + L+ L + A + LE EK L + DL K E
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 133 ANTELKTENDKLNHQVIALTKEQDSLKQERAQLQDAHGFLEELCANLEKDNQHLTDKLKK 192
+T + L + AL Q L++ + LE + L +
Sbjct: 240 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299

Query: 193 LESAQKNLENSNDQLLQAIENIAEEKTELEREIARLK 229
LE + L + L + ++ E K +LE E +L+
Sbjct: 300 LEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336



Score = 45.1 bits (106), Expect = 2e-07
Identities = 48/262 (18%), Positives = 86/262 (32%), Gaps = 11/262 (4%)

Query: 16 KELEARISGLEDENAELFAENEKLALGTSELKDANNQLRQKNDKLFTTKENLTQEKTELT 75
EL +S +++ + + A EL+ L + + + + L
Sbjct: 88 DELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLE 147

Query: 76 EKNKVLTTEKGNLDNQLNASQKQVQALEQSQQVLENEKVELTNKITDLSKEKENLTKANT 135
+ L K +L+ L + A + LE EK L + +L K E +T
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 136 ELKTENDKLNHQVIALTKEQDSLKQERAQLQDAHGFLEELCANLEKDNQHLTDKLKKLES 195
+ L + AL + L++ + LE + L + +LE
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 196 AQKNLENSNDQLLQAIENIAEEKTELEREIARLKSLEATDKSELDLQNCRFKSAIEDLKR 255
A + N + I+ + EK LE E A L+ + + L+R
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR-----------QSLRR 316

Query: 256 QNRKLEEENIALKERAYGLKEQ 277
E L+ L+EQ
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQ 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00350UREASE10450.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1045 bits (2703), Expect = 0.0
Identities = 354/569 (62%), Positives = 443/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNASNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGNAS +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00385IGASERPTASE428e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 41.6 bits (97), Expect = 8e-06
Identities = 49/253 (19%), Positives = 83/253 (32%), Gaps = 10/253 (3%)

Query: 52 KLTSDSPTQQQDQKVAQNTASNDSQEATTLENTASTDNTTATTDETYTKSTDTTVAGAAQ 111
K D+ + A ++ + T A + + T T T TK T T
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 112 KVETDNTA----VQSAEQTLKTDVAKVQADASAKDFDETTFQADQAAEQTAEKALQQAES 167
KVET+ T V S + VQ A ++ T + QT A + +
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 168 KLNTDQQTLNTALQDQTKTPTPSTPPTKEEPKHTASSGTPPAPESPPAKKDETSGTPSAS 227
K + N T + E P++T + T P S + K + S
Sbjct: 1173 KETSS----NVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV- 1227

Query: 228 GSSVASQLTKDTTMVNNLKSVSVSAMNTTLSGVETMSQQTATIGNLLNSSTDLSSVIPNA 287
SV + TT N+ +V++ + +T + + LN +S I
Sbjct: 1228 -RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 288 QGLNSAFSTLESA 300
+ N + +
Sbjct: 1287 EMNNEGQYNVWVS 1299


2C695_00700C695_00735Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
C695_007002131.620104DNA glycosylase MutY
C695_007053142.503938citrate:succinate antiporter
C695_007104132.408870citrate:succinate antiporter
C695_007154141.643125cbb3-type cytochrome c oxidase subunit I
C695_007200160.023080cbb3-type cytochrome c oxidase subunit II
C695_00725218-1.658263cytochrome c oxidase subunit Q
C695_00730417-1.247917cytochrome c oxidase, cbb3-type, subunit III
C695_00735419-1.861109hypothetical protein
3C695_00910C695_01020Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_009103181.075495serine hydroxymethyltransferase
C695_009153210.947760hypothetical protein
C695_009201180.066229hypothetical protein
C695_009252181.089040hypothetical protein
C695_00930-1123.176463hypothetical protein
C695_00935-1102.861814hypothetical protein
C695_00940-1102.112765hypothetical protein
C695_00945-2102.253503hypothetical protein
C695_00950-1103.049904fumarate reductase iron-sulfur subunit
C695_00955-1103.094487fumarate reductase flavoprotein subunit
C695_00960-2141.594016fumarate reductase cytochrome b-556 subunit
C695_00965-2151.688591triosephosphate isomerase
C695_00970-2162.845314enoyl-(acyl carrier protein) reductase
C695_00975-2163.115653UDP-3-O-[3-hydroxymyristoyl] glucosamine
C695_00980-2163.468187S-adenosylmethionine synthetase
C695_00985-2172.665732mulitfunctional nucleoside diphosphate
C695_009900181.869182hypothetical protein
C695_00995-2171.76395350S ribosomal protein L32
C695_01000-111-2.375479phosphate acyltransferase
C695_01005114-4.1304753-oxoacyl-(acyl carrier protein) synthase III
C695_01010214-3.924814hypothetical protein
C695_01015213-3.932591hypothetical protein
C695_01020213-3.530627hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00920IGASERPTASE320.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.003
Identities = 28/148 (18%), Positives = 54/148 (36%), Gaps = 6/148 (4%)

Query: 50 PKETFLQTDSGMQKIGNTKDEKKDDEFESLNLDPSKQEDKLDKVADNVKKQENDAFNMPT 109
P ET ++ T ++ + D E+ + ++ V N Q N+ +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN--TQTNEVAQSGS 1090

Query: 110 QTDQTQTEMKTTEETQEAQKGLKVVEHTSTQKESQAVAKKEISHKKPKATPKDKEAHKDK 169
+T +TQT T E ++ K VE TQ+ + ++ ++ + E ++
Sbjct: 1091 ETKETQTTETKETATVEKEEKAK-VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 170 D---KHAVKELKVKKEAHKEVPKKANSK 194
D + + A E P K S
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSS 1177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00970DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 8e-13
Identities = 60/263 (22%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNSVKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + +++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L +++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSSGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


4C695_01490C695_01720Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_014902163.27013150S ribosomal protein L21
C695_014952163.34665350S ribosomal protein L27
C695_015001153.565391dipeptide ABC transporter periplasmic
C695_015052153.935782dipeptide ABC transporter permease (dppB)
C695_015100153.292514dipeptide ABC transporter permease (dppC)
C695_01515-2143.051082dipeptide ABC transporter ATP-binding protein
C695_01520-2142.806571dipeptide ABC transporter ATP-binding protein
C695_01525-2122.205242GTPase CgtA
C695_01530-1131.549621hypothetical protein
C695_01535-1162.260732hypothetical protein
C695_015400172.745241glutamate-1-semialdehyde aminotransferase
C695_015452172.277980hypothetical protein
C695_015501162.252191hypothetical protein
C695_015551162.554980hypothetical protein
C695_01560-1162.013521hypothetical protein
C695_01565-116-0.222737hypothetical protein
C695_01570-116-1.003386ATP-binding protein
C695_01575-1140.910755nitrite extrusion protein (narK)
C695_015801161.245145hypothetical protein
C695_015852151.365898virulence-associated protein D
C695_015900160.976565hypothetical protein
C695_015950171.676235hypothetical protein
C695_01600-1171.639151hypothetical protein
C695_01605318-0.702194hypothetical protein
C695_01610216-1.138565heme iron utilization protein
C695_01615115-1.177920arginyl-tRNA ligase
C695_01620213-0.692361sec-independent protein translocase protein
C695_01625311-1.294851guanylate kinase
C695_01630311-2.129434poly E-rich protein
C695_01635112-2.109986nuclease NucT
C695_01640114-1.890618hypothetical protein
C695_01645314-1.418195flagellar basal body L-ring protein
C695_01650213-1.245892CMP-N-acetylneuraminic acid synthetase
C695_01655213-0.735007flagellar protein FlaG
C695_016601120.818136tetraacyldisaccharide 4'-kinase
C695_016650151.880346NAD synthetase
C695_01670-1172.422950*ketol-acid reductoisomerase
C695_016750181.499627cell division inhibitor
C695_016801170.526813cell division topological specificity factor
C695_016850190.632273DNA processing chain A (dprA)
C695_01690120-0.383255Holliday junction resolvase-like protein
C695_01695322-1.443646hypothetical protein
C695_01710220-1.500442hypothetical protein
C695_01715227-1.322544hypothetical protein
C695_01720228-0.362921hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01575TCRTETB310.005 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.005
Identities = 37/207 (17%), Positives = 83/207 (40%), Gaps = 1/207 (0%)

Query: 23 VLIPLLILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFL 82
V +P + + P + + A ++ + G+ + LS + ++ + + +
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 83 VCYFDSIPFFWLWIWRFIAGVASSALMILVAPLSLPYVKEHKKALVGGLIFSAVGIGSVF 142
+ + F L + RFI G ++A LV + Y+ + + GLI S V +G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 143 SGFVLPWISSYNIKWAWIFLGGSCLIAFILSLVGLKTRSLRKKSVKKEESAFKIPFHLWL 202
+ I+ Y I W+++ L I + L+ L + +R K + + +
Sbjct: 155 GPAIGGMIAHY-IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVF 213

Query: 203 LLISCALNAIGFLPHTLFWVDYLIRHL 229
++ +I FL ++ ++H+
Sbjct: 214 FMLFTTSYSISFLIVSVLSFLIFVKHI 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01585PF046051025e-32 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 102 bits (256), Expect = 5e-32
Identities = 21/93 (22%), Positives = 45/93 (48%), Gaps = 3/93 (3%)

Query: 3 ALAFDLKIEILKKEYGEPYNKAYDDLRQELELLGFEWTQGSVYVNYSKENTLAQVYKAIN 62
A+ FDL + L+K + + + Y +++ + GFE Q S Y + N +V + +N
Sbjct: 7 AINFDLSTKSLEKYF-KDTREPYSLIKKFMLENGFEHRQYSGYTSKEPIN-ERRVIRIVN 64

Query: 63 KLS-QIEWFKKSVRDIRAFKVEDFSDFTEIVKS 94
KL+ + W + V++ ++ + E ++
Sbjct: 65 KLTKKFTWLGECVKEFDITEIGEQYSLKETIQD 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01625PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01630IGASERPTASE602e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.1 bits (145), Expect = 2e-11
Identities = 51/266 (19%), Positives = 87/266 (32%), Gaps = 19/266 (7%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEE------KEEVKEEEKEEVKEE 193
E+E N Q +E + +E E E V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 194 EKEEVKETPQEEKKPKDDETQEGETLKDKEVSKELEA-PQELEIPKEETQEQDPIKEETQ 252
K+E K + E+ + Q E KE ++A Q E+ + + + +T
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNRE--VAKEAKSNVKANTQTNEVAQ---SGSETKETQTT 1098

Query: 253 ENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTE 312
E KE + ++ E E QE+ K + S QE E Q AE ++ + + E
Sbjct: 1099 ETKETATVEKEEKAKV-ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 313 KTQAQ----ELEVPKEKTQESAEALQETQAHELEKQEIAETPQDVEIPQSQDKEVQELE- 367
+ E P ++T + E + E P++ +Q E
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 368 IPKEETQENTET-PQDVETPQEKETQ 392
PK + + + P +VE
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPATTSSND 1243



Score = 54.3 bits (130), Expect = 1e-09
Identities = 39/252 (15%), Positives = 87/252 (34%), Gaps = 15/252 (5%)

Query: 257 EKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTEKTQA 316
EK+ +T D+ + +Q V + N+ ++ P +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 317 QELEVPKEKTQESAEAL-QETQAHELEKQEIAETPQDVEIPQSQDKE------------- 362
QE + ++ Q++ E Q + + K + Q E+ QS +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 363 VQELEIPKEETQENTETPQ-DVETPQEKETQEDHYESIEDIPEPVMAKAMGEELPFLNEA 421
V++ E K ET++ E P+ + ++E E E E + E N
Sbjct: 1106 VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165

Query: 422 VAKIPNNENDTETPKESVTETSKNENNTETPQEKEESDKTSSPLELRLNLQDLLKSLNQE 481
+ + ++ VTE++ + E + ++ + + K+ ++
Sbjct: 1166 ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225

Query: 482 SLKSLLENKTLS 493
S++S+ N +
Sbjct: 1226 SVRSVPHNVEPA 1237



Score = 50.1 bits (119), Expect = 2e-08
Identities = 27/165 (16%), Positives = 57/165 (34%), Gaps = 3/165 (1%)

Query: 168 QEEKEEVKEEEKEEVKEEEKEEVKEEEKEEVKETPQEEKKPKDDETQEGETLKDKEVSKE 227
E +E + E +E EKEE + E E+ +E P+ + + Q E ++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 228 LEAPQELEIPKEETQEQDPIKEETQENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNG 287
+ ++ P+ +T + Q KE Q + + +V+ + +
Sbjct: 1149 NDPTVNIKEPQSQTNTT---ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 288 QENKEKTQESAEIPQDKEIQEVVTEKTQAQELEVPKEKTQESAEA 332
ES+ P+++ + V + + A
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01645FLGLRINGFLGH1959e-65 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 195 bits (496), Expect = 9e-65
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01655SACTRNSFRASE270.029 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.2 bits (60), Expect = 0.029
Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 103 GETILKALEFIAFE---EFQLHSLHLEVMENNFKAIAFYEKNHYELEG 147
+ + AL A E E L LE + N A FY K+H+ +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


5C695_02165C695_02365Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_02165015-3.376998hypothetical protein
C695_02170217-3.914583hypothetical protein
C695_02175227-5.522528phage/colicin/tellurite resistance cluster terY
C695_02180427-6.247917hypothetical protein
C695_02185421-4.986246hypothetical protein
C695_02190520-5.335243protein phosphatase 2C
C695_02195619-5.793961protein kinase C-like protein
C695_02200720-6.533594hypothetical protein
C695_02220720-6.393825IS605 transposase (tnpA)
C695_02225722-6.409794IS605 transposase (tnpB)
C695_02230723-7.084190hypothetical protein
C695_02235723-7.153555DNA topoisomerase I (topA)
C695_02240625-7.732180VirB4-like protein
C695_02245524-6.062474hypothetical protein
C695_02250219-5.910506hypothetical protein
C695_02255219-5.779940hypothetical protein
C695_02260220-5.779116hypothetical protein
C695_02265421-5.979989hypothetical protein
C695_02270320-5.564156hypothetical protein
C695_02300220-6.469334hypothetical protein
C695_023051027-7.884195hypothetical protein
C695_02320826-7.903464hypothetical protein
C695_02325825-7.562044hypothetical protein
C695_02330724-7.600283hypothetical protein
C695_02335723-7.798916hypothetical protein
C695_02340423-7.476895hypothetical protein
C695_02345421-7.225694protein VirB4
C695_02350019-5.214554VirB7 type IV secretion protein
C695_02355-218-4.127555hypothetical protein
C695_02360-117-3.652341hypothetical protein
C695_02365-116-3.504481type I restriction enzyme S protein (hsdS)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02230PF04335933e-24 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 93.0 bits (231), Expect = 3e-24
Identities = 35/213 (16%), Positives = 72/213 (33%), Gaps = 13/213 (6%)

Query: 144 FEEVRD-ASVIYHLEKKLGDYIFYVACFFFGTTALLIILLTILLPLKQKEPYLVQFSNNK 202
FEE ++ + VA ++ + L PLK EPY++ N
Sbjct: 14 FEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNT 73

Query: 203 ENFALVQ--KADSSITANKALIRSLVGAYVLNRESITHIEQHEKMRQNTIKEQSSNEVWY 260
++ D++IT ++A+ + + YV RE + + + + S+
Sbjct: 74 GEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAR--EEYFDAVMVMSARPEQD 131

Query: 261 EFEKLIA-----HYDSIYTNPLLTRKVKIANI-YLDKDLAYIDIEVSLYHSGELESLKRY 314
+ + +I N V+I + +L ++A + +G +
Sbjct: 132 RWSRFYKTDNPQSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFTKESV-TGSNSTKTDA 189

Query: 315 KVVMSFEFKKQEINFDSMSLNPTGFMVTSYDVT 347
+ ++ NP G+ V SY
Sbjct: 190 VATIKYKVDGTPSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02270adhesinb290.032 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.032
Identities = 30/123 (24%), Positives = 48/123 (39%), Gaps = 12/123 (9%)

Query: 53 ERTPWDLESLLRDYNFVFSNTTGQHNQALERKETPYFDSVIVDEAAKANPLELLMVMALA 112
E P D++ + +F N G + LE +F ++ E AK + ++
Sbjct: 71 EPLPEDVKKT-SQADLIFYN--GIN---LETGGNAWFTKLV--ENAKKKENKDYYAVSEG 122

Query: 113 KERIILVGDDRQL---PHY-LDDEIGKKLEDESQDAQDEIEKALKDSMFKKLKERAQKLK 168
+ I L G + PH L+ E G E + A K++ K LK +KL
Sbjct: 123 VDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLS 182

Query: 169 ELD 171
LD
Sbjct: 183 ALD 185


6C695_02425C695_02510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_02425014-3.459637molybdenum ABC transporter periplasmic
C695_02430012-3.661176molybdenum ABC transporter ModB
C695_02435-19-2.079601molybdenum ABC transporter ATP-binding protein
C695_02440-110-2.109893glutamyl-tRNA ligase
C695_02445-111-2.685576hypothetical protein
C695_02450-211-2.772403adenine-specific DNA methyltransferase
C695_02455013-1.275278hypothetical protein
C695_02460017-0.310190GTP-binding protein TypA
C695_02465224-4.309932adenine-specific DNA methyltransferase
C695_02470018-3.164216type II adenine specific DNA methyltransferase
C695_02475416-0.096997hypothetical protein
C695_024806180.465046Type II DNA modification enzyme
C695_02485116-0.364859type II DNA modification
C695_02490217-0.244364hypothetical protein
C695_024952160.479912catalase-like protein
C695_025003170.515368hypothetical protein
C695_02505314-1.168031hypothetical protein
C695_02510314-1.252926hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02435PF05272300.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.007
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VVALLGESGAGKSTILRILAGLE 52
V L G G GKST++ L GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02460TCRTETOQM1963e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 196 bits (501), Expect = 3e-57
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FNAMDV-GDSVVDPANPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


7C695_02660C695_02810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_02660019-3.193815GTPase Era
C695_02665317-2.954805hypothetical protein
C695_02670619-3.129176hypothetical protein
C695_02675720-2.317964cag pathogenicity island protein (cag1)
C695_02680820-2.108248cag pathogenicity island protein Epsilon
C695_02685819-2.222887cag island protein
C695_02690819-2.089634cag pathogenicity island protein (cag3)
C695_02695916-2.394296cag pathogenicity island protein (cag4)
C695_02700917-2.753506cag pathogenicity island protein Beta
C695_02705920-3.132826cag pathogenicity island protein alpha
C695_02710820-3.377970cag pathogenicity island protein (cag6)
C695_02715821-3.384000hypothetical protein
C695_02720920-3.342676cag pathogenicity island protein (cag7)
C695_02725926-4.249027cag pathogenicity island protein (cag8)
C695_027301030-4.372210cag pathogenicity island protein (cag9)
C695_027351431-5.097389cag pathogenicity island protein (cag10)
C695_027401435-5.267525cag pathogenicity island protein (cag11)
C695_027451228-5.255853cag pathogenicity island protein T
C695_027501024-5.514644cag pathogenicity island protein T
C695_027551023-5.416532cag pathogenicity island protein S
C695_02760620-4.045374cag pathogenicity island protein R
C695_02765619-2.847340hypothetical protein
C695_02770519-2.819240cag pathogenicity island protein (cag16)
C695_02775620-3.129008cag pathogenicity island protein (cag17)
C695_02780519-2.938323cag pathogenicity island protein (cag18)
C695_02785520-3.164902cag pathogenicity island protein (cag19)
C695_02790520-3.203122cag pathogenicity island protein (cag20)
C695_02795622-4.132104cag pathogenicity island protein (cag21)
C695_02800622-3.419278cag pathogenicity island protein (cag22)
C695_02805420-2.486568cag pathogenicity island protein (cag23)
C695_02810218-1.018586cag pathogenicity island protein (cag24)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02660PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYDSQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02690PF07201300.019 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.019
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLVANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02720IGASERPTASE350.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 0.003
Identities = 33/173 (19%), Positives = 71/173 (41%), Gaps = 23/173 (13%)

Query: 970 KARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEARKL 1029
+ NE+ + E + P A ++ + + ++KT + ++ T + R++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 1030 LEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEQQVLDCLKNAKTEADKKRCV 1089
+EAK +VKA A++ E KE + T E + +++ AK E +K + V
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1090 KDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEEAKE 1142
KV ++ S K + ++E + + E + ++E +
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQS 1160



Score = 35.4 bits (81), Expect = 0.003
Identities = 30/179 (16%), Positives = 65/179 (36%), Gaps = 12/179 (6%)

Query: 713 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDC 772
P ++ + + ++KT + ++ T + ++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 773 VSRARNEKEKKECEKLLTPEAKKLLEQQALDCLKNAKTDKERKKCLKDLPKDLQKKVLAK 832
A++ E KE + T E + +++ KT + K + PK Q + +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 833 ESVKAYLDCVSQAKTEAEKKECEKYLDCVSQAKNEAEKKECEKLLTLESKKKLEEAKKS 891
QA+ E + SQ A+ ++ K + ++ + E+
Sbjct: 1142 -----------QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189



Score = 34.7 bits (79), Expect = 0.004
Identities = 25/191 (13%), Positives = 62/191 (32%), Gaps = 10/191 (5%)

Query: 842 VSQAKTEAEKKECEKYLDCVSQAKNEAEKKECEKLLTLESKKKLEEAKKSVKAYLDCVSQ 901
+ +E + + A E + + SK++ + +K+ Q
Sbjct: 1005 ADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--------EQ 1056

Query: 902 AKTEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESL 961
TE + E ++ Q + ++ + + ++K+ AK
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET 1116

Query: 962 KAYKDCVSKARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKL 1021
+ ++ K+E + + P+A+ E + SQ T A+ ++ K
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND--PTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 1022 LTPEARKLLEE 1032
+ + + E
Sbjct: 1175 TSSNVEQPVTE 1185



Score = 34.7 bits (79), Expect = 0.005
Identities = 33/185 (17%), Positives = 75/185 (40%), Gaps = 8/185 (4%)

Query: 968 VSKARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDC--VSQAKTEAEKKECEKLLTPE 1025
SK + E+ E T + +++ +EAK +VKA V+Q+ +E ++ + +
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 1026 ARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEQQVLDCLKNAK----T 1081
+ E+AK + ++ K+E + + P+A+ E +K + T
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1082 EADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEEAK 1141
AD ++ K+ ++++ V +V V N + + + K +
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHR 1224

Query: 1142 ESLKA 1146
S+++
Sbjct: 1225 RSVRS 1229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02725TYPE4SSCAGX8750.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 875 bits (2262), Expect = 0.0
Identities = 514/522 (98%), Positives = 518/522 (99%)

Query: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAAALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
MGQAFFKKIVGCFCLGYLFLSSAIEA ALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EETIKQRAKDKINIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EE ++QRAKDKI+IKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQKELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQ+ELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02735PF043351188e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (297), Expect = 8e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02775TYPE4SSCAGX280.042 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 28.2 bits (62), Expect = 0.042
Identities = 28/119 (23%), Positives = 55/119 (46%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTLHQRHDDEEVTKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K + D +E+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSIQKKAAKHKGLQELNEINATPLNDNPNSNSSTETKSNKDDNFDEM 142
QK+ K +++A L+ L + P N + N N S K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02805ACRIFLAVINRP320.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.012
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFKKIEELKADMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


8C695_03470C695_03595Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_03470015-3.671497hypothetical protein
C695_03475014-2.208476hypothetical protein
C695_03480014-2.604759aspartate aminotransferase
C695_03485011-2.256844hypothetical protein
C695_03490-1120.153635integrase-recombinase protein
C695_034952110.636543methylated-DNA--protein-cysteine
C695_035001111.220295hypothetical protein
C695_035050121.699333putative lipopolysaccharide biosynthesis
C695_035100121.720593ribonucleotide-diphosphate reductase subunit
C695_035155141.079586hypothetical protein
C695_035203100.680666hypothetical protein
C695_035251110.197756bifunctional N-acetylglucosamine-1-phosphate
C695_035301100.920697flagellar biosynthesis protein
C695_035351111.053150flagellar biosynthesis protein FliP
C695_035401111.611593iron(III) dicitrate transport protein (fecA)
C695_03545-2112.153537ferrous iron transport protein B
C695_03550-1143.053456hypothetical protein
C695_035551144.331793acetyl coenzyme A acetyltransferase
C695_035602163.553978succinyl-CoA-transferase subunit A
C695_035653153.994508succinyl-CoA-transferase subunit B
C695_035704153.617762short-chain fatty acids transporter
C695_035804163.017675hypothetical protein
C695_035904152.907681hydantoin utilization protein A (hyuA)
C695_035952131.972905Hydantoin utilization protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03530FLGBIOSNFLIP754e-20 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein

FliP signature.
Length = 245

Score = 75.3 bits (185), Expect = 4e-20
Identities = 33/97 (34%), Positives = 49/97 (50%), Gaps = 3/97 (3%)

Query: 1 MRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPHSNF-SLALF 96
L+MTSFTR+I+VF LR ALGT PP+ LALF
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALF 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03535FLGBIOSNFLIP1744e-58 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 174 bits (442), Expect = 4e-58
Identities = 70/127 (55%), Positives = 98/127 (77%)

Query: 1 MDKKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMI 60
++KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++
Sbjct: 117 SEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVT 176

Query: 61 SELKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLT 120
SELKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL
Sbjct: 177 SELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLV 236

Query: 121 ENLVASF 127
+L SF
Sbjct: 237 GSLAQSF 243


9C695_03680C695_03760Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_03680313-0.675718RNA polymerase factor sigma-54
C695_036852140.327007putative ABC transporter ATP-binding protein
C695_036902120.118204hypothetical protein
C695_036951130.939341DNA polymerase III subunits gamma and tau
C695_037001142.766616hypothetical protein
C695_037052143.883379hypothetical protein
C695_037103163.658791hypothetical protein
C695_037153163.540337hypothetical protein
C695_037202153.233233hypothetical protein
C695_037251132.490218L-asparaginase II
C695_037300120.949210anaerobic C4-dicarboxylate transporter
C695_03735013-0.212185hypothetical protein
C695_03740113-1.645059hypothetical protein
C695_03745215-3.154402transcriptional regulator
C695_03750213-2.779082tRNA(Ile)-lysidine synthetase
C695_03755213-2.555609hypothetical protein
C695_03760211-1.384075hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03695IGASERPTASE340.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.002
Identities = 19/63 (30%), Positives = 27/63 (42%), Gaps = 1/63 (1%)

Query: 510 SKPKPTTETTAETKEKETKEKEIQENDTKEIQEVQPKQAPTALQEFMANHSEL-IEEIKS 568
+ P TTET AE ++E+K E E D E + A A AN + + S
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 569 EFE 571
E +
Sbjct: 1091 ETK 1093


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03710SECA240.038 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 23.7 bits (51), Expect = 0.038
Identities = 8/17 (47%), Positives = 12/17 (70%)

Query: 8 ELEEKTKGLSDEEIKAK 24
+E + + LSDEE+K K
Sbjct: 30 AMEPEMEKLSDEELKGK 46


10C695_04475C695_04700Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_044752142.560373alkylphosphonate uptake protein
C695_044802152.680021hypothetical protein
C695_044853142.736855hypothetical protein
C695_044903142.328224catalase
C695_044952151.905739iron-regulated outer membrane protein
C695_04500321-2.053656Holliday junction resolvase
C695_04505625-3.725763hypothetical protein
C695_04510318-1.803472hypothetical protein
C695_04515015-2.526841hypothetical protein
C695_04520313-1.833375hypothetical protein
C695_04525312-1.264168hypothetical protein
C695_04530312-1.095958hypothetical protein
C695_045352120.118964Holliday junction DNA helicase RuvA
C695_045402120.257150hypothetical protein
C695_045451141.341518virulence factor MviN protein
C695_045500151.970690cysteinyl-tRNA ligase
C695_045651172.248283iron compounds ABC transporter ATP-binding
C695_045700192.234647iron(III) dicitrate ABC transporter permease
C695_04575013-1.015359short-chain oxidoreductase
C695_045802150.854209hypothetical protein
C695_045852170.841324hypothetical protein
C695_045900162.158664hypothetical protein
C695_045951162.718545hypothetical protein
C695_046000152.812971hypothetical protein
C695_046051163.288334hypothetical protein
C695_046100152.392720**hypothetical protein
C695_046150151.796854hydrogenase isoenzymes formation protein HypD
C695_046200170.456845hydrogenase expression/formation protein
C695_04625-1170.680490hydrogenase expression/formation protein HypB
C695_04630-3180.835909hypothetical protein
C695_046352161.804096hypothetical protein
C695_046401152.047414acetate kinase
C695_046451142.522278acetate kinase A/propionate kinase 2
C695_046501151.692701phosphotransacetylase
C695_046652140.861018phosphotransacetylase (pta)
C695_046700140.732563hypothetical protein
C695_04675-2131.458710flagellar basal body rod modification protein
C695_04680-1141.873795flagellar hook protein FlgE
C695_04685-1121.474365hypothetical protein
C695_04690-1122.211045adenine-specific DNA methyltransferase
C695_046950122.790335rep helicase, single-stranded DNA-dependent
C695_047002143.524116hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04550OMS28PORIN300.015 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.015
Identities = 17/51 (33%), Positives = 32/51 (62%), Gaps = 4/51 (7%)

Query: 309 EEDLLVSKKRLDKIYRLKQRVLGTLGGINPNFKKEILECMQDDLNVSKALS 359
+E L+ S++ LD+ + Q+VL + G+NP+ K ++L +V+KA+S
Sbjct: 188 KETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVLA----KKDVAKAIS 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04575DHBDHDRGNASE932e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.8 bits (230), Expect = 2e-24
Identities = 60/243 (24%), Positives = 110/243 (45%), Gaps = 12/243 (4%)

Query: 1 MGEKKESQKVAVITGASSGIGLECALMLLDQGYKVYALSRHATLCVALNHALC------E 54
M K K+A ITGA+ GIG A L QG + A+ + + +L E
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 55 SVDIDVSDSNALKEVFSNISAKEKYCDVLINSAGYGVFGSVEDTPIEEVKKQFSVNFFAL 114
+ DV DS A+ E+ + I + D+L+N AG G + EE + FSVN +
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 115 CEVVQFCLPLLKNKPHSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNV 174
+ + ++ I + S V + Y++SK A ++ L LEL +N+
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 175 QVCLIEPGPVKSNWEKTAFSVENFESEDSLYALEVNAAKSFYSGVYQNALS-PKAVAQKI 233
+ ++ PG +++ + + ++ EN + + + ++F +G+ L+ P +A +
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQ-----VIKGSLETFKTGIPLKKLAKPSDIADAV 235

Query: 234 VFL 236
+FL
Sbjct: 236 LFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04640ACETATEKNASE1235e-37 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 123 bits (310), Expect = 5e-37
Identities = 48/117 (41%), Positives = 72/117 (61%), Gaps = 2/117 (1%)

Query: 1 MRNIEARK-EKGDKEAKLAFEMCAYRIKKHIGAYMVVLKKVDAIIFTGGLGENYSALRES 59
R++E + GDK A+LA + AYR+KK IG+Y + VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 60 VCEGLENLGIALCKPTNDNPGSGLVNLSQPDAKIQILRIPTDEELEIALQTKKVLEK 116
+ +GLE LG L K N G + +S D+K+ ++ +PT+EE IA T+K++E
Sbjct: 343 ILDGLEFLGFKLDKEKNKVRGEEAI-ISTADSKVNVMVVPTNEEYMIAKDTEKIVES 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04645ACETATEKNASE353e-123 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 353 bits (907), Expect = e-123
Identities = 142/282 (50%), Positives = 192/282 (68%), Gaps = 6/282 (2%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKFVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVNEKVMQEIGNLS 118
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 119 ILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHATMPSYAYMYALPYELYEKYQIRHY 178
LAPLHNPAN+ GI+ + P +P +AVFDTAFH TMP YAY+Y +PYE Y KY+IR Y
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKY 176

Query: 179 GFHRTSHHYVAKEAAKFLNTAYEEFNAISLHLGNGSSAAAIQKGKSVDTSMGLTPLEGLI 238
GFH TSH YV++ AA+ LN E I+ HLGNGSS AA++ GKS+DTSMG TPLEGL
Sbjct: 177 GFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLA 236

Query: 239 MGTRCGDIDPTVVEYTAQCANKSLEEVMKMLNHESGLKGICG 280
MGTR G IDP+++ Y + N S EEV+ +LN +SG+ GI G
Sbjct: 237 MGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04670IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 5e-04
Identities = 40/227 (17%), Positives = 70/227 (30%), Gaps = 11/227 (4%)

Query: 294 KKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANPPPNDNIPTPLEKEEKAKE 353
+ Q PS N + P+ P A TP E E E
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA---------TPSETTETVAE 1042

Query: 354 ASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKPPMSRI 413
S + KT E + Q + + KS T + Q E +E + ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 414 SMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNALGFEGVDLSF 473
+ + +E KVE +K + KV+ Q+ Q N +
Sbjct: 1103 TATVEKEEKAKVET--EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 474 SQDSSKEQQAPKDQPKEPFKEQELTPLKENALKSYQENTDNENQETS 520
+++ + + P + ++ N S EN +N T+
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207



Score = 30.4 bits (68), Expect = 0.025
Identities = 48/240 (20%), Positives = 82/240 (34%), Gaps = 18/240 (7%)

Query: 184 PKTLKDIQTLSQKHDLNASNIQAATTPENKN------PLNASDQLALKTTQTPTNHTLAK 237
P+ K QT+ + +NIQA N A T + T T+A+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 238 NDAKNTANLSSVLQSLEKKEPQNKEHANPLNNEKKTPPLK--------EALEMNAIKRDK 289
N + + + Q + QN+E A + K E E + +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 290 TLSKKKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANP-PPNDNIPTPLEKE 348
T + +K EK + + P T + +PK + A P ND E +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPK---QEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 349 EKAKEASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKP 408
+ +D ++ KETS++ + + T ++ P+ T TQ KP
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04680FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


11C695_05040C695_05200Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_050403151.192560cell division protein FtsA
C695_050454180.302152cell division protein FtsZ
C695_05050318-2.028017hypothetical protein
C695_05055418-2.486412hypothetical protein
C695_05060421-2.715639hypothetical protein
C695_05065319-2.894563hypothetical protein
C695_05070120-4.042340hypothetical protein
C695_05075022-4.796729hypothetical protein
C695_05080427-6.567405hypothetical protein
C695_05085423-4.839927hypothetical protein
C695_05090322-5.318765hypothetical protein
C695_05095422-5.234403hypothetical protein
C695_05100624-5.892725hypothetical protein
C695_05105723-6.403680IS605 transposase (tnpA)
C695_051101024-6.383694IS605 transposase (tnpB)
C695_051151024-6.366722hypothetical protein
C695_051301023-6.318538hypothetical protein
C695_051351022-6.088596hypothetical protein
C695_051401020-5.066528integrase/recombinase (xerD)
C695_05145918-4.425497relaxase
C695_05150417-3.623438IS605 transposase (tnpB)
C695_05155315-4.285240IS605 transposase (tnpA)
C695_05160515-4.080184adenine specific DNA methyltransferase
C695_05165415-4.287132PARA protein
C695_05170417-4.367039hypothetical protein
C695_05175319-5.833838hypothetical protein
C695_05190424-6.771477hypothetical protein
C695_05195323-6.678951hypothetical protein
C695_05200012-3.660627conjugal transfer protein (traG)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05040SHAPEPROTEIN423e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 42.1 bits (99), Expect = 3e-06
Identities = 38/176 (21%), Positives = 67/176 (38%), Gaps = 12/176 (6%)

Query: 210 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 263
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 264 HMLNTPFPYAEEVKIKYGDLSFEGGEETPSQNVQIPTTGSDGHESHIVPLSEIQTIMRER 323
+ AE +K + G S G+E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 324 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 376
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


12C695_05695C695_05815Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_05695291.737133glucokinase
C695_057003100.908499cinnamyl-alcohol dehydrogenase ELI3-2 (cad)
C695_057051111.814526LPS biosynthesis protein
C695_057102122.566375hypothetical protein
C695_057150142.990682hypothetical protein
C695_057200122.577841pyruvate flavodoxin oxidoreductase subunit
C695_05725-1112.324365pyruvate flavodoxin oxidoreductase subunit
C695_05730-1111.792141pyruvate flavodoxin oxidoreductase subunit
C695_05735011-0.067361ferrodoxin oxidoreductase beta subunit
C695_05740212-0.601803adenylosuccinate lyase
C695_05745315-1.132474hypothetical protein
C695_05750115-0.015657excinuclease ABC subunit B
C695_05755215-0.321152hypothetical protein
C695_05760214-0.334986hypothetical protein
C695_05765-1150.036806hypothetical protein
C695_05770-1140.222332hypothetical protein
C695_05775014-0.070892gamma-glutamyltranspeptidase
C695_05780-113-1.132643flagellar hook-associated protein FlgK
C695_05785016-1.785503hypothetical protein
C695_05790119-1.224688cytosine specific DNA methyltransferase
C695_05795216-0.847544hypothetical protein
C695_05800214-1.841524hypothetical protein
C695_05805315-1.745184peptidyl-prolyl cis-trans isomerase
C695_05810317-2.454588hypothetical protein
C695_05815315-2.109638peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05720YERSSTKINASE290.010 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.3 bits (65), Expect = 0.010
Identities = 18/63 (28%), Positives = 33/63 (52%), Gaps = 9/63 (14%)

Query: 50 YNRVDDEPILNHERFMQPDYVLVIDPGLVFIENIFANEKEDTTYIITSYLNKEELFEKKP 109
++R ++P E F P+ + + N+ A+EK D ++++ L+ E FEK P
Sbjct: 293 HSRSGEQPKGFTESFKAPE---------LGVGNLGASEKSDVFLVVSTLLHCIEGFEKNP 343

Query: 110 ELK 112
E+K
Sbjct: 344 EIK 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05780FLGHOOKAP15700.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 570 bits (1471), Expect = 0.0
Identities = 129/610 (21%), Positives = 229/610 (37%), Gaps = 75/610 (12%)

Query: 6 SSLNTSYTGLQAHQSMVDVTGNNISNASDEFYSRQRVIAKPQAAYMYGTKNVNMGVDVEA 65
S +N + +GL A Q+ ++ NNIS+ + Y+RQ I + + V GV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 66 IERVHDEFVFARYTKANYENTYYDTEFSHLKEASAYFPDIDEASLFTDLQDYFNSWKELS 125
++R +D F+ + A +++ + + + +SL T +QD+F S + L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML-STSTSSLATQMQDFFTSLQTLV 120

Query: 126 KNAKDSAQKQALAQKTEALTHNIKDTRERLTTLQHKASEELKSVIKEVNSLGSQIAEINK 185
NA+D A +QAL K+E L + K T + L + + + + + ++N+ QIA +N
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 186 RIKEVENNKSLKHANELRDKRDELEFHLRELLGGNVFKSSIKTHSLTDKDSADFDESYNL 245
+I + + N L D+RD+L L +++G V S +YN+
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEV--------------SVQDGGTYNI 226

Query: 246 NIGHGFNIIDGSIFHPLVVKESENKGGLNQVYFQSDDFKVTNITDK-LNQGRVGALLNVY 304
+ +G++++ GS L S V + I +K LN G +G +L
Sbjct: 227 TMANGYSLVQGSTARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFR 286

Query: 305 NDGSNGTLKGKLQDYIDLLDSFAKGLIESTNAIYAQSASHYIEGEPVEFNSDEAFKDTNY 364
+ L + L A E+ N + +A D N
Sbjct: 287 SQ--------DLDQTRNTLGQLALAFAEAFNTQH------------------KAGFDANG 320

Query: 365 NIKNGSFDL----IAYNTDGKEIARKTIAITPITTMNDIIQAINANTDDNQ-----DNNT 415
+ F + + NT K +T + + I+ + + Q N T
Sbjct: 321 DAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQWQVTRLASNTT 380

Query: 416 ENDFDDYFTAGFNNETKKFVIQPKNASQGLFVSMKDNGTNFMGALKLNPFFQGDDASNIS 475
D + + + + + M L D + I+
Sbjct: 381 FTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLI-------TDEAKIA 433

Query: 476 LNKEYKKEPTTIRPWLAPINGNFDVANMMQQLQYDSVDFYNDKFDIKPMKISEFYQFLTG 535
+ E E G+ D N L S N K ++ Y L
Sbjct: 434 MASE---EDA----------GDSDNRNGQALLDLQS----NSKTVGGAKSFNDAYASLVS 476

Query: 536 KINTDAEKSGRILDTKKSMLETIKKEQLSISQVSVDEEMVNLIKFQSGYAANAKVITAID 595
I T+ +++ + +Q SIS V++DEE NL +FQ Y ANA+V+ +
Sbjct: 477 DIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTAN 536

Query: 596 RMIDTLLGIK 605
+ D L+ I+
Sbjct: 537 AIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05810GPOSANCHOR330.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.002
Identities = 25/145 (17%), Positives = 49/145 (33%)

Query: 27 GATKKELKQLQINSKNFSNILTKIHSQVEANTQAQEGLRSVYEGQANKIKDLNNAILSQE 86
K + ++ +A EG + + KIK L + E
Sbjct: 200 EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALE 259

Query: 87 ESLRALKASQEVQANTLKQQSQTLEDLRNEIHANQQAIQQLDKQNKEMSELLTKLSQDLV 146
L+ + E N S ++ L E A + L+ Q++ ++ L +DL
Sbjct: 260 ARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLD 319

Query: 147 SQIALIQKALKEQEEKAEKPLKSNA 171
+ ++ E ++ E+ S A
Sbjct: 320 ASREAKKQLEAEHQKLEEQNKISEA 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05815OMPADOMAIN1476e-46 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (373), Expect = 6e-46
Identities = 48/169 (28%), Positives = 75/169 (44%), Gaps = 24/169 (14%)

Query: 22 KMDNKTVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAVESGTIIASIYFDF 80
+ DN ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKC-----AQKTR----ECYKENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


13C695_05870C695_05895Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_05870018-3.682516F0F1 ATP synthase subunit B'
C695_05875118-3.288553plasmid replication-partition related protein
C695_05880218-3.741531SpoOJ regulator
C695_05885220-4.955792biotin--protein ligase
C695_05890221-5.558658methionyl-tRNA formyltransferase
C695_05895222-5.840890hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05880PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 69 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 125
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 126 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 170
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_05890FERRIBNDNGPP300.008 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 30.3 bits (68), Expect = 0.008
Identities = 12/33 (36%), Positives = 21/33 (63%)

Query: 70 EPEVQILKDLKPNFIVVVAYGKILPKEVLTIAP 102
EP +++L ++KP+F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


14C695_07165C695_07220Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_07165217-2.833283ribulose-phosphate 3-epimerase
C695_07170219-3.672494DNA polymerase III subunit epsilon
C695_07175417-6.345288hypothetical protein
C695_07180613-3.980801hypothetical protein
C695_07185411-2.194419hypothetical protein
C695_07190512-2.354036hypothetical protein
C695_07195111-1.719153hypothetical protein
C695_07200013-1.989876fibronectin/fibrinogen-binding protein
C695_07205-211-1.087274DNA repair protein (recN)
C695_07210113-0.003414inorganic polyphosphate/ATP-NAD kinase
C695_07215011-0.284811hypothetical protein
C695_072202122.503856hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07200FbpA_PF058331125e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 112 bits (282), Expect = 5e-29
Identities = 74/361 (20%), Positives = 143/361 (39%), Gaps = 31/361 (8%)

Query: 97 AKDLAYKSENFILRLEMIPKKANLMILDKEKCVIEA--FRFNDRVAKNDILGALPPN-IY 153
+ ++ ++ +N + L + K + + I++ F FN N +G N +
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 154 EHQEEDLDFKGLLDILEKDFLFYQHKE----LEHKKNQIIKRLNAQKERLKEKLEKLEDP 209
+ + + + +LE FY K+ L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLEN---FYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNT 325

Query: 210 KNLQLEAKELQTQASLLLTYQHLIHKHESRVVLKDFED---KERAIEIDKSMPLNAFINK 266
+ + LL + + K S + L ++ I +D++ + +
Sbjct: 326 LKKCEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQS 385

Query: 267 KFTLSKKKKQKSQFLYLEEENLKEKIAFKENQINYVKGAQEESVLE------------MF 314
+ K K+ + + +E++ + + + + A +E F
Sbjct: 386 YYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKF 445

Query: 315 MPSKNSKIKRPMSGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLI 373
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+I
Sbjct: 446 KKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVI 505

Query: 374 VFCQKNAPKDDVIMELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRT 429
V + P + ++E A + K +S +DYT+ K VK GA VIYS +T
Sbjct: 506 VKNIMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQT 564

Query: 430 I 430
I
Sbjct: 565 I 565



Score = 35.2 bits (81), Expect = 5e-04
Identities = 20/92 (21%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 SAPYIGLSKKPPESVLKNTLALDFCLNKFTRNAKILQANIIDNDRI--LEINGAKDLAYK 103
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SENFILRLEMIPKKANLMILDK-EKCVIEAFR 134
S + L +E++ + +N+ ++ K + ++++ +
Sbjct: 114 SI-YSLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


15C695_07610C695_07670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_076102100.763301hypothetical protein
C695_07615180.363949hypothetical protein
C695_07620190.399412branched-chain amino acid aminotransferase
C695_07625111-0.319719hypothetical protein
C695_07630112-0.502616DNA polymerase I
C695_07635-1150.146791type IIS restriction enzyme R protein (BCGIB)
C695_076400150.487954type IIS restriction enzyme M protein (mod)
C695_076454170.662652hypothetical protein
C695_076503130.423273thymidylate kinase
C695_076553120.161737phosphopantetheine adenylyltransferase
C695_076602120.4047743-octaprenyl-4-hydroxybenzoate carboxy-lyase
C695_076653120.153491hypothetical protein
C695_076702120.153674flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07650BLACTAMASEA270.030 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 27.4 bits (61), Expect = 0.030
Identities = 11/45 (24%), Positives = 21/45 (46%), Gaps = 7/45 (15%)

Query: 22 DRFKNALFTKEPGGTR-------MGESLRRIALNENISELARAFL 59
DR++ L PG R M +LR++ ++ +S ++ L
Sbjct: 159 DRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQL 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07655LPSBIOSNTHSS2241e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 224 bits (573), Expect = 1e-78
Identities = 64/147 (43%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLDERLKMIQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ I A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


16C695_07825C695_07975Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_078252132.935912saccharopine dehydrogenase
C695_078302142.959479ferrodoxin-like protein
C695_078351122.042535glycerol-3-phosphate acyltransferase PlsY
C695_07840-1121.842866hypothetical protein
C695_07845-1110.700724hypothetical protein
C695_07850-111-0.831382iron-regulated outer membrane protein
C695_07855015-3.595285hypothetical protein
C695_07860011-3.536496selenocysteine synthase
C695_07865010-3.413086transcription elongation factor NusA
C695_07870110-4.608055hypothetical protein
C695_07875110-3.638419type IIS restriction enzyme R and M protein
C695_07880113-3.949535hypothetical protein
C695_07895212-3.469057type III restriction enzyme R protein (res)
C695_07900213-3.274944type III R-M system modification enzyme
C695_07905112-2.936451type III DNA modification enzyme
C695_07910013-1.781367ATP-dependent DNA helicase RecG
C695_07915016-1.106417hypothetical protein
C695_07920-114-0.858199hypothetical protein
C695_07925-112-1.121449exodeoxyribonuclease III
C695_07930112-0.177337*hypothetical protein
C695_07935316-0.040872hypothetical protein
C695_079402140.300457chromosomal replication initiation protein
C695_07945116-0.670828purine nucleoside phosphorylase (punB)
C695_07950115-1.082354hypothetical protein
C695_07955115-1.112842glucosamine--fructose-6-phosphate
C695_07960216-3.430598FAD-dependent thymidylate synthase
C695_07965115-2.195738hypothetical protein
C695_07970013-0.848856IS605 transposase (tnpB)
C695_079752110.715729IS605 transposase (tnpA)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07840PF08280260.032 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 26.3 bits (58), Expect = 0.032
Identities = 13/38 (34%), Positives = 21/38 (55%)

Query: 76 LSHALKTRYKEITELYLKISKLEISPNSQVGASVKIRY 113
LS++ R +E L+ +L++S N VG +IRY
Sbjct: 147 LSNSSAYRMREALIPLLRNFELKLSKNKIVGEEYRIRY 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07940HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


17C695_00180C695_00205N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_00180-2130.402990hypothetical protein
C695_00185-2130.356021conjugal plasmid transfer system protein
C695_00190-2141.147917ComB10 competence protein
C695_00195-1121.343706mannose-6-phosphate isomerase
C695_00200-2121.620733GDP-D-mannose dehydratase
C695_00205-1131.490476nodulation protein (nolK)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00180PF043351322e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 132 bits (334), Expect = 2e-40
Identities = 37/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYVQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00185TYPE4SSCAGX300.017 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.017
Identities = 27/110 (24%), Positives = 51/110 (46%), Gaps = 18/110 (16%)

Query: 155 FIEDKNYYSNAFLKPQKENMAENAPKDAPTNNKPLKEEKEETKEKEEETITIGDNTNAMK 214
I+ +N + A++ N A + N + ++EEK++ + + + NA+K
Sbjct: 339 LIKQENLNTTAYI-----NRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALK 393

Query: 215 IVKKDIQKGYKALKSSQRKWYCLGICSKKSKLSLMPKEIFNDKQFTYFKF 264
+ + + Y ++ + K+SK +MP EIF+D FTYF F
Sbjct: 394 --RNPVPRNYNYYQAPE----------KRSK-HIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00195FLGMRINGFLIF310.015 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 30.7 bits (69), Expect = 0.015
Identities = 17/88 (19%), Positives = 32/88 (36%), Gaps = 3/88 (3%)

Query: 272 ALFEEAANEPKENVSLNQTPVFAKESENNLVFSHKVSAL---LGVENLAVIDTKDALLIA 328
+LF P +V++ P A + H VS+ L N+ ++D LL
Sbjct: 162 SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQ 221

Query: 329 HKDKAKDLKALVNEVETNNQELLQTHTK 356
+DL + + + +Q +
Sbjct: 222 SNTSGRDLNDAQLKFANDVESRIQRRIE 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00200NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.5 bits (217), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00205NUCEPIMERASE534e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 52.9 bits (127), Expect = 4e-10
Identities = 51/346 (14%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELC-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDSGVKKA 102
D++ + + R + ++ + Y NL L + + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKEFAMWGDGTARREYLNAKDLARFIS 222
+YG + + P + T + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYENIASIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDYKGVFVKD 265
+ I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QRALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


18C695_00565C695_00595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_00565017-1.409970flagellin B
C695_00570013-1.701490DNA topoisomerase I
C695_00575117-2.222551hypothetical protein
C695_00580115-1.496508hypothetical protein
C695_00585012-0.740670hypothetical protein
C695_00590-1100.628158hypothetical protein
C695_005950131.873819phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00565FLAGELLIN2843e-92 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 284 bits (728), Expect = 3e-92
Identities = 130/519 (25%), Positives = 221/519 (42%), Gaps = 18/519 (3%)

Query: 2 SFRINTNIAALTSHAVGVQNNRDLSSSLEKLSSGLRINKAADDSSGMAIADSLRSQSANL 61
+ INTN +L + ++ LSS++E+LSSGLRIN A DD++G AIA+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIRNANDAIGMVQTADKAMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLE 121
QA RNAND I + QT + A++E L ++ +VQA + +++Q +IQ+ LE
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKASIGSTSSDKIGHVRMETSSFSG 181
E+D ++N T FNG ++LS + Q+GA T+ + +G +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVN---- 175

Query: 182 AGMLASAAAQNLTEVGLNFKQVNGVNDYKIETVRISTSAGTGIGALSEIINRFSNTLGVR 241
+ ++ +FK V G + Y + + +G + + V
Sbjct: 176 -----GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVN 230

Query: 242 ASYNVMATG----GTPVQSGTVRELTINGVEIGTVNDVHKNDADGRLTNAINSVKDRTGV 297
A+ + T T V + T E + K +G T V
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD-TFDYKGVTFTIDT 289

Query: 298 EASLDIQGRINLHSIDGRAISVHAASASGQVFGGGNFAGISGTQHAVIGRLTLTRTDARD 357
+ D G+++ +I+G +++ A + S D +
Sbjct: 290 KTGNDGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 358 IIVSGVNFSHVGFHSAQGVAEYTVNLRAVRGIFDANVASAAGANANGAQAETNSQGIGAG 417
S ++ +G ++ TVN + + AG + + +
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 418 --VTSLKGAMIVMDMADSARTQLDKIRSDMGSVQMELVTTINNISVTQVNVKAAESQIRD 475
+ K + DSA +++D +RS +G++Q + I N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 476 VDFAEESANFSKYNILAQSGSFAMAQANAVQQNVLRLLQ 514
D+A E +N SK IL Q+G+ +AQAN V QNVL LL+
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00585IGASERPTASE553e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.5 bits (133), Expect = 3e-10
Identities = 29/184 (15%), Positives = 59/184 (32%), Gaps = 10/184 (5%)

Query: 96 DDQSKKEVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNI 155
+ EVA++ E + + K + ++EK K E E KT++ + TS +
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETK------ETATVEKEEKAKVETE--KTQEVPKVTSQV 1129

Query: 156 E-TNNQIKVEQKQQKTEQEKQKTEQEKQKTEQE-KQKTEQEKQKTSNIETNNQIKVEQKQ 213
Q + Q Q + +E T K+ Q ++ K ++ +
Sbjct: 1130 SPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189

Query: 214 QKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQ 273
+ E T Q +E + + + + + T ++
Sbjct: 1190 NTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249

Query: 274 KDLV 277
DL
Sbjct: 1250 CDLT 1253



Score = 54.7 bits (131), Expect = 6e-10
Identities = 38/231 (16%), Positives = 79/231 (34%), Gaps = 14/231 (6%)

Query: 102 EVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQI 161
T+ AEN++ + + + T Q ++ ++ K + Q T+ + +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ-TNEVAQSGSE 1091

Query: 162 KVEQKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIE-TNNQIKVEQKQQKTEQEK 220
E + +T++ ++EK K E E KT++ + TS + Q + Q Q + +E
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETE--KTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 221 QKTEQEKQKTEQE-KQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQKDLVKY 279
T K+ Q ++ K ++ + + NT +
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ- 1208

Query: 280 AEQNCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTPIQPKHLPN 330
E+ N+ K V + +PA T+ L +
Sbjct: 1209 -PTVNSESSNK-------PKNRHRRSVRSVPHNVEPATTSSNDRSTVALCD 1251



Score = 52.4 bits (125), Expect = 3e-09
Identities = 27/188 (14%), Positives = 66/188 (35%), Gaps = 4/188 (2%)

Query: 98 QSKKEVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNIET 157
Q+++ E + + + E ++ +T + K+ EK++ + + + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 158 NNQIKVEQKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQKQQKTE 217
+V KQ+++E + + E ++ K Q + T+ + ++
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183

Query: 218 QEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQ----QKTEQEKQKTNNTQ 273
E E + T Q T N E++N+ K + E T++
Sbjct: 1184 TESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 274 KDLVKYAE 281
+ V +
Sbjct: 1244 RSTVALCD 1251



Score = 50.1 bits (119), Expect = 2e-08
Identities = 44/232 (18%), Positives = 79/232 (34%), Gaps = 12/232 (5%)

Query: 109 EAENARDRANKSGIE-LEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNI---ETNNQIKVE 164
E E + + I Q E+ + E + ET +
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 165 QKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQI--KVEQKQQKTEQEKQK 222
KQ+ EK E+ TE Q E K+ SN++ N Q + + E + +
Sbjct: 1044 SKQESKTVEK----NEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 223 TEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQKDLVKYAEQ 282
T++ ++EK K E EK + T +Q+ +QEQ +T Q Q + D ++
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVT-SQVSPKQEQSETVQ-PQAEPARENDPTVNIKE 1157

Query: 283 NCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTPIQPKHLPNSKQP 334
+ + ++ + +E T + P + QP
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00590IGASERPTASE290.045 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.9 bits (64), Expect = 0.045
Identities = 27/142 (19%), Positives = 45/142 (31%), Gaps = 29/142 (20%)

Query: 163 ELANSQIKAEQERQKTEQEKQ-----------------------KANKSAIELEQQKQKT 199
++ AE +Q+++ ++ KAN E+ Q +T
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 200 INTQRDLIKEQKDFIKETEQNCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTP 259
TQ KE KE + + Q K + E +PA+ N
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 260 IQPKHLPNSKQPHSQRGSKAQE 281
+ N K+P SQ + A
Sbjct: 1153 V------NIKEPQSQTNTTADT 1168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_00595PHPHTRNFRASE2944e-92 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 294 bits (755), Expect = 4e-92
Identities = 106/446 (23%), Positives = 187/446 (41%), Gaps = 68/446 (15%)

Query: 388 DLEHMNSFKEGEILVTDN-TDPDWEPCMKK-ASAVITNRGGRTCHAAIVAREIGVPAIVG 445
+ + + E +++ ++ T D K+ T+ GGRT H+AI++R + +PA+VG
Sbjct: 146 ETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVG 205

Query: 446 VSGATDSLYTGMEITVSCAEGE---------EGYVYAGIYEHEIERVELSNMQETQT--- 493
T+ + G + V EG E ++ E + + +
Sbjct: 206 TKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTK 265

Query: 494 -----KIYINIGNPEKAFGFSQLPNHGVGLARMEMIILNQIKAHPLALVDLHHKKSVKEK 548
++ NIG P+ G G+GL R E + +++ + P
Sbjct: 266 DGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDRDQ-LPTE------------- 311

Query: 549 NEIENLMAGYANPKDFFVKKIAEGIGMISAAFYPKPVIVRTSDFKSNEYMRMLGGSSYEP 608
E Y K++ + KPV++RT D ++ + L P
Sbjct: 312 ---EEQFEAY--------KEVVQ-------RMDGKPVVIRTLDIGGDKELSYL----QLP 349

Query: 609 NEENPMLGYRGASRYYSESYNEAFSWECEALALVREEMGLTNMKVMIPFLRTIEEGKKVL 668
E NP LG+R + F + AL N+KVM P + T+EE ++
Sbjct: 350 KELNPFLGFRAIRLCLE--KQDIFRTQLRALL---RASTYGNLKVMFPMIATLEELRQAK 404

Query: 669 EILRKNNLESGKNG------LEIYIMCELPVNVILADDFLSLFDGFSIGSNDLTQLTLGV 722
I+++ + G +E+ IM E+P + A+ F D FSIG+NDL Q T+
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 723 DRDSELVSHVFDERNEAMLKMFKKAIEACKRHNKYCGICGQAPSDYPEVTEFLVKEGITS 782
DR +E VS+++ + A+L++ I+A K+ G+CG+ D L+ G+
Sbjct: 465 DRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDE 523

Query: 783 ISLNPDSVIPTWNAVAKLE-KELKEH 807
S++ S++P + + KL +ELK
Sbjct: 524 FSMSATSILPARSQLLKLSKEELKPF 549


19C695_01225C695_01260N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_01225-2131.102884neutrophil activating protein (napA)
C695_01230-3120.893665signal-transducing protein, histidine kinase
C695_01235-3111.594241hypothetical protein
C695_01240-3112.036989flagellar basal body P-ring protein
C695_01245-2111.769399DEAD/DEAH box helicase
C695_01250-2101.668750hypothetical protein
C695_01255-291.194173hypothetical protein
C695_01260-292.111216oligopeptide ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01225HELNAPAPROT1515e-50 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 151 bits (382), Expect = 5e-50
Identities = 39/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEAIKLTRVKEETKTSFHSKDIFKEILEDYKYLEKEFKELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLQAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01230PF06580300.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01240FLGPRINGFLGI371e-130 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 371 bits (955), Expect = e-130
Identities = 118/345 (34%), Positives = 191/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AIVSGN-----------SNNLLSANIINGATIEREVSYDLFHKNAMTLSLKNPNFKNAIQ 186
A++ SA + NGA IERE+ + L L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIMVHPVVVTSQDITLKITKDP--------LNDSKNTQDLDNNMSLDTAHN 294
++GTIV G D+ + V V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01260HTHFIS300.019 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.019
Identities = 16/50 (32%), Positives = 20/50 (40%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGIGKSSIANIIMRLNPR----FKPHNGEVLFETTNLLKESEEF 75
+ I GESG GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


20C695_01625C695_01655N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_01625311-1.294851guanylate kinase
C695_01630311-2.129434poly E-rich protein
C695_01635112-2.109986nuclease NucT
C695_01640114-1.890618hypothetical protein
C695_01645314-1.418195flagellar basal body L-ring protein
C695_01650213-1.245892CMP-N-acetylneuraminic acid synthetase
C695_01655213-0.735007flagellar protein FlaG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01625PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01630IGASERPTASE602e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.1 bits (145), Expect = 2e-11
Identities = 51/266 (19%), Positives = 87/266 (32%), Gaps = 19/266 (7%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEE------KEEVKEEEKEEVKEE 193
E+E N Q +E + +E E E V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 194 EKEEVKETPQEEKKPKDDETQEGETLKDKEVSKELEA-PQELEIPKEETQEQDPIKEETQ 252
K+E K + E+ + Q E KE ++A Q E+ + + + +T
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNRE--VAKEAKSNVKANTQTNEVAQ---SGSETKETQTT 1098

Query: 253 ENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTE 312
E KE + ++ E E QE+ K + S QE E Q AE ++ + + E
Sbjct: 1099 ETKETATVEKEEKAKV-ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 313 KTQAQ----ELEVPKEKTQESAEALQETQAHELEKQEIAETPQDVEIPQSQDKEVQELE- 367
+ E P ++T + E + E P++ +Q E
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 368 IPKEETQENTET-PQDVETPQEKETQ 392
PK + + + P +VE
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPATTSSND 1243



Score = 54.3 bits (130), Expect = 1e-09
Identities = 39/252 (15%), Positives = 87/252 (34%), Gaps = 15/252 (5%)

Query: 257 EKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTEKTQA 316
EK+ +T D+ + +Q V + N+ ++ P +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 317 QELEVPKEKTQESAEAL-QETQAHELEKQEIAETPQDVEIPQSQDKE------------- 362
QE + ++ Q++ E Q + + K + Q E+ QS +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 363 VQELEIPKEETQENTETPQ-DVETPQEKETQEDHYESIEDIPEPVMAKAMGEELPFLNEA 421
V++ E K ET++ E P+ + ++E E E E + E N
Sbjct: 1106 VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165

Query: 422 VAKIPNNENDTETPKESVTETSKNENNTETPQEKEESDKTSSPLELRLNLQDLLKSLNQE 481
+ + ++ VTE++ + E + ++ + + K+ ++
Sbjct: 1166 ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225

Query: 482 SLKSLLENKTLS 493
S++S+ N +
Sbjct: 1226 SVRSVPHNVEPA 1237



Score = 50.1 bits (119), Expect = 2e-08
Identities = 27/165 (16%), Positives = 57/165 (34%), Gaps = 3/165 (1%)

Query: 168 QEEKEEVKEEEKEEVKEEEKEEVKEEEKEEVKETPQEEKKPKDDETQEGETLKDKEVSKE 227
E +E + E +E EKEE + E E+ +E P+ + + Q E ++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 228 LEAPQELEIPKEETQEQDPIKEETQENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNG 287
+ ++ P+ +T + Q KE Q + + +V+ + +
Sbjct: 1149 NDPTVNIKEPQSQTNTT---ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 288 QENKEKTQESAEIPQDKEIQEVVTEKTQAQELEVPKEKTQESAEA 332
ES+ P+++ + V + + A
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01645FLGLRINGFLGH1959e-65 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 195 bits (496), Expect = 9e-65
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01655SACTRNSFRASE270.029 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.2 bits (60), Expect = 0.029
Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 103 GETILKALEFIAFE---EFQLHSLHLEVMENNFKAIAFYEKNHYELEG 147
+ + AL A E E L LE + N A FY K+H+ +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


21C695_01780C695_01835N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_01780-3100.815515flagellar MS-ring protein
C695_01785-3101.406830flagellar motor switch protein G
C695_01790-290.528567flagellar assembly protein H
C695_01795-290.8847691-deoxy-D-xylulose-5-phosphate synthase
C695_01800-2110.249130GTP-binding protein LepA
C695_01805-212-0.846207hypothetical protein
C695_01810-1120.062169short chain alcohol dehydrogenase
C695_01815-111-0.358694hypothetical protein
C695_018251110.178865**UDP-glucose 4-epimerase
C695_01830112-0.201405tRNA pseudouridine synthase A
C695_01835012-1.358487hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01780FLGMRINGFLIF5590.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 559 bits (1443), Expect = 0.0
Identities = 178/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYTQGGYGVLFEGLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GASKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGVNTLEYEPLSDESLQKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +++I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPVIDNATLSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEYKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01785FLGMOTORFLIG351e-123 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 351 bits (902), Expect = e-123
Identities = 122/338 (36%), Positives = 209/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAKKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIVKLDNFAIREILKVADKKDLSLALKTSTKDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDIV LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 30.2 bits (68), Expect = 0.010
Identities = 20/102 (19%), Positives = 41/102 (40%), Gaps = 3/102 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEA 102
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEK 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01800TCRTETOQM1417e-38 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 141 bits (358), Expect = 7e-38
Identities = 100/437 (22%), Positives = 176/437 (40%), Gaps = 85/437 (19%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLISECNAIS---NREMKSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTFKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+F+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNHLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSNTNEVSAKARLGIKD--------- 170
+ + INKID ++ V QDI++ + + +V + + +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDT 177

Query: 171 -------LLEKIITTIPAPSGDFNAPLKALIYD-------------------------SW 198
LLEK ++ + + ++ +
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 199 F--------------------DNYLGALALVRIMDGSINTEQEILVMGTGKKHGVLGLYY 238
F LA +R+ G ++ + + K + +Y
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYT 296

Query: 239 PNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDAKNPTSKPIEGFMPAKPFV 295
+ GEI I+ L L SV +GDT P + IE P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL---PQRERIEN---PLPLL 346

Query: 296 FAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFGFRVGFLGLLHMEVIKERL 355
+ P + + E L +ALL++ +D L + +S+ + FLG + MEV L
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---EIILSFLGKVQMEVTCALL 403

Query: 356 EREFGLNLIATAPTVVY 372
+ ++ + + PTV+Y
Sbjct: 404 QEKYHVEIEIKEPTVIY 420



Score = 31.4 bits (71), Expect = 0.011
Identities = 15/82 (18%), Positives = 28/82 (34%), Gaps = 2/82 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEPIENREAN 480
L T G + E
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTT 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01810DHBDHDRGNASE835e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.8 bits (204), Expect = 5e-21
Identities = 55/233 (23%), Positives = 96/233 (41%), Gaps = 21/233 (9%)

Query: 4 ILVSGATSGFGLEIAKAFLQKNHVVFGTGRRKENLQKL------QLAYPKHFIPLCFDLQ 57
++GA G G +A+ + + E L+K+ + + + F P D++
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA--DVR 67

Query: 58 NKLETKRALEAIFSMTDRIDALINNAGLALGLNKAYECELDDWEIMIDTNIKGLLHLTRL 117
+ I ID L+N AG+ L + ++WE N G+ + +R
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 118 ILPSMIEHDQGTIINLGSIAGTYAYPGGNVYGASKAFVKQFSLNLRADLAGTNIRVSNVE 177
+ M++ G+I+ +GS Y +SKA F+ L +LA NIR + V
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 178 PGLCGETEFSMVRFKGDKIKAQSV------YENTIYL----KPQDIANIVLWI 220
PG ET+ + + Q + ++ I L KP DIA+ VL++
Sbjct: 187 PG-STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01825NUCEPIMERASE1202e-33 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 120 bits (302), Expect = 2e-33
Identities = 62/341 (18%), Positives = 131/341 (38%), Gaps = 48/341 (14%)

Query: 1 MALLFTGACGYIGSHTARAFLEKTKENIIIVDDLSTGF---LEHLKALEHYYPNRVVFIQ 57
M L TGA G+IG H ++ LE + ++ +D+L+ + L+ + P F +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLAQPG-FQFHK 58

Query: 58 ANLNETHKLDAFLNKQQLKDPIEAILHFGAKISVEESTHLPLEYYTNNTLNTLELVKLCL 117
+L + + E + +++V S P Y +N L +++ C
Sbjct: 59 IDLADREGMTDLFASGH----FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCR 114

Query: 118 KHAIKRFIFSSTAVVYGKSSS-SLNEESPLN-PINPYGASKMMSERILLDTSKIADFKCV 175
+ I+ +++S++ VYG + + + ++ P++ Y A+K +E + S +
Sbjct: 115 HNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPAT 174

Query: 176 ILRYFNVAGACMHNDYTTPYTLGQRTLNATHLIKIACECAVGKRKKMGIFGTNYPTRDGT 235
LR+F V G D + + L GK ++ G
Sbjct: 175 GLRFFTVYGPWGRPD-MALFKFTKAMLE-------------GKSID--VYN------YGK 212

Query: 236 CIRDYIHVDDLANAHLASYQTLLEKNKS---------------EIYNVGYNQGHSVKEVI 280
RD+ ++DD+A A + + + +YN+G + + + I
Sbjct: 213 MKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYI 272

Query: 281 EKVKEISNNDFLVEILDKRQGDPASLIANNAKILQNTSFKP 321
+ +++ + +L + GD A+ + + F P
Sbjct: 273 QALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTP 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_01835RTXTOXINA290.033 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.033
Identities = 31/133 (23%), Positives = 53/133 (39%), Gaps = 24/133 (18%)

Query: 100 VFAPLSLLVSAILLVFSLILIPTSKSTYYGFLRQKKDKIDINIRAGEFGQKLGDWLV--- 156
V AP+S LV A+ + S IL + ++ + + D I E+ +K G
Sbjct: 391 VGAPVSALVGAVTGIISGILEASKQAMFEHVASKMADVIA------EWEKKHGKNYFENG 444

Query: 157 YVDKTKNNSYDNLVLFS--NKSLSQESFILAQKGNINNQNGVFEL--------NLYNGHA 206
Y + DN + S NK S E +L + + + G EL +G +
Sbjct: 445 YDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHWDTLIG--ELAGVTRNGDKTLSGKS 502

Query: 207 Y---FTQGDKMRK 216
Y + +G ++ K
Sbjct: 503 YIDYYEEGKRLEK 515


22C695_02990C695_03020N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_02990213-0.948142hypothetical protein
C695_02995214-0.247515hypothetical protein
C695_03000116-0.338271dihydroorotase
C695_03005016-2.570100hypothetical protein
C695_03010-214-2.894663hypothetical protein
C695_03015-214-2.298641flagellar motor switch protein
C695_03020-112-1.053525endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_02990TYPE3IMSPROT300.006 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.7 bits (67), Expect = 0.006
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLTDVLGFLPFSIYHHFMASLIFSALFCSSLFLSSPLLGVIALVALS 147
L Y F L+L+ +LPFS S + + +L PLL V AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03005TONBPROTEIN533e-10 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 52.7 bits (126), Expect = 3e-10
Identities = 24/52 (46%), Positives = 27/52 (51%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
P P P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.5 bits (110), Expect = 4e-08
Identities = 27/74 (36%), Positives = 32/74 (43%), Gaps = 8/74 (10%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
A Q PP P P P P P P E P KPKPKP+PK K V+
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVK 104

Query: 143 KVEEKKVVEEKKEE 156
KV+E+ + K E
Sbjct: 105 KVQEQPKRDVKPVE 118



Score = 45.4 bits (107), Expect = 9e-08
Identities = 21/65 (32%), Positives = 27/65 (41%), Gaps = 1/65 (1%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPK-KPNHKHKALKKV 141
P P P P P P PPK +PKPKPKP+PK + + + V
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114

Query: 142 EKVEE 146
+ VE
Sbjct: 115 KPVES 119



Score = 38.8 bits (90), Expect = 1e-05
Identities = 43/218 (19%), Positives = 80/218 (36%), Gaps = 38/218 (17%)

Query: 101 PTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVEEKKEEKKIV 160
P PP +P +P EP+P+P+P P+ P +E VV EK + K
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-------------KEAPVVIEKPKPKPKP 98

Query: 161 EQKVEQKVEQKKIEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTRRDIDELYGE 220
+ K +KV+++ + KPV E P N T +
Sbjct: 99 KPKPVKKVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAATSK 144

Query: 221 EFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDLKI 280
+ + + RN + YP A L +G V+F + P+G + +++I
Sbjct: 145 PVTSVASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQI 193

Query: 281 IIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 318
+ M + ++ + +P + ++ I +
Sbjct: 194 LSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 35.0 bits (80), Expect = 3e-04
Identities = 12/42 (28%), Positives = 17/42 (40%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPN 132
P+ P P P P PKP K + +PK +P +
Sbjct: 79 PEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 29.6 bits (66), Expect = 0.013
Identities = 13/48 (27%), Positives = 19/48 (39%)

Query: 84 PKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKP 131
+ P K P P PKP++K + +PK KP +P
Sbjct: 74 EPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRP 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03015FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_03020OMS28PORIN280.031 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 27.8 bits (61), Expect = 0.031
Identities = 28/112 (25%), Positives = 53/112 (47%), Gaps = 11/112 (9%)

Query: 25 NQTTELRHKNPYELLVATILSAQCTDARVNQITPKLFEKYPSVNDLAL-----ASLEEVK 79
N+ E+ K E A ++ + T QI + K P+ +L L A +E+VK
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 80 EIIKSVSYFNNKSKHLISMAQKVVRDFKGVIPSTQKELMSLDGVGQKTANVV 131
E + + +++ + AQKV+ G+ PS + ++++ V + +NVV
Sbjct: 189 ETLMASERALDET---VQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


23C695_04640C695_04680N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_046401152.047414acetate kinase
C695_046451142.522278acetate kinase A/propionate kinase 2
C695_046501151.692701phosphotransacetylase
C695_046652140.861018phosphotransacetylase (pta)
C695_046700140.732563hypothetical protein
C695_04675-2131.458710flagellar basal body rod modification protein
C695_04680-1141.873795flagellar hook protein FlgE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04640ACETATEKNASE1235e-37 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 123 bits (310), Expect = 5e-37
Identities = 48/117 (41%), Positives = 72/117 (61%), Gaps = 2/117 (1%)

Query: 1 MRNIEARK-EKGDKEAKLAFEMCAYRIKKHIGAYMVVLKKVDAIIFTGGLGENYSALRES 59
R++E + GDK A+LA + AYR+KK IG+Y + VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 60 VCEGLENLGIALCKPTNDNPGSGLVNLSQPDAKIQILRIPTDEELEIALQTKKVLEK 116
+ +GLE LG L K N G + +S D+K+ ++ +PT+EE IA T+K++E
Sbjct: 343 ILDGLEFLGFKLDKEKNKVRGEEAI-ISTADSKVNVMVVPTNEEYMIAKDTEKIVES 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04645ACETATEKNASE353e-123 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 353 bits (907), Expect = e-123
Identities = 142/282 (50%), Positives = 192/282 (68%), Gaps = 6/282 (2%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKFVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVNEKVMQEIGNLS 118
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 119 ILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHATMPSYAYMYALPYELYEKYQIRHY 178
LAPLHNPAN+ GI+ + P +P +AVFDTAFH TMP YAY+Y +PYE Y KY+IR Y
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKY 176

Query: 179 GFHRTSHHYVAKEAAKFLNTAYEEFNAISLHLGNGSSAAAIQKGKSVDTSMGLTPLEGLI 238
GFH TSH YV++ AA+ LN E I+ HLGNGSS AA++ GKS+DTSMG TPLEGL
Sbjct: 177 GFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLA 236

Query: 239 MGTRCGDIDPTVVEYTAQCANKSLEEVMKMLNHESGLKGICG 280
MGTR G IDP+++ Y + N S EEV+ +LN +SG+ GI G
Sbjct: 237 MGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04670IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 5e-04
Identities = 40/227 (17%), Positives = 70/227 (30%), Gaps = 11/227 (4%)

Query: 294 KKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANPPPNDNIPTPLEKEEKAKE 353
+ Q PS N + P+ P A TP E E E
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA---------TPSETTETVAE 1042

Query: 354 ASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKPPMSRI 413
S + KT E + Q + + KS T + Q E +E + ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 414 SMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNALGFEGVDLSF 473
+ + +E KVE +K + KV+ Q+ Q N +
Sbjct: 1103 TATVEKEEKAKVET--EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 474 SQDSSKEQQAPKDQPKEPFKEQELTPLKENALKSYQENTDNENQETS 520
+++ + + P + ++ N S EN +N T+
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207



Score = 30.4 bits (68), Expect = 0.025
Identities = 48/240 (20%), Positives = 82/240 (34%), Gaps = 18/240 (7%)

Query: 184 PKTLKDIQTLSQKHDLNASNIQAATTPENKN------PLNASDQLALKTTQTPTNHTLAK 237
P+ K QT+ + +NIQA N A T + T T+A+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 238 NDAKNTANLSSVLQSLEKKEPQNKEHANPLNNEKKTPPLK--------EALEMNAIKRDK 289
N + + + Q + QN+E A + K E E + +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 290 TLSKKKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANP-PPNDNIPTPLEKE 348
T + +K EK + + P T + +PK + A P ND E +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPK---QEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 349 EKAKEASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKP 408
+ +D ++ KETS++ + + T ++ P+ T TQ KP
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_04680FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


24C695_06135C695_06185N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_06135-1150.046356arabinose transporter
C695_06140-116-1.359813hypothetical protein
C695_06145-114-0.648182alpha-carbonic anhydrase
C695_06150017-1.332237hypothetical protein
C695_06155-1150.170877hypothetical protein
C695_06160-3102.185107aspartate-semialdehyde dehydrogenase
C695_06165-3112.254826histidyl-tRNA ligase
C695_06170-2123.132413ADP-heptose--LPS heptosyltransferase II
C695_061750114.176620flagellar motility protein
C695_061800123.988990aldo/keto reductase
C695_061851134.045579elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_06135TCRTETB492e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 49.1 bits (117), Expect = 2e-08
Identities = 32/132 (24%), Positives = 63/132 (47%), Gaps = 1/132 (0%)

Query: 37 LSDIAKSFEMESATVGLMITAYAWVVSLGSLPLMLLSAKIERKRLLLFLFALFILSHILS 96
L DIA F A+ + TA+ S+G+ LS ++ KRLLLF + ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 97 ALAWNFW-VLLLSRMGIAFAHSIFWSITASLVIRVAPRNKKQQALGLLALGSSLAMILGL 155
+ +F+ +L+++R + F ++ +V R P+ + +A GL+ ++ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 156 PLGRIIGQILDW 167
+G +I + W
Sbjct: 157 AIGGMIAHYIHW 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_06150IGASERPTASE310.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.011
Identities = 20/86 (23%), Positives = 34/86 (39%), Gaps = 11/86 (12%)

Query: 124 SEQIEL--EQEKQKTSNIETNNQIKVEQEKQKT-------SNIETNNQI--KVEQEQQKT 172
SE E E KQ++ +E N Q E Q SN++ N Q + +
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 173 EQERQKTEQERQKTEQEKQKTIKTQK 198
E + +T++ ++EK K +
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKT 1119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_06155IGASERPTASE280.026 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.026
Identities = 40/222 (18%), Positives = 66/222 (29%), Gaps = 25/222 (11%)

Query: 3 DKVQDKSKQAEKENQINWWKYSGLTIATSLLL--AACSVGDIDKQIELEQEKKEAENARD 60
+ V + SKQ K + N + T + A +V + E+ Q E + +
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 61 RANKSGIELE-QEKQKTIKEQKDLVKKAEQNCQENHGQFFMKKLGIKGGIAIEVEAECKT 119
K +E +EK K E+ V K Q E
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ---------------SETVQPQ 1142

Query: 120 PKPAKTNQTPIQPKHLPNSKQPHSQRGSKA-QELIAYLQKELESLPYSQKAIAKQVNFYR 178
+PA+ N + N K+P SQ + A E A P ++ N
Sbjct: 1143 AEPARENDPTV------NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV 1196

Query: 179 PSSVAYLELDPRDFKVTEEWQKENLKIRSKAQAKMLGNEKPT 220
+ + +E K + R ++ E T
Sbjct: 1197 ENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_06165ANTHRAXTOXNA310.012 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.012
Identities = 21/58 (36%), Positives = 31/58 (53%), Gaps = 3/58 (5%)

Query: 180 EALRIVDKLEKIGLNGVEEELKKECGLNSNTIKELLELIQIKQNDL--SHAEFFEKIA 235
+ ++KLEK G + E LKKE G+ + I L +K + L HA+ F+KIA
Sbjct: 263 DMFEYMNKLEKGGFEKISESLKKE-GVEKDRIDVLKGEKALKASGLVPEHADAFKKIA 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_06185TCRTETOQM6420.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 642 bits (1658), Expect = 0.0
Identities = 179/671 (26%), Positives = 304/671 (45%), Gaps = 66/671 (9%)

Query: 9 RIRNIGIAAHIDAGKTTTSERILFYTGVSHKIGEVHDGAATMDWMEQEKERGITITSAAT 68
+I NIG+ AH+DAGKTT +E +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TCFWKDHQINLIDTPGHVDFTIEVERSMRVLDGAVSVFCSVGGVQPQSETVWRQANKYGV 128
+ W++ ++N+IDTPGH+DF EV RS+ VLDGA+ + + GVQ Q+ ++ K G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFVNKMDRIGANFYNVENQIKLRLKANPVPINIPIGAEDTFIGVIDLVQMKAIVWNN 188
P I F+NK+D+ G + V IK +L A V
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI--------------------------- 154

Query: 189 ETMGAKYDVEEIPSDLLEKAKEYREKLVEAVAEQDEALMEKYLGGEELSIEEIKKGIKAG 248
K VE P+ + E + + V E ++ L+EKY+ G+ L E+++
Sbjct: 155 -----KQKVELYPNMCVTNFTESEQ--WDTVIEGNDDLLEKYMSGKSLEALELEQEESIR 207

Query: 249 CLNMSLVPMLCGSSFKNKGVQTLLDAVIDYLPAPTEVVDIKGIDPKTEEEVFVKSSDDGE 308
N SL P+ GS+ N G+ L++ + + + T E
Sbjct: 208 FHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSE 248

Query: 309 FAGLAFKIMTDPFVGQLTFVRVYRGKLESGSYVYNSTKDKKERVGRLLKMHSNKREDIKE 368
G FKI +L ++R+Y G L V S K+K ++ + + + I +
Sbjct: 249 LCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELCKIDK 307

Query: 369 VYAGEICAFVG----LKDTLTGDTLCDEKNAVVLERMEFPEPVIHIAVEPKTKADQEKMG 424
Y+GEI L L GDT + ER+E P P++ VEP +E +
Sbjct: 308 AYSGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLL 362

Query: 425 VALGKLAEEDPSFRVMTQEETGQTLIGGMGELHLEIIVDRLKREFKVEAEIGQPQVAFRE 484
AL ++++ DP R T + ++ +G++ +E+ L+ ++ VE EI +P V + E
Sbjct: 363 DALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME 422

Query: 485 TIRSSVSKEHKYAKQSGGRGQYGHVFIKLEPKEPGSGYEFVNEISGGVIPKEYIPAVDKG 544
R E+ + + + + + P GSG ++ + +S G + + + AV +G
Sbjct: 423 --RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 545 IQEAMQNGVLAGYPVVDFKVTLYDGSYHDVDSSEMAFKIAGSMAFKEASRAANPVLLEPM 604
I+ + G L G+ V D K+ G Y+ S+ F++ + ++ + A LLEP
Sbjct: 481 IRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPY 539

Query: 605 MKVEVEVPEEYMGDVIGDLNRRRGQINSMDDRLGLKIVNAFVPLVEMFGYSTDLRSATQG 664
+ ++ P+EY+ D + I + I++ +P + Y +DL T G
Sbjct: 540 LSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNG 599

Query: 665 RGTYSMEFDHY 675
R E Y
Sbjct: 600 RSVCLTELKGY 610


25C695_07525C695_07560N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_075250130.367534membrane protein insertase
C695_075300110.352398hypothetical protein
C695_075351110.693475tRNA modification GTPase TrmE
C695_075402121.579088hypothetical protein
C695_07545-1150.930477hypothetical protein
C695_07550-2141.400485hypothetical protein
C695_07555-2131.956965hypothetical protein
C695_07560-2132.254749membrane-associated lipoprotein (lpp20)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_0752560KDINNERMP431e-148 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 431 bits (1110), Expect = e-148
Identities = 162/581 (27%), Positives = 275/581 (47%), Gaps = 81/581 (13%)

Query: 9 RLILAIALSFLFIALYSYFFQKPNKTTTQTTKQETTNNHTATSPNAPNAQHFSTTQTTPQ 68
R +L IAL F+ ++ Q T T+ A + + Q
Sbjct: 5 RNLLVIALLFVSFMIW-------QAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQ 57

Query: 69 ENLLSTISFEHARIEIDSLG-RIKQVYLKDKKYLTPKQKGFLEHVG--HLFSSKEN---- 121
L+ ++ + + I++ G ++Q L P L L +
Sbjct: 58 GKLI-SVKTDVLDLTINTRGGDVEQALL-------PAYPKELNSTQPFQLLETSPQFIYQ 109

Query: 122 AQPPL--KELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDL 177
AQ L ++ P A+ +PL +N A G NE V D
Sbjct: 110 AQSGLTGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDA 156

Query: 178 GTLSIIKTLTFYDDLHYDLKIAFKSPNN------------------LIPSYVITNGYRPV 219
+ KT Y + + + N L P + +
Sbjct: 157 AGNTFTKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFAL 215

Query: 220 ADLDSYTFSGVLLENSDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQ 276
+TF G D+K EK + D + + S +++ + +YF T +
Sbjct: 216 -----HTFRGAAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-D 269

Query: 277 GFEALIDSEIGTKNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDV 325
G + +G N + I K++ N ++GP+ + A++P L
Sbjct: 270 GTNNFYTANLG--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLT 327

Query: 326 IEYGLITFFAKGVFVLLDYLYQFVGNWGWAIILLTIIVRIILYPLSYKGMVSMQKLKELA 385
++YG + F ++ +F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ L
Sbjct: 328 VDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQ 387

Query: 386 PKMKELQEKYKGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELK 445
PK++ ++E+ + Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+
Sbjct: 388 PKIQAMRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELR 447

Query: 446 SSEWILWIHDLSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLI 505
+ + LWIHDLS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F +
Sbjct: 448 QAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFL 507

Query: 506 TFPAGLVLYWTTNNILSVLQQLIINKVLENKKRMHAQNKKE 546
FP+GLVLY+ +N+++++QQ +I + LE K+ +H++ KK+
Sbjct: 508 WFPSGLVLYYIVSNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07530IGASERPTASE300.009 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.009
Identities = 16/46 (34%), Positives = 25/46 (54%), Gaps = 2/46 (4%)

Query: 64 KEESVKETNTKEIHQSAEEKKQKLETETPQEEIITPKPSKKNPKEE 109
+ + + T TKE +E+K K+ETE QE S+ +PK+E
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEV--PKVTSQVSPKQE 1134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07535TCRTETOQM310.008 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.008
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 KGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_07560LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (751), Expect = e-105
Identities = 175/175 (100%), Positives = 175/175 (100%)

Query: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175


26C695_08085C695_08110N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C695_08085-1132.406580flagellar hook-basal body protein FliE
C695_08090-1132.127934flagellar basal body rod protein FlgC
C695_080950151.717916flagellar basal body rod protein FlgB
C695_081001131.732642cell division protein FtsW
C695_081050120.230607iron(III) ABC transporter periplasmic
C695_08110113-0.099049iron(III) ABC transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_08085FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_08090FLGHOOKAP1290.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.011
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_08105FERRIBNDNGPP348e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.8 bits (77), Expect = 8e-04
Identities = 28/183 (15%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 108 NVELLKKLSPDLVVTFVG-NPKAVEHAKKFGISFLSFQETT--IAEAMQAMQ--AQATVL 162
N+ELL ++ P +V G P A+ +F + +A A +++ A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 163 EIDASKKFAKMQETLDFIAERL-KNVKKKKGVELFHKAN--KISGHQAISSDILEKGGID 219
+ A A+ ++ + + R K + + + G ++ +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 220 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLTPEDVLNNPKFATIKAIKNKQVY 277
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 278 KLP 280
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C695_08110FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 29/184 (15%), Positives = 75/184 (40%), Gaps = 12/184 (6%)

Query: 106 NVELLKKLGPDLVVTFVGNPKAVEHAKKF--GILFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EIDASKKLAKMQETLDFIAERLKGVKKKKGVELFHKAN----KISGHQALDSDILEKGGI 216
+ A LA+ ++ + + R + + + L + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVK-RGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 217 DN-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLTPEDVLNNPKFATIKAIKNKQV 274
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 207 PNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF 266

Query: 275 YKLP 278
++P
Sbjct: 267 QRVP 270



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.