PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeRif2.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP003906 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1C730_00225C730_00390Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_00225-216-3.369469hypothetical protein
C730_00230-116-5.323031adenine-specific DNA methyltransferase
C730_00235-216-4.049378cytosine specific DNA methyltransferase
C730_00240-29-2.427813hypothetical protein
C730_00245-29-2.292084hypothetical protein
C730_00250-19-2.330093hypothetical protein
C730_0025509-1.286847adenine/cytosine DNA methyltransferase
C730_002601100.373618sodium/proline symporter
C730_00265112-0.056581delta-1-pyrroline-5-carboxylate dehydrogenase
C730_00270315-1.024828hypothetical protein
C730_00275313-0.990422Myosin-3
C730_00280312-0.616635hypothetical protein
C730_00285212-0.576790hypothetical protein
C730_00290114-0.018208hypothetical protein
C730_002951100.465170hypothetical protein
C730_003000110.961832hypothetical protein
C730_003050101.490057hypothetical protein
C730_003100111.578106hypothetical protein
C730_00315-1112.152520ATP-binding protein
C730_003200142.806425urease accessory protein UreH
C730_003254233.260418urease accessory protein UreG
C730_003304242.825599urease accessory protein UreF
C730_003354242.470118urease accessory protein UreE
C730_003403212.390763urease accessory protein UreI
C730_003451202.010771hypothetical protein
C730_003500172.481742urease subunit beta
C730_00355-3101.673604urease subunit alpha
C730_003600132.216760*lipoprotein signal peptidase
C730_003652132.667843phosphoglucosamine mutase
C730_003703152.86314230S ribosomal protein S20
C730_003752132.058185peptide chain release factor 1
C730_003803131.925539hypothetical protein
C730_003853131.883748hypothetical protein
C730_003902141.133666hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00265ANTHRAXTOXNA310.034 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.034
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00280GPOSANCHOR461e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.8 bits (108), Expect = 1e-07
Identities = 40/217 (18%), Positives = 74/217 (34%)

Query: 13 QVRKELEARISGLEDENAELFAENEKLALGTSELKDANNQLRQKNDKLFTTKENLTQEKT 72
+ +LE + G + + A+ + L + L L + + + +
Sbjct: 120 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIK 179

Query: 73 ELTEKNKVLTTEKGNLDNQLNASQKQVQALEQSQQVLENEKVELTNKITDLSKEKENLTK 132
L + L + L+ L + A + LE EK L + DL K E
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 133 ANTELKTENDKLNHQVIALTKEQDSLKQERAQLQDAHGFLEELCANLEKDNQHLTDKLKK 192
+T + L + AL Q L++ + LE + L +
Sbjct: 240 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299

Query: 193 LESAQKNLENSNDQLLQAIENIAEEKTELEREIARLK 229
LE + L + L + ++ E K +LE E +L+
Sbjct: 300 LEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336



Score = 45.1 bits (106), Expect = 2e-07
Identities = 48/262 (18%), Positives = 86/262 (32%), Gaps = 11/262 (4%)

Query: 16 KELEARISGLEDENAELFAENEKLALGTSELKDANNQLRQKNDKLFTTKENLTQEKTELT 75
EL +S +++ + + A EL+ L + + + + L
Sbjct: 88 DELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLE 147

Query: 76 EKNKVLTTEKGNLDNQLNASQKQVQALEQSQQVLENEKVELTNKITDLSKEKENLTKANT 135
+ L K +L+ L + A + LE EK L + +L K E +T
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 136 ELKTENDKLNHQVIALTKEQDSLKQERAQLQDAHGFLEELCANLEKDNQHLTDKLKKLES 195
+ L + AL + L++ + LE + L + +LE
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 196 AQKNLENSNDQLLQAIENIAEEKTELEREIARLKSLEATDKSELDLQNCRFKSAIEDLKR 255
A + N + I+ + EK LE E A L+ + + L+R
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR-----------QSLRR 316

Query: 256 QNRKLEEENIALKERAYGLKEQ 277
E L+ L+EQ
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQ 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00350UREASE10450.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1045 bits (2703), Expect = 0.0
Identities = 354/569 (62%), Positives = 443/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNASNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGNAS +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00385IGASERPTASE428e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 41.6 bits (97), Expect = 8e-06
Identities = 49/253 (19%), Positives = 83/253 (32%), Gaps = 10/253 (3%)

Query: 52 KLTSDSPTQQQDQKVAQNTASNDSQEATTLENTASTDNTTATTDETYTKSTDTTVAGAAQ 111
K D+ + A ++ + T A + + T T T TK T T
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 112 KVETDNTA----VQSAEQTLKTDVAKVQADASAKDFDETTFQADQAAEQTAEKALQQAES 167
KVET+ T V S + VQ A ++ T + QT A + +
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 168 KLNTDQQTLNTALQDQTKTPTPSTPPTKEEPKHTASSGTPPAPESPPAKKDETSGTPSAS 227
K + N T + E P++T + T P S + K + S
Sbjct: 1173 KETSS----NVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV- 1227

Query: 228 GSSVASQLTKDTTMVNNLKSVSVSAMNTTLSGVETMSQQTATIGNLLNSSTDLSSVIPNA 287
SV + TT N+ +V++ + +T + + LN +S I
Sbjct: 1228 -RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 288 QGLNSAFSTLESA 300
+ N + +
Sbjct: 1287 EMNNEGQYNVWVS 1299


2C730_00700C730_00735Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
C730_007002131.621314DNA glycosylase MutY
C730_007053142.505148citrate:succinate antiporter
C730_007104132.410080citrate:succinate antiporter
C730_007154141.644335cbb3-type cytochrome c oxidase subunit I
C730_007200160.024290cbb3-type cytochrome c oxidase subunit II
C730_00725218-1.657053cytochrome c oxidase subunit Q
C730_00730417-1.246707cytochrome c oxidase, cbb3-type, subunit III
C730_00735419-1.859899hypothetical protein
3C730_00910C730_01020Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_009103181.076705serine hydroxymethyltransferase
C730_009153210.948970hypothetical protein
C730_009201180.067439hypothetical protein
C730_009252181.090250hypothetical protein
C730_00930-1123.177673hypothetical protein
C730_00935-1102.863024hypothetical protein
C730_00940-1102.113975hypothetical protein
C730_00945-2102.254713hypothetical protein
C730_00950-1103.051114fumarate reductase iron-sulfur subunit
C730_00955-1103.095697fumarate reductase flavoprotein subunit
C730_00960-2141.595226fumarate reductase cytochrome b-556 subunit
C730_00965-2151.689801triosephosphate isomerase
C730_00970-2162.846524enoyl-(acyl carrier protein) reductase
C730_00975-2163.116863UDP-3-O-[3-hydroxymyristoyl] glucosamine
C730_00980-2163.469397S-adenosylmethionine synthetase
C730_00985-2172.666942mulitfunctional nucleoside diphosphate
C730_009900181.870392hypothetical protein
C730_00995-2171.76516350S ribosomal protein L32
C730_01000-111-2.374269phosphate acyltransferase
C730_01005114-4.1292653-oxoacyl-(acyl carrier protein) synthase III
C730_01010214-3.923604hypothetical protein
C730_01015213-3.931381hypothetical protein
C730_01020213-3.529417hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00920IGASERPTASE320.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.003
Identities = 28/148 (18%), Positives = 54/148 (36%), Gaps = 6/148 (4%)

Query: 50 PKETFLQTDSGMQKIGNTKDEKKDDEFESLNLDPSKQEDKLDKVADNVKKQENDAFNMPT 109
P ET ++ T ++ + D E+ + ++ V N Q N+ +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN--TQTNEVAQSGS 1090

Query: 110 QTDQTQTEMKTTEETQEAQKGLKVVEHTSTQKESQAVAKKEISHKKPKATPKDKEAHKDK 169
+T +TQT T E ++ K VE TQ+ + ++ ++ + E ++
Sbjct: 1091 ETKETQTTETKETATVEKEEKAK-VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 170 D---KHAVKELKVKKEAHKEVPKKANSK 194
D + + A E P K S
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSS 1177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00970DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 8e-13
Identities = 60/263 (22%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNSVKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + +++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L +++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSSGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


4C730_01495C730_01550Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
C730_014952163.27134150S ribosomal protein L21
C730_015002163.34786350S ribosomal protein L27
C730_015051153.566601dipeptide ABC transporter periplasmic
C730_015102153.936992dipeptide ABC transporter permease (dppB)
C730_015150153.293724dipeptide ABC transporter permease (dppC)
C730_01520-2143.052292dipeptide ABC transporter ATP-binding protein
C730_01525-2142.807781dipeptide ABC transporter ATP-binding protein
C730_01530-2122.206452GTPase CgtA
C730_01535-1131.550831hypothetical protein
C730_01540-1162.261942hypothetical protein
C730_015450172.746451glutamate-1-semialdehyde aminotransferase
C730_015502172.279190hypothetical protein
5C730_01610C730_01720Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_01610216-1.137355heme iron utilization protein
C730_01615115-1.176710arginyl-tRNA ligase
C730_01620213-0.691151sec-independent protein translocase protein
C730_01625311-1.293641guanylate kinase
C730_01630311-2.128224poly E-rich protein
C730_01635112-2.108776nuclease NucT
C730_01640114-1.889408hypothetical protein
C730_01645314-1.416985flagellar basal body L-ring protein
C730_01650213-1.244682CMP-N-acetylneuraminic acid synthetase
C730_01655213-0.733797flagellar protein FlaG
C730_016601120.819346tetraacyldisaccharide 4'-kinase
C730_016650151.881556NAD synthetase
C730_01670-1172.424160*ketol-acid reductoisomerase
C730_016750181.500837cell division inhibitor
C730_016801170.528023cell division topological specificity factor
C730_016850190.633483DNA processing chain A (dprA)
C730_01690120-0.382045Holliday junction resolvase-like protein
C730_01695322-1.442436hypothetical protein
C730_01710220-1.499232hypothetical protein
C730_01715227-1.321334hypothetical protein
C730_01720228-0.374981hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01625PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01630IGASERPTASE602e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.1 bits (145), Expect = 2e-11
Identities = 51/266 (19%), Positives = 87/266 (32%), Gaps = 19/266 (7%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEE------KEEVKEEEKEEVKEE 193
E+E N Q +E + +E E E V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 194 EKEEVKETPQEEKKPKDDETQEGETLKDKEVSKELEA-PQELEIPKEETQEQDPIKEETQ 252
K+E K + E+ + Q E KE ++A Q E+ + + + +T
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNRE--VAKEAKSNVKANTQTNEVAQ---SGSETKETQTT 1098

Query: 253 ENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTE 312
E KE + ++ E E QE+ K + S QE E Q AE ++ + + E
Sbjct: 1099 ETKETATVEKEEKAKV-ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 313 KTQAQ----ELEVPKEKTQESAEALQETQAHELEKQEIAETPQDVEIPQSQDKEVQELE- 367
+ E P ++T + E + E P++ +Q E
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 368 IPKEETQENTET-PQDVETPQEKETQ 392
PK + + + P +VE
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPATTSSND 1243



Score = 54.3 bits (130), Expect = 1e-09
Identities = 39/252 (15%), Positives = 87/252 (34%), Gaps = 15/252 (5%)

Query: 257 EKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTEKTQA 316
EK+ +T D+ + +Q V + N+ ++ P +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 317 QELEVPKEKTQESAEAL-QETQAHELEKQEIAETPQDVEIPQSQDKE------------- 362
QE + ++ Q++ E Q + + K + Q E+ QS +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 363 VQELEIPKEETQENTETPQ-DVETPQEKETQEDHYESIEDIPEPVMAKAMGEELPFLNEA 421
V++ E K ET++ E P+ + ++E E E E + E N
Sbjct: 1106 VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165

Query: 422 VAKIPNNENDTETPKESVTETSKNENNTETPQEKEESDKTSSPLELRLNLQDLLKSLNQE 481
+ + ++ VTE++ + E + ++ + + K+ ++
Sbjct: 1166 ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225

Query: 482 SLKSLLENKTLS 493
S++S+ N +
Sbjct: 1226 SVRSVPHNVEPA 1237



Score = 50.1 bits (119), Expect = 2e-08
Identities = 27/165 (16%), Positives = 57/165 (34%), Gaps = 3/165 (1%)

Query: 168 QEEKEEVKEEEKEEVKEEEKEEVKEEEKEEVKETPQEEKKPKDDETQEGETLKDKEVSKE 227
E +E + E +E EKEE + E E+ +E P+ + + Q E ++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 228 LEAPQELEIPKEETQEQDPIKEETQENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNG 287
+ ++ P+ +T + Q KE Q + + +V+ + +
Sbjct: 1149 NDPTVNIKEPQSQTNTT---ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 288 QENKEKTQESAEIPQDKEIQEVVTEKTQAQELEVPKEKTQESAEA 332
ES+ P+++ + V + + A
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01645FLGLRINGFLGH1959e-65 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 195 bits (496), Expect = 9e-65
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01655SACTRNSFRASE270.029 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.2 bits (60), Expect = 0.029
Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 103 GETILKALEFIAFE---EFQLHSLHLEVMENNFKAIAFYEKNHYELEG 147
+ + AL A E E L LE + N A FY K+H+ +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


6C730_02165C730_02365Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_02165015-3.375788hypothetical protein
C730_02170217-3.913373hypothetical protein
C730_02175227-5.521318phage/colicin/tellurite resistance cluster terY
C730_02180427-6.246707hypothetical protein
C730_02185421-4.985036hypothetical protein
C730_02190520-5.334033protein phosphatase 2C
C730_02195619-5.792751protein kinase C-like protein
C730_02200720-6.532384hypothetical protein
C730_02220720-6.392615IS605 transposase (tnpA)
C730_02225722-6.408584IS605 transposase (tnpB)
C730_02230723-7.082980hypothetical protein
C730_02235723-7.152345DNA topoisomerase I (topA)
C730_02240625-7.730970VirB4-like protein
C730_02245524-6.061264hypothetical protein
C730_02250219-5.909296hypothetical protein
C730_02255319-5.778730hypothetical protein
C730_02260220-5.777906hypothetical protein
C730_02265421-5.978779hypothetical protein
C730_02270320-5.562946hypothetical protein
C730_02300220-6.468124hypothetical protein
C730_023051027-7.882985hypothetical protein
C730_02320826-7.902254hypothetical protein
C730_02325825-7.560834hypothetical protein
C730_02330724-7.599073hypothetical protein
C730_02335723-7.797706hypothetical protein
C730_02340423-7.475685hypothetical protein
C730_02345421-7.224484protein VirB4
C730_02350019-5.213344VirB7 type IV secretion protein
C730_02355-218-4.126345hypothetical protein
C730_02360-117-3.651131hypothetical protein
C730_02365-116-3.503271type I restriction enzyme S protein (hsdS)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02230PF04335933e-24 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 93.0 bits (231), Expect = 3e-24
Identities = 35/213 (16%), Positives = 72/213 (33%), Gaps = 13/213 (6%)

Query: 144 FEEVRD-ASVIYHLEKKLGDYIFYVACFFFGTTALLIILLTILLPLKQKEPYLVQFSNNK 202
FEE ++ + VA ++ + L PLK EPY++ N
Sbjct: 14 FEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNT 73

Query: 203 ENFALVQ--KADSSITANKALIRSLVGAYVLNRESITHIEQHEKMRQNTIKEQSSNEVWY 260
++ D++IT ++A+ + + YV RE + + + + S+
Sbjct: 74 GEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAR--EEYFDAVMVMSARPEQD 131

Query: 261 EFEKLIA-----HYDSIYTNPLLTRKVKIANI-YLDKDLAYIDIEVSLYHSGELESLKRY 314
+ + +I N V+I + +L ++A + +G +
Sbjct: 132 RWSRFYKTDNPQSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFTKESV-TGSNSTKTDA 189

Query: 315 KVVMSFEFKKQEINFDSMSLNPTGFMVTSYDVT 347
+ ++ NP G+ V SY
Sbjct: 190 VATIKYKVDGTPSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02270adhesinb290.032 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.032
Identities = 30/123 (24%), Positives = 48/123 (39%), Gaps = 12/123 (9%)

Query: 53 ERTPWDLESLLRDYNFVFSNTTGQHNQALERKETPYFDSVIVDEAAKANPLELLMVMALA 112
E P D++ + +F N G + LE +F ++ E AK + ++
Sbjct: 71 EPLPEDVKKT-SQADLIFYN--GIN---LETGGNAWFTKLV--ENAKKKENKDYYAVSEG 122

Query: 113 KERIILVGDDRQL---PHY-LDDEIGKKLEDESQDAQDEIEKALKDSMFKKLKERAQKLK 168
+ I L G + PH L+ E G E + A K++ K LK +KL
Sbjct: 123 VDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLS 182

Query: 169 ELD 171
LD
Sbjct: 183 ALD 185


7C730_02425C730_02510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_02425014-3.458427molybdenum ABC transporter periplasmic
C730_02430012-3.659966molybdenum ABC transporter ModB
C730_02435-19-2.078391molybdenum ABC transporter ATP-binding protein
C730_02440-110-2.108683glutamyl-tRNA ligase
C730_02445-111-2.684366hypothetical protein
C730_02450-211-2.771193adenine-specific DNA methyltransferase
C730_02455013-1.274068hypothetical protein
C730_02460017-0.308980GTP-binding protein TypA
C730_02465224-4.308722adenine-specific DNA methyltransferase
C730_02470018-3.163006type II adenine specific DNA methyltransferase
C730_02475416-0.095787hypothetical protein
C730_024806180.466256Type II DNA modification enzyme
C730_02485116-0.363649type II DNA modification
C730_02490217-0.243154hypothetical protein
C730_024952160.481122catalase-like protein
C730_025003170.516578hypothetical protein
C730_02505314-1.166821hypothetical protein
C730_02510314-1.251716hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02435PF05272300.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.007
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VVALLGESGAGKSTILRILAGLE 52
V L G G GKST++ L GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02460TCRTETOQM1963e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 196 bits (501), Expect = 3e-57
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FNAMDV-GDSVVDPANPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


8C730_02660C730_02810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_02660019-3.192605GTPase Era
C730_02665317-2.953595hypothetical protein
C730_02670619-3.127966hypothetical protein
C730_02675720-2.316754cag pathogenicity island protein (cag1)
C730_02680820-2.107038cag pathogenicity island protein Epsilon
C730_02685819-2.221677cag island protein
C730_02690819-2.088424cag pathogenicity island protein (cag3)
C730_02695916-2.393086cag pathogenicity island protein (cag4)
C730_02700917-2.752296cag pathogenicity island protein Beta
C730_02705920-3.131616cag pathogenicity island protein alpha
C730_02710820-3.376760cag pathogenicity island protein (cag6)
C730_02715821-3.382790hypothetical protein
C730_02720920-3.341466cag pathogenicity island protein (cag7)
C730_02725926-4.247817cag pathogenicity island protein (cag8)
C730_027301030-4.371000cag pathogenicity island protein (cag9)
C730_027351431-5.096179cag pathogenicity island protein (cag10)
C730_027401435-5.266315cag pathogenicity island protein (cag11)
C730_027451228-5.254643cag pathogenicity island protein T
C730_027501024-5.513434cag pathogenicity island protein T
C730_027551023-5.415322cag pathogenicity island protein S
C730_02760620-4.044164cag pathogenicity island protein R
C730_02765619-2.846130hypothetical protein
C730_02770519-2.818030cag pathogenicity island protein (cag16)
C730_02775620-3.127798cag pathogenicity island protein (cag17)
C730_02780519-2.937113cag pathogenicity island protein (cag18)
C730_02785520-3.163692cag pathogenicity island protein (cag19)
C730_02790520-3.201912cag pathogenicity island protein (cag20)
C730_02795622-4.130894cag pathogenicity island protein (cag21)
C730_02800622-3.418068cag pathogenicity island protein (cag22)
C730_02805420-2.485358cag pathogenicity island protein (cag23)
C730_02810218-1.017376cag pathogenicity island protein (cag24)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02660PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYDSQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02690PF07201300.019 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.019
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLVANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02720IGASERPTASE350.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 0.003
Identities = 33/173 (19%), Positives = 71/173 (41%), Gaps = 23/173 (13%)

Query: 970 KARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEARKL 1029
+ NE+ + E + P A ++ + + ++KT + ++ T + R++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 1030 LEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEQQVLDCLKNAKTEADKKRCV 1089
+EAK +VKA A++ E KE + T E + +++ AK E +K + V
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1090 KDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEEAKE 1142
KV ++ S K + ++E + + E + ++E +
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQS 1160



Score = 35.4 bits (81), Expect = 0.003
Identities = 30/179 (16%), Positives = 65/179 (36%), Gaps = 12/179 (6%)

Query: 713 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDC 772
P ++ + + ++KT + ++ T + ++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 773 VSRARNEKEKKECEKLLTPEAKKLLEQQALDCLKNAKTDKERKKCLKDLPKDLQKKVLAK 832
A++ E KE + T E + +++ KT + K + PK Q + +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 833 ESVKAYLDCVSQAKTEAEKKECEKYLDCVSQAKNEAEKKECEKLLTLESKKKLEEAKKS 891
QA+ E + SQ A+ ++ K + ++ + E+
Sbjct: 1142 -----------QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189



Score = 34.7 bits (79), Expect = 0.004
Identities = 25/191 (13%), Positives = 62/191 (32%), Gaps = 10/191 (5%)

Query: 842 VSQAKTEAEKKECEKYLDCVSQAKNEAEKKECEKLLTLESKKKLEEAKKSVKAYLDCVSQ 901
+ +E + + A E + + SK++ + +K+ Q
Sbjct: 1005 ADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--------EQ 1056

Query: 902 AKTEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESL 961
TE + E ++ Q + ++ + + ++K+ AK
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET 1116

Query: 962 KAYKDCVSKARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKL 1021
+ ++ K+E + + P+A+ E + SQ T A+ ++ K
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND--PTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 1022 LTPEARKLLEE 1032
+ + + E
Sbjct: 1175 TSSNVEQPVTE 1185



Score = 34.7 bits (79), Expect = 0.005
Identities = 33/185 (17%), Positives = 75/185 (40%), Gaps = 8/185 (4%)

Query: 968 VSKARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDC--VSQAKTEAEKKECEKLLTPE 1025
SK + E+ E T + +++ +EAK +VKA V+Q+ +E ++ + +
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 1026 ARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEQQVLDCLKNAK----T 1081
+ E+AK + ++ K+E + + P+A+ E +K + T
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1082 EADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKKECEKLLTPEAKKLLEEAK 1141
AD ++ K+ ++++ V +V V N + + + K +
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHR 1224

Query: 1142 ESLKA 1146
S+++
Sbjct: 1225 RSVRS 1229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02725TYPE4SSCAGX8750.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 875 bits (2262), Expect = 0.0
Identities = 514/522 (98%), Positives = 518/522 (99%)

Query: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAAALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
MGQAFFKKIVGCFCLGYLFLSSAIEA ALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EETIKQRAKDKINIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EE ++QRAKDKI+IKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQKELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQ+ELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02735PF043351188e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (297), Expect = 8e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02775TYPE4SSCAGX280.042 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 28.2 bits (62), Expect = 0.042
Identities = 28/119 (23%), Positives = 55/119 (46%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTLHQRHDDEEVTKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K + D +E+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSIQKKAAKHKGLQELNEINATPLNDNPNSNSSTETKSNKDDNFDEM 142
QK+ K +++A L+ L + P N + N N S K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02805ACRIFLAVINRP320.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.012
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFKKIEELKADMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


9C730_03470C730_03600Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_03470015-3.670287hypothetical protein
C730_03475014-2.207266hypothetical protein
C730_03480014-2.603549aspartate aminotransferase
C730_03485011-2.255634hypothetical protein
C730_03490-1120.154845integrase-recombinase protein
C730_034952110.637753methylated-DNA--protein-cysteine
C730_035001111.221505hypothetical protein
C730_035051121.535886putative lipopolysaccharide biosynthesis
C730_035101121.842128ribonucleotide-diphosphate reductase subunit
C730_035152161.592421hypothetical protein
C730_035205130.855386hypothetical protein
C730_035252100.473596bifunctional N-acetylglucosamine-1-phosphate
C730_03530112-0.332177hypothetical protein
C730_035351100.921907flagellar biosynthesis protein
C730_035401111.054360flagellar biosynthesis protein FliP
C730_035451111.612803iron(III) dicitrate transport protein (fecA)
C730_03550-2112.154747ferrous iron transport protein B
C730_03555-1143.054666hypothetical protein
C730_035601144.333003acetyl coenzyme A acetyltransferase
C730_035652163.555188succinyl-CoA-transferase subunit A
C730_035703153.995718succinyl-CoA-transferase subunit B
C730_035754153.618972short-chain fatty acids transporter
C730_035854163.018885hypothetical protein
C730_035954152.908891hydantoin utilization protein A (hyuA)
C730_036002131.974115Hydantoin utilization protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03535FLGBIOSNFLIP754e-20 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein

FliP signature.
Length = 245

Score = 75.3 bits (185), Expect = 4e-20
Identities = 33/97 (34%), Positives = 49/97 (50%), Gaps = 3/97 (3%)

Query: 1 MRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPHSNF-SLALF 96
L+MTSFTR+I+VF LR ALGT PP+ LALF
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALF 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03540FLGBIOSNFLIP1744e-58 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 174 bits (442), Expect = 4e-58
Identities = 70/127 (55%), Positives = 98/127 (77%)

Query: 1 MDKKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMI 60
++KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++
Sbjct: 117 SEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVT 176

Query: 61 SELKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLT 120
SELKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL
Sbjct: 177 SELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLV 236

Query: 121 ENLVASF 127
+L SF
Sbjct: 237 GSLAQSF 243


10C730_03685C730_03765Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_03685312-0.674508RNA polymerase factor sigma-54
C730_036902140.328217putative ABC transporter ATP-binding protein
C730_036952120.119414hypothetical protein
C730_037001130.940551DNA polymerase III subunits gamma and tau
C730_037051142.767826hypothetical protein
C730_037102143.884589hypothetical protein
C730_037153163.660001hypothetical protein
C730_037203163.541547hypothetical protein
C730_037252153.234443hypothetical protein
C730_037301132.491428L-asparaginase II
C730_037350120.950420anaerobic C4-dicarboxylate transporter
C730_03740013-0.210975hypothetical protein
C730_03745113-1.643849hypothetical protein
C730_03750215-3.204773transcriptional regulator
C730_03755213-2.826511tRNA(Ile)-lysidine synthetase
C730_03760213-2.606607hypothetical protein
C730_03765211-1.432702hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03700IGASERPTASE340.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.002
Identities = 19/63 (30%), Positives = 27/63 (42%), Gaps = 1/63 (1%)

Query: 510 SKPKPTTETTAETKEKETKEKEIQENDTKEIQEVQPKQAPTALQEFMANHSEL-IEEIKS 568
+ P TTET AE ++E+K E E D E + A A AN + + S
Sbjct: 1031 ATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 569 EFE 571
E +
Sbjct: 1091 ETK 1093


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03715SECA240.038 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 23.7 bits (51), Expect = 0.038
Identities = 8/17 (47%), Positives = 12/17 (70%)

Query: 8 ELEEKTKGLSDEEIKAK 24
+E + + LSDEE+K K
Sbjct: 30 AMEPEMEKLSDEELKGK 46


11C730_04475C730_04700Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_044752142.561583alkylphosphonate uptake protein
C730_044802152.681231hypothetical protein
C730_044853142.738065hypothetical protein
C730_044903142.329434catalase
C730_044952151.906949iron-regulated outer membrane protein
C730_04500321-2.052446Holliday junction resolvase
C730_04505625-3.724553hypothetical protein
C730_04510318-1.802262hypothetical protein
C730_04515015-2.525631hypothetical protein
C730_04520313-1.832165hypothetical protein
C730_04525312-1.262958hypothetical protein
C730_04530312-1.094748hypothetical protein
C730_045352120.120174Holliday junction DNA helicase RuvA
C730_045402120.258360hypothetical protein
C730_045451141.342728virulence factor MviN protein
C730_045500151.971900cysteinyl-tRNA ligase
C730_045651172.249493iron compounds ABC transporter ATP-binding
C730_045700192.235857iron(III) dicitrate ABC transporter permease
C730_04575013-1.014149short-chain oxidoreductase
C730_045802150.855419hypothetical protein
C730_045852170.842534hypothetical protein
C730_045900162.159874hypothetical protein
C730_045951162.719755hypothetical protein
C730_046000152.814181hypothetical protein
C730_046051163.289544hypothetical protein
C730_046100152.393930**hypothetical protein
C730_046150151.798064hydrogenase isoenzymes formation protein HypD
C730_046200170.458055hydrogenase expression/formation protein
C730_04625-1170.718467hydrogenase expression/formation protein HypB
C730_04630-3180.873794hypothetical protein
C730_046352171.825042hypothetical protein
C730_046401152.064982acetate kinase
C730_046451152.534744acetate kinase A/propionate kinase 2
C730_046501151.707839phosphotransacetylase
C730_046652140.862228phosphotransacetylase (pta)
C730_046700140.733773hypothetical protein
C730_04675-2131.459920flagellar basal body rod modification protein
C730_04680-1141.875005flagellar hook protein FlgE
C730_04685-1121.475575hypothetical protein
C730_04690-1122.212255adenine-specific DNA methyltransferase
C730_046950122.791545rep helicase, single-stranded DNA-dependent
C730_047002143.525326hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04550OMS28PORIN300.015 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.015
Identities = 17/51 (33%), Positives = 32/51 (62%), Gaps = 4/51 (7%)

Query: 309 EEDLLVSKKRLDKIYRLKQRVLGTLGGINPNFKKEILECMQDDLNVSKALS 359
+E L+ S++ LD+ + Q+VL + G+NP+ K ++L +V+KA+S
Sbjct: 188 KETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVLA----KKDVAKAIS 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04575DHBDHDRGNASE932e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.8 bits (230), Expect = 2e-24
Identities = 60/243 (24%), Positives = 110/243 (45%), Gaps = 12/243 (4%)

Query: 1 MGEKKESQKVAVITGASSGIGLECALMLLDQGYKVYALSRHATLCVALNHALC------E 54
M K K+A ITGA+ GIG A L QG + A+ + + +L E
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 55 SVDIDVSDSNALKEVFSNISAKEKYCDVLINSAGYGVFGSVEDTPIEEVKKQFSVNFFAL 114
+ DV DS A+ E+ + I + D+L+N AG G + EE + FSVN +
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 115 CEVVQFCLPLLKNKPHSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNV 174
+ + ++ I + S V + Y++SK A ++ L LEL +N+
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 175 QVCLIEPGPVKSNWEKTAFSVENFESEDSLYALEVNAAKSFYSGVYQNALS-PKAVAQKI 233
+ ++ PG +++ + + ++ EN + + + ++F +G+ L+ P +A +
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQ-----VIKGSLETFKTGIPLKKLAKPSDIADAV 235

Query: 234 VFL 236
+FL
Sbjct: 236 LFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04640ACETATEKNASE1235e-37 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 123 bits (310), Expect = 5e-37
Identities = 48/117 (41%), Positives = 72/117 (61%), Gaps = 2/117 (1%)

Query: 1 MRNIEARK-EKGDKEAKLAFEMCAYRIKKHIGAYMVVLKKVDAIIFTGGLGENYSALRES 59
R++E + GDK A+LA + AYR+KK IG+Y + VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 60 VCEGLENLGIALCKPTNDNPGSGLVNLSQPDAKIQILRIPTDEELEIALQTKKVLEK 116
+ +GLE LG L K N G + +S D+K+ ++ +PT+EE IA T+K++E
Sbjct: 343 ILDGLEFLGFKLDKEKNKVRGEEAI-ISTADSKVNVMVVPTNEEYMIAKDTEKIVES 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04645ACETATEKNASE353e-123 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 353 bits (907), Expect = e-123
Identities = 142/282 (50%), Positives = 192/282 (68%), Gaps = 6/282 (2%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKFVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVNEKVMQEIGNLS 118
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 119 ILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHATMPSYAYMYALPYELYEKYQIRHY 178
LAPLHNPAN+ GI+ + P +P +AVFDTAFH TMP YAY+Y +PYE Y KY+IR Y
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKY 176

Query: 179 GFHRTSHHYVAKEAAKFLNTAYEEFNAISLHLGNGSSAAAIQKGKSVDTSMGLTPLEGLI 238
GFH TSH YV++ AA+ LN E I+ HLGNGSS AA++ GKS+DTSMG TPLEGL
Sbjct: 177 GFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLA 236

Query: 239 MGTRCGDIDPTVVEYTAQCANKSLEEVMKMLNHESGLKGICG 280
MGTR G IDP+++ Y + N S EEV+ +LN +SG+ GI G
Sbjct: 237 MGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04670IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 5e-04
Identities = 40/227 (17%), Positives = 70/227 (30%), Gaps = 11/227 (4%)

Query: 294 KKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANPPPNDNIPTPLEKEEKAKE 353
+ Q PS N + P+ P A TP E E E
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA---------TPSETTETVAE 1042

Query: 354 ASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKPPMSRI 413
S + KT E + Q + + KS T + Q E +E + ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 414 SMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNALGFEGVDLSF 473
+ + +E KVE +K + KV+ Q+ Q N +
Sbjct: 1103 TATVEKEEKAKVET--EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 474 SQDSSKEQQAPKDQPKEPFKEQELTPLKENALKSYQENTDNENQETS 520
+++ + + P + ++ N S EN +N T+
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207



Score = 30.4 bits (68), Expect = 0.025
Identities = 48/240 (20%), Positives = 82/240 (34%), Gaps = 18/240 (7%)

Query: 184 PKTLKDIQTLSQKHDLNASNIQAATTPENKN------PLNASDQLALKTTQTPTNHTLAK 237
P+ K QT+ + +NIQA N A T + T T+A+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 238 NDAKNTANLSSVLQSLEKKEPQNKEHANPLNNEKKTPPLK--------EALEMNAIKRDK 289
N + + + Q + QN+E A + K E E + +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 290 TLSKKKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANP-PPNDNIPTPLEKE 348
T + +K EK + + P T + +PK + A P ND E +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPK---QEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 349 EKAKEASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKP 408
+ +D ++ KETS++ + + T ++ P+ T TQ KP
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04680FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


12C730_05035C730_05195Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_050353151.193770cell division protein FtsA
C730_050404180.303362cell division protein FtsZ
C730_05045318-2.026807hypothetical protein
C730_05050418-2.485202hypothetical protein
C730_05055421-2.971416hypothetical protein
C730_05060319-3.216404hypothetical protein
C730_05065120-4.333441hypothetical protein
C730_05070122-5.122100hypothetical protein
C730_05075428-7.012990hypothetical protein
C730_05080422-5.081206hypothetical protein
C730_05085322-5.317555hypothetical protein
C730_05090422-5.233193hypothetical protein
C730_05095624-5.891515hypothetical protein
C730_05100723-6.402470IS605 transposase (tnpA)
C730_051051024-6.382484IS605 transposase (tnpB)
C730_051101024-6.365512hypothetical protein
C730_051251023-6.317328hypothetical protein
C730_051301022-6.087386hypothetical protein
C730_051351020-5.065318integrase/recombinase (xerD)
C730_05140918-4.424287relaxase
C730_05145417-3.622228IS605 transposase (tnpB)
C730_05150315-4.284030IS605 transposase (tnpA)
C730_05155416-4.222122adenine specific DNA methyltransferase
C730_05160315-4.418434PARA protein
C730_05165417-4.510965hypothetical protein
C730_05170219-5.976680hypothetical protein
C730_05185324-7.004282hypothetical protein
C730_05190322-6.932981hypothetical protein
C730_05195012-3.659417conjugal transfer protein (traG)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05035SHAPEPROTEIN423e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 42.1 bits (99), Expect = 3e-06
Identities = 38/176 (21%), Positives = 67/176 (38%), Gaps = 12/176 (6%)

Query: 210 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 263
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 264 HMLNTPFPYAEEVKIKYGDLSFEGGEETPSQNVQIPTTGSDGHESHIVPLSEIQTIMRER 323
+ AE +K + G S G+E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 324 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 376
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


13C730_05690C730_05810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_05690291.738343glucokinase
C730_056953100.909709cinnamyl-alcohol dehydrogenase ELI3-2 (cad)
C730_057001111.815736LPS biosynthesis protein
C730_057052122.567585hypothetical protein
C730_057100142.991892hypothetical protein
C730_057150122.579051pyruvate flavodoxin oxidoreductase subunit
C730_05720-1112.325575pyruvate flavodoxin oxidoreductase subunit
C730_05725-1111.793351pyruvate flavodoxin oxidoreductase subunit
C730_05730011-0.066151ferrodoxin oxidoreductase beta subunit
C730_05735212-0.600593adenylosuccinate lyase
C730_05740315-1.131264hypothetical protein
C730_05745115-0.014447excinuclease ABC subunit B
C730_05750215-0.319942hypothetical protein
C730_05755214-0.333776hypothetical protein
C730_05760-1150.038016hypothetical protein
C730_05765-1140.223542hypothetical protein
C730_05770014-0.069682gamma-glutamyltranspeptidase
C730_05775-113-1.131433flagellar hook-associated protein FlgK
C730_05780016-1.784293hypothetical protein
C730_05785119-1.223478cytosine specific DNA methyltransferase
C730_05790216-0.846334hypothetical protein
C730_05795214-1.840314hypothetical protein
C730_05800315-1.743974peptidyl-prolyl cis-trans isomerase
C730_05805317-2.453378hypothetical protein
C730_05810315-2.108428peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05715YERSSTKINASE290.010 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.3 bits (65), Expect = 0.010
Identities = 18/63 (28%), Positives = 33/63 (52%), Gaps = 9/63 (14%)

Query: 50 YNRVDDEPILNHERFMQPDYVLVIDPGLVFIENIFANEKEDTTYIITSYLNKEELFEKKP 109
++R ++P E F P+ + + N+ A+EK D ++++ L+ E FEK P
Sbjct: 293 HSRSGEQPKGFTESFKAPE---------LGVGNLGASEKSDVFLVVSTLLHCIEGFEKNP 343

Query: 110 ELK 112
E+K
Sbjct: 344 EIK 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05775FLGHOOKAP15700.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 570 bits (1471), Expect = 0.0
Identities = 129/610 (21%), Positives = 229/610 (37%), Gaps = 75/610 (12%)

Query: 6 SSLNTSYTGLQAHQSMVDVTGNNISNASDEFYSRQRVIAKPQAAYMYGTKNVNMGVDVEA 65
S +N + +GL A Q+ ++ NNIS+ + Y+RQ I + + V GV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 66 IERVHDEFVFARYTKANYENTYYDTEFSHLKEASAYFPDIDEASLFTDLQDYFNSWKELS 125
++R +D F+ + A +++ + + + +SL T +QD+F S + L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML-STSTSSLATQMQDFFTSLQTLV 120

Query: 126 KNAKDSAQKQALAQKTEALTHNIKDTRERLTTLQHKASEELKSVIKEVNSLGSQIAEINK 185
NA+D A +QAL K+E L + K T + L + + + + + ++N+ QIA +N
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 186 RIKEVENNKSLKHANELRDKRDELEFHLRELLGGNVFKSSIKTHSLTDKDSADFDESYNL 245
+I + + N L D+RD+L L +++G V S +YN+
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEV--------------SVQDGGTYNI 226

Query: 246 NIGHGFNIIDGSIFHPLVVKESENKGGLNQVYFQSDDFKVTNITDK-LNQGRVGALLNVY 304
+ +G++++ GS L S V + I +K LN G +G +L
Sbjct: 227 TMANGYSLVQGSTARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFR 286

Query: 305 NDGSNGTLKGKLQDYIDLLDSFAKGLIESTNAIYAQSASHYIEGEPVEFNSDEAFKDTNY 364
+ L + L A E+ N + +A D N
Sbjct: 287 SQ--------DLDQTRNTLGQLALAFAEAFNTQH------------------KAGFDANG 320

Query: 365 NIKNGSFDL----IAYNTDGKEIARKTIAITPITTMNDIIQAINANTDDNQ-----DNNT 415
+ F + + NT K +T + + I+ + + Q N T
Sbjct: 321 DAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQWQVTRLASNTT 380

Query: 416 ENDFDDYFTAGFNNETKKFVIQPKNASQGLFVSMKDNGTNFMGALKLNPFFQGDDASNIS 475
D + + + + + M L D + I+
Sbjct: 381 FTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLI-------TDEAKIA 433

Query: 476 LNKEYKKEPTTIRPWLAPINGNFDVANMMQQLQYDSVDFYNDKFDIKPMKISEFYQFLTG 535
+ E E G+ D N L S N K ++ Y L
Sbjct: 434 MASE---EDA----------GDSDNRNGQALLDLQS----NSKTVGGAKSFNDAYASLVS 476

Query: 536 KINTDAEKSGRILDTKKSMLETIKKEQLSISQVSVDEEMVNLIKFQSGYAANAKVITAID 595
I T+ +++ + +Q SIS V++DEE NL +FQ Y ANA+V+ +
Sbjct: 477 DIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTAN 536

Query: 596 RMIDTLLGIK 605
+ D L+ I+
Sbjct: 537 AIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05805GPOSANCHOR330.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.002
Identities = 25/145 (17%), Positives = 49/145 (33%)

Query: 27 GATKKELKQLQINSKNFSNILTKIHSQVEANTQAQEGLRSVYEGQANKIKDLNNAILSQE 86
K + ++ +A EG + + KIK L + E
Sbjct: 200 EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALE 259

Query: 87 ESLRALKASQEVQANTLKQQSQTLEDLRNEIHANQQAIQQLDKQNKEMSELLTKLSQDLV 146
L+ + E N S ++ L E A + L+ Q++ ++ L +DL
Sbjct: 260 ARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLD 319

Query: 147 SQIALIQKALKEQEEKAEKPLKSNA 171
+ ++ E ++ E+ S A
Sbjct: 320 ASREAKKQLEAEHQKLEEQNKISEA 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05810OMPADOMAIN1476e-46 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (373), Expect = 6e-46
Identities = 48/169 (28%), Positives = 75/169 (44%), Gaps = 24/169 (14%)

Query: 22 KMDNKTVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAVESGTIIASIYFDF 80
+ DN ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKC-----AQKTR----ECYKENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


14C730_05865C730_05890Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_05865018-3.681306F0F1 ATP synthase subunit B'
C730_05870118-3.287343plasmid replication-partition related protein
C730_05875218-3.740321SpoOJ regulator
C730_05880220-4.954582biotin--protein ligase
C730_05885221-5.557448methionyl-tRNA formyltransferase
C730_05890222-5.839680hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05875PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 69 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 125
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 126 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 170
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_05885FERRIBNDNGPP300.008 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 30.3 bits (68), Expect = 0.008
Identities = 12/33 (36%), Positives = 21/33 (63%)

Query: 70 EPEVQILKDLKPNFIVVVAYGKILPKEVLTIAP 102
EP +++L ++KP+F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


15C730_07165C730_07220Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_07165217-2.832073ribulose-phosphate 3-epimerase
C730_07170219-3.671284DNA polymerase III subunit epsilon
C730_07175417-6.344078hypothetical protein
C730_07180613-3.979591hypothetical protein
C730_07185411-2.193209hypothetical protein
C730_07190512-2.352826hypothetical protein
C730_07195111-1.717943hypothetical protein
C730_07200013-1.988666fibronectin/fibrinogen-binding protein
C730_07205-211-1.086064DNA repair protein (recN)
C730_07210113-0.002204inorganic polyphosphate/ATP-NAD kinase
C730_07215011-0.283601hypothetical protein
C730_072202122.505066hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07200FbpA_PF058331125e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 112 bits (282), Expect = 5e-29
Identities = 74/361 (20%), Positives = 143/361 (39%), Gaps = 31/361 (8%)

Query: 97 AKDLAYKSENFILRLEMIPKKANLMILDKEKCVIEA--FRFNDRVAKNDILGALPPN-IY 153
+ ++ ++ +N + L + K + + I++ F FN N +G N +
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 154 EHQEEDLDFKGLLDILEKDFLFYQHKE----LEHKKNQIIKRLNAQKERLKEKLEKLEDP 209
+ + + + +LE FY K+ L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLEN---FYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNT 325

Query: 210 KNLQLEAKELQTQASLLLTYQHLIHKHESRVVLKDFED---KERAIEIDKSMPLNAFINK 266
+ + LL + + K S + L ++ I +D++ + +
Sbjct: 326 LKKCEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQS 385

Query: 267 KFTLSKKKKQKSQFLYLEEENLKEKIAFKENQINYVKGAQEESVLE------------MF 314
+ K K+ + + +E++ + + + + A +E F
Sbjct: 386 YYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKF 445

Query: 315 MPSKNSKIKRPMSGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLI 373
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+I
Sbjct: 446 KKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVI 505

Query: 374 VFCQKNAPKDDVIMELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRT 429
V + P + ++E A + K +S +DYT+ K VK GA VIYS +T
Sbjct: 506 VKNIMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQT 564

Query: 430 I 430
I
Sbjct: 565 I 565



Score = 35.2 bits (81), Expect = 5e-04
Identities = 20/92 (21%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 SAPYIGLSKKPPESVLKNTLALDFCLNKFTRNAKILQANIIDNDRI--LEINGAKDLAYK 103
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SENFILRLEMIPKKANLMILDK-EKCVIEAFR 134
S + L +E++ + +N+ ++ K + ++++ +
Sbjct: 114 SI-YSLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


16C730_07595C730_07655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_075952100.764511hypothetical protein
C730_07600180.365159hypothetical protein
C730_07605190.400622branched-chain amino acid aminotransferase
C730_07610111-0.318509hypothetical protein
C730_07615112-0.501406DNA polymerase I
C730_07620-1150.148001type IIS restriction enzyme R protein (BCGIB)
C730_076250150.489164type IIS restriction enzyme M protein (mod)
C730_076304170.663862hypothetical protein
C730_076353130.424483thymidylate kinase
C730_076403120.162947phosphopantetheine adenylyltransferase
C730_076452120.4059843-octaprenyl-4-hydroxybenzoate carboxy-lyase
C730_076503120.154701hypothetical protein
C730_076552120.154884flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07635BLACTAMASEA270.030 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 27.4 bits (61), Expect = 0.030
Identities = 11/45 (24%), Positives = 21/45 (46%), Gaps = 7/45 (15%)

Query: 22 DRFKNALFTKEPGGTR-------MGESLRRIALNENISELARAFL 59
DR++ L PG R M +LR++ ++ +S ++ L
Sbjct: 159 DRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQL 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07640LPSBIOSNTHSS2241e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 224 bits (573), Expect = 1e-78
Identities = 64/147 (43%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLDERLKMIQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ I A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


17C730_07810C730_07960Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_078102132.937122saccharopine dehydrogenase
C730_078152142.960689ferrodoxin-like protein
C730_078201122.043745glycerol-3-phosphate acyltransferase PlsY
C730_07825-1121.844076hypothetical protein
C730_07830-1110.701934hypothetical protein
C730_07835-111-0.830172iron-regulated outer membrane protein
C730_07840015-3.594075hypothetical protein
C730_07845011-3.535286selenocysteine synthase
C730_07850010-3.411876transcription elongation factor NusA
C730_07855110-4.606845hypothetical protein
C730_07860110-3.637209type IIS restriction enzyme R and M protein
C730_07865113-3.948325hypothetical protein
C730_07880212-3.467847type III restriction enzyme R protein (res)
C730_07885213-3.273734type III R-M system modification enzyme
C730_07890112-2.935241type III DNA modification enzyme
C730_07895013-1.780157ATP-dependent DNA helicase RecG
C730_07900016-1.105207hypothetical protein
C730_07905-114-0.856989hypothetical protein
C730_07910-112-1.120239exodeoxyribonuclease III
C730_07915112-0.176127*hypothetical protein
C730_07920316-0.039662hypothetical protein
C730_079252140.301667chromosomal replication initiation protein
C730_07930116-0.669618purine nucleoside phosphorylase (punB)
C730_07935115-1.081144hypothetical protein
C730_07940115-1.111632glucosamine--fructose-6-phosphate
C730_07945216-3.429388FAD-dependent thymidylate synthase
C730_07950115-2.194528hypothetical protein
C730_07955013-0.847646IS605 transposase (tnpB)
C730_079602110.716939IS605 transposase (tnpA)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07825PF08280260.032 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 26.3 bits (58), Expect = 0.032
Identities = 13/38 (34%), Positives = 21/38 (55%)

Query: 76 LSHALKTRYKEITELYLKISKLEISPNSQVGASVKIRY 113
LS++ R +E L+ +L++S N VG +IRY
Sbjct: 147 LSNSSAYRMREALIPLLRNFELKLSKNKIVGEEYRIRY 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07925HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


18C730_00180C730_00205N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_00180-2130.404200hypothetical protein
C730_00185-2130.357231conjugal plasmid transfer system protein
C730_00190-2141.149127ComB10 competence protein
C730_00195-1121.344916mannose-6-phosphate isomerase
C730_00200-2121.621943GDP-D-mannose dehydratase
C730_00205-1131.491686nodulation protein (nolK)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00180PF043351322e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 132 bits (334), Expect = 2e-40
Identities = 37/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYVQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00185TYPE4SSCAGX300.017 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.017
Identities = 27/110 (24%), Positives = 51/110 (46%), Gaps = 18/110 (16%)

Query: 155 FIEDKNYYSNAFLKPQKENMAENAPKDAPTNNKPLKEEKEETKEKEEETITIGDNTNAMK 214
I+ +N + A++ N A + N + ++EEK++ + + + NA+K
Sbjct: 339 LIKQENLNTTAYI-----NRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALK 393

Query: 215 IVKKDIQKGYKALKSSQRKWYCLGICSKKSKLSLMPKEIFNDKQFTYFKF 264
+ + + Y ++ + K+SK +MP EIF+D FTYF F
Sbjct: 394 --RNPVPRNYNYYQAPE----------KRSK-HIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00195FLGMRINGFLIF310.015 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 30.7 bits (69), Expect = 0.015
Identities = 17/88 (19%), Positives = 32/88 (36%), Gaps = 3/88 (3%)

Query: 272 ALFEEAANEPKENVSLNQTPVFAKESENNLVFSHKVSAL---LGVENLAVIDTKDALLIA 328
+LF P +V++ P A + H VS+ L N+ ++D LL
Sbjct: 162 SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQ 221

Query: 329 HKDKAKDLKALVNEVETNNQELLQTHTK 356
+DL + + + +Q +
Sbjct: 222 SNTSGRDLNDAQLKFANDVESRIQRRIE 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00200NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.5 bits (217), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00205NUCEPIMERASE534e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 52.9 bits (127), Expect = 4e-10
Identities = 51/346 (14%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELC-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDSGVKKA 102
D++ + + R + ++ + Y NL L + + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKEFAMWGDGTARREYLNAKDLARFIS 222
+YG + + P + T + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYENIASIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDYKGVFVKD 265
+ I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QRALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


19C730_00565C730_00595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_00565017-1.396931flagellin B
C730_00570013-1.689582DNA topoisomerase I
C730_00575117-2.205885hypothetical protein
C730_00580115-1.481302hypothetical protein
C730_00585012-0.724102hypothetical protein
C730_00590-1100.648228hypothetical protein
C730_005950131.879419phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00565FLAGELLIN2843e-92 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 284 bits (728), Expect = 3e-92
Identities = 130/519 (25%), Positives = 221/519 (42%), Gaps = 18/519 (3%)

Query: 2 SFRINTNIAALTSHAVGVQNNRDLSSSLEKLSSGLRINKAADDSSGMAIADSLRSQSANL 61
+ INTN +L + ++ LSS++E+LSSGLRIN A DD++G AIA+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIRNANDAIGMVQTADKAMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLE 121
QA RNAND I + QT + A++E L ++ +VQA + +++Q +IQ+ LE
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKASIGSTSSDKIGHVRMETSSFSG 181
E+D ++N T FNG ++LS + Q+GA T+ + +G +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVN---- 175

Query: 182 AGMLASAAAQNLTEVGLNFKQVNGVNDYKIETVRISTSAGTGIGALSEIINRFSNTLGVR 241
+ ++ +FK V G + Y + + +G + + V
Sbjct: 176 -----GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVN 230

Query: 242 ASYNVMATG----GTPVQSGTVRELTINGVEIGTVNDVHKNDADGRLTNAINSVKDRTGV 297
A+ + T T V + T E + K +G T V
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD-TFDYKGVTFTIDT 289

Query: 298 EASLDIQGRINLHSIDGRAISVHAASASGQVFGGGNFAGISGTQHAVIGRLTLTRTDARD 357
+ D G+++ +I+G +++ A + S D +
Sbjct: 290 KTGNDGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 358 IIVSGVNFSHVGFHSAQGVAEYTVNLRAVRGIFDANVASAAGANANGAQAETNSQGIGAG 417
S ++ +G ++ TVN + + AG + + +
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 418 --VTSLKGAMIVMDMADSARTQLDKIRSDMGSVQMELVTTINNISVTQVNVKAAESQIRD 475
+ K + DSA +++D +RS +G++Q + I N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 476 VDFAEESANFSKYNILAQSGSFAMAQANAVQQNVLRLLQ 514
D+A E +N SK IL Q+G+ +AQAN V QNVL LL+
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00585IGASERPTASE553e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.5 bits (133), Expect = 3e-10
Identities = 29/184 (15%), Positives = 59/184 (32%), Gaps = 10/184 (5%)

Query: 96 DDQSKKEVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNI 155
+ EVA++ E + + K + ++EK K E E KT++ + TS +
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETK------ETATVEKEEKAKVETE--KTQEVPKVTSQV 1129

Query: 156 E-TNNQIKVEQKQQKTEQEKQKTEQEKQKTEQE-KQKTEQEKQKTSNIETNNQIKVEQKQ 213
Q + Q Q + +E T K+ Q ++ K ++ +
Sbjct: 1130 SPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189

Query: 214 QKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQ 273
+ E T Q +E + + + + + T ++
Sbjct: 1190 NTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249

Query: 274 KDLV 277
DL
Sbjct: 1250 CDLT 1253



Score = 54.7 bits (131), Expect = 6e-10
Identities = 38/231 (16%), Positives = 79/231 (34%), Gaps = 14/231 (6%)

Query: 102 EVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQI 161
T+ AEN++ + + + T Q ++ ++ K + Q T+ + +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ-TNEVAQSGSE 1091

Query: 162 KVEQKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIE-TNNQIKVEQKQQKTEQEK 220
E + +T++ ++EK K E E KT++ + TS + Q + Q Q + +E
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETE--KTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 221 QKTEQEKQKTEQE-KQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQKDLVKY 279
T K+ Q ++ K ++ + + NT +
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ- 1208

Query: 280 AEQNCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTPIQPKHLPN 330
E+ N+ K V + +PA T+ L +
Sbjct: 1209 -PTVNSESSNK-------PKNRHRRSVRSVPHNVEPATTSSNDRSTVALCD 1251



Score = 52.4 bits (125), Expect = 3e-09
Identities = 27/188 (14%), Positives = 66/188 (35%), Gaps = 4/188 (2%)

Query: 98 QSKKEVAETQKEAENARDRANKSGIELEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNIET 157
Q+++ E + + + E ++ +T + K+ EK++ + + + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP 1123

Query: 158 NNQIKVEQKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQKQQKTE 217
+V KQ+++E + + E ++ K Q + T+ + ++
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183

Query: 218 QEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQ----QKTEQEKQKTNNTQ 273
E E + T Q T N E++N+ K + E T++
Sbjct: 1184 TESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 274 KDLVKYAE 281
+ V +
Sbjct: 1244 RSTVALCD 1251



Score = 50.1 bits (119), Expect = 2e-08
Identities = 44/232 (18%), Positives = 79/232 (34%), Gaps = 12/232 (5%)

Query: 109 EAENARDRANKSGIE-LEQEQQKTEQEKQKTEQEKQKTEQEKQKTSNI---ETNNQIKVE 164
E E + + I Q E+ + E + ET +
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 165 QKQQKTEQEKQKTEQEKQKTEQEKQKTEQEKQKTSNIETNNQI--KVEQKQQKTEQEKQK 222
KQ+ EK E+ TE Q E K+ SN++ N Q + + E + +
Sbjct: 1044 SKQESKTVEK----NEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 223 TEQEKQKTEQEKQKTEQEKQKTSNIETNNQIKVEQEQQKTEQEKQKTNNTQKDLVKYAEQ 282
T++ ++EK K E EK + T +Q+ +QEQ +T Q Q + D ++
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVT-SQVSPKQEQSETVQ-PQAEPARENDPTVNIKE 1157

Query: 283 NCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTPIQPKHLPNSKQP 334
+ + ++ + +E T + P + QP
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00590IGASERPTASE290.045 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.9 bits (64), Expect = 0.045
Identities = 27/142 (19%), Positives = 45/142 (31%), Gaps = 29/142 (20%)

Query: 163 ELANSQIKAEQERQKTEQEKQ-----------------------KANKSAIELEQQKQKT 199
++ AE +Q+++ ++ KAN E+ Q +T
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 200 INTQRDLIKEQKDFIKETEQNCQENHNQFFIKKLGIKGGIAIEVEAECKTPKPAKTNQTP 259
TQ KE KE + + Q K + E +PA+ N
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 260 IQPKHLPNSKQPHSQRGSKAQE 281
+ N K+P SQ + A
Sbjct: 1153 V------NIKEPQSQTNTTADT 1168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_00595PHPHTRNFRASE2944e-92 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 294 bits (755), Expect = 4e-92
Identities = 106/446 (23%), Positives = 187/446 (41%), Gaps = 68/446 (15%)

Query: 388 DLEHMNSFKEGEILVTDN-TDPDWEPCMKK-ASAVITNRGGRTCHAAIVAREIGVPAIVG 445
+ + + E +++ ++ T D K+ T+ GGRT H+AI++R + +PA+VG
Sbjct: 146 ETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVG 205

Query: 446 VSGATDSLYTGMEITVSCAEGE---------EGYVYAGIYEHEIERVELSNMQETQT--- 493
T+ + G + V EG E ++ E + + +
Sbjct: 206 TKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTK 265

Query: 494 -----KIYINIGNPEKAFGFSQLPNHGVGLARMEMIILNQIKAHPLALVDLHHKKSVKEK 548
++ NIG P+ G G+GL R E + +++ + P
Sbjct: 266 DGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDRDQ-LPTE------------- 311

Query: 549 NEIENLMAGYANPKDFFVKKIAEGIGMISAAFYPKPVIVRTSDFKSNEYMRMLGGSSYEP 608
E Y K++ + KPV++RT D ++ + L P
Sbjct: 312 ---EEQFEAY--------KEVVQ-------RMDGKPVVIRTLDIGGDKELSYL----QLP 349

Query: 609 NEENPMLGYRGASRYYSESYNEAFSWECEALALVREEMGLTNMKVMIPFLRTIEEGKKVL 668
E NP LG+R + F + AL N+KVM P + T+EE ++
Sbjct: 350 KELNPFLGFRAIRLCLE--KQDIFRTQLRALL---RASTYGNLKVMFPMIATLEELRQAK 404

Query: 669 EILRKNNLESGKNG------LEIYIMCELPVNVILADDFLSLFDGFSIGSNDLTQLTLGV 722
I+++ + G +E+ IM E+P + A+ F D FSIG+NDL Q T+
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 723 DRDSELVSHVFDERNEAMLKMFKKAIEACKRHNKYCGICGQAPSDYPEVTEFLVKEGITS 782
DR +E VS+++ + A+L++ I+A K+ G+CG+ D L+ G+
Sbjct: 465 DRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDE 523

Query: 783 ISLNPDSVIPTWNAVAKLE-KELKEH 807
S++ S++P + + KL +ELK
Sbjct: 524 FSMSATSILPARSQLLKLSKEELKPF 549


20C730_01625C730_01655N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_01625311-1.293641guanylate kinase
C730_01630311-2.128224poly E-rich protein
C730_01635112-2.108776nuclease NucT
C730_01640114-1.889408hypothetical protein
C730_01645314-1.416985flagellar basal body L-ring protein
C730_01650213-1.244682CMP-N-acetylneuraminic acid synthetase
C730_01655213-0.733797flagellar protein FlaG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01625PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01630IGASERPTASE602e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.1 bits (145), Expect = 2e-11
Identities = 51/266 (19%), Positives = 87/266 (32%), Gaps = 19/266 (7%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEE------KEEVKEEEKEEVKEE 193
E+E N Q +E + +E E E V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 194 EKEEVKETPQEEKKPKDDETQEGETLKDKEVSKELEA-PQELEIPKEETQEQDPIKEETQ 252
K+E K + E+ + Q E KE ++A Q E+ + + + +T
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNRE--VAKEAKSNVKANTQTNEVAQ---SGSETKETQTT 1098

Query: 253 ENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTE 312
E KE + ++ E E QE+ K + S QE E Q AE ++ + + E
Sbjct: 1099 ETKETATVEKEEKAKV-ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 313 KTQAQ----ELEVPKEKTQESAEALQETQAHELEKQEIAETPQDVEIPQSQDKEVQELE- 367
+ E P ++T + E + E P++ +Q E
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 368 IPKEETQENTET-PQDVETPQEKETQ 392
PK + + + P +VE
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPATTSSND 1243



Score = 54.3 bits (130), Expect = 1e-09
Identities = 39/252 (15%), Positives = 87/252 (34%), Gaps = 15/252 (5%)

Query: 257 EKQEKTQDSPSAQELEAMQELVKEIQENSNGQENKEKTQESAEIPQDKEIQEVVTEKTQA 316
EK+ +T D+ + +Q V + N+ ++ P +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 317 QELEVPKEKTQESAEAL-QETQAHELEKQEIAETPQDVEIPQSQDKE------------- 362
QE + ++ Q++ E Q + + K + Q E+ QS +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 363 VQELEIPKEETQENTETPQ-DVETPQEKETQEDHYESIEDIPEPVMAKAMGEELPFLNEA 421
V++ E K ET++ E P+ + ++E E E E + E N
Sbjct: 1106 VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165

Query: 422 VAKIPNNENDTETPKESVTETSKNENNTETPQEKEESDKTSSPLELRLNLQDLLKSLNQE 481
+ + ++ VTE++ + E + ++ + + K+ ++
Sbjct: 1166 ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRR 1225

Query: 482 SLKSLLENKTLS 493
S++S+ N +
Sbjct: 1226 SVRSVPHNVEPA 1237



Score = 50.1 bits (119), Expect = 2e-08
Identities = 27/165 (16%), Positives = 57/165 (34%), Gaps = 3/165 (1%)

Query: 168 QEEKEEVKEEEKEEVKEEEKEEVKEEEKEEVKETPQEEKKPKDDETQEGETLKDKEVSKE 227
E +E + E +E EKEE + E E+ +E P+ + + Q E ++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 228 LEAPQELEIPKEETQEQDPIKEETQENKEEKQEKTQDSPSAQELEAMQELVKEIQENSNG 287
+ ++ P+ +T + Q KE Q + + +V+ + +
Sbjct: 1149 NDPTVNIKEPQSQTNTT---ADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 288 QENKEKTQESAEIPQDKEIQEVVTEKTQAQELEVPKEKTQESAEA 332
ES+ P+++ + V + + A
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01645FLGLRINGFLGH1959e-65 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 195 bits (496), Expect = 9e-65
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01655SACTRNSFRASE270.029 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.2 bits (60), Expect = 0.029
Identities = 14/48 (29%), Positives = 20/48 (41%), Gaps = 3/48 (6%)

Query: 103 GETILKALEFIAFE---EFQLHSLHLEVMENNFKAIAFYEKNHYELEG 147
+ + AL A E E L LE + N A FY K+H+ +
Sbjct: 103 KKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


21C730_01780C730_01835N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_01780-3100.841997flagellar MS-ring protein
C730_01785-3101.436763flagellar motor switch protein G
C730_01790-290.556561flagellar assembly protein H
C730_01795-290.9118691-deoxy-D-xylulose-5-phosphate synthase
C730_01800-2110.280629GTP-binding protein LepA
C730_01805-212-0.844997hypothetical protein
C730_01810-1120.063379short chain alcohol dehydrogenase
C730_01815-111-0.357484hypothetical protein
C730_018251110.180075**UDP-glucose 4-epimerase
C730_01830112-0.200195tRNA pseudouridine synthase A
C730_01835012-1.357277hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01780FLGMRINGFLIF5590.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 559 bits (1443), Expect = 0.0
Identities = 178/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYTQGGYGVLFEGLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GASKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGVNTLEYEPLSDESLQKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +++I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPVIDNATLSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEYKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01785FLGMOTORFLIG351e-123 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 351 bits (902), Expect = e-123
Identities = 122/338 (36%), Positives = 209/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAKKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIVKLDNFAIREILKVADKKDLSLALKTSTKDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDIV LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 30.2 bits (68), Expect = 0.010
Identities = 20/102 (19%), Positives = 41/102 (40%), Gaps = 3/102 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEA 102
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEK 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01800TCRTETOQM1418e-38 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 141 bits (358), Expect = 8e-38
Identities = 100/437 (22%), Positives = 176/437 (40%), Gaps = 85/437 (19%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLISECNAIS---NREMKSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTFKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+F+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNHLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSNTNEVSAKARLGIKD--------- 170
+ + INKID ++ V QDI++ + + +V + + +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDT 177

Query: 171 -------LLEKIITTIPAPSGDFNAPLKALIYD-------------------------SW 198
LLEK ++ + + ++ +
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 199 F--------------------DNYLGALALVRIMDGSINTEQEILVMGTGKKHGVLGLYY 238
F LA +R+ G ++ + + K + +Y
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYT 296

Query: 239 PNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDAKNPTSKPIEGFMPAKPFV 295
+ GEI I+ L L SV +GDT P + IE P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL---PQRERIEN---PLPLL 346

Query: 296 FAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFGFRVGFLGLLHMEVIKERL 355
+ P + + E L +ALL++ +D L + +S+ + FLG + MEV L
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---EIILSFLGKVQMEVTCALL 403

Query: 356 EREFGLNLIATAPTVVY 372
+ ++ + + PTV+Y
Sbjct: 404 QEKYHVEIEIKEPTVIY 420



Score = 31.4 bits (71), Expect = 0.011
Identities = 15/82 (18%), Positives = 28/82 (34%), Gaps = 2/82 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTEGYASFDYEPIENREAN 480
L T G + E
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTT 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01810DHBDHDRGNASE835e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.8 bits (204), Expect = 5e-21
Identities = 55/233 (23%), Positives = 96/233 (41%), Gaps = 21/233 (9%)

Query: 4 ILVSGATSGFGLEIAKAFLQKNHVVFGTGRRKENLQKL------QLAYPKHFIPLCFDLQ 57
++GA G G +A+ + + E L+K+ + + + F P D++
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA--DVR 67

Query: 58 NKLETKRALEAIFSMTDRIDALINNAGLALGLNKAYECELDDWEIMIDTNIKGLLHLTRL 117
+ I ID L+N AG+ L + ++WE N G+ + +R
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 118 ILPSMIEHDQGTIINLGSIAGTYAYPGGNVYGASKAFVKQFSLNLRADLAGTNIRVSNVE 177
+ M++ G+I+ +GS Y +SKA F+ L +LA NIR + V
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 178 PGLCGETEFSMVRFKGDKIKAQSV------YENTIYL----KPQDIANIVLWI 220
PG ET+ + + Q + ++ I L KP DIA+ VL++
Sbjct: 187 PG-STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01825NUCEPIMERASE1202e-33 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 120 bits (302), Expect = 2e-33
Identities = 62/341 (18%), Positives = 131/341 (38%), Gaps = 48/341 (14%)

Query: 1 MALLFTGACGYIGSHTARAFLEKTKENIIIVDDLSTGF---LEHLKALEHYYPNRVVFIQ 57
M L TGA G+IG H ++ LE + ++ +D+L+ + L+ + P F +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQ-VVGIDNLNDYYDVSLKQARLELLAQPG-FQFHK 58

Query: 58 ANLNETHKLDAFLNKQQLKDPIEAILHFGAKISVEESTHLPLEYYTNNTLNTLELVKLCL 117
+L + + E + +++V S P Y +N L +++ C
Sbjct: 59 IDLADREGMTDLFASGH----FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCR 114

Query: 118 KHAIKRFIFSSTAVVYGKSSS-SLNEESPLN-PINPYGASKMMSERILLDTSKIADFKCV 175
+ I+ +++S++ VYG + + + ++ P++ Y A+K +E + S +
Sbjct: 115 HNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPAT 174

Query: 176 ILRYFNVAGACMHNDYTTPYTLGQRTLNATHLIKIACECAVGKRKKMGIFGTNYPTRDGT 235
LR+F V G D + + L GK ++ G
Sbjct: 175 GLRFFTVYGPWGRPD-MALFKFTKAMLE-------------GKSID--VYN------YGK 212

Query: 236 CIRDYIHVDDLANAHLASYQTLLEKNKS---------------EIYNVGYNQGHSVKEVI 280
RD+ ++DD+A A + + + +YN+G + + + I
Sbjct: 213 MKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYI 272

Query: 281 EKVKEISNNDFLVEILDKRQGDPASLIANNAKILQNTSFKP 321
+ +++ + +L + GD A+ + + F P
Sbjct: 273 QALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTP 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_01835RTXTOXINA290.033 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.033
Identities = 31/133 (23%), Positives = 53/133 (39%), Gaps = 24/133 (18%)

Query: 100 VFAPLSLLVSAILLVFSLILIPTSKSTYYGFLRQKKDKIDINIRAGEFGQKLGDWLV--- 156
V AP+S LV A+ + S IL + ++ + + D I E+ +K G
Sbjct: 391 VGAPVSALVGAVTGIISGILEASKQAMFEHVASKMADVIA------EWEKKHGKNYFENG 444

Query: 157 YVDKTKNNSYDNLVLFS--NKSLSQESFILAQKGNINNQNGVFEL--------NLYNGHA 206
Y + DN + S NK S E +L + + + G EL +G +
Sbjct: 445 YDARHAAFLEDNFKILSQYNKEYSVERSVLITQQHWDTLIG--ELAGVTRNGDKTLSGKS 502

Query: 207 Y---FTQGDKMRK 216
Y + +G ++ K
Sbjct: 503 YIDYYEEGKRLEK 515


22C730_02990C730_03020N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_02990213-0.946932hypothetical protein
C730_02995214-0.246305hypothetical protein
C730_03000116-0.337061dihydroorotase
C730_03005016-2.568890hypothetical protein
C730_03010-214-2.893453hypothetical protein
C730_03015-214-2.297431flagellar motor switch protein
C730_03020-112-1.052315endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_02990TYPE3IMSPROT300.006 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.7 bits (67), Expect = 0.006
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLTDVLGFLPFSIYHHFMASLIFSALFCSSLFLSSPLLGVIALVALS 147
L Y F L+L+ +LPFS S + + +L PLL V AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03005TONBPROTEIN533e-10 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 52.7 bits (126), Expect = 3e-10
Identities = 24/52 (46%), Positives = 27/52 (51%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
P P P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.5 bits (110), Expect = 4e-08
Identities = 27/74 (36%), Positives = 32/74 (43%), Gaps = 8/74 (10%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
A Q PP P P P P P P E P KPKPKP+PK K V+
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVK 104

Query: 143 KVEEKKVVEEKKEE 156
KV+E+ + K E
Sbjct: 105 KVQEQPKRDVKPVE 118



Score = 45.4 bits (107), Expect = 9e-08
Identities = 21/65 (32%), Positives = 27/65 (41%), Gaps = 1/65 (1%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPK-KPNHKHKALKKV 141
P P P P P P PPK +PKPKPKP+PK + + + V
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114

Query: 142 EKVEE 146
+ VE
Sbjct: 115 KPVES 119



Score = 38.8 bits (90), Expect = 1e-05
Identities = 43/218 (19%), Positives = 80/218 (36%), Gaps = 38/218 (17%)

Query: 101 PTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVEEKKEEKKIV 160
P PP +P +P EP+P+P+P P+ P +E VV EK + K
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-------------KEAPVVIEKPKPKPKP 98

Query: 161 EQKVEQKVEQKKIEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTRRDIDELYGE 220
+ K +KV+++ + KPV E P N T +
Sbjct: 99 KPKPVKKVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAATSK 144

Query: 221 EFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDLKI 280
+ + + RN + YP A L +G V+F + P+G + +++I
Sbjct: 145 PVTSVASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQI 193

Query: 281 IIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 318
+ M + ++ + +P + ++ I +
Sbjct: 194 LSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 35.0 bits (80), Expect = 3e-04
Identities = 12/42 (28%), Positives = 17/42 (40%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPN 132
P+ P P P P PKP K + +PK +P +
Sbjct: 79 PEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 29.6 bits (66), Expect = 0.013
Identities = 13/48 (27%), Positives = 19/48 (39%)

Query: 84 PKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKP 131
+ P K P P PKP++K + +PK KP +P
Sbjct: 74 EPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRP 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03015FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_03020OMS28PORIN280.031 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 27.8 bits (61), Expect = 0.031
Identities = 28/112 (25%), Positives = 53/112 (47%), Gaps = 11/112 (9%)

Query: 25 NQTTELRHKNPYELLVATILSAQCTDARVNQITPKLFEKYPSVNDLAL-----ASLEEVK 79
N+ E+ K E A ++ + T QI + K P+ +L L A +E+VK
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 80 EIIKSVSYFNNKSKHLISMAQKVVRDFKGVIPSTQKELMSLDGVGQKTANVV 131
E + + +++ + AQKV+ G+ PS + ++++ V + +NVV
Sbjct: 189 ETLMASERALDET---VQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


23C730_04640C730_04680N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_046401152.064982acetate kinase
C730_046451152.534744acetate kinase A/propionate kinase 2
C730_046501151.707839phosphotransacetylase
C730_046652140.862228phosphotransacetylase (pta)
C730_046700140.733773hypothetical protein
C730_04675-2131.459920flagellar basal body rod modification protein
C730_04680-1141.875005flagellar hook protein FlgE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04640ACETATEKNASE1235e-37 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 123 bits (310), Expect = 5e-37
Identities = 48/117 (41%), Positives = 72/117 (61%), Gaps = 2/117 (1%)

Query: 1 MRNIEARK-EKGDKEAKLAFEMCAYRIKKHIGAYMVVLKKVDAIIFTGGLGENYSALRES 59
R++E + GDK A+LA + AYR+KK IG+Y + VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 60 VCEGLENLGIALCKPTNDNPGSGLVNLSQPDAKIQILRIPTDEELEIALQTKKVLEK 116
+ +GLE LG L K N G + +S D+K+ ++ +PT+EE IA T+K++E
Sbjct: 343 ILDGLEFLGFKLDKEKNKVRGEEAI-ISTADSKVNVMVVPTNEEYMIAKDTEKIVES 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04645ACETATEKNASE353e-123 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 353 bits (907), Expect = e-123
Identities = 142/282 (50%), Positives = 192/282 (68%), Gaps = 6/282 (2%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKFVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVNEKVMQEIGNLS 118
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 119 ILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHATMPSYAYMYALPYELYEKYQIRHY 178
LAPLHNPAN+ GI+ + P +P +AVFDTAFH TMP YAY+Y +PYE Y KY+IR Y
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKY 176

Query: 179 GFHRTSHHYVAKEAAKFLNTAYEEFNAISLHLGNGSSAAAIQKGKSVDTSMGLTPLEGLI 238
GFH TSH YV++ AA+ LN E I+ HLGNGSS AA++ GKS+DTSMG TPLEGL
Sbjct: 177 GFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLA 236

Query: 239 MGTRCGDIDPTVVEYTAQCANKSLEEVMKMLNHESGLKGICG 280
MGTR G IDP+++ Y + N S EEV+ +LN +SG+ GI G
Sbjct: 237 MGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGVYGISG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04670IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 5e-04
Identities = 40/227 (17%), Positives = 70/227 (30%), Gaps = 11/227 (4%)

Query: 294 KKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANPPPNDNIPTPLEKEEKAKE 353
+ Q PS N + P+ P A TP E E E
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA---------TPSETTETVAE 1042

Query: 354 ASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKPPMSRI 413
S + KT E + Q + + KS T + Q E +E + ++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 414 SMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNALGFEGVDLSF 473
+ + +E KVE +K + KV+ Q+ Q N +
Sbjct: 1103 TATVEKEEKAKVET--EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 474 SQDSSKEQQAPKDQPKEPFKEQELTPLKENALKSYQENTDNENQETS 520
+++ + + P + ++ N S EN +N T+
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207



Score = 30.4 bits (68), Expect = 0.025
Identities = 48/240 (20%), Positives = 82/240 (34%), Gaps = 18/240 (7%)

Query: 184 PKTLKDIQTLSQKHDLNASNIQAATTPENKN------PLNASDQLALKTTQTPTNHTLAK 237
P+ K QT+ + +NIQA N A T + T T+A+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 238 NDAKNTANLSSVLQSLEKKEPQNKEHANPLNNEKKTPPLK--------EALEMNAIKRDK 289
N + + + Q + QN+E A + K E E + +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 290 TLSKKKSEKTPIHAKTQTTAPSATPENAPKIPLKTPPLMPLIGANP-PPNDNIPTPLEKE 348
T + +K EK + + P T + +PK + A P ND E +
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPK---QEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 349 EKAKEASDNKEKTKETSNSAQNAQNTQASDKTSDNKSTAPKETIKHFTQQLKQEIQEYKP 408
+ +D ++ KETS++ + + T ++ P+ T TQ KP
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_04680FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


24C730_06135C730_06185N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_06135-1150.047566arabinose transporter
C730_06140-116-1.358603hypothetical protein
C730_06145-114-0.646972alpha-carbonic anhydrase
C730_06150017-1.331027hypothetical protein
C730_06155-1150.172087hypothetical protein
C730_06160-3102.186317aspartate-semialdehyde dehydrogenase
C730_06165-3112.256036histidyl-tRNA ligase
C730_06170-2123.133623ADP-heptose--LPS heptosyltransferase II
C730_061750113.823061flagellar motility protein
C730_061800123.870389aldo/keto reductase
C730_061851124.050063elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_06135TCRTETB492e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 49.1 bits (117), Expect = 2e-08
Identities = 32/132 (24%), Positives = 63/132 (47%), Gaps = 1/132 (0%)

Query: 37 LSDIAKSFEMESATVGLMITAYAWVVSLGSLPLMLLSAKIERKRLLLFLFALFILSHILS 96
L DIA F A+ + TA+ S+G+ LS ++ KRLLLF + ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 97 ALAWNFW-VLLLSRMGIAFAHSIFWSITASLVIRVAPRNKKQQALGLLALGSSLAMILGL 155
+ +F+ +L+++R + F ++ +V R P+ + +A GL+ ++ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 156 PLGRIIGQILDW 167
+G +I + W
Sbjct: 157 AIGGMIAHYIHW 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_06150IGASERPTASE310.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.011
Identities = 20/86 (23%), Positives = 34/86 (39%), Gaps = 11/86 (12%)

Query: 124 SEQIEL--EQEKQKTSNIETNNQIKVEQEKQKT-------SNIETNNQI--KVEQEQQKT 172
SE E E KQ++ +E N Q E Q SN++ N Q + +
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 173 EQERQKTEQERQKTEQEKQKTIKTQK 198
E + +T++ ++EK K +
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKT 1119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_06155IGASERPTASE280.026 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.026
Identities = 40/222 (18%), Positives = 66/222 (29%), Gaps = 25/222 (11%)

Query: 3 DKVQDKSKQAEKENQINWWKYSGLTIATSLLL--AACSVGDIDKQIELEQEKKEAENARD 60
+ V + SKQ K + N + T + A +V + E+ Q E + +
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 61 RANKSGIELE-QEKQKTIKEQKDLVKKAEQNCQENHGQFFMKKLGIKGGIAIEVEAECKT 119
K +E +EK K E+ V K Q E
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ---------------SETVQPQ 1142

Query: 120 PKPAKTNQTPIQPKHLPNSKQPHSQRGSKA-QELIAYLQKELESLPYSQKAIAKQVNFYR 178
+PA+ N + N K+P SQ + A E A P ++ N
Sbjct: 1143 AEPARENDPTV------NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV 1196

Query: 179 PSSVAYLELDPRDFKVTEEWQKENLKIRSKAQAKMLGNEKPT 220
+ + +E K + R ++ E T
Sbjct: 1197 ENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_06165ANTHRAXTOXNA310.012 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.012
Identities = 21/58 (36%), Positives = 31/58 (53%), Gaps = 3/58 (5%)

Query: 180 EALRIVDKLEKIGLNGVEEELKKECGLNSNTIKELLELIQIKQNDL--SHAEFFEKIA 235
+ ++KLEK G + E LKKE G+ + I L +K + L HA+ F+KIA
Sbjct: 263 DMFEYMNKLEKGGFEKISESLKKE-GVEKDRIDVLKGEKALKASGLVPEHADAFKKIA 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_06185TCRTETOQM6420.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 642 bits (1658), Expect = 0.0
Identities = 179/671 (26%), Positives = 304/671 (45%), Gaps = 66/671 (9%)

Query: 9 RIRNIGIAAHIDAGKTTTSERILFYTGVSHKIGEVHDGAATMDWMEQEKERGITITSAAT 68
+I NIG+ AH+DAGKTT +E +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TCFWKDHQINLIDTPGHVDFTIEVERSMRVLDGAVSVFCSVGGVQPQSETVWRQANKYGV 128
+ W++ ++N+IDTPGH+DF EV RS+ VLDGA+ + + GVQ Q+ ++ K G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFVNKMDRIGANFYNVENQIKLRLKANPVPINIPIGAEDTFIGVIDLVQMKAIVWNN 188
P I F+NK+D+ G + V IK +L A V
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI--------------------------- 154

Query: 189 ETMGAKYDVEEIPSDLLEKAKEYREKLVEAVAEQDEALMEKYLGGEELSIEEIKKGIKAG 248
K VE P+ + E + + V E ++ L+EKY+ G+ L E+++
Sbjct: 155 -----KQKVELYPNMCVTNFTESEQ--WDTVIEGNDDLLEKYMSGKSLEALELEQEESIR 207

Query: 249 CLNMSLVPMLCGSSFKNKGVQTLLDAVIDYLPAPTEVVDIKGIDPKTEEEVFVKSSDDGE 308
N SL P+ GS+ N G+ L++ + + + T E
Sbjct: 208 FHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSE 248

Query: 309 FAGLAFKIMTDPFVGQLTFVRVYRGKLESGSYVYNSTKDKKERVGRLLKMHSNKREDIKE 368
G FKI +L ++R+Y G L V S K+K ++ + + + I +
Sbjct: 249 LCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGELCKIDK 307

Query: 369 VYAGEICAFVG----LKDTLTGDTLCDEKNAVVLERMEFPEPVIHIAVEPKTKADQEKMG 424
Y+GEI L L GDT + ER+E P P++ VEP +E +
Sbjct: 308 AYSGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLL 362

Query: 425 VALGKLAEEDPSFRVMTQEETGQTLIGGMGELHLEIIVDRLKREFKVEAEIGQPQVAFRE 484
AL ++++ DP R T + ++ +G++ +E+ L+ ++ VE EI +P V + E
Sbjct: 363 DALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME 422

Query: 485 TIRSSVSKEHKYAKQSGGRGQYGHVFIKLEPKEPGSGYEFVNEISGGVIPKEYIPAVDKG 544
R E+ + + + + + P GSG ++ + +S G + + + AV +G
Sbjct: 423 --RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 545 IQEAMQNGVLAGYPVVDFKVTLYDGSYHDVDSSEMAFKIAGSMAFKEASRAANPVLLEPM 604
I+ + G L G+ V D K+ G Y+ S+ F++ + ++ + A LLEP
Sbjct: 481 IRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPY 539

Query: 605 MKVEVEVPEEYMGDVIGDLNRRRGQINSMDDRLGLKIVNAFVPLVEMFGYSTDLRSATQG 664
+ ++ P+EY+ D + I + I++ +P + Y +DL T G
Sbjct: 540 LSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNG 599

Query: 665 RGTYSMEFDHY 675
R E Y
Sbjct: 600 RSVCLTELKGY 610


25C730_07515C730_07545N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_075150120.450151membrane protein insertase
C730_075200100.364864hypothetical protein
C730_07525090.992435tRNA modification GTPase TrmE
C730_075301121.759731hypothetical protein
C730_07535-2141.401695hypothetical protein
C730_07540-2131.958175hypothetical protein
C730_07545-2132.255959membrane-associated lipoprotein (lpp20)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_0751560KDINNERMP431e-148 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 431 bits (1110), Expect = e-148
Identities = 162/581 (27%), Positives = 275/581 (47%), Gaps = 81/581 (13%)

Query: 9 RLILAIALSFLFIALYSYFFQKPNKTTTQTTKQETTNNHTATSPNAPNAQHFSTTQTTPQ 68
R +L IAL F+ ++ Q T T+ A + + Q
Sbjct: 5 RNLLVIALLFVSFMIW-------QAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQ 57

Query: 69 ENLLSTISFEHARIEIDSLG-RIKQVYLKDKKYLTPKQKGFLEHVG--HLFSSKEN---- 121
L+ ++ + + I++ G ++Q L P L L +
Sbjct: 58 GKLI-SVKTDVLDLTINTRGGDVEQALL-------PAYPKELNSTQPFQLLETSPQFIYQ 109

Query: 122 AQPPL--KELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDL 177
AQ L ++ P A+ +PL +N A G NE V D
Sbjct: 110 AQSGLTGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDA 156

Query: 178 GTLSIIKTLTFYDDLHYDLKIAFKSPNN------------------LIPSYVITNGYRPV 219
+ KT Y + + + N L P + +
Sbjct: 157 AGNTFTKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFAL 215

Query: 220 ADLDSYTFSGVLLENSDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQ 276
+TF G D+K EK + D + + S +++ + +YF T +
Sbjct: 216 -----HTFRGAAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-D 269

Query: 277 GFEALIDSEIGTKNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDV 325
G + +G N + I K++ N ++GP+ + A++P L
Sbjct: 270 GTNNFYTANLG--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLT 327

Query: 326 IEYGLITFFAKGVFVLLDYLYQFVGNWGWAIILLTIIVRIILYPLSYKGMVSMQKLKELA 385
++YG + F ++ +F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ L
Sbjct: 328 VDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQ 387

Query: 386 PKMKELQEKYKGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELK 445
PK++ ++E+ + Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+
Sbjct: 388 PKIQAMRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELR 447

Query: 446 SSEWILWIHDLSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLI 505
+ + LWIHDLS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F +
Sbjct: 448 QAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFL 507

Query: 506 TFPAGLVLYWTTNNILSVLQQLIINKVLENKKRMHAQNKKE 546
FP+GLVLY+ +N+++++QQ +I + LE K+ +H++ KK+
Sbjct: 508 WFPSGLVLYYIVSNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07520IGASERPTASE300.009 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.009
Identities = 16/46 (34%), Positives = 25/46 (54%), Gaps = 2/46 (4%)

Query: 64 KEESVKETNTKEIHQSAEEKKQKLETETPQEEIITPKPSKKNPKEE 109
+ + + T TKE +E+K K+ETE QE S+ +PK+E
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEV--PKVTSQVSPKQE 1134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07525TCRTETOQM310.008 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.008
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 KGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_07545LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (751), Expect = e-105
Identities = 175/175 (100%), Positives = 175/175 (100%)

Query: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175


26C730_08070C730_08095N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
C730_08070-1132.407790flagellar hook-basal body protein FliE
C730_08075-1132.129144flagellar basal body rod protein FlgC
C730_080800151.719126flagellar basal body rod protein FlgB
C730_080851131.733852cell division protein FtsW
C730_080900120.231817iron(III) ABC transporter periplasmic
C730_08095113-0.097839iron(III) ABC transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_08070FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_08075FLGHOOKAP1290.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.011
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_08090FERRIBNDNGPP348e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.8 bits (77), Expect = 8e-04
Identities = 28/183 (15%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 108 NVELLKKLSPDLVVTFVG-NPKAVEHAKKFGISFLSFQETT--IAEAMQAMQ--AQATVL 162
N+ELL ++ P +V G P A+ +F + +A A +++ A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 163 EIDASKKFAKMQETLDFIAERL-KNVKKKKGVELFHKAN--KISGHQAISSDILEKGGID 219
+ A A+ ++ + + R K + + + G ++ +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 220 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLTPEDVLNNPKFATIKAIKNKQVY 277
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 278 KLP 280
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
C730_08095FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 29/184 (15%), Positives = 75/184 (40%), Gaps = 12/184 (6%)

Query: 106 NVELLKKLGPDLVVTFVGNPKAVEHAKKF--GILFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EIDASKKLAKMQETLDFIAERLKGVKKKKGVELFHKAN----KISGHQALDSDILEKGGI 216
+ A LA+ ++ + + R + + + L + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVK-RGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 217 DN-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLTPEDVLNNPKFATIKAIKNKQV 274
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 207 PNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF 266

Query: 275 YKLP 278
++P
Sbjct: 267 QRVP 270



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.