PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeXcc_8004_2.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_007086 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1XC_RS00185XC_RS00245Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS001850123.185363hypothetical protein
XC_RS001900122.904746pirin
XC_RS001950131.664076acyl-ACP phosphodiesterase
XC_RS002000121.656638LysR family transcriptional regulator
XC_RS002052121.982791antibiotic biosynthesis monooxygenase
XC_RS002101112.067981glutamine amidotransferase
XC_RS002152142.122038transcriptional regulator
XC_RS002202141.985405transcriptional regulator
XC_RS00225192.714018methicillin resistance protein
XC_RS00230093.091889ankyrin
XC_RS00235-2102.952712hypothetical protein
XC_RS00240-1113.304518saccharopine dehydrogenase
XC_RS00245-1123.141090Tat pathway signal protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00230PF02370444e-07 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 43.6 bits (102), Expect = 4e-07
Identities = 24/153 (15%), Positives = 56/153 (36%), Gaps = 18/153 (11%)

Query: 428 ATRGAQAAVHDALAANNQDLEMQQERMQDAQDALQEAREQLASLGPELA-----QAKQDA 482
++ G ++ L N L+ Q E D+ D+ +E Q +L E + +
Sbjct: 10 SSNGKLITEYNKLVEENSKLQKQLEEYLDSSDSKRENDPQYRALMGENQDLRKREGQYQD 69

Query: 483 QQQAREAEQQIREVAQQHRQAQYAYAAAVRQAAALSRQQAEMGKQAARQARIEAARGQQQ 542
+ + E E++ ++ + R+ + Q E + A + ++ + Q
Sbjct: 70 KIEELEKERKEKQERPERREKFERQHQDKHYQEQQKKHQQEQQQLEAEKQKLAKEK---Q 126

Query: 543 AAQAQRE----------AAQAQADAEQAQAEAE 565
+ A R+ AA+ + + + + E
Sbjct: 127 ISDASRQGLNRDLEASRAAKKELEPKHQKLGTE 159



Score = 40.9 bits (95), Expect = 3e-06
Identities = 29/120 (24%), Positives = 57/120 (47%), Gaps = 5/120 (4%)

Query: 438 DALAANNQDLEMQQERMQDAQDALQEAREQLASLGPELAQAKQDAQQQAREAEQQIREVA 497
AL NQDL ++ + QD + L++ R++ + K + Q Q + ++Q ++
Sbjct: 51 RALMGENQDLRKREGQYQDKIEELEKERKEKQE--RPERREKFERQHQDKHYQEQQKKHQ 108

Query: 498 QQHRQ--AQYAYAAAVRQAAALSRQQAEMGKQAARQARIE-AARGQQQAAQAQREAAQAQ 554
Q+ +Q A+ A +Q + SRQ +A+R A+ E + Q+ + Q+ + Q
Sbjct: 109 QEQQQLEAEKQKLAKEKQISDASRQGLNRDLEASRAAKKELEPKHQKLGTEHQKLKEEKQ 168


2XC_RS00290XC_RS00455Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS00290432-6.800747hypothetical protein
XC_RS00295433-7.204244hypothetical protein
XC_RS00300543-8.439758NAD(P)H oxidoreductase
XC_RS00305646-8.757866hypothetical protein
XC_RS00310442-8.477705HxlR family transcriptional regulator
XC_RS00315445-9.039085hypothetical protein
XC_RS00320335-6.047537hypothetical protein
XC_RS00325233-5.969985hypothetical protein
XC_RS00330125-4.090200hypothetical protein
XC_RS00335022-4.433716hypothetical protein
XC_RS00340019-2.352372hypothetical protein
XC_RS003451240.502302hypothetical protein
XC_RS00350-119-0.511456hypothetical protein
XC_RS00355-1152.350411dihydrofolate reductase
XC_RS003600132.423721hypothetical protein
XC_RS003651132.814948LuxR family transcriptional regulator
XC_RS003701143.139732hypothetical protein
XC_RS003752133.339101hypothetical protein
XC_RS003803154.390888histidine kinase
XC_RS003854122.791861ATPase
XC_RS003903112.940914hypothetical protein
XC_RS003953122.504654peptidase M4
XC_RS004002152.025884NAD-dependent dehydratase
XC_RS004051141.001329DNA mismatch repair protein MutT
XC_RS004100121.208704permease
XC_RS00415-2131.712786esterase
XC_RS004200151.413957attachment protein
XC_RS004252154.399632thioredoxin
XC_RS004303165.981932proline/betaine transporter
XC_RS004357167.996602ATPase
XC_RS004407168.248792peptidase
XC_RS004457167.911680hypothetical protein
XC_RS004505167.204980membrane protein
XC_RS004550154.836845membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00395THERMOLYSIN2832e-93 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 283 bits (724), Expect = 2e-93
Identities = 118/288 (40%), Positives = 165/288 (57%), Gaps = 23/288 (7%)

Query: 74 YDARQGTALPGVLVRE--EGAPPTDDVAVTEAYDYLGATHAFFQQVYARNSIDDAGMPLL 131
YD R T LPG L + + D A +A+ Y G + +++ V+ R S D + +
Sbjct: 270 YDGRNRTVLPGSLWADGDNQFFASYDAAAVDAHYYAGVVYDYYKNVHGRLSYDGSNAAIR 329

Query: 132 GTVHYERNYDNAFWTGEQMVFGDGDGEIFTRFTIAIDVVAHELTHGVIERTANLIYQGQS 191
TVHY R Y+NAFW G QMV+GDGDG+ F F+ IDVV HELTH V + TA L+YQ +S
Sbjct: 330 STVHYGRGYNNAFWNGSQMVYGDGDGQTFLPFSGGIDVVGHELTHAVTDYTAGLVYQNES 389

Query: 192 GALNESVSDVFGVLVKQYALRQDAAQADWLVGAGMFLPGVQGVALRSMQAPGTAYDDPAL 251
GA+NE++SD+FG LV+ YA R DW +G ++ PGV G ALRSM P
Sbjct: 390 GAINEAMSDIFGTLVEFYANRNP----DWEIGEDIYTPGVAGDALRSMSDP--------- 436

Query: 252 GKDPQPAHMDAYVDTQEDDGGVHYNSGIPNRAFQRAA-------VAIGGYAWEKAGRIWY 304
K P H +D+GGVH NSGI N+A + V++ G +K G+I+Y
Sbjct: 437 AKYGDPDHYSKRYTGTQDNGGVHTNSGIINKAAYLLSQGGVHYGVSVTGIGRDKMGKIFY 496

Query: 305 RALTGGALSASADFATFAALTVRVASTDYGAGSAEASAVEQAWRDVGV 352
RAL L+ +++F+ A V+ A+ YG+ S E ++V+QA+ VGV
Sbjct: 497 RALV-YYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00430TCRTETA417e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.3 bits (97), Expect = 7e-06
Identities = 55/307 (17%), Positives = 103/307 (33%), Gaps = 39/307 (12%)

Query: 71 PTAQLIATFATFTVAF-LVRPIGGMVFGPLGDRYGRQKVLAATMILMALGTFSIGLIPSY 129
+ + A + + L++ V G L DR+GR+ VL ++ A+ + P
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF- 95

Query: 130 AQIGLWAPALLLLARLLQGFSTGGEYGGAATFIAEYATDRNR----GLMGSWLEFGTLGG 185
LW +L + R++ G TG A +IA+ R G M + FG + G
Sbjct: 96 ----LW---VLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG 147

Query: 186 YIAGAATVTALHMALSQAQMLDWGWRVPFLVAGPLGLLGLYMRMKLEETPAFRAYTEQSE 245
+ G M + PF A L L L
Sbjct: 148 PVLGGL-------------MGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRR 194

Query: 246 QRERETAGQGLMTLLRLHWPQLLKCVGLVLVFNVTDYMLLTYMPS---YLSVTMGYAESK 302
+ A + + + + LV V + + + + + T+G + +
Sbjct: 195 EALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAA 254

Query: 303 GLLLIILVMLVMMPLNVVGGMFSDKLGRRPMIIGACAALFALAIPCLLLIGSGSDVLIFT 362
+L L ++ G + +LG R ++ + A +LL + + F
Sbjct: 255 FGILHSLAQAMIT------GPVAARLGERRALM---LGMIADGTGYILLAFATRGWMAFP 305

Query: 363 GLMLLGL 369
++LL
Sbjct: 306 IMVLLAS 312



Score = 29.8 bits (67), Expect = 0.027
Identities = 22/103 (21%), Positives = 44/103 (42%), Gaps = 9/103 (8%)

Query: 267 LLKCVGLVLVFNVTDYMLLTYMPSYLSVTMGYAESKGLLLIILVMLVMMPLNVVGGMFSD 326
L VG+ L+ V +L + S + +L+ L L+ V G SD
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHS------NDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 327 KLGRRPMIIGACAALFALAIPCLLLIGSGSDVLIFTGLMLLGL 369
+ GRRP++ +L A+ ++ + +++ G ++ G+
Sbjct: 69 RFGRRPVL---LVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI 108


3XC_RS00605XC_RS00725Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS00605215-0.411380hypothetical protein
XC_RS00610221-0.3413182-keto-3-deoxygluconate kinase
XC_RS00615223-0.689521TonB-denpendent receptor
XC_RS00620122-1.006821TonB-denpendent receptor
XC_RS00625024-0.493859pectin methylesterase
XC_RS00630023-2.057176pectate lyase
XC_RS00635-120-2.466174hypothetical protein
XC_RS00640037-7.742181hypothetical protein
XC_RS00645437-8.366053hypothetical protein
XC_RS00650334-7.270530hypothetical protein
XC_RS00655332-6.681575deoxycytidylate deaminase
XC_RS00660228-5.752487transposase
XC_RS00670220-3.651044hypothetical protein
XC_RS00675012-2.498530type IV secretion protein Rhs
XC_RS00680-2100.820975transposase
XC_RS00690-3110.654339hypothetical protein
XC_RS00695-3120.831186hypothetical protein
XC_RS00700-2120.889981alpha-amylase
XC_RS00705-1130.631812trehalose synthase
XC_RS007102130.883054glycogen branching protein
XC_RS007152150.047292transposase
XC_RS007252141.634099hypothetical protein
4XC_RS00925XC_RS01030Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS00925-1153.048566cardiolipin synthase
XC_RS009301163.694701esterase
XC_RS009351163.247479acetyltransferase
XC_RS00940-1161.937871alpha/beta hydrolase
XC_RS00945-3141.543336hypothetical protein
XC_RS00950-1130.058348hypothetical protein
XC_RS009550140.805702hypothetical protein
XC_RS009601160.976718undecaprenyl-diphosphatase
XC_RS009650151.656100glutamine synthetase
XC_RS009700163.266273nitrogen regulatory protein P-II 1
XC_RS009750143.371303ammonia channel protein
XC_RS00980-2133.912959two-component system sensor protein
XC_RS00985-2133.448160nitrogen regulation protein NR(I)
XC_RS009950103.095495superoxide dismutase
XC_RS010000122.900057superoxide dismutase
XC_RS010052112.904014glyoxalase
XC_RS010104113.583609hypothetical protein
XC_RS01015292.857466acetyl-CoA acetyltransferase
XC_RS010204102.777255hypothetical protein
XC_RS01025392.924485porphyrin biosynthesis protein
XC_RS01030392.455407uroporphyrin-III C-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00945SYCDCHAPRONE379e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 9e-05
Identities = 20/102 (19%), Positives = 32/102 (31%), Gaps = 3/102 (2%)

Query: 96 DPNQFNAYVMQAHLAVARGDLDEAERLSRTAARLAPEHPQLLAVDGVVEMRRGQSDRALS 155
Q + + + G ++A ++ + L + G GQ D A+
Sbjct: 35 TLEQLYSLAFNQYQS---GKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIH 91

Query: 156 LLTRAAEQLPDDARVMFALGFAYLQKEHFAFAERAFERVVEL 197
+ A + R F LQK A AE EL
Sbjct: 92 SYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQEL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00985HTHFIS495e-175 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 495 bits (1277), Expect = e-175
Identities = 206/470 (43%), Positives = 279/470 (59%), Gaps = 14/470 (2%)

Query: 10 RIWVVDDDRSVRFVLSTALRDAGYAVDGFDSAAAALQALAMRPTPDLLFTDVRMPGEDGL 69
I V DDD ++R VL+ AL AGY V +AA + +A DL+ TDV MP E+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-AGDGDLVVTDVVMPDENAF 63

Query: 70 TLLDKLKSKHPQLPVIVMSAYTDVASTAGAFRGGAHEFLSKPFDLDDAVALAARALPDAD 129
LL ++K P LPV+VMSA + A GA+++L KPFDL + + + RAL +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 AAVEDTLGTPVAEGSAALIGDTPAMQALFRAIGRLAQAPLSVLINGETGTGKELVARALH 189
++ L+G + AMQ ++R + RL Q L+++I GE+GTGKELVARALH
Sbjct: 124 RRPSKLEDD--SQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH 181

Query: 190 NESPRSRKPFVALNTAAIPAELLESELFGHETGAFTGATKRHIGRFEQADGGTLFLDEIG 249
+ R PFVA+N AAIP +L+ESELFGHE GAFTGA R GRFEQA+GGTLFLDEIG
Sbjct: 182 DYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIG 241

Query: 250 DMPLPLQTRLLRVLAENEFFRVGGRELIRVDVRVIAATHQDLEALVEQGRFRADLLHRLD 309
DMP+ QTRLLRVL + E+ VGGR IR DVR++AAT++DL+ + QG FR DL +RL+
Sbjct: 242 DMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLN 301

Query: 310 VVRLQLPPLRERRGDIAQLAENFLAMAGRKLDMLPKRLSSAALEALRQYDWPGNVRELEN 369
VV L+LPPLR+R DI L +F+ A K + KR ALE ++ + WPGNVRELEN
Sbjct: 302 VVPLRLPPLRDRAEDIPDLVRHFVQQA-EKEGLDVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 370 VCWRLAALATADTIDVVDV---------DAALARGGRRVRAGRSDGQWDEMLSSWAAQRL 420
+ RL AL D I + D+ + + R + +E + + A
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 421 SE-GAQGLHAEARERLDKTLLEAALQLTQGRRAEAAARLGLGRNTVTRKL 469
GL+ ++ L+ AAL T+G + +AA LGL RNT+ +K+
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKI 470


5XC_RS01075XC_RS01160Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS010752121.174719pyruvate dehydrogenase
XC_RS010800130.412185hypothetical protein
XC_RS01085-114-2.542252MFS transporter
XC_RS01090024-6.151123tRNA/rRNA methyltransferase
XC_RS01095029-5.787753hypothetical protein
XC_RS01100027-5.777288hypothetical protein
XC_RS01105-128-5.6409803-oxoacyl-ACP synthase
XC_RS01110-132-5.969070addiction module protein
XC_RS01115-127-3.712419methyltransferase
XC_RS01120-219-1.083444hypothetical protein
XC_RS011252150.507876haloalkane dehalogenase
XC_RS011303160.086904ferredoxin
XC_RS011352140.250421hypothetical protein
XC_RS011401121.304873peptide synthase
XC_RS011450111.1663973-beta hydroxysteroid dehydrogenase
XC_RS011501131.353234membrane protein
XC_RS011552121.586815acetyltransferase
XC_RS011602142.093568D-alanine--D-alanine ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01145NUCEPIMERASE1432e-42 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 143 bits (363), Expect = 2e-42
Identities = 80/362 (22%), Positives = 135/362 (37%), Gaps = 78/362 (21%)

Query: 1 MKILVTGGGGFLGQALCRGLVARGHEVV-----------SFQRGDYPVLHTLGVGQIRGD 49
MK LVTG GF+G + + L+ GH+VV S ++ +L G + D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 50 LADPQAVRHALA--GIDAVFHNAAKAG---AWGSYDSYHQANVVGTQNVLDACRANGVPR 104
LAD + + A + VF + + + + +Y +N+ G N+L+ CR N +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 105 LIYTSTPSVTHRATNPVEGLGADE-VPYGEDLRA-----PYAATKAIAERAVLAANDA-Q 157
L+Y S+ SV G + +P+ D YAATK E +
Sbjct: 121 LLYASSSSV----------YGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 158 LATVALRPRLIWGP-GD-NHLLPRLAARARAGR-LRMVGDGSNLVDSTYIDNAAQAHFDA 214
L LR ++GP G + L + G+ + + G D TYID+ A+A
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRL 230

Query: 215 ------------FAHLAPGAACA-GKAYFISNGEPLPMRELLNRLLAAVDAPAVTRSLSF 261
P A+ A + Y I N P+ + + + L A+ A L
Sbjct: 231 QDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPL 290

Query: 262 KTAYRIGAVCETLWPLLRLPGEVPLTRFLVEQLCTPHWYSMEPARRDFGYVPQISIEEGL 321
+ PG+V T + G+ P+ ++++G+
Sbjct: 291 Q------------------PGDVLET-----------SADTKALYEVIGFTPETTVKDGV 321

Query: 322 QR 323
+
Sbjct: 322 KN 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01150PF04335240.029 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 24.0 bits (52), Expect = 0.029
Identities = 5/20 (25%), Positives = 9/20 (45%)

Query: 28 TNIAWILFVVFLILAVISMF 47
+AW++ V LA +
Sbjct: 32 KKLAWVVAGVAGALATAGVV 51


6XC_RS01410XC_RS01535Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS01410-2123.3983222-alkenal reductase
XC_RS01415-3113.041869oxidoreductase
XC_RS01420-2113.191344methyltransferase
XC_RS01425-2113.700032ribosomal large subunit pseudouridine synthase
XC_RS01430-1133.962255membrane protein
XC_RS01435-1153.898376RNA helicase
XC_RS01440-2152.388350chemotaxis protein
XC_RS01445-1152.593813hypothetical protein
XC_RS014501133.0043575-hydroxyisourate hydrolase
XC_RS014552143.607441monooxygenase
XC_RS014602132.992152OHCU decarboxylase
XC_RS014652143.073258hypothetical protein
XC_RS014701142.766941chitin deacetylase
XC_RS014752112.488239aminotransferase V
XC_RS014802132.081287allantoate amidohydrolase
XC_RS014850150.072853LysR family transcriptional regulator
XC_RS014901130.9943273-oxoacyl-ACP reductase
XC_RS014952111.985477hypothetical protein
XC_RS01500192.002488transcriptional regulator
XC_RS015052112.266708hypothetical protein
XC_RS015102132.527889cardiolipin synthase 2
XC_RS015153123.271195gamma-glutamyltranspeptidase
XC_RS015202122.928386amidase
XC_RS015252152.522460nucleoside hydrolase
XC_RS015302162.717467adenosine deaminase
XC_RS015352162.702380xanthine/uracil permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01490DHBDHDRGNASE280.004 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.004
Identities = 24/97 (24%), Positives = 37/97 (38%), Gaps = 7/97 (7%)

Query: 1 AVEILSVYLARELEPRSITANTVARGAIATGFP-------GGAVRDTPAYTKAFADMTAL 53
A + + L EL +I N V+ G+ T GA + + F L
Sbjct: 163 AAVMFTKCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPL 222

Query: 54 GRVGVPDDIGPMVASLLSEDNRWVTGQRIEVSGGQTI 90
++ P DI V L+S +T + V GG T+
Sbjct: 223 KKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01500TCRTETB394e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 38.7 bits (90), Expect = 4e-05
Identities = 28/143 (19%), Positives = 53/143 (37%), Gaps = 1/143 (0%)

Query: 46 LTPIAADLHATAGMAGQAISISGLFAVLASLLIAPLTSRFN-RRHVLISLSAVMLLSLLL 104
L IA D + + L + + + L+ + +R +L + S++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 105 IANAHSFGMLMAARALLGITIGGFWALATATVMRMMPEEAVPKALGIVYIGNAVATAFAA 164
F +L+ AR + G F AL V R +P+E KA G++ A+
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 165 PLGSYLGAVIGWRGVFWGMVPLV 187
+G + I W + + +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITI 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01515SALSPVBPROT320.006 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 32.0 bits (72), Expect = 0.006
Identities = 12/30 (40%), Positives = 18/30 (60%), Gaps = 3/30 (10%)

Query: 56 GDGFWLIHEPDGRVHAIDACGRAAQAATLD 85
GD FWL+H+ +G +H + G+ A A D
Sbjct: 155 GDDFWLLHDSNGILHLL---GKTAAARLSD 181


7XC_RS01600XC_RS01770Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS016002111.528652dehydrogenase
XC_RS016051131.025180hypothetical protein
XC_RS016101110.732233formyltetrahydrofolate deformylase
XC_RS016151120.844897transcriptional regulator
XC_RS01620-210-0.749105dehydrogenase
XC_RS01625-310-1.519328conditioned medium factor
XC_RS01630-219-4.1949315,10-methylenetetrahydrofolate reductase
XC_RS01635-114-3.732173LysR family transcriptional regulator
XC_RS01640012-2.806064NADH-dependent FMN reductase
XC_RS01645012-2.762542hypothetical protein
XC_RS01650-113-0.4652675-methyltetrahydropteroyltriglutamate--
XC_RS016552110.797002abortive phage infection protein
XC_RS016603102.244866porin
XC_RS016654122.093794oxidoreductase
XC_RS016703150.825312HxlR family transcriptional regulator
XC_RS01675320-0.612140hypothetical protein
XC_RS01680321-2.381195methyl-accepting chemotaxis protein
XC_RS01685240-5.897445hypothetical protein
XC_RS01690439-7.412791hypothetical protein
XC_RS01695545-7.717212hypothetical protein
XC_RS01700649-9.119574hypothetical protein
XC_RS01705753-9.420503AttT protein
XC_RS01710957-10.629505hypothetical protein
XC_RS01715954-10.355373hypothetical protein
XC_RS01720957-11.025130hypothetical protein
XC_RS01725756-10.930587hypothetical protein
XC_RS01730757-9.657890hypothetical protein
XC_RS01735657-9.131070hypothetical protein
XC_RS01740552-6.378685hypothetical protein
XC_RS01745651-8.012990hypothetical protein
XC_RS01750451-7.156396xanthomonadin biosynthesis protein
XC_RS01755653-8.194610hypothetical protein
XC_RS01760451-7.271228xanthomonadin biosynthesis protein
XC_RS01765451-7.764600hypothetical protein
XC_RS01770242-5.506242hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01620DHBDHDRGNASE995e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.0 bits (246), Expect = 5e-27
Identities = 65/240 (27%), Positives = 99/240 (41%), Gaps = 14/240 (5%)

Query: 2 QTVLITGCSSGFGLATAHYFLERDWNVVATMRTPREDLFPVSPRL------KVLQLDVTD 55
+ ITG + G G A A + ++ A P + VS + DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 56 AASI----QAAIAAAGTVDTLVNNAGFGAPAPLELTSLQAVRDLFETNTFGTLAVTQAVL 111
+A+I G +D LVN AG P + S + F N+ G +++V
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 112 PQMRARRAGVIVNVSSSATLKPLPLIGAYRAAKAAVNALSESLAAELEAFGIRVRVISPG 171
M RR+G IV V S+ P + AY ++KAA ++ L EL + IR ++SPG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 172 SCGETDFRATARAGLRGADDEIYGAFMQQTLARMSASTGPGTRSIDVAQAVWRAATDPAA 231
S ETD + + A GA+ I G+ + + D+A AV + A
Sbjct: 189 ST-ETDMQWSLWADENGAEQVIKGSLET---FKTGIPLKKLAKPSDIADAVLFLVSGQAG 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01665NUCEPIMERASE392e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 38.6 bits (90), Expect = 2e-05
Identities = 27/130 (20%), Positives = 41/130 (31%), Gaps = 35/130 (26%)

Query: 8 ILVTGASGHLGALIVDALLERVPAGRIVA---------TARETASLSAFAKRDISVRRAD 58
LVTGA+G +G + LLE ++V + + A L A+ + D
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEA--GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 59 YADPASLD--------------AAFAGVGTVL-----LVSSNAVGARVEQHRNVIEAAKR 99
AD + V L SN G N++E +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTG-----FLNILEGCRH 115

Query: 100 AGVGLLAYTS 109
+ L Y S
Sbjct: 116 NKIQHLLYAS 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01705SACTRNSFRASE310.002 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 30.7 bits (69), Expect = 0.002
Identities = 12/58 (20%), Positives = 24/58 (41%), Gaps = 3/58 (5%)

Query: 71 VVDIAVLPAHQGRGLGKTVMGEIASYIEREVPASAYVSLIADG--AAYKLYQQFGFAL 126
+ DIAV ++ +G+G ++ A +E + D +A Y + F +
Sbjct: 92 IEDIAVAKDYRKKGVGTALLH-KAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01720V8PROTEASE361e-04 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 35.8 bits (82), Expect = 1e-04
Identities = 29/184 (15%), Positives = 53/184 (28%), Gaps = 24/184 (13%)

Query: 37 GGVQLLGTAFALDKPGLFATASHVAGTEGQNLV-LAFKPLAALHDYQDTGDTSIRSFPVQ 95
+ + + T HV + L P A D G + Q
Sbjct: 98 PTGTFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAE----Q 152

Query: 96 IHALDPFHDLAVLKADVISTSS--------VVVGGADDATVGTQVASFGFPHADHCRMVL 147
I DLA++K + + + V + G+P V
Sbjct: 153 ITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP---VA 209

Query: 148 TQQDAEIGARVLIASGGIKAKHLVLNTQARPGQSGSPIFRKTDGRLVGVLVGSYAPGGGG 207
T +++ L + + G SGSP+F + ++G+ G G
Sbjct: 210 TMWESKGKITYLKGEA------MQYDLSTTGGNSGSPVFNE-KNEVIGIHWGGVPNEFNG 262

Query: 208 GISL 211
+ +
Sbjct: 263 AVFI 266


8XC_RS01890XC_RS02080Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS018902112.667486diguanylate cyclase
XC_RS018953102.846803Rieske (2Fe-2S) protein
XC_RS019002133.9411193-oxoadipate:succinyl-CoA transferase subunit A
XC_RS019052103.7065603-oxoadipate:succinyl-CoA transferase subunit B
XC_RS019102113.663553acetyl-CoA acetyltransferase
XC_RS019154103.791544protocatechuate 3,4-dioxygenase subunit beta
XC_RS019205114.396248protocatechuate 3,4-dioxygenase subunit alpha
XC_RS019252103.7493683-carboxy-cis,cis-muconate cycloisomerase
XC_RS019303112.9164353-oxoadipate enol-lactonase
XC_RS019351111.9821834-carboxymuconolactone decarboxylase
XC_RS01940-1112.005981hydrolase
XC_RS019450131.584383transcriptional regulator
XC_RS01950-1121.783431hypothetical protein
XC_RS019550142.457065lipase
XC_RS019600152.142327hypothetical protein
XC_RS019650152.870193tRNA uridine 5-carboxymethylaminomethyl
XC_RS01970-1143.681919hypothetical protein
XC_RS01975-3153.946358threonine dehydratase
XC_RS01980-2153.798742hypothetical protein
XC_RS01985-1143.645495hypothetical protein
XC_RS019900114.162193aspartyl/asparaginyl beta-hydroxylase
XC_RS019951104.362908malonyl-CoA O-methyltransferase
XC_RS02000284.3037393-oxoacyl-ACP reductase
XC_RS020052103.686200pimeloyl-[acyl-carrier protein] methyl ester
XC_RS02010082.448458hypothetical protein
XC_RS02015-111-0.0304928-amino-7-oxononanoate synthase
XC_RS02020017-0.939970biotin synthase
XC_RS02025122-2.359072competence protein ComF
XC_RS02030328-3.8208514-hydroxybenzoate octaprenyltransferase
XC_RS02040438-5.463100*integrase
XC_RS02050331-5.152754TonB-denpendent receptor
XC_RS02055028-4.474914hypothetical protein
XC_RS02060133-5.772694superoxide dismutase
XC_RS02065133-5.615858hypothetical protein
XC_RS02070028-5.209810hypothetical protein
XC_RS02075-125-4.604043TonB-dependent receptor
XC_RS02080-219-3.119347transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01895FLGPRINGFLGI330.002 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 33.0 bits (75), Expect = 0.002
Identities = 16/40 (40%), Positives = 19/40 (47%), Gaps = 1/40 (2%)

Query: 172 IGQDEVAEAPFEVVHGDRTVTVSRWMHNIDPPPFWAGQIA 211
IG D V + V +G TV V+ I P PF GQ A
Sbjct: 274 IGAD-VRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTA 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01970GPOSANCHOR372e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.0 bits (85), Expect = 2e-05
Identities = 12/55 (21%), Positives = 22/55 (40%), Gaps = 7/55 (12%)

Query: 102 PNGRPNGPRPQRPNPPPATGAPPRPAQPPRIGAPPRVIREIQRQTAPRNTRQQIP 156
G+ + Q P+ P A P + AP + Q + + T++Q+P
Sbjct: 459 RAGKASDS--QTPDAKPG-----NKAVPGKGQAPQAGTKPNQNKAPMKETKRQLP 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS01995DPTHRIATOXIN290.017 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 29.3 bits (65), Expect = 0.017
Identities = 18/69 (26%), Positives = 31/69 (44%), Gaps = 1/69 (1%)

Query: 37 ESLDYLGDTAPKVVLDVGAGPGHASATIKKRWPKAQVIALDQALPM-LRQARKTAGWWKP 95
E + GD A +VVL + G +S W +A+ ++++ + R R ++
Sbjct: 154 EFIKRFGDGASRVVLSLPFAEGSSSVEYINNWEQAKALSVELEINFETRGKRGQDAMYEY 213

Query: 96 FAQVCADAR 104
AQ CA R
Sbjct: 214 MAQACAGNR 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02000DHBDHDRGNASE901e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.1 bits (223), Expect = 1e-23
Identities = 73/258 (28%), Positives = 106/258 (41%), Gaps = 5/258 (1%)

Query: 4 GIAGRWALVCAASKGLGLGCAQALAREGANVVIVARGREALEQSASNLRALPGAGEVRSV 63
GI G+ A + A++G+G A+ LA +GA++ V E LE+ S+L+A E
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 64 -VADITTSEGRAAALAA-CPQVDILINNAGGPPPGDFRQWERADWLRALDANMLAPIELI 121
V D + A + +DIL+N AG PG +W N
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 122 RATVDAMRARRFGRIVNITSSAVKAPIDILGLSNGARAGLTGFVAGLARSTVADNVTLNN 181
R+ M RR G IV + S+ P + ++A F L N+ N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 182 LLPGQFATDRLRGNFAA-IAEQQGTSAEAVAAHKRAGIPAGRFGEPDEFGAACAFLCSAQ 240
+ PG TD +A +Q + GIP + +P + A FL S Q
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGS--LETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 241 AGYITGQNLLIDGGAYPG 258
AG+IT NL +DGGA G
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02070PYOCINKILLER260.009 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.3 bits (57), Expect = 0.009
Identities = 15/54 (27%), Positives = 26/54 (48%)

Query: 3 ADDHGRQESNRQSVLGELDFRYFPQQQRRERRLIRPQYEERTFALADYFQSPLE 56
+ ++G QE + +G RY P Q + +RR I Q+ + L Q+ L+
Sbjct: 45 STENGWQEFESYADVGVDPRRYVPLQVKEKRREIELQFRDAEKKLEASVQAELD 98


9XC_RS02125XC_RS02180Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS02125-1103.096860hypothetical protein
XC_RS021300143.937247hypothetical protein
XC_RS021350154.179653glycogen synthase
XC_RS02140-1133.321818glycogen branching protein
XC_RS02145-1133.266355malto-oligosyltrehalose trehalohydrolase
XC_RS02150-1133.2684684-alpha-glucanotransferase
XC_RS02155-1112.166495malto-oligosyltrehalose synthase
XC_RS02160-181.389200hypothetical protein
XC_RS02165081.111019glycogen debranching protein
XC_RS021700103.771860NAD-dependent dehydratase
XC_RS02175-1123.1267043-oxoacyl-ACP reductase
XC_RS02180-2124.095718hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02175DHBDHDRGNASE1197e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 119 bits (300), Expect = 7e-35
Identities = 79/251 (31%), Positives = 113/251 (45%), Gaps = 9/251 (3%)

Query: 11 VLIAGGSRGIGLAIAQAFVQSGAQVSLCARNPEGLANAAAQLAADGAPVHTFACDLADAA 70
I G ++GIG A+A+ GA ++ NPE L + L A+ F D+ D+A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 71 QIERYVHAAAQAFGGLDVVVNNAS----GYGHGNDDESWQAGLDVDLMAAVRCNRAAVPY 126
I+ + G +D++VN A G H DE W+A V+ +R+ Y
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 127 LRQSGNAVILNISSINGQRPTPRAIAYSTAKAALNYYTTTLAAELARERIRVNAIAPGSI 186
+ + I+ + S P AY+++KAA +T L ELA IR N ++PGS
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 187 E--FPGGLWEQRSRDEPALY---ARIRDSIPFGGFGQVQHIADAALFLASPQARWITGQV 241
E LW + E + + IP + IADA LFL S QA IT
Sbjct: 191 ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHN 250

Query: 242 LAVDGGQSLGV 252
L VDGG +LGV
Sbjct: 251 LCVDGGATLGV 261


10XC_RS02395XC_RS02525Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS02395114-4.776208lipid kinase
XC_RS02400321-7.741377anthranilate synthase component I
XC_RS02405424-8.997664threonine aldolase
XC_RS02410625-10.044024hypothetical protein
XC_RS02415623-9.521822DEAD/DEAH box helicase
XC_RS02420725-9.841422restriction endonuclease subunit S
XC_RS02425418-7.399872anticodon nuclease
XC_RS02430012-4.032996DNA-binding protein
XC_RS02435-112-2.959573type I restriction-modification protein subunit
XC_RS02440014-0.782409anthranilate synthase component II
XC_RS02445112-2.319274Asp/Glu/hydantoin racemase
XC_RS02450211-2.641171anthranilate phosphoribosyltransferase
XC_RS0245539-2.226908indole-3-glycerol phosphate synthase
XC_RS02460311-2.849962hypothetical protein
XC_RS02465616-3.458355CRP-like protein Clp
XC_RS02470415-3.066775S-adenosylmethionine decarboxylase proenzyme
XC_RS02475016-0.495140membrane protein
XC_RS02480217-1.9523912-nonaprenyl-3-methyl-6-methoxy-1,4-benzoquinol
XC_RS02485-118-0.25442250S ribosomal protein L13
XC_RS024900181.24756530S ribosomal protein S9
XC_RS025051181.939203**RNA pyrophosphohydrolase
XC_RS025101192.336617bacterioferritin-associated ferredoxin
XC_RS025151192.475764bacterioferritin
XC_RS025201193.081093histidine kinase
XC_RS02525-2113.024920GGDEF signaling protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02515HELNAPAPROT300.003 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 29.8 bits (67), Expect = 0.003
Identities = 20/103 (19%), Positives = 42/103 (40%), Gaps = 10/103 (9%)

Query: 44 EYKESIDEMKHADKLSDRILFLEGLPNF---QALGKLRIGENP-----TEMFRCDLALER 95
E + E D +++R+L + G P + I + +EM + + +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EAVVVLREAVAYAETVKDYVSRQLFVDILESEEEHIDWLETQL 138
+ + + AE +D + LFV ++E E+ + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02520HTHFIS593e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 3e-11
Identities = 25/134 (18%), Positives = 47/134 (35%), Gaps = 5/134 (3%)

Query: 498 RILLVEDNPVNLLVAQKLLGVLGFEADTATDGEAALSSMESTRYDMVFMDCQMPVLDGYA 557
IL+ +D+ V + L G++ ++ + + D+V D MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 558 ATRRWRAMETESGGRPIPIVAMTANAMAGDRERCLAAGMDDYLSKPVAREQLNACLQRWL 617
R + + +P++ M+A + G DYL KP +L + R L
Sbjct: 65 LLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 618 PRQALLPGPSTGAP 631
P
Sbjct: 120 AEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02525HTHFIS686e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.5 bits (165), Expect = 6e-14
Identities = 28/133 (21%), Positives = 56/133 (42%), Gaps = 4/133 (3%)

Query: 107 RVLIVEDDRSQALFAQSVLHGAGMHAQVEMTAASVPQAIQDYHPDLILMDLHMPELDGIR 166
+L+ +DD + L AG ++ AA++ + I DL++ D+ MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 167 LTTLIRQQPGQQLLPIVFLTGDPDPERQFEVLDSGADDFLTKPIRPRHLIAAVSN--RIR 224
L I++ + LP++ ++ + + GA D+L KP LI +
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 225 RARQQALQQAGEQ 237
+ R L+ +
Sbjct: 123 KRRPSKLEDDSQD 135


11XC_RS02575XC_RS02735Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS025750173.145939LysR family transcriptional regulator
XC_RS02580-1152.658711transporter
XC_RS025850142.588579YccS/YhfK family integral membrane protein
XC_RS025901132.766486phosphoribosylamine--glycine ligase
XC_RS025952112.898190bifunctional purine biosynthesis protein PurH
XC_RS026005163.458663CDP-alcohol phosphatidyltransferase
XC_RS026055163.199447hypothetical protein
XC_RS026106182.445409ser/threonine protein phosphatase
XC_RS026153120.890953hypothetical protein
XC_RS026203120.649195hypothetical protein
XC_RS026252130.419901CDP-diacylglycerol--glycerol-3-phosphate
XC_RS026302140.183317acyltransferase
XC_RS026352140.632258phosphatidate cytidylyltransferase
XC_RS026402141.360285ice nucleation protein
XC_RS026450144.860006Fis family transcriptional regulator
XC_RS026501145.295853hypothetical protein
XC_RS02655-2123.012295XRE family transcriptional regulator
XC_RS02660-2112.290718hypothetical protein
XC_RS02665-1121.416613ribosomal protein L11 methyltransferase
XC_RS026700141.322941hypothetical protein
XC_RS02675-2192.341183hypothetical protein
XC_RS02680-1192.123233acetyl-CoA carboxylase biotin carboxylase
XC_RS026852223.651927hypothetical protein
XC_RS026902253.890556acetyl-CoA carboxylase biotin carboxyl carrier
XC_RS026950272.7039283-dehydroquinate dehydratase
XC_RS02700115-0.655969cytochrome C biogenesis protein
XC_RS02705217-2.986722divalent cation tolerance protein
XC_RS02710218-4.579379ribonuclease
XC_RS02715324-6.102462molecular chaperone GroES
XC_RS02720324-6.018796molecular chaperone GroEL
XC_RS02725229-5.848059helicase
XC_RS02730026-4.938125ATPase AAA
XC_RS02735016-3.751637hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02580TCRTETB415e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.4 bits (97), Expect = 5e-06
Identities = 32/137 (23%), Positives = 60/137 (43%), Gaps = 7/137 (5%)

Query: 37 LETLAQAFQIQVRTAGAVVTAAQLAYAAGLLLLVPLGDRLERRGLIVGLFVLSALGLLVS 96
L +A F + V TA L ++ G + L D+L + L++ +++ G ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 97 AASHS-FGMLLAGTIVTGASSVAAQILVPFA-ATLAAPQERGRVIGTVMSGLLLGILLAR 154
HS F +L+ + GA + A LV A + RG+ G + S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 TAAGLLAGVGGWHTVYW 171
G++A H ++W
Sbjct: 157 AIGGMIA-----HYIHW 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02610PHPHTRNFRASE290.048 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.0 bits (65), Expect = 0.048
Identities = 16/65 (24%), Positives = 24/65 (36%), Gaps = 10/65 (15%)

Query: 354 PASLRAAAQAIEAARAHG-PVLVCCALGYSRSAASVVTWLV------LSQRAASISEAMA 406
PA LR I+AA + G V +C + + L+ S A SI A +
Sbjct: 480 PAILRLVDMVIKAAHSEGKWVGMCGEMA---GDEVAIPLLLGLGLDEFSMSATSILPARS 536

Query: 407 CVRAA 411
+
Sbjct: 537 QLLKL 541


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02620cloacin358e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.7 bits (79), Expect = 8e-04
Identities = 17/49 (34%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 448 GTPWASYHEIRAPNTHHHDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGG 496
G+ W+S + P S GG G G GG G+SGG G G
Sbjct: 36 GSGWSSENN---PWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81



Score = 32.8 bits (74), Expect = 0.003
Identities = 14/41 (34%), Positives = 20/41 (48%)

Query: 461 NTHHHDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGGCGGGG 501
++ ++ G S G GG G G+ GG+G G G GG
Sbjct: 40 SSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 31.2 bits (70), Expect = 0.009
Identities = 14/38 (36%), Positives = 18/38 (47%)

Query: 465 HDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGGCGGGGD 502
++ G G G G G G +GG G GG G GG+
Sbjct: 43 NNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 30.8 bits (69), Expect = 0.015
Identities = 19/62 (30%), Positives = 22/62 (35%), Gaps = 18/62 (29%)

Query: 458 RAPNTHHHDSTGSSCGGGGDGGDGGGGDSG------------------GDGGGGCGGCGG 499
R NT H ++G+ GG G GGG G GGG G GG
Sbjct: 7 RGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGG 66

Query: 500 GG 501
G
Sbjct: 67 GN 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02640ICENUCLEATIN9940.0 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 994 bits (2571), Expect = 0.0
Identities = 942/1306 (72%), Positives = 1056/1306 (80%), Gaps = 48/1306 (3%)

Query: 1 MNREKVLALRTCTNNMSDHCGLIWPLSGIVECRHWQPSIKQENGLTGLLWGQGTNAHLNM 60
M +KVL LRTC NNM+DH G+IWPLSGIVEC++W+P ENGLTGL+WG+G+++ L++
Sbjct: 1 MKEDKVLILRTCANNMADHGGIIWPLSGIVECKYWKPVKGFENGLTGLIWGKGSDSPLSL 60

Query: 61 HADAHWVVCMVDTADIIWLGEEGMIKFPRAEVVYAGNRAGAMSCIAAGIEQHSPPKPEPP 120
HADA WVV VD + I + G IKFPRAEV++ G + AM I +
Sbjct: 61 HADARWVVAEVDADECIAIETHGWIKFPRAEVLHVGTKTSAMQFILHHRADYV------- 113

Query: 121 ADSVIAAEFTPKAAHAQFTAPIVESGAHSTAPLPSPPNGIGPQAAQPSNAILRTREIATY 180
+ + P + + T + + Q Q T EIATY
Sbjct: 114 --ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQ-------TIEIATY 164

Query: 181 GSTLTGADQSQLIAGYGSTETAGNGSELIAGYGSTGVAGSDSTIVAGYGSSQTAGGGSTL 240
GSTL+G QSQLIAGYGSTETAG+ S LIAGYGSTG AG+DST+VAGYGS+QTAG S+
Sbjct: 165 GSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQ 224

Query: 241 TAGYGSTQTARNGSELTAGYGSTETAGADSSLIAGYGSTQTSGGDSSLTAGYGSTQTAQN 300
AGYGSTQT GS+LTAGYGST TAG DSSLIAGYGSTQT+G DSSLTAGYGSTQTAQ
Sbjct: 225 MAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 301 GSDLTAGYGSTSTAGTDSSLIAGYGSTQTSGGESSLTAGYGSTQTAQDGSDLTAGYGSTG 360
GSDLTAGYGST TAG DSSLIAGYGSTQT+G ES+ TAGYGSTQTAQ GSDLTAGYGSTG
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 361 TAGADSSLIAGYGSTQTSGNDSSLTAGYGSTQTARTGSDLTAGYGSTSTAGADSTLIAGY 420
TAG DSSLIAGYGSTQT+G DSSLTAGYGSTQTA+ GSDLTAGYGST TAGADS+LIAGY
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 421 GSTQTSGGDSSLTAGYGSTQTARKGSDLTTGYGSTSTAGADSTLIAGYGSTQTSGSESSL 480
GSTQT+G +S+ TAGYGSTQTA+KGSDLT GYGST TAG DS+LIAGYGSTQT+G +SSL
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 481 TAGYGSTQTARKGSDLTAGYGSTSTAGADSTLIAGYGSTQTSGGESSLTAGYGSTQTARK 540
TAGYGSTQTA+KGSDLTAGYGSTSTAG +S+LIAGYGSTQT+G S+LTAGYGSTQTA+
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 541 GSDLTAGYGSTSTAGGDSTLVAGYGSTQTSGGDSSLTAGYGSTQTARSGSDLTTGYGSTS 600
SDL GYGSTSTAG +S+L+AGYGSTQT+ +S LTAGYGSTQTAR GSDLT GYGST
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 601 TAGGESTLIAGYGSTQTSGNASSLTAGYGSTQTARSGSDLTTGYGSTSTAGADSTLIAGY 660
TAG +S++IAGYGSTQT+ SSLTAGYGSTQTAR S LTTGYGSTSTAGADS+LIAGY
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGY 644

Query: 661 GSTQTSGGESSLTAGYGSTQTARKGSDLTTGYGSTSTAGADSTLIAGYGSTQTAGGESSL 720
GSTQT+G S LTAGYGSTQTA++GSDLT GYGSTSTAGADS+LIAGYGSTQTAG S L
Sbjct: 645 GSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSIL 704

Query: 721 TAGYGSTQTARKGSDLTAGYGSTSTAGSDSSLIAGYGSTQTAGFKSILTTGYGSTQTAQE 780
TAGYGSTQTA++GSDLT+GYGSTSTAG+DSSLIAGYGSTQTA + S LT GYGSTQTA+E
Sbjct: 705 TAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTARE 764

Query: 781 GSLLTAGYGSSSTAGSDSSLIAGYGSTQTAGFKSILTAGYGSTQTAQERSTLTTGYGSTS 840
S+LT GYGS+STAG+DSSLIAGYGSTQTAG+ SILTAGYGSTQTAQERS LTTGYGSTS
Sbjct: 765 QSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTS 824

Query: 841 TAGHDSTLIAGYGSTQTAGYKSILTTGYGSTQTAQESSSLIAGYGSSSMAGPDSSLIAGY 900
TAG DS+LIAGYGSTQTAGY SILT GYGSTQTAQE+S L GYGS+S AG DSSLIAGY
Sbjct: 825 TAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGY 884

Query: 901 GSTQTAGYDSFLTAGYGSTQTAQSSSWLITGYGSTSTASFQSSLIAGYGSTQTAGYESTL 960
GSTQTAGY+S LTAGYGSTQTAQ +S L TGYGSTSTA ++SSLIAGYGSTQTA ++STL
Sbjct: 885 GSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTL 944

Query: 961 TAGYGSTQTAQEISWLTTGYGSTQTAGHGSILTAGYGSNSTAGYESTLIAGYGSTQTAGY 1020
AGYGS+QTA+E S LT GYGST AG+ S L AGYGS TAGY+STL AGYGSTQTA +
Sbjct: 945 MAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEH 1004

Query: 1021 ESTLTAGYGSTLTALENSSLTAGYGSTEIAGFSSTLIAGYGSSQTAGGDSTLTAGYGSTL 1080
STLTAGYGST TAG DS+L AGYGS+L
Sbjct: 1005 SSTLTAGYGST--------------------------------ATAGADSSLIAGYGSSL 1032

Query: 1081 MALDNSTLTAGYGSTETAGQDSSLIAGYGSNLTSGVRSYLTAGYGSNQIASYGSSLIAGH 1140
+ S LTAGYGST +G S L AGYGS+L SG RS LTAGYGSNQIAS+ SSLIAG
Sbjct: 1033 TSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGP 1092

Query: 1141 ESTQIAGHRSMLIAGKLSSQTAGSRSTLIAGRGSIQTAGDRSKLIAGADSTQIAGDRSKL 1200
ESTQI G+RSMLIAGK SSQTAG RSTLI+G S+Q AG+R KLIAGADSTQ AGDRSKL
Sbjct: 1093 ESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKL 1152

Query: 1201 LAGSNSFLTAGDRSKLTAGDDCTLMAGDRSKLTAGKNSILTAGANSRLIGSLGSTLTGGE 1260
LAG+NS+LTAGDRSKLTAG+DC LMAGDRSKLTAG NSILTAG S+LIGS GSTLT GE
Sbjct: 1153 LAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTAGINSILTAGCRSKLIGSNGSTLTAGE 1212

Query: 1261 DSVLIFRCWDGKRYTNIIAKTGEEGVEADIAYQVDDDKNVVEKFDD 1306
+SVLIFRCWDGKRYTN++AKTG+ G+EAD+ YQ+D+D N+V K ++
Sbjct: 1213 NSVLIFRCWDGKRYTNVVAKTGKGGIEADMPYQMDEDNNIVNKPEE 1258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02645DNABINDNGFIS1143e-37 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 114 bits (287), Expect = 3e-37
Identities = 38/74 (51%), Positives = 55/74 (74%)

Query: 16 KSPLREHVAQSVRRYLRDLDGSDADDVYEIVLREMEIPLFVEVLNHCEGNQSRAAAMLGI 75
+ PLR+ V Q+++ Y L+G D +D+YE+VL E+E PL V+ + GNQ+RAA M+GI
Sbjct: 24 QKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQPLLDMVMQYTRGNQTRAALMMGI 83

Query: 76 HRATLRKKLKEYGL 89
+R TLRKKLK+YG+
Sbjct: 84 NRGTLRKKLKKYGM 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02735RTXTOXIND412e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.0 bits (96), Expect = 2e-05
Identities = 29/201 (14%), Positives = 66/201 (32%), Gaps = 19/201 (9%)

Query: 467 QRARRQLSAQQNLLAQWQSDLESMRPVVEDSDLFRRIEFEGFDQAT-GDEAEFIGLVEQV 525
+ + L + ++Q S+ + F + + L+++
Sbjct: 137 LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLP--DEPYFQNVSEEEVLRLTSLIKEQ 194

Query: 526 ESELRGHYEQIRTAVSQIEATLATTPARFAALRLQVAITQAQ-ADYQALLER---LKSQG 581
S + Q + + A T AR + +++ D+ +LL + K
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAV 254

Query: 582 VESPDQYAALVARRDELQKELKRLDAGLHEQAGYIQQAEQ-----VLEELTQLRHEISES 636
+E ++Y V + +L+++++ + Q Q +L++L Q I
Sbjct: 255 LEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 637 RKDFITEINQRAGNRIRLSLS 657
+ A N R S
Sbjct: 315 T-------LELAKNEERQQAS 328



Score = 36.0 bits (83), Expect = 7e-04
Identities = 30/187 (16%), Positives = 57/187 (30%), Gaps = 13/187 (6%)

Query: 371 RNRFPVTVLSQKQIYALAEQRGYLLELLDRQPQVDKAQWQRRFDAMRQRFLGLRAQARQL 430
N+ P L + + + L + Q + WQ + L + +
Sbjct: 162 LNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ--FSTWQNQKYQKELN---LDKKRAER 216

Query: 431 APDLSQRPALEAELRDTERKLKALE--LSHHASSLQAYQRARRQLSAQQNLLAQWQSDLE 488
L++ E R + +L L A + A + N L ++S LE
Sbjct: 217 LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLE 276

Query: 489 SMRPVVEDSDLFRRIEFEGFDQATGDE-AEFIGLVEQVESEL-----RGHYEQIRTAVSQ 542
+ + + ++ + F D+ + + + EL R IR VS
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSV 336

Query: 543 IEATLAT 549
L
Sbjct: 337 KVQQLKV 343


12XC_RS02780XC_RS02820Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS027802120.968898phospho-2-dehydro-3-deoxyheptonate aldolase
XC_RS027852131.311661hypothetical protein
XC_RS027901102.457778acetyl-CoA hydrolase
XC_RS027952133.740178hypothetical protein
XC_RS028002123.160385hypothetical protein
XC_RS028054153.352902gluconolactonase
XC_RS028105143.015722hypothetical protein
XC_RS028156133.031110RNA polymerase sigma factor
XC_RS028203112.817435iron dicitrate transporter FecR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02800FLGHOOKFLIK340.002 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 34.0 bits (77), Expect = 0.002
Identities = 25/122 (20%), Positives = 42/122 (34%), Gaps = 1/122 (0%)

Query: 125 LPGAVVVSP-APASSSATATASSATATATTTNGVSAPRTSQADGMPVDPAATNAANASNA 183
LPG A S+ T T T+ ++ + A G P P A A +
Sbjct: 138 LPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSK 197

Query: 184 AAGEAPPVATAATAPNAAATATTQPAARSTRSQTHSPAWTLQFDRIVAEQVQALTLRQLQ 243
A + P A A TQP +P + ++ + +++ + T + Q
Sbjct: 198 AEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQ 257

Query: 244 IA 245
A
Sbjct: 258 SA 259


13XC_RS02895XC_RS03110Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS02895272.666345malonate decarboxylase subunit alpha
XC_RS029006114.003192malonate decarboxylase acyl carrier protein
XC_RS02905593.135925malonate decarboxylase subunit beta
XC_RS029104112.477726malonate decarboxylase subunit gamma
XC_RS029152112.185312phosphoribosyl-dephospho-CoA transferase
XC_RS02920191.802311CitG protein
XC_RS029251101.325559ACP S-malonyltransferase
XC_RS029300101.182339DeoR family transcriptional regulator
XC_RS02935-3100.714342GntR family transcriptional regulator
XC_RS02940-28-0.284222HAD family hydrolase
XC_RS02945-28-0.603164anti-sigma F factor antagonist
XC_RS02950-212-0.718181hypothetical protein
XC_RS02955114-3.992181IcfG protein
XC_RS02960218-5.268430hypothetical protein
XC_RS02965118-4.984645lipase
XC_RS02970118-4.897871arabinogalactan endo-1,4-beta-galactosidase
XC_RS02975224-5.858897pyruvate dehydrogenase
XC_RS02980347-8.527622RNA-directed DNA polymerase
XC_RS02985-118-2.257048transposase
XC_RS02990-224-3.119640aldehyde-activating protein
XC_RS02995-320-3.571627dimethyladenosine transferase
XC_RS03000-220-3.609695hypothetical protein
XC_RS03005-316-1.931724hypothetical protein
XC_RS03010-213-0.723123oxidoreductase
XC_RS03015021-1.476023hypothetical protein
XC_RS03020-217-0.855493glutathione S-transferase
XC_RS03025-320-0.203447glyoxalase
XC_RS03030-1160.137206membrane protein
XC_RS030350150.485263membrane protein
XC_RS030402170.487524MarR family transcriptional regulator
XC_RS030454141.065288aminopeptidase
XC_RS030506141.885806esterase
XC_RS030557132.419132fatty acid desaturase
XC_RS030605162.357481hypothetical protein
XC_RS030654121.491734hypothetical protein
XC_RS030701131.901425glycosyl transferase
XC_RS030750132.468225hypothetical protein
XC_RS030800122.516353membrane protein
XC_RS030850122.497844hypothetical protein
XC_RS030900112.416141peptidase S9
XC_RS030952102.464759hypothetical protein
XC_RS031003111.631808histidine kinase
XC_RS03105217-0.427832hypothetical protein
XC_RS03110218-0.102400hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03010DHBDHDRGNASE386e-05 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 38.1 bits (88), Expect = 6e-05
Identities = 15/91 (16%), Positives = 33/91 (36%), Gaps = 3/91 (3%)

Query: 171 LRGRVALLTGGRVKIGYQAGLKLLRAGAELIVTTRFPRDSAARYAEEPDFAEWGHRLQVF 230
+ G++A +TG IG L GA + + + F
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHI---AAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 231 GLDLRHTPSVEAFCSQLLATRTRLDFIINNA 261
D+R + +++ +++ +D ++N A
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03110SUBTILISIN391e-06 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 38.7 bits (90), Expect = 1e-06
Identities = 9/35 (25%), Positives = 15/35 (42%)

Query: 20 PGTQLYGFKVLGDAGNGRDSWMIKAVQHVANLHAR 54
P L KVL G+G+ W+I+ + +
Sbjct: 108 PEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVD 142


14XC_RS03255XC_RS03310Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS032551163.563584sulfur deprivation response regulator
XC_RS032600173.156512glucans biosynthesis glucosyltransferase H
XC_RS032650173.050360carboxylesterase
XC_RS032700172.592539histidine kinase
XC_RS032750172.540436transcriptional regulator
XC_RS03280-1150.581137porphobilinogen deaminase
XC_RS03285010-2.357080salt-induced outer membrane protein
XC_RS03290114-3.659074transcriptional regulator
XC_RS03295114-3.458784outer membrane lipoprotein
XC_RS0330019-3.319658membrane protein
XC_RS03305010-3.493444prolyl oligopeptidase
XC_RS03310114-3.597228hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03265TCRTETOQM300.008 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 30.2 bits (68), Expect = 0.008
Identities = 9/16 (56%), Positives = 9/16 (56%)

Query: 6 IERETGPNPQWAVIWL 21
I E PNP WA I L
Sbjct: 432 IHIEVPPNPFWASIGL 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03270PF065801574e-47 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 157 bits (398), Expect = 4e-47
Identities = 70/303 (23%), Positives = 128/303 (42%), Gaps = 17/303 (5%)

Query: 59 LWLALAVSVLLCVLRPSLSRLPPRLGGLAALSIAAVVAMLGAGIVHGLYAVLDQAPLGPL 118
+ L L + + R +L L L V+ M+ ++ +L P+
Sbjct: 51 MGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPV 110

Query: 119 VGFWRFTLGSAATVVLIT-ALALRYFYVS----------DRWEAQVQANARAEADALQAR 167
L VV++T +L YF D+W+ A A+ AL+A+
Sbjct: 111 AFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQ-EAQLMALKAQ 169

Query: 168 IRPHFLFNSMNLIASLLRRDPVVAEQAVLDLSDLFRAALGAGEG-VSTLRAECELAERYL 226
I PHF+FN++N I +L+ DP A + + LS+L R +L +L E + + YL
Sbjct: 170 INPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYL 229

Query: 227 AIESLRLGDRLQVRWHKQEPLPWALPMPRLVLQPLVENAVLHGVSRMPEGGTLYLSLRQR 286
+ S++ DRLQ + + +P +++Q LVEN + HG++++P+GG + L +
Sbjct: 230 QLASIQFEDRLQFENQINPAI-MDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 287 GSQLQIRIVNPAPQPGTQAPLVVAGAGHAQASISHRLAFQFGAGARMAAGWSEGYYACEI 346
+ + + N G ++ RL +G A++ +G +
Sbjct: 289 NGTVTLEVENTGSLALKNTKE---STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 347 TLP 349
+P
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03275HTHFIS653e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 3e-14
Identities = 30/137 (21%), Positives = 51/137 (37%), Gaps = 6/137 (4%)

Query: 1 MTATQVRVLIADDEPLARERLRLLLGEHLHVQVVGEAENGEQVLQLCEQQQPDLVLLDIA 60
MT +L+ADD+ R L L + + N + + DLV+ D+
Sbjct: 1 MTGA--TILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVV 56

Query: 61 MPGVDGLETARLLRQRPQPPAVVFCTAYD--QHALSAFDAAALDYLMKPVRPERLAAALQ 118
MP + + +++ V+ +A + A+ A + A DYL KP L +
Sbjct: 57 MPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 119 RVATYLAGRTPSLAPAA 135
R R L +
Sbjct: 117 RALAEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03295BCTLIPOCALIN503e-10 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 50.4 bits (120), Expect = 3e-10
Identities = 29/117 (24%), Positives = 52/117 (44%), Gaps = 1/117 (0%)

Query: 33 DLSKIMGTWYVIARMPNPVERGHVASRDEYTLVEDGKVAVRYRYREGFEEPEKEVNARAS 92
+L+ +G WY +AR+ + ERG EY + DG ++V R + KE +A
Sbjct: 30 ELNNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLNRGYSEEKGEWKEAEGKAY 89

Query: 93 VDADSGNRDWRVWFYKVIPAKQRILEIAPDG-SWMLISYPGRDLAWIFARKPDMSRD 148
S + +V F+ + E+ + S+ +S P + W+ +R P + R
Sbjct: 90 FVNGSTDGYLKVSFFGPFYGSYVVFELDRENYSYAFVSGPNTEYLWLLSRTPTVERG 146


15XC_RS03505XC_RS03530Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS03505114-5.285278rod shape-determining protein RodA
XC_RS03510226-5.623148NTPase
XC_RS03515127-5.846064transposase
XC_RS03520026-4.992524transposase
XC_RS03525128-5.147327type VI secretion protein
XC_RS03530-227-3.754385hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03525BICOMPNTOXIN310.005 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 31.0 bits (70), Expect = 0.005
Identities = 11/44 (25%), Positives = 21/44 (47%)

Query: 231 QIVYAPREQQDANDYSDMLGYTTVRKKNKSQTSGKQSSVSYSET 274
I Y P+ + ++ + S LGY + + G S +YS++
Sbjct: 122 LINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKS 165


16XC_RS03685XC_RS03765Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS036852121.8797322-amino-4-hydroxy-6-
XC_RS03690090.587515pteridine reductase
XC_RS03695190.539174hypothetical protein
XC_RS03700091.002042membrane protein
XC_RS03705191.125388hypothetical protein
XC_RS037101101.208493hypothetical protein
XC_RS037150101.273232TonB-denpendent receptor
XC_RS037204132.989177type II secretion system protein C
XC_RS037253132.892426type II secretion system protein D
XC_RS037304152.928542general secretion pathway protein E
XC_RS037354172.978209general secretion pathway protein F
XC_RS037406183.518648general secretion pathway protein GspG
XC_RS037458193.671960general secretion pathway protein H
XC_RS037509183.169073general secretion pathway protein I
XC_RS037557153.043372type II secretion system protein J
XC_RS037604121.526791type II secretion system protein K
XC_RS037652101.409027type II secretion system protein L
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03690DHBDHDRGNASE1154e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (288), Expect = 4e-33
Identities = 77/253 (30%), Positives = 122/253 (48%), Gaps = 16/253 (6%)

Query: 6 KVVLITGAARRIGAQIATTLHGAGYRVALHAHRSGDALAARVAALCAQRAGSACALQADL 65
K+ ITGAA+ IG +A TL G +A + + L V++L A+ A A A AD+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAE-ARHAEAFPADV 66

Query: 66 RTPEAPAQLVDACVAAFGRLDAVVNNASAFYPTVLGEATPAQWDELFAVNARAPFFIAQA 125
R A ++ G +D +VN A P ++ + +W+ F+VN+ F +++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 126 AAAQLRAHH-GAIVNLTDLHAEQPMRQHPLYGASKSALEMLTRSLALELAPQ-VRVNAVA 183
+ + G+IV + A P Y +SK+A M T+ L LELA +R N V+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 184 PGAI-------LWPEDGKADAAKQALLAR----TPLARIGTPEEVAEAVRWLLDD-ASFI 231
PG+ LW ++ A+ + L PL ++ P ++A+AV +L+ A I
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 232 TGHTLRVDGGRRL 244
T H L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03710GPOSANCHOR330.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.5 bits (76), Expect = 0.002
Identities = 9/35 (25%), Positives = 12/35 (34%)

Query: 27 ADRVASPGEGAFPPAPTPAPTPAPTPAPTPAPAPT 61
A+ +A G + TP P P AP
Sbjct: 452 AEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQ 486


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03720BCTERIALGSPC451e-07 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 45.0 bits (106), Expect = 1e-07
Identities = 54/273 (19%), Positives = 89/273 (32%), Gaps = 44/273 (16%)

Query: 15 MSARAVRTAAVCVLLVLLAVQGVRLVWLLVTPLG----------------PLGTAQG--- 55
+S +R +L++L Q + W + P P+
Sbjct: 9 LSPSVIRRILFYLLMLLFCQQLAMIFWRIGLPDNAPVSSVQITPAQARQQPVTLNDFTLF 68

Query: 56 --ATTAAPLPALQRDVFFRAPADSGDLGLVLHGVRVGG--ADSAAYLSTGDGRQGAYRIG 111
+ AL P + L L L GV G + S A +S D Q + +
Sbjct: 69 GVSPEKNKAGALDASQMSNLPPST--LNLSLTGVMAGDDDSRSIAIISK-DNEQFSRGVN 125

Query: 112 DAV-GPGLTLQAIATDHVMVRAGSALRRLPLIEHAAASAAITAPLPASGAPAAAAAVASN 170
+ V G + +I D V+++ L L SG+ A
Sbjct: 126 EEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQ-----------EDSGSDGVPGA---Q 171

Query: 171 VGARTAAAGTTAVDPQQLLRTTGLRANADGGGFTVMPRGDDALLRQAGLAPGDVLTQLNG 230
V + +T + + + + + G+ + P + GL D+ LNG
Sbjct: 172 VNEQLQQRASTTM--SDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDNDMAVALNG 229

Query: 231 RTL-DAEHLHELQDELRDGQTATLTYRRDGQTH 262
L DAE + + + D TLT RDGQ
Sbjct: 230 LDLRDAEQAKKAMERMADVHNFTLTVERDGQRQ 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03725BCTERIALGSPD376e-123 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 376 bits (967), Expect = e-123
Identities = 208/684 (30%), Positives = 329/684 (48%), Gaps = 58/684 (8%)

Query: 4 VRRPWLLSATLLLALPSLTMLPLHAADAPAVRMQDVDLRAFIQDVSRATGITFIVDTRVQ 63
V R + L+ LL +L P A + + D++ FI VS+ T I+D V+
Sbjct: 6 VIRSFSLT---LLIFAALLFRPAAAEEFS-ASFKGTDIQEFINTVSKNLNKTVIIDPSVR 61

Query: 64 GSVNVARAQAMSEADLLGMLLAVLRANGLIAVSSGPSTYRIIPDDTAAQQPG-----SAA 118
G++ V ++E L+VL G ++ +++ A +A
Sbjct: 62 GTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAP 121

Query: 119 SGNLGFATQVFTLQRVDARSAAEILKPLVGRGGVIMAM--PQGNSLLIADYADNLRRIRG 176
T+V L V AR A +L+ L GV + N LL+ A ++R+
Sbjct: 122 GIGDEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLT 181

Query: 177 LVAQIDTDR-AAIDTVTLRNSSAQELARTLTTLF----GQAGERSAVLSVLPVESSNSLI 231
+V ++D ++ TV L +SA ++ + +T L A S V +V+ E +N+++
Sbjct: 182 IVERVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVL 241

Query: 232 VRGDPALVQRVVRTALDLDGRAERRGDVSVVRLQHASAEQLLPVLQQLVGQTPGNEAEPG 291
V G+P QR++ LD + +G+ V+ L++A A L+ VL + T +E +
Sbjct: 242 VSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTG-ISSTMQSEKQAA 300

Query: 292 QETRPTAVDVAASAGAAQAQVIAPAAGKRPVIVRYPGSNALIINADPETQRALMDVIRQL 351
+ ++ A +NALI+ A P+ L VI QL
Sbjct: 301 KPVAALDKNIIIKAH--------------------GQTNALIVTAAPDVMNDLERVIAQL 340

Query: 352 DVHREQVLVEAIVVEISDTAAKRLGVQLLLAGRNGTVPLLATQYSGAAPGIVPLAAAAAG 411
D+ R QVLVEAI+ E+ D LG+Q A +N + TQ++ + I A A
Sbjct: 341 DIRRPQVLVEAIIAEVQDADGLNLGIQW--ANKNAGM----TQFTNSGLPISTAIAGANQ 394

Query: 412 TRSNNGEDDSVLEQARNVAAQSLLGLSGGLIGLAGQSNDAVFGMIIDAVKSDTGSNLLST 471
+ S+ S L G+ Q N + M++ A+ S T +++L+T
Sbjct: 395 YNKDGTVSSSLA---------SALSSFNGIAAGFYQGN---WAMLLTALSSSTKNDILAT 442

Query: 472 PSIMTLDNEQARILVGQEVPITTGEVLGAANDNPFRTIQRQDVGVELEVRPQINTAGGIT 531
PSI+TLDN +A VGQEVP+ TG + DN F T++R+ VG++L+V+PQIN +
Sbjct: 443 PSIVTLDNMEATFNVGQEVPVLTGSQTTS-GDNIFNTVERKTVGIKLKVKPQINEGDSVL 501

Query: 532 LAIKQEVSAIAGPVSTQSSEL--VFNKRQIETRVVVENGAIVALGGLLDQNDRQTVEKVP 589
L I+QEVS++A S+ SS+L FN R + V+V +G V +GGLLD++ T +KVP
Sbjct: 502 LEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVP 561

Query: 590 LLGDVPGLGALFRHKSRNRDKTNLMVFIRPTIIRDAADAQRMTAPRYNYLRDRQLADGDP 649
LLGD+P +GALFR S+ K NLM+FIRPT+IRD + ++ ++ +Y D Q
Sbjct: 562 LLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGK 621

Query: 650 EAALDALVRDYLRAQPPQLPASAP 673
E L +D L P Q A+
Sbjct: 622 ENNDAMLNQDLLEIYPRQDTAAFR 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03735BCTERIALGSPF342e-117 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 342 bits (878), Expect = e-117
Identities = 170/405 (41%), Positives = 238/405 (58%), Gaps = 10/405 (2%)

Query: 1 MAKFDYTVLDLHGRNRHGVISADSVRSARSLLEQRQWVPLRVEPAAATTP---------M 51
MA++ Y LD G+ G ADS R AR LL +R VPL V+
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 52 RAARFSGKDLVLFTRQLATLVDTA-PLEEALRTIGTQSERRGVRAVTGQTHALVVEGFRL 110
R R S DL L TRQLATLV + PLEEAL + QSE+ + + + V+EG L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 111 SDAMARQGTAFPPLYRAMVAAGESAGALPQVLERLADLLERQAQVRSKLQSALVYPAALA 170
+DAM +F LY AMVAAGE++G L VL RLAD E++ Q+RS++Q A++YP L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 171 MTAGVVVIVLMTFVVPKVVDQFDSMGRALPWLTRAVIALSQFLLHAGIPLLVAGVIAAVV 230
+ A VV +L++ VVPKVV+QF M +ALP TR ++ +S + G +L+A + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 231 AVQVRKRPPVRLAIDRAILRAPLLGRLLRDLHAARMARTLAIMVNSGLPLMEGLMIAART 290
+ ++ R++ R +L PL+GR+ R L+ AR ARTL+I+ S +PL++ + I+
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 291 VDNHALRLATDSMVTAIREGGSLGAAMKRAGVFPPTLLYMASSGENSGRLAPMLERAADY 350
+ N R A+REG SL A+++ +FPP + +M +SGE SG L MLERAAD
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 351 LEREFESFTTAAMSLLEPLIIVLLGGVVAVIVLSILLPILQFNTL 395
+REF S T A+ L EPL++V + VV IVL+IL PILQ NTL
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTL 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03740BCTERIALGSPG1823e-62 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 182 bits (462), Expect = 3e-62
Identities = 65/134 (48%), Positives = 91/134 (67%), Gaps = 3/134 (2%)

Query: 17 RGFTLVELMVVIVIIGLLATVVMINVMPSQDRAMVEKARADVAVLEQALETYRLDNLTYP 76
RGFTL+E+MVVIVIIG+LA++V+ N+M ++++A +KA +D+ LE AL+ Y+LDN YP
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 77 STEQGLQALLSAPGGLSRPERYRQGGYIRRLPEDPWGHAYQYRRPGRSGGFDVYSFGADG 136
+T QGL++L+ AP Y + GYI+RLP DPWG+ Y PG G +D+ S G DG
Sbjct: 68 TTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLSAGPDG 127

Query: 137 AEGGDADNADIGNW 150
G + DI NW
Sbjct: 128 EMGTE---DDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03745BCTERIALGSPH415e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 40.7 bits (95), Expect = 5e-07
Identities = 14/74 (18%), Positives = 31/74 (41%), Gaps = 1/74 (1%)

Query: 13 GFTLLEVLAVLVITALASTVVLLTLPDT-QRTLPDQADALATALSHARDEAILSLRMTEV 71
GFTLLE++ +L++ +++ +VLL P + + L + + + + V
Sbjct: 5 GFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFFGV 64

Query: 72 VLSAGGYAFRRQAR 85
+ + F
Sbjct: 65 SVHPDRWQFLVLEA 78


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03750BCTERIALGSPG315e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 31.4 bits (71), Expect = 5e-04
Identities = 19/59 (32%), Positives = 34/59 (57%), Gaps = 4/59 (6%)

Query: 3 RRADTQAGFSLLELLVALAIFG-MAVVGLLNLSGESTRTAVILEERALAAVVAENQAIE 60
R D Q GF+LLE++V + I G +A + + NL G + ++A++ +VA A++
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADK---QKAVSDIVALENALD 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03755BCTERIALGSPG325e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.2 bits (73), Expect = 5e-04
Identities = 12/48 (25%), Positives = 24/48 (50%)

Query: 1 MMRLRRAAGFTLIELLVALAVFALVAAAAVGVLRQSIEQRAAVQARLQ 48
M + GFTL+E++V + + ++A+ V L + E+ +A
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03765SUBTILISIN310.009 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 30.6 bits (69), Expect = 0.009
Identities = 22/94 (23%), Positives = 30/94 (31%), Gaps = 13/94 (13%)

Query: 22 DAQGH-VHVHGTAASTPPAARTVLVVPGTQVH-LRWLALPGRSPAQSLAAARLQLAEHLA 79
D GH HV GT A+T V V P + ++ L G + E
Sbjct: 82 DYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKV 141

Query: 80 ----------SDVSTLHVVIAANAQADGTRLVAA 103
DV LH + A A ++ A
Sbjct: 142 DIISMSLGGPEDVPELHEAV-KKAVASQILVMCA 174


17XC_RS03915XC_RS04005Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS039153140.330030glutathione-dependent formaldehyde-activating
XC_RS039203111.214642S-formylglutathione hydrolase
XC_RS039253121.588209aspartate aminotransferase
XC_RS039300100.410715hypothetical protein
XC_RS03935110-1.394009oxidoreductase
XC_RS03940110-1.386148permease
XC_RS03945-112-1.008918hypothetical protein
XC_RS03950-112-1.092675DegV domain-containing protein XCC3382
XC_RS03955214-1.334045cellulase
XC_RS039603140.743888cellulase
XC_RS039702132.172685hypothetical protein
XC_RS039750121.607924hypothetical protein
XC_RS03980-1141.578334hypothetical protein
XC_RS03985-2131.503596hypothetical protein
XC_RS03990-1132.230299amidohydrolase
XC_RS03995-1121.761501amidohydrolase
XC_RS040000122.207416hypothetical protein
XC_RS040052152.721970hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03955BONTOXILYSIN290.027 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 28.7 bits (64), Expect = 0.027
Identities = 17/73 (23%), Positives = 29/73 (39%), Gaps = 8/73 (10%)

Query: 163 DHNSWPIGTLTASKV----ITAGGRTFDLWEGFNSGAGYYVYTFIPTGTAGQATLKTSGS 218
H + I L A T + D + +S VY+F+ + T+K G
Sbjct: 495 THTALSINYLQAQITNNENFTL---SSDFSKVVSSKDKSLVYSFLDNLMSYLETIKNDGP 551

Query: 219 LNVDAKPFLNWLQ 231
++ D K + WL+
Sbjct: 552 IDTDKK-YYLWLK 563


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03990UREASE402e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 40.1 bits (94), Expect = 2e-05
Identities = 17/33 (51%), Positives = 22/33 (66%)

Query: 348 LAGLTRVPAEIFGVGDRIGSIAVGKQADLVLWD 380
+A T PA G+ IGS+ VGK+ADLVLW+
Sbjct: 406 IAKYTINPAIAHGLSHEIGSLEVGKRADLVLWN 438


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03995UREASE362e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 36.3 bits (84), Expect = 2e-04
Identities = 23/72 (31%), Positives = 34/72 (47%), Gaps = 12/72 (16%)

Query: 41 VLIQHATVLTGTGQRLDDADVLLRDGKVAAVGRG----------LEVPADARRIDGTGKW 90
+I +A +L G AD+ L+DG++AA+G+ + V I G GK
Sbjct: 70 TVITNALILDHWGIV--KADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKI 127

Query: 91 VTPGIIDVHSHL 102
VT G +D H H
Sbjct: 128 VTAGGMDSHIHF 139



Score = 33.6 bits (77), Expect = 0.002
Identities = 16/35 (45%), Positives = 21/35 (60%)

Query: 389 RAIRWLTSNAAKALGIEQQTGALEPGKMGDMVVWN 423
R I T N A A G+ + G+LE GK D+V+WN
Sbjct: 404 RYIAKYTINPAIAHGLSHEIGSLEVGKRADLVLWN 438


18XC_RS04295XC_RS04400Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS04295-123-3.962504K(+)-insensitive pyrophosphate-energized proton
XC_RS04300-228-4.159114hypothetical protein
XC_RS04305-127-4.569063hypothetical protein
XC_RS04310029-7.108611hypothetical protein
XC_RS04315-132-7.863033hypothetical protein
XC_RS04320339-10.229503hypothetical protein
XC_RS04325128-8.153769transposase
XC_RS04330233-8.417090transposase
XC_RS04335237-9.580762hypothetical protein
XC_RS04340231-8.273995hypothetical protein
XC_RS04350027-6.572073hypothetical protein
XC_RS04355023-5.997333hypothetical protein
XC_RS04360026-6.089628hypothetical protein
XC_RS04365016-4.395791hypothetical protein
XC_RS04370013-2.825760hypothetical protein
XC_RS04375-211-0.995344transposase
XC_RS04380-111-0.484686hypothetical protein
XC_RS04385-2110.414554hypothetical protein
XC_RS043900131.4439296-phosphofructokinase
XC_RS043951161.587021adenylate kinase
XC_RS044002161.851595UDP-N-acetylmuramate:L-alanyl-gamma-D-glutamyl-
19XC_RS04915XC_RS04945Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS04915-1143.282917protein-S-isoprenylcysteine methyltransferase
XC_RS049201133.572121cysteine synthase
XC_RS049252112.900171uroporphyrin-III methyltransferase
XC_RS049304122.521437LysR family transcriptional regulator
XC_RS049354102.207972hypothetical protein
XC_RS04940391.866226histidine kinase
XC_RS04945392.231699hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04940PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 20/101 (19%), Positives = 36/101 (35%), Gaps = 23/101 (22%)

Query: 304 LVGNAVKY-----TERGQVLVGCRRRPGHAVVEVIDSGIGLNLEHPEDVFQAFRQADPGS 358
LV N +K+ + G++L+ + G +EV ++G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL--------------ALKNTK 308

Query: 359 DGLGIGLWIVHRTAETL---GCQVEVRPRPSGGTRFSVTIP 396
+ G GL V + L Q+++ + G V IP
Sbjct: 309 ESTGTGLQNVRERLQMLYGTEAQIKLSEKQ-GKVNAMVLIP 348


20XC_RS05085XC_RS05245Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05085222-4.112990thiazole synthase
XC_RS05090328-5.587299sulfur carrier protein ThiS
XC_RS05095329-5.931031lipase/esterase
XC_RS05105548-10.347545*integrase
XC_RS051101058-12.543520hypothetical protein
XC_RS05115957-12.133112hypothetical protein
XC_RS05120753-11.351492hypothetical protein
XC_RS05125849-11.219868transcriptional regulator
XC_RS05130748-11.063965D-alanyl-D-alanine endopeptidase
XC_RS05135848-11.238813addiction module antitoxin
XC_RS05140140-9.716004transposase
XC_RS05145345-11.323695transposase
XC_RS05150747-12.478370transposase
XC_RS05155543-10.870460Type IV secretion system protein virB6
XC_RS05160542-10.550823hypothetical protein
XC_RS05165334-8.995280hypothetical protein
XC_RS05170328-7.607871hypothetical protein
XC_RS05175229-5.611373transposase
XC_RS05180233-6.990393transposase
XC_RS05185338-8.752621transposase ISxcC1
XC_RS05190440-8.913707transposase
XC_RS05195542-8.646267transposase
XC_RS05200745-8.931346relaxase
XC_RS05205444-9.350530hypothetical protein
XC_RS05210445-8.998220hypothetical protein
XC_RS05215344-8.422760hypothetical protein
XC_RS05220344-8.307429hypothetical protein
XC_RS05225242-8.338387hypothetical protein
XC_RS05230141-7.954098membrane protein
XC_RS05235138-6.587780hypothetical protein
XC_RS05240233-5.975799conjugal transfer protein
XC_RS05245215-3.594977hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05085HTHFIS310.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.003
Identities = 26/162 (16%), Positives = 50/162 (30%), Gaps = 26/162 (16%)

Query: 105 TKLEVLGDERTLYPDVVQTLKAAEQLVADGFEVMVYTSDDPILAKRLEEIGCVAVMPLAA 164
+ V D+ + + Q L A G++V + ++ + G + V +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA------GYDVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 165 PIGSGLGIQNKYNLLEII--ENAKVPIIVDAGVGTASDAAIAMELGCDGVLMNTAIAGAR 222
P + LL I +P++V + T A A E GA
Sbjct: 58 PDENAFD------LLPRIKKARPDLPVLVMSAQNTFMTAIKASE------------KGAY 99

Query: 223 DPILMASAMRKAIEAGREAFLAGRIPRKRYASASSPVDGVIG 264
D + + + I A + + S ++G
Sbjct: 100 DYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVG 141


21XC_RS05295XC_RS05360Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05295127-5.532443CopG family transcriptional regulator
XC_RS05300131-6.359346hypothetical protein
XC_RS05305236-7.813004transposase
XC_RS05310332-8.395776dephospho-CoA kinase
XC_RS05315322-6.770367type 4 prepilin-like proteins leader
XC_RS05320117-5.066009type II secretion system protein F
XC_RS05325012-2.478403pilin
XC_RS05330014-1.747628pilin
XC_RS05335-116-0.974633type II secretory protein GspE
XC_RS05340-121-0.005775chemotaxis protein CheY
XC_RS05345-1180.231609ATPase
XC_RS05350115-0.748308succinyl-CoA ligase [ADP-forming] subunit beta
XC_RS053552120.084371succinyl-CoA synthetase subunit alpha
XC_RS053602131.068414CopG family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05315PREPILNPTASE330e-116 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 330 bits (849), Expect = e-116
Identities = 130/282 (46%), Positives = 175/282 (62%), Gaps = 1/282 (0%)

Query: 1 MAFLDQHPGLGFPAAAGLGLLIGSFLNVVILRLPKRMEWQWRRDAREILELPDI-YEPPP 59
+ P L F L+IGSFLNVVI RLP +E +W+ + R D + PP
Sbjct: 5 LELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPP 64

Query: 60 PGIVVEPSHDPVTGDKLKWWENIPLFSWLMLRGKSRYSGKPISIQYPLVELLTSILCVAS 119
++V S P + ENIPL SWL LRG+ R PIS +YPLVELLT++L VA
Sbjct: 65 YNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAV 124

Query: 120 VWRFGFGWQGFGAIVLSCFLVAMSGIDLRHKLLPDQLTLPLMWLGLVGSMDNLYMPAKPA 179
GW A++L+ LVA++ IDL LLPDQLTLPL+W GL+ ++ ++ A
Sbjct: 125 AMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDA 184

Query: 180 LLGAAVGYVSLWTVWWLFKQLTGKEGMGHGDFKLLAALGAWCGLKGILPIILISSLVGAV 239
++GA GY+ LW+++W FK LTGKEGMG+GDFKLLAALGAW G + + ++L+SSLVGA
Sbjct: 185 VIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAF 244

Query: 240 LGSIWLFAKGRDRATPIPFGPYLAIAGWVVFFWGNDLVDGYL 281
+G + + ++ PIPFGPYLAIAGW+ WG+ + YL
Sbjct: 245 MGIGLILLRNHHQSKPIPFGPYLAIAGWIALLWGDSITRWYL 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05320BCTERIALGSPF380e-132 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 380 bits (978), Expect = e-132
Identities = 118/407 (28%), Positives = 219/407 (53%), Gaps = 9/407 (2%)

Query: 20 MVPFVWEGTDKRGIKMKGEQPARNANMLRAELRRQGITPLVV-----KTKPKPLFGAA-- 72
M + ++ D +G K +G Q A +A R LR +G+ PL V + G +
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 73 -GKKISSKDIAFFSRQMATMMKSGVPIVGSLEIIGEGHKNPRMKKMVGQIRTDIEGGSSL 131
++S+ D+A +RQ+AT++ + +P+ +L+ + + + P + +++ +R+ + G SL
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 132 HEAISRHPVQFDDLYRNLVRAGEGAGVLETVLDTVANYKENIEALKGKIKKALFYPAMVM 191
+A+ P F+ LY +V AGE +G L+ VL+ +A+Y E + ++ +I++A+ YP ++
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 192 AVAIIVSGILLVFVVPQFEDVFKGFGAELPAFTQMIVAASRFMVSYWWLMLLGSIAAIAG 251
VAI V ILL VVP+ + F LP T++++ S + ++ MLL +A
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 252 FIFAYKRSPRMRHGMDRLVLKVPVIGQIMHNSSIARFARTTAVTFKAGVPLVEALSIVAG 311
F R + R R +L +P+IG+I + AR+ART ++ + VPL++A+ I
Sbjct: 241 FRVML-RQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGD 299

Query: 312 ATGNKVYEEAVLRMRDDVSVGYPVNMAMKQVNLFPHMVVQMTSIGEEAGALDAMLFKVAE 371
N + D V G ++ A++Q LFP M+ M + GE +G LD+ML + A+
Sbjct: 300 VMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 372 YFEQEVNNAVDALSSLLEPLIMVFIGTIVGGMVIGMYLPIFKLGSVV 418
++E ++ + L EPL++V + +V +V+ + PI +L +++
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05325BCTERIALGSPG463e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.0 bits (109), Expect = 3e-09
Identities = 22/74 (29%), Positives = 39/74 (52%), Gaps = 2/74 (2%)

Query: 1 MKKQNGFTLIELMIVVAIIAILAAIALPAYQDYLARSQVSEGLSLASGAKTAVAETYANT 60
KQ GFTL+E+M+V+ II +LA++ +P ++ + +S + A+ +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 GAFPATNAAAGLEA 74
+P TN GLE+
Sbjct: 64 HHYPTTN--QGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05330BCTERIALGSPG441e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.5 bits (105), Expect = 1e-08
Identities = 20/72 (27%), Positives = 38/72 (52%), Gaps = 2/72 (2%)

Query: 2 RVRGFTLIELMIVVAVIAILAAIALPAYQDYLVRAQVSEGLSLASGAKVAVEEFHWAKSA 61
+ RGFTL+E+M+V+ +I +LA++ +P +A + +S + A++ +
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHH 65

Query: 62 SPSTNIEAGLGS 73
P+T GL S
Sbjct: 66 YPTT--NQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05340HTHFIS507e-180 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 507 bits (1308), Expect = e-180
Identities = 166/474 (35%), Positives = 257/474 (54%), Gaps = 17/474 (3%)

Query: 6 SALVVDDERDIRELLVLTLGRMGLRISTAANLAEARELLASNPYDLCLTDMRLPDGNGIE 65
+ LV DD+ IR +L L R G + +N A +A+ DL +TD+ +PD N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 66 LVTEIARQYPQTPVAMITAFGSMDLAVEALKAGAFDFVSKPVDISVLRGLVKHALELNNR 125
L+ I + P PV +++A + A++A + GA+D++ KP D++ L G++ AL R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 126 DRPAPPPPPLEQASRLLGDSTAMESLRSTIGKVARSQAPVYIVGESGVGKELVARTIHEQ 185
RP+ + L+G S AM+ + + ++ ++ + I GESG GKELVAR +H+
Sbjct: 125 -RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 186 GARAAGPFIPVNCGAIPAELMESEFFGHKKGSFTGAHADKPGLFQAAHGGTLFLDEVAEL 245
G R GPF+ +N AIP +L+ESE FGH+KG+FTGA G F+ A GGTLFLDE+ ++
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 246 PLQMQVKLLRAIQEKSVRPVGASGETLVDVRILSATHKDLGDLVSDGRFRHDLYYRINVI 305
P+ Q +LLR +Q+ VG DVRI++AT+KDL ++ G FR DLYYR+NV+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 306 ELRVPPLRERSGDLPQLAAAIIARLARSHGRPIPLLTQSALDALDTYGFPGNVRELENIL 365
LR+PPLR+R+ D+P L + + + G + Q AL+ + + +PGNVRELEN++
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 366 ERALALAEDDQISASDLRLPAH---------------GGHRLAASPGSAAIEPREAVVDI 410
R AL D I+ + G ++ + + + D
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 411 DPASSALPSYIEQLERAAIQKALEENRWNKTKTAAQLGITFRALRYKLKKLGME 464
P S + ++E I AL R N+ K A LG+ LR K+++LG+
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


22XC_RS05485XC_RS05595Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS054852131.387092O-acetylhomoserine
XC_RS054903132.749436ligand-gated channel protein
XC_RS054954163.251778membrane protein
XC_RS055003153.576684hypothetical protein
XC_RS055053153.975610cob(I)alamin adenolsyltransferase/cobinamide
XC_RS055105164.665417cobyrinic acid a,c-diamide synthase
XC_RS055156165.327627cobalamin biosynthesis protein CobD
XC_RS055206135.214504cobyric acid synthase
XC_RS055255135.113101adenosylcobinamide kinase
XC_RS055305124.429774nicotinate-nucleotide--dimethylbenzimidazole
XC_RS055356114.424639fructose-2,6-bisphosphatase
XC_RS055407134.173483cobalamin synthase
XC_RS055456143.603232AraC family transcriptional regulator
XC_RS055504132.200684diaminopimelate decarboxylase
XC_RS055553142.082341iron transporter
XC_RS055601121.908158transporter
XC_RS055651110.772303hypothetical protein
XC_RS05570-28-0.011511carboxylate-amine ligase
XC_RS05575-210-0.574821citrate-dependent iron transporter
XC_RS05580-111-0.4094494-hydroxy-2-oxovalerate aldolase
XC_RS05585114-0.293033cation:proton antiport protein
XC_RS05590113-0.310283cytochrome P450 hydroxylase
XC_RS055952150.008301ferric enterobactin receptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05485BICOMPNTOXIN320.004 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 32.2 bits (73), Expect = 0.004
Identities = 11/36 (30%), Positives = 17/36 (47%), Gaps = 1/36 (2%)

Query: 30 IYQTVAYAF-DDTQHGADLFDLKVQGNIYSRIMNPT 64
+ Q + + F D ++ D LK+QG I SR
Sbjct: 58 VTQNIQFDFVKDKKYNKDALILKMQGFISSRTTYYN 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05550ALARACEMASE310.007 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 31.3 bits (71), Expect = 0.007
Identities = 47/230 (20%), Positives = 76/230 (33%), Gaps = 40/230 (17%)

Query: 31 DLAALDAHAAWMRAQLPPGCTLFYAAKANA----EPQILQTLAPYVDGFEAASGGE-LAW 85
DL AL + + +R Q ++ KANA +I + DGF + E +
Sbjct: 10 DLQALKQNLSIVR-QAATHARVWSVVKANAYGHGIERIWSAI-GATDGFALLNLEEAITL 67

Query: 86 LHQQQPDAALLFGGPGKLES-ELAQAVRLPDCTVHVESLGELQRLAAIARQIAQPDRRRI 144
+ L+ G + E+ RL C VH + AR + +
Sbjct: 68 RERGWKGPILMLEGFFHAQDLEIYDQHRLTTC-VHSN---WQLKALQNARL-----KAPL 118

Query: 145 PVFLRMNIAVPGAQTTRLMMGGQPSPFGLDPDDLDAALLQLRASPALELRGFHFHLMSHQ 204
++L++N + RL G PD + QLRA + H
Sbjct: 119 DIYLKVNSGM-----NRL---------GFQPDRVLTVWQQLRAMANVGEMTLMSHFAE-- 162

Query: 205 RDAAAQLHLIAAYLRTVQHWRQRHGLGPLLVNAGGGFGVDYLTPEASFDW 254
A I+ + ++ + L N+ PEA FDW
Sbjct: 163 ---AEHPDGISGAMARIEQAAEGLECRRSLSNSAATL----WHPEAHFDW 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05555PF041832784e-88 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 278 bits (712), Expect = 4e-88
Identities = 109/523 (20%), Positives = 190/523 (36%), Gaps = 51/523 (9%)

Query: 100 DGHALARSLLQALGSTHAVNPELLAQSDNSVAIT----AALL--RQAQAATPTGDTLIDA 153
D LA++LL L +++ +A+ + T LL R+ +A+ + D
Sbjct: 69 DEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNADR 128

Query: 154 EQALLWGHALHPTPKSREGVALAQVLACAPEARAAFALYWFRIDRRLLRVQ---GRDVRA 210
Q LL GH K R G + APE F L+W + R + + D+
Sbjct: 129 LQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQ 188

Query: 211 TLQ---------------QLSGHDDFY---PCHPWEVQRLRDDPLLQELQARGAITPVGV 252
L Q +G D + P HPW+ Q+ + + A G + +G
Sbjct: 189 LLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGE 247

Query: 253 LGEPLRPTSSVRTLYHP--ALAYFLKCSVHVRLTNCVRKNAWYELESAVALTHLLAPSWQ 310
G+ S+RTL + +K + + T+C R + + + L +
Sbjct: 248 FGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFA 307

Query: 311 ALAAQV-PGFDVMLEPAATSLDVTQFDPALAAADPLATRALAESFGILYRQAPTAAQRAR 369
A V G ++ EPAA + + AA A E G+++R+ P +
Sbjct: 308 TDATLVQSGAVILGEPAAGYVSHEGY-----AALARAPYRYQEMLGVIWRENPCRWLKPD 362

Query: 370 WRPQVAAALFTCDAQGHSVAAAALRAHSVRRLDRRTATVLWFRAYAGLLLDGVWSALFQH 429
P + A L CD +A A + R T W +++ ++ L ++
Sbjct: 363 ESPVLMATLMECDENNQPLAGA-----YIDRSGLDAET--WLTQLFRVVVVPLYHLLCRY 415

Query: 430 GIALEPHLQNTVIGFSDGWPTRVWIRDLEGT-KLLAQRWPAARLHGVGERARQSLYYTPE 488
G+AL H QN + +G P RV ++D +G +L+ + +P + + + R
Sbjct: 416 GVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPE--MDSLPQEVRDVTSRLSA 473

Query: 489 QAWNRVAYCALVNNLAEAIFHLSDGDAVLEARLWQCVAEIAARWQQRNGAQPALQGLID- 547
+ I L V E R +Q +A + + + +++ L
Sbjct: 474 DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSL 533

Query: 548 GAPLPGKNNLGTRLLQRADRQSDYTALPNPIA----PMHAVQQ 586
P + L L D LPN + P+ V Q
Sbjct: 534 FRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQ 576


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05560TCRTETA613e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 60.6 bits (147), Expect = 3e-12
Identities = 52/156 (33%), Positives = 70/156 (44%), Gaps = 3/156 (1%)

Query: 20 LGMPLFLPQVLAELAPSAAI-GWSGVLYVLPTLCTALTASTWGRLADRYGRKRSLLRAQL 78
L MP+ LP +L +L S + G+L L L A G L+DR+GR+ LL +
Sbjct: 23 LIMPV-LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLA 81

Query: 79 GLALGFAIAGFAPTLPWLVVGLIVQGTCGGSLAAANAYLASQPQAGPLARALDWTQYSAR 138
G A+ +AI AP L L +G IV G G + A A AY+A AR +
Sbjct: 82 GAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 139 LAMVSAPALLGLAVALGPAQALYRALALLPLVAFAL 174
MV+ P L GL P + A A L + F
Sbjct: 142 FGMVAGPVLGGLMGGFSPHAPFFAA-AALNGLNFLT 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05565PF041831512e-41 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 151 bits (384), Expect = 2e-41
Identities = 90/384 (23%), Positives = 139/384 (36%), Gaps = 49/384 (12%)

Query: 121 DEAACAAAHRGLARDAYA-AQAPHLAQALRAADAAERAYRCDQLASYRD-HPFYPTARAK 178
+A A + L Q + L A+D D+L HP + + +
Sbjct: 88 SDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA--DRLQCLLSGHPKFVFNKGR 145

Query: 179 AGLAPAELRAYAPEFAPTFALQWLAIPQAQVSCTSTPPAELWPDLARVGLPAELAQTHQL 238
G L YAPE+A TF L WLA+ + + ++ L P E A+ Q+
Sbjct: 146 RGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQV 205

Query: 239 W------------PVHPLVWARLEQDGFALPAGSVRAPLTYLP-----VRPTLSVRTLVP 281
W PVHP W + F A + L S+RTL
Sbjct: 206 WQENGLDHNWLPLPVHPWQWQQKIATDFI--ADFAEGRMVSLGEFGDQWLAQQSLRTLTN 263

Query: 282 LQHPH-LHLKLPIPMRTLGALNLRLIKPSTLYDGHWLEQALRRIDAHDAALRGRCVFV-D 339
L +KLP+ + R I + G + L+++ A DA L +
Sbjct: 264 ASRRGGLDIKLPLTIYNTSC--YRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILG 321

Query: 340 ESHGGHV-------------GQTRHLAYLLRRYPPLDTA---TLVPVAALCALMPDGRPM 383
E G+V L + R P + V +A L + +P+
Sbjct: 322 EPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPL 381

Query: 384 AIHLAETFSGGDLLAWWRDYTELLLAVHLRLWLRYGIALEANQQNSVLVYAPGQPTRLLM 443
A + SG D W +++ L RYG+AL A+ QN L G P R+L+
Sbjct: 382 AGAYIDR-SGLDAETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLL 440

Query: 444 KDN-DAARIALPQLRAQLPEIDTL 466
KD R+ ++ + PE+D+L
Sbjct: 441 KDFQGDMRL----VKEEFPEMDSL 460


23XC_RS05880XC_RS05955Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05880-193.550982coenzyme PQQ synthesis protein E
XC_RS058850103.066293coenzyme PQQ synthesis protein D
XC_RS05890-192.938931pyrroloquinoline-quinone synthase
XC_RS05895-193.261136coenzyme PQQ synthesis protein B
XC_RS05900-2113.350334(p)ppGpp synthetase
XC_RS05905-1103.347459hypothetical protein
XC_RS05910-2102.407597glycosyl transferase
XC_RS05915-2102.720671penicillin-binding protein 1B
XC_RS05920-1102.912491hypothetical protein
XC_RS05925-292.082082helicase
XC_RS05930-2101.477208hypothetical protein
XC_RS05935-390.874901ADP-ribosylglycohydrolase
XC_RS059400110.468413cell envelope biogenesis protein TonB
XC_RS059452131.324154glutathione synthetase
XC_RS059502131.590109pilus response regulator PilG
XC_RS059552132.001851chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05905RTXTOXIND310.016 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.016
Identities = 33/210 (15%), Positives = 61/210 (29%), Gaps = 15/210 (7%)

Query: 144 LRLRQLQARRAGLDALVAQAVAAQQQGRLDGTPDSA--LPLYQRVLSLAPDRTDALEGRE 201
L+L L A L + A +Q R S L + L P + E
Sbjct: 125 LKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 202 DALTDLLAQARHALARDALAEAAALLAAAKRYDAGHADVPSTEGQYHRLMDQRRQRADTL 261
LT L+ + + L A + E R+ R +L
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS-RVEKSRLDDFSSL 243

Query: 262 LRRGRLA-PAVRDFTAVLAAEPGDAQAQRG----VERVAAEYAGQATRQAGDFQFDAATQ 316
L + +A AV + + + + +E + F+ + +
Sbjct: 244 LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDK 303

Query: 317 SLQQARALVPSGPSIAAAEQAIARARDAQR 346
L+Q +I +A+ + Q+
Sbjct: 304 -LRQTTD------NIGLLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05915PF05272300.036 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.036
Identities = 19/95 (20%), Positives = 27/95 (28%), Gaps = 8/95 (8%)

Query: 470 EAQRQVGSLLKPFVY--LLALASPDRWALSSWVDDSPVTVQLARGKTWSPGNSDNRSHGT 527
+ + LLKP + AL S A D+ R W
Sbjct: 439 RLRLRGRWLLKPRRAALIEALRSAPALAGCVAFDELREQPVAVRAFPWRKAPGPLEDADV 498

Query: 528 VRLIDALAHSYNQATVRVGMQVGPERVAQLIQVLA 562
+RL D + +Y + Q I V A
Sbjct: 499 LRLADYVETTYGTGEAS------AQTTEQAINVAA 527


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05920BACINVASINC290.008 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 29.1 bits (64), Expect = 0.008
Identities = 24/97 (24%), Positives = 39/97 (40%), Gaps = 6/97 (6%)

Query: 69 RETAKAKRQIGDLAAAAAALDQALGLVSGDPAILQERAEVAVLQGDWNASERFAKQAIEL 128
R A+ + GDL + + S A QER+E + Q + + + +A E
Sbjct: 315 RIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVNNRVASTASDEARES 374

Query: 129 GSKTGPLCRRHWATIEQARLARGEKENAASAKAQIVG 165
K+ L + T+E ++ ASA A I G
Sbjct: 375 SRKSTSLIQEMLKTMESI------NQSKASALAAIAG 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05940PF035441182e-34 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 118 bits (298), Expect = 2e-34
Identities = 41/262 (15%), Positives = 86/262 (32%), Gaps = 37/262 (14%)

Query: 11 MDERQRLTATLVISLLLHGLLILGVGFAVSEDAPLVPTLDVIFSQTSTPLTPKQADFLAQ 70
+D +R ++S+ +HG ++ G+ + +P P P +A
Sbjct: 8 LDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPA----------PAQPISVTMVAP 57

Query: 71 ANQQGGGNHDTAQRPRDSQPGVVPQDRNGLAPQAQRATTVQAPLPTQTRVVSSRRGEQAV 130
A D P P+ P+ + P +
Sbjct: 58 A--------DLEPPQAVQPP---PEPVVEPEPEPEPIPEPPKEAPVV------------I 94

Query: 131 PTPQPNPQTDPLSPADAQRVQRDAEMARLAAEVHLRSEQYAKRPNRKFVSASTREYAYAN 190
P+P P+ P ++ +RD + + A+ + +A+++
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVA 154

Query: 191 YLRAWVDRAERVGNLNYPDEARRRRLGGKVVITVGVRRDGSVESSRVLVSSGTPVLDAAA 250
+ R YP A+ R+ G+V + V DG V++ ++L + + +
Sbjct: 155 SGPRALSRN----QPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREV 210

Query: 251 LRVVQLAQPFPPLPRSKDDVDI 272
++ + P P S V+I
Sbjct: 211 KNAMRRWRYEPGKPGSGIVVNI 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05950HTHFIS732e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.3 bits (180), Expect = 2e-18
Identities = 28/115 (24%), Positives = 49/115 (42%), Gaps = 2/115 (1%)

Query: 15 KVMVIDDSKTIRRTAETLLKREGCEVVTATDGFEALAKIADQQPQIIFVDIMMPRLDGYQ 74
++V DD IR L R G +V ++ IA ++ D++MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 75 TCALIKGNQLFKSTPVIMLSSKDGLFDKARGRIVGSEQYLTKPFTREELLSAIRT 129
IK + PV+++S+++ + G+ YL KPF EL+ I
Sbjct: 65 LLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05955HTHFIS886e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 6e-24
Identities = 36/116 (31%), Positives = 57/116 (49%), Gaps = 2/116 (1%)

Query: 2 ARIILIEDSPTDRAVFSQWLEKAGHTVVATDNAEEGLELIRSQAPDLVLMDVVLPGMSGF 61
A I++ +D R V +Q L +AG+ V T NA I + DLV+ DVV+P + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 QATRALARDQATKDIPVLLVSTKGMETDKAWGLRQGASDYIVKPPREDDLIARIKQ 117
+ + + D+PVL++S + +GA DY+ KP +LI I +
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


24XC_RS06030XC_RS06075Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS06030-116-4.060719hypothetical protein
XC_RS06035116-4.855388RebB protein
XC_RS06040115-4.624968hypothetical protein
XC_RS06045116-4.473589metal-dependent hydrolase
XC_RS06050116-4.555047DEAD/DEAH box helicase
XC_RS06055126-5.648558type I restriction endonuclease subunit S
XC_RS06060019-3.250020DNA-damage-inducible protein D
XC_RS06065-121-3.369311type I restriction-modification protein subunit
XC_RS06070032-3.951745hypothetical protein
XC_RS06075024-3.448510hypothetical protein
25XC_RS07275XC_RS07335Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS072750143.083880transcriptional regulator
XC_RS072800153.088279ligand-gated channel
XC_RS072852113.905674membrane protein
XC_RS072900104.384689hypothetical protein
XC_RS072951114.630280chloride channel protein
XC_RS073002114.650150phosphodiesterase-nucleotide pyrophosphatase
XC_RS073051134.591706hypothetical protein
XC_RS073101144.368854methylated-DNA--protein-cysteine
XC_RS073152154.140957DNA methylase
XC_RS073201153.708778hypothetical protein
XC_RS073250142.882450membrane protein
XC_RS073301142.511049hypothetical protein
XC_RS073353151.618406membrane protein
26XC_RS07485XC_RS07555Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS07485-1133.643213lipoprotein
XC_RS07490-2113.588637DNA polymerase III subunit delta
XC_RS07495-3112.811267nicotinic acid mononucleotide
XC_RS07500-2121.870853hypothetical protein
XC_RS07505-1131.619521phosphatase
XC_RS07510-2133.222083ribosomal RNA large subunit methyltransferase H
XC_RS07515-3123.210843cell envelope biogenesis protein TonB
XC_RS07520-3142.999807membrane protein
XC_RS07525-3133.101440septum formation protein Maf
XC_RS07530-3133.221527ribonuclease G
XC_RS07535-3123.625066membrane protein
XC_RS07540-2121.958815hypothetical protein
XC_RS07545-1131.694820TldD protein
XC_RS075500102.695149hypothetical protein
XC_RS07555-1113.467168PmbA protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07495LPSBIOSNTHSS462e-08 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 46.3 bits (110), Expect = 2e-08
Identities = 22/68 (32%), Positives = 33/68 (48%), Gaps = 3/68 (4%)

Query: 84 YGGTFDPIHRGHLAIACAARDALGAQVHLVPAADPPHRPAPGATAAQRTRMLELALADLP 143
Y G+FDPI GHL I L QV++ +P +P + +R + A+A LP
Sbjct: 5 YPGSFDPITFGHLDIIERGC-RLFDQVYVAVLRNPNKQPMF--SVQERLEQIAKAIAHLP 61

Query: 144 GLLLDTRE 151
+D+ E
Sbjct: 62 NAQVDSFE 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07515PF03544663e-15 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 66.2 bits (161), Expect = 3e-15
Identities = 48/212 (22%), Positives = 74/212 (34%), Gaps = 30/212 (14%)

Query: 39 LIPLSHHQVMQPPTPKERWLMPITVPATPPPPPLV-----------FPVEVTFKAPSTHT 87
L+ S HQV++ P P + + + PA PP V E + P
Sbjct: 32 LLYTSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAP 91

Query: 88 PVI----ADPVVKQPSVTQTAVVDTAHVALEAVSDTAPTISAPAAPPSSG---------P 134
VI P K V + +E+ + +APA P SS
Sbjct: 92 VVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVT 151

Query: 135 VDAGQLHYLSAPAPSYPVAALRAGQQGTVMLRVLVGTDGRPAEVSVQTSSGHRALDLAAR 194
A LS P YP A +G V ++ V DGR V + ++ + +
Sbjct: 152 SVASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVK 211

Query: 195 SQVLRSWRFQPAMQNGQAVQAYGLVPVSFSLN 226
+ + R WR++P V V + F +N
Sbjct: 212 NAM-RRWRYEPGKPGSGIV-----VNILFKIN 237


27XC_RS07605XC_RS07640Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS07605-225-3.844587acyl-CoA hydrolase
XC_RS07610-124-4.027127transposase
XC_RS07615-127-4.735645acyl-CoA hydrolase
XC_RS07620-126-4.880646histidine kinase
XC_RS07625-128-5.392071hypothetical protein
XC_RS07630-130-5.160122chemotaxis protein CheY
XC_RS07635-126-3.855368porin
XC_RS07640-129-3.419178lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07620PF06580364e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 4e-04
Identities = 22/102 (21%), Positives = 45/102 (44%), Gaps = 22/102 (21%)

Query: 553 LIENAVQHA----PAGSRVVITARTLDENGRASVECRVQDAGSGFAPDDLPRIFDPFFTR 608
L+EN ++H P G ++++ +NG ++E V++ GS +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGT--KDNGTVTLE--VENTGSLALKNT----------- 307

Query: 609 RRKGTGLGLAIVQRIVEEHKGTIAG--HNSPEGGAVMVMRLP 648
++ TG GL V+ ++ GT A + +G ++ +P
Sbjct: 308 -KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07625HTHFIS478e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 46.7 bits (111), Expect = 8e-10
Identities = 17/60 (28%), Positives = 28/60 (46%)

Query: 15 FESTSAQDAHAGDLNLTLQQIERQHIKRVLHDVGGKVEQASLRLGVPRSTLYQKIKLHGI 74
F S +G + L ++E I L G +A+ LG+ R+TL +KI+ G+
Sbjct: 416 FASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07630HTHFIS374e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 374 bits (962), Expect = e-129
Identities = 127/358 (35%), Positives = 203/358 (56%), Gaps = 4/358 (1%)

Query: 7 RTRILIVDDEDSIRFGMRDFLESRGYGVVDADSCQRARELFQASPPDVAVIDYRLHDGSA 66
IL+ DD+ +IR + L GY V + A D+ V D + D +A
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 67 IDLLRDFRQIDADVPMIVLTAYGSIDLAVQAVKEGAEQFLTKPIEMPALHVILKRLLATR 126
DLL ++ D+P++V++A + A++A ++GA +L KP ++ L I+ R LA
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 127 RLQRQQQVVATRDRRERVNPLSGGSPAIRDLAEQASKVMHTDSPILILGETGTGKSLLAK 186
+ + + D + PL G S A++++ +++M TD ++I GE+GTGK L+A+
Sbjct: 123 KRRPSK----LEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVAR 178

Query: 187 WLHHHGMRADEAFVDLNCAGLSPDFLETELFGHEKGAFTGATASKQGMLEIADGGTVFLD 246
LH +G R + FV +N A + D +E+ELFGHEKGAFTGA G E A+GGT+FLD
Sbjct: 179 ALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLD 238

Query: 247 EIGDVDPRVQPKLLKVLEEKRFRRLGAVREREVDIRLIAATHHDLASKVKEGTFRSDLYF 306
EIGD+ Q +LL+VL++ + +G D+R++AAT+ DL + +G FR DLY+
Sbjct: 239 EIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYY 298

Query: 307 RISSIPLTMPALRDRKEDIVPLARTLLARSLADPSRTPQLTDDAALALREYAWPGNIR 364
R++ +PL +P LRDR EDI L R + ++ + + +A ++ + WPGN+R
Sbjct: 299 RLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVR 356


28XC_RS07835XC_RS08025Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS078350133.584021aspartate-semialdehyde dehydrogenase
XC_RS078401123.860872ferrous iron transporter B
XC_RS078450153.120687glyoxalase
XC_RS078500153.597688tRNA pseudouridine synthase A
XC_RS078551163.122756N-(5'-phosphoribosyl)anthranilate isomerase
XC_RS078600122.921440transcriptional regulator
XC_RS078652131.321136tryptophan synthase beta chain
XC_RS078700121.146708hypothetical protein
XC_RS07875-280.693386tryptophan synthase alpha chain
XC_RS07880-280.219220acetyl-coenzyme A carboxylase carboxyl
XC_RS07885-28-0.040548phosphoglucosamine mutase
XC_RS07890-117-4.036492oxidoreductase
XC_RS07895-119-3.480815hypothetical protein
XC_RS07900-119-3.396012phosphodiesterase
XC_RS07905028-4.656770hypothetical protein
XC_RS07910129-6.234581cyanoglobin
XC_RS07915127-5.971704hypothetical protein
XC_RS07920-115-0.517275dehydrogenase
XC_RS07925116-2.101277triosephosphate isomerase
XC_RS07930217-2.587628preprotein translocase subunit SecG
XC_RS07940218-2.926907*NADH:ubiquinone oxidoreductase subunit A
XC_RS07945117-1.385315NADH-quinone oxidoreductase subunit B
XC_RS07950016-1.818401NADH-quinone oxidoreductase subunit C
XC_RS07955016-2.188057NADH-quinone oxidoreductase subunit D
XC_RS07960-116-1.991104NADH dehydrogenase subunit E
XC_RS07965016-2.055907NADH dehydrogenase
XC_RS07970016-2.655847NADH dehydrogenase subunit G
XC_RS07975016-4.096198NADH-quinone oxidoreductase subunit H
XC_RS07980017-3.303792NADH-quinone oxidoreductase subunit I
XC_RS07985016-3.221211NADH:ubiquinone oxidoreductase subunit J
XC_RS07990015-2.836354NADH-quinone oxidoreductase subunit K
XC_RS07995116-2.156505NADH:ubiquinone oxidoreductase subunit L
XC_RS08000213-1.558642NADH:ubiquinone oxidoreductase subunit M
XC_RS08005112-0.578566NADH:ubiquinone oxidoreductase subunit N
XC_RS08015313-0.667777*ribosome maturation factor RimP
XC_RS08020316-0.899522transcription termination factor NusA
XC_RS08025215-0.192518translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07905IGASERPTASE290.023 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.3 bits (65), Expect = 0.023
Identities = 13/64 (20%), Positives = 21/64 (32%), Gaps = 8/64 (12%)

Query: 143 GVQYKRLRDGTLPLAIGARDDHGTDVYLSAARLLLQGAGGYQLLLNGTVRATRANQTGLL 202
G Q+ + L + + G ++ L Y + GT GL+
Sbjct: 207 GSQFIYKKGDNYSLILNNHEVGGNNLKLVG--------DAYTYGIAGTPYKVNHENNGLI 258

Query: 203 GFGG 206
GFG
Sbjct: 259 GFGN 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07920DHBDHDRGNASE667e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 66.2 bits (161), Expect = 7e-15
Identities = 52/179 (29%), Positives = 82/179 (45%), Gaps = 2/179 (1%)

Query: 8 ALITGASSGIGREIARAYAARGMPLILTARRVDRLEALAAELGAQVRV-EILPEDLGDPA 66
A ITGA+ GIG +AR A++G + ++LE + + L A+ R E P D+ D A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 67 APARLVAEIQRQGWTVGTLVNNAGYGMPGRYLQNDWQTHARFLQVMVTAVCELTWRLLPM 126
A + A I+R+ + LVN AG PG + V T V + +
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 127 IRASGQGRILNVASFAALTPSADGQTLYAAAKSFMLRFSESLALENADCAVKVCALCPG 185
+ G I+ V S A P YA++K+ + F++ L LE A+ ++ + PG
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRT-SMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07930SECGEXPORT962e-28 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 95.8 bits (238), Expect = 2e-28
Identities = 41/111 (36%), Positives = 63/111 (56%), Gaps = 9/111 (8%)

Query: 5 ILNVVYVLVALAMIALILMQRGAGAAAGSGFGAGASGTVFGSQGASNFLSKSTKWLAVVF 64
L VV+++VA+ ++ LI++Q+G GA G+ FGAGAS T+FGS G+ NF+++ T LA +F
Sbjct: 4 ALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATLF 63

Query: 65 FSISLFMAWYATHGARPTDQNLGVMSQSATPAPAAAGELTQPLPQAPAAGA 115
F ISL + ++ + + APA + Q P APA
Sbjct: 64 FIISLVLGNINSNKTNKGSEWENL------SAPA---KTEQTQPAAPAKPT 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08025TCRTETOQM781e-16 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 77.6 bits (191), Expect = 1e-16
Identities = 47/181 (25%), Positives = 76/181 (41%), Gaps = 29/181 (16%)

Query: 408 IMGHVDHGKTSLLDYI-----RRTKIASGEAG-------------GITQHIGAYHVETGR 449
++ HVD GKT+L + + T++ S + G GIT G +
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 450 GVISFLDTPGHAAFTSMRARGAKITDIVVLVVAADDGVMPQTKEAVAHAKAAGVPLIVAV 509
++ +DTPGH F + R + D +L+++A DGV QT+ + G+P I +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 510 NKIDKAGADPLRVKNEL---LAENVVA--------EDFGGDTQFIEVSAKVGTGVDTLLD 558
NKID+ G D V ++ L+ +V + E V G D LL+
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 559 A 559

Sbjct: 188 K 188


29XC_RS08070XC_RS08200Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08070-119-4.017400phosphoribosylaminoimidazole carboxylase
XC_RS08075017-4.343488superoxide dismutase
XC_RS08080020-4.995186glutaredoxin
XC_RS08085022-5.519959short-chain dehydrogenase
XC_RS08090229-7.122341membrane protein
XC_RS08095332-7.706604Oar protein
XC_RS08100139-6.999932LOG family protein
XC_RS08105253-9.006709pre-pilin like leader sequence
XC_RS08110331-6.454942pilus modification protein PilV
XC_RS08115429-5.890329pilus assembly protein PilW
XC_RS08120519-5.341487pilus assembly protein
XC_RS08125520-5.303681type IV pilin
XC_RS08135320-5.901008*fimbrial protein
XC_RS08140423-7.556725UvrABC system protein B
XC_RS08150439-9.650353*hypothetical protein
XC_RS08155438-10.008083hypothetical protein
XC_RS08160452-11.265247hypothetical protein
XC_RS08165451-11.412850conjugative transfer protein
XC_RS08170551-11.833425hypothetical protein
XC_RS08175544-11.498080hypothetical protein
XC_RS08180441-12.140276hypothetical protein
XC_RS08185238-9.806971hypothetical protein
XC_RS08190133-8.583451hypothetical protein
XC_RS08195130-7.593839hypothetical protein
XC_RS08200115-5.382723hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08085DHBDHDRGNASE746e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.3 bits (182), Expect = 6e-18
Identities = 43/203 (21%), Positives = 86/203 (42%), Gaps = 3/203 (1%)

Query: 9 AAALAGRVVLVTGAAGGLGAAAAQACAQAGATVVLLGRKLRPLERVYDALAGQGAAPLLY 68
A + G++ +TGAA G+G A A+ A GA + + LE+V +L + +
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 69 PLDLAGATPDDYAALAQRLQSELGGLSGVLHCAAEFAGLTPAELAAPAEFARSIHVNLTA 128
P D+ + R++ E+G + +L A + E+ + VN T
Sbjct: 63 PADV--RDSAAIDEITARIEREMGPID-ILVNVAGVLRPGLIHSLSDEEWEATFSVNSTG 119

Query: 129 RAWLTQACLPLLRQQQDAALVFVVDDPARVGQAYWGAYGAAQHAQRGLIASLHHETAAGS 188
+++ + ++ ++V V +PA V + AY +++ A L E A +
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 189 VRVSGLQPGPMRTALRARAFTHQ 211
+R + + PG T ++ + +
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADE 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08105BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.7 bits (77), Expect = 1e-04
Identities = 14/49 (28%), Positives = 28/49 (57%)

Query: 5 RARGFTLVELMTTVAVVAIVAAIGYPSFQGVIRSNRAVTANNEVVGLLN 53
+ RGFTL+E+M + ++ ++A++ P+ G A +++V L N
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALEN 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08110BCTERIALGSPG280.008 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.3 bits (63), Expect = 0.008
Identities = 8/23 (34%), Positives = 17/23 (73%), Gaps = 2/23 (8%)

Query: 8 REASGFTLIEVLIAIIVLAFGLL 30
+ GFTL+E+++ I+++ G+L
Sbjct: 5 DKQRGFTLLEIMVVIVII--GVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08115BCTERIALGSPH290.019 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.1 bits (65), Expect = 0.019
Identities = 23/67 (34%), Positives = 31/67 (46%), Gaps = 9/67 (13%)

Query: 6 RAAGLSLIEMMIALVIGLVLLLGVIQVFSASRTAFQLSEGASRAQENARF--ALDFLARD 63
R G +L+EMM+ L++ V V+ F ASR S AQ ARF L F+ +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASR-------DDSAAQTLARFEAQLRFVQQR 54

Query: 64 IRMAGHF 70
G F
Sbjct: 55 GLQTGQF 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08125BCTERIALGSPG344e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.5 bits (79), Expect = 4e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 1 MIELMIVVAVIAILSAIAYPSYAEYVRKSRRAQAKADLV 39
++E+M+V+ +I +L+++ P+ K+ + +A +D+V
Sbjct: 12 LLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIV 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08135BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.1 bits (78), Expect = 1e-04
Identities = 10/38 (26%), Positives = 24/38 (63%)

Query: 8 PAKGYTATELLIVMAVLGLLAAIALPSFSSLIERQRLQ 45
+G+T E+++V+ ++G+LA++ +P+ E+ Q
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQ 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08165PF043352196e-73 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 219 bits (560), Expect = 6e-73
Identities = 53/230 (23%), Positives = 104/230 (45%), Gaps = 12/230 (5%)

Query: 14 QVGAAVQKAVNYEVSIADLARRSERRAWLVATVSMLITVITAGGYYYMLPLKEKVPYLVM 73
++ A ++A ++E A RS++ AW+VA V+ + + PLK PY++
Sbjct: 9 ELKAYFEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVIT 68

Query: 74 ADAYSGTSTIAKLEANYGGRTISTSEALARSNIARFILARESFDVSTIGDRDWNTVAAMA 133
D +G ++I +G TI+ EA+ + +A ++ RE + + + ++ V M+
Sbjct: 69 VDRNTGEASI--AAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAR-EEYFDAVMVMS 125

Query: 134 ATNVLAEYRTLHAGNNPLRPFNTYGRSRAIRINILSITLVGGKGKAYTGATVRFQRNVYD 193
A + + +NP P N + + I ++ +GG A V F +
Sbjct: 126 ARPEQDRWSRFYKTDNPQSPQNILANRTDVFVEIKRVSFLGGN-----VAQVYFTKESVT 180

Query: 194 KTSTVTTLLDNKIATMGFAYQDNLQMSDSLRVENPLGFRVTDYRVDSDYS 243
+++ T + +AT+ + D + R +NPLG++V YR D +
Sbjct: 181 GSNSTKT---DAVATIKYKV-DGTPSKEVDRFKNPLGYQVESYRADVEVP 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08170TYPE4SSCAGX368e-05 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 36.3 bits (83), Expect = 8e-05
Identities = 27/89 (30%), Positives = 42/89 (47%), Gaps = 10/89 (11%)

Query: 44 TGLGITTQIELSPNEKILDYSTGFTGGWELTRRENVFYLKPKNVDVD-------TNMMIR 96
T L T I+L +E I +TGF GW + N +++PK+V + N +
Sbjct: 59 TSLDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALM 118

Query: 97 TATHSYILELK---VVATDWQRLEQAKQA 122
T + L+ K V A D + LE+ K+A
Sbjct: 119 TRDYQEFLKTKKLIVDAPDPKELEEQKKA 147



Score = 29.8 bits (66), Expect = 0.011
Identities = 11/27 (40%), Positives = 17/27 (62%)

Query: 165 YDYDYATRTKKSWLVPSRVYDDGKFTY 191
Y+Y A + ++PS ++DDG FTY
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTY 427


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08180MYCMG045340.001 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 33.9 bits (77), Expect = 0.001
Identities = 31/105 (29%), Positives = 49/105 (46%), Gaps = 19/105 (18%)

Query: 119 LPSKHTKTLEQYTSDGFFDEVLEQAADVSEQDRELLELRRSKQYSEFFKKSVLYKKNVVV 178
L K + Q T D +E+L A Q+ L+ + R ++ SE +++V +
Sbjct: 119 LFIDSIKEISQQTKDSKNNELLHWAVPYFLQN--LVFVYRGEKISELEQENVSW------ 170

Query: 179 AGATGSGKTTFMKALVNHIP--NEERLVTIEDARELFISQPNSVH 221
T +KA+V H N+ RLV I+DAR +F S N V+
Sbjct: 171 --------TDVIKAIVKHKDRFNDNRLVFIDDARTIF-SLANIVN 206


30XC_RS08280XC_RS08380Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08280216-1.612137phenylalanine--tRNA ligase alpha subunit
XC_RS08285117-2.274425phenylalanine--tRNA ligase beta subunit
XC_RS08290218-5.487333integration host factor subunit alpha
XC_RS08295018-5.237956MerR family transcriptional regulator
XC_RS08305020-5.229609*polysaccharide biosynthesis protein GumB
XC_RS08310022-5.118353GumC protein
XC_RS08315-122-4.387724GumD protein
XC_RS08320-223-3.976099polysaccharide biosynthesis protein GumE
XC_RS08325-224-3.525924GumF protein
XC_RS08330-221-3.748708GumG protein
XC_RS08335-218-2.985140glycosyl transferase family 1
XC_RS08340-315-2.205536GDP-mannose:glycolipid
XC_RS08345-215-2.029540polysaccharide biosynthesis protein GumJ
XC_RS08350-211-0.798672UDP-glucuronate:glycolipid
XC_RS08355-1130.946387GumL protein
XC_RS083601131.864179GumM protein
XC_RS083653141.944676hypothetical protein
XC_RS083704161.945855polysaccharide biosynthesis protein GumN
XC_RS083756172.1020133-oxoacyl-ACP synthase
XC_RS083802121.517243lactamase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08290DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (295), Expect = 3e-38
Identities = 35/89 (39%), Positives = 55/89 (61%)

Query: 4 TKAEMAERLFDEVGLNKREAKEFVDAFFDVLRDALEQGRQVKLSGFGNFDLRRKNQRPGR 63
K ++ ++ + L K+++ VDA F + L +G +V+L GFGNF++R + R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEEIPISARTVVTFRPGQKLKERVE 92
NP+TGEEI I A V F+ G+ LK+ V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


31XC_RS08485XC_RS08515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08485-2113.684431Fe(2+)-trafficking protein
XC_RS084900113.992026A/G-specific adenine glycosylase
XC_RS084951114.956703cell division protein FtsY
XC_RS085001144.053359AraC family transcriptional regulator
XC_RS085052134.8077794-hydroxyproline 2-epimerase
XC_RS085103154.354197D-amino acid oxidase
XC_RS085152122.538525(2Fe-2S)-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08495PF03544401e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.3 bits (94), Expect = 1e-05
Identities = 19/111 (17%), Positives = 30/111 (27%)

Query: 99 AAVPSAPTPAPAAAPSALQALPDPVAPSAPATATPVVVVPTPQPLPQAAPAAPLQAPTPA 158
A + P P P P A V+ P P+P P+ P ++ P
Sbjct: 58 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRD 117

Query: 159 VTPAAQVATPQVAVPPVVPAPAAPAVPAATPPAAYAPAPAAPVVTTPVAAP 209
V P ++ A A + P + + P
Sbjct: 118 VKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYP 168



Score = 36.1 bits (83), Expect = 2e-04
Identities = 23/124 (18%), Positives = 35/124 (28%), Gaps = 6/124 (4%)

Query: 97 SIAAVPSAPTP------APAAAPSALQALPDPVAPSAPATATPVVVVPTPQPLPQAAPAA 150
+ +P+ P APA P P P + P +
Sbjct: 39 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK 98

Query: 151 PLQAPTPAVTPAAQVATPQVAVPPVVPAPAAPAVPAATPPAAYAPAPAAPVVTTPVAAPT 210
P P P + V PA A P ++ A A + VT+ + P
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPR 158

Query: 211 HLVQ 214
L +
Sbjct: 159 ALSR 162



Score = 34.6 bits (79), Expect = 6e-04
Identities = 25/144 (17%), Positives = 38/144 (26%), Gaps = 5/144 (3%)

Query: 21 YSLEELAAAFPTAPSAATPAAPPAPTQTPPASAPTTPVAPVAAPSTPAAETPTPTAPAAP 80
Y+ P + AP P A P PV P P P A
Sbjct: 34 YTSVHQVIELPAPAQPIS-VTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPV 92

Query: 81 AEQLAQDIAARTGQAQSIAAVPSAPTPAPAAAPSALQALPDPVAPSAPATATPVVVVPTP 140
+ + + P + P+ + + AP+ P ++T P
Sbjct: 93 VIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPA---SPFENTAPARPTSSTATAATSKP 149

Query: 141 QPLPQAAPAAPLQAPTPAVTPAAQ 164
A+ L P AQ
Sbjct: 150 VTSV-ASGPRALSRNQPQYPARAQ 172



Score = 31.1 bits (70), Expect = 0.009
Identities = 17/123 (13%), Positives = 26/123 (21%), Gaps = 1/123 (0%)

Query: 120 PDPVAPSAPATATPVVVVPTPQPLPQAAPAAPLQAPTPAVTPAA-QVATPQVAVPPVVPA 178
+ AP+ P + T V P P P+ P P P V + P
Sbjct: 41 IELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK 100

Query: 179 PAAPAVPAATPPAAYAPAPAAPVVTTPVAAPTHLVQDSGIDTHDSLPAAPAGKPGWRERL 238
P P T + + + L
Sbjct: 101 PKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRAL 160

Query: 239 RNS 241
+
Sbjct: 161 SRN 163


32XC_RS09070XC_RS09095Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS09070183.969686ribonuclease D
XC_RS09075084.376373hypothetical protein
XC_RS09080083.940384virulence factor
XC_RS09085-173.858572exodeoxyribonuclease 7 large subunit
XC_RS09090-173.432726epoxyqueuosine reductase
XC_RS09095083.288987carbohydrate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09080PF060572844e-97 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 284 bits (729), Expect = 4e-97
Identities = 74/207 (35%), Positives = 113/207 (54%), Gaps = 7/207 (3%)

Query: 209 PLVELHAPGS---DRLVVLLSGDGGWREMDKGIAARLQQQGVSVVGFNSLRYFWGSRTPQ 265
P +++A S LV+ LSGDGGW +DK + LQQQG VVG++SL+Y+W + P+
Sbjct: 38 PSTQVNAASSHTKPPLVIFLSGDGGWATLDKAVGGILQQQGWPVVGWSSLKYYWKQKDPK 97

Query: 266 QVGVDLDRIIATYQQRWHAHHVALVGYSFGADVLPFAFPELPAARRDAVQFVGLLGLAHQ 325
V D II YQ + V L+GYSFGA+V+PF E+PA R V LL +
Sbjct: 98 DVTQDTLAIIDKYQAEFGTQKVILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQS 157

Query: 326 ADFKVRVGGWLGWHNEAER-PIAPALQALDAHRLQCIYGEQEKDT--LCPELRARGVEVI 382
+DF++ V + N++ R P + + C+YG+++ LCPE++ V V+
Sbjct: 158 SDFEIHVSEMVTSDNQSARYLTLPEVNKQTTVPMLCLYGKEDDAPLHLCPEVKQPNVTVM 217

Query: 383 ARPGGHHFDRDPGALADILLQGWQRAA 409
GGH FD D + + ++GW + +
Sbjct: 218 ELSGGHSFDDDYDKVVKL-IKGWLKPS 243


33XC_RS10000XC_RS10330Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS100002122.043820transporter
XC_RS100052100.801086biopolymer transporter ExbB
XC_RS10010191.314549biopolymer transporter
XC_RS10015-181.485599lipid A export ATP-binding/permease protein
XC_RS10020-191.550064tetraacyldisaccharide 4'-kinase
XC_RS10025-210-0.3369723-deoxy-manno-octulosonate cytidylyltransferase
XC_RS10030-112-0.982922phosphotyrosine protein phosphatase
XC_RS10035120-3.602971nhl repeat protein
XC_RS10040327-5.888549UvrABC system protein C
XC_RS10045539-8.489540CDP-diacylglycerol--glycerol-3-phosphate
XC_RS10065336-7.195753***integrase
XC_RS10070140-7.898876transposase
XC_RS10080142-8.209598avirulence protein
XC_RS10085-231-5.305839hypothetical protein
XC_RS10090-122-3.603430hypothetical protein
XC_RS10095-120-3.499053transposase
XC_RS10100023-4.288075transposase
XC_RS10105033-5.537904transcriptional regulator
XC_RS10110132-5.897547transposase
XC_RS10115135-6.789800transposase
XC_RS10120141-8.010401transposase
XC_RS10125346-9.912844hypothetical protein
XC_RS10130343-9.286897histidine kinase
XC_RS10135344-10.475673hypothetical protein
XC_RS10140240-8.802531hypothetical protein
XC_RS10145236-7.659373hypothetical protein
XC_RS10150231-6.567994hypothetical protein
XC_RS10155129-5.100889hypothetical protein
XC_RS10160225-4.770895hypothetical protein
XC_RS10165121-3.554620hypothetical protein
XC_RS10170116-1.695828integrase
XC_RS10175217-1.415924hypothetical protein
XC_RS10180118-1.252658hypothetical protein
XC_RS10185118-1.579510membrane protein
XC_RS10190119-0.506372hypothetical protein
XC_RS10195120-0.858196hypothetical protein
XC_RS10200121-0.959611hypothetical protein
XC_RS10205220-0.962801hypothetical protein
XC_RS10210120-0.397973DNA repair protein RadC
XC_RS102151190.065551DSBA oxidoreductase
XC_RS102202180.097653conjugal transfer protein
XC_RS102252180.654097conjugal transfer protein
XC_RS102302190.307823hypothetical protein
XC_RS102352191.180008hypothetical protein
XC_RS10240421-1.011820hypothetical protein
XC_RS10245221-2.320699hypothetical protein
XC_RS10250221-2.011894hypothetical protein
XC_RS10255222-0.870277hypothetical protein
XC_RS10260220-0.507457hypothetical protein
XC_RS10265221-0.609604membrane protein
XC_RS102702200.260083conjugal transfer protein TraG
XC_RS102753220.534576hypothetical protein
XC_RS10280322-0.135344lytic transglycosylase
XC_RS10285322-1.171056hypothetical protein
XC_RS10290423-1.469780hypothetical protein
XC_RS10295423-1.721634hypothetical protein
XC_RS10300525-2.243692DEAD/DEAH box helicase
XC_RS10305424-2.469284hypothetical protein
XC_RS10315323-3.396607O-methyl transferase
XC_RS10320222-3.893384hypothetical protein
XC_RS10325223-4.693187hypothetical protein
XC_RS10330117-4.803505hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10155FLGFLIJ280.037 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 28.2 bits (62), Expect = 0.037
Identities = 22/78 (28%), Positives = 34/78 (43%), Gaps = 7/78 (8%)

Query: 250 QQLAQQYSPMID---KYRDDVASMRLGATLGTRSMQ----AMTLADGIDQLRGTLAPGAG 302
QQ +Q +ID +YR+++ S R + TL I Q R L
Sbjct: 33 QQAEEQLKMLIDYQNEYRNNLNSDMSAGITSNRWINYQQFIQTLEKAITQHRQQLNQWTQ 92

Query: 303 RADMAAARWEEVQQRMQS 320
+ D+A W E +QR+Q+
Sbjct: 93 KVDIALNSWREKKQRLQA 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10165IGASERPTASE300.029 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.029
Identities = 18/168 (10%), Positives = 48/168 (28%), Gaps = 13/168 (7%)

Query: 368 AGERPAAYAGTVLVEAAQGAETTG-------ESAPDLLDSAMQDTADDRANPPLESKPSL 420
E + TV ETT E+ ++ + + + E++ +
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 421 STQPVAKNAPDMMDELL------AMIGAPDPPKADDQGSSSSQPMPEAPGKPHQVSESVP 474
+ + + + + PK + + Q P P +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 475 AKKNPMSTSTQPSRDESNGTSTQMSADHFVTWLQQAIASRRLVINDAK 522
++ N + + QP+++ S+ ++ V + +
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10180PF06057260.045 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 26.0 bits (57), Expect = 0.045
Identities = 8/25 (32%), Positives = 11/25 (44%), Gaps = 3/25 (12%)

Query: 20 GWRAYARGERRLSSWLASKGVPVVG 44
GW + + L +G PVVG
Sbjct: 62 GWATLDKA---VGGILQQQGWPVVG 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10230PF02370310.006 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 31.2 bits (70), Expect = 0.006
Identities = 13/64 (20%), Positives = 25/64 (39%), Gaps = 4/64 (6%)

Query: 84 KSQREENQRLRQRENSIDQRINSAL----ETERSNLRRDQQQAASERQQTEGLLADLQQR 139
++ ENQ LR+RE +I E + RR++ + + + + QQ
Sbjct: 51 RALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQDKHYQEQQKKHQQE 110

Query: 140 LDSI 143
+
Sbjct: 111 QQQL 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10325FLGHOOKFLIK270.048 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 27.5 bits (60), Expect = 0.048
Identities = 15/41 (36%), Positives = 21/41 (51%), Gaps = 4/41 (9%)

Query: 124 ATPARASRPAKPAPVQASADPLVDTTPFGVDAQPLDTSAAP 164
++A + P+PV A+A PL+ QPL T AAP
Sbjct: 193 EAQSKAEVISTPSPVTAAASPLITPH----QTQPLPTVAAP 229


34XC_RS10385XC_RS10485Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS10385224-3.625527DNA topoisomerase III
XC_RS10390325-3.714807single-stranded DNA-binding protein
XC_RS10395225-3.509122integrase
XC_RS10400127-3.191321hypothetical protein
XC_RS10405127-3.223849hypothetical protein
XC_RS10410127-2.480510hypothetical protein
XC_RS10415126-3.039460hypothetical protein
XC_RS10420338-6.908597hypothetical protein
XC_RS10425636-8.280315cobyrinic acid a,c-diamide synthase
XC_RS10430840-9.598906Phage-related regulatory protein
XC_RS10435837-8.716746transposase
XC_RS104401045-11.535488hypothetical protein
XC_RS104451043-11.431158XRE family transcriptional regulator
XC_RS104551144-12.698289DEAD/DEAH box helicase
XC_RS10460733-7.076740hypothetical protein
XC_RS10465628-5.679493hypothetical protein
XC_RS10470630-5.890596avirulence protein
XC_RS10475525-3.845873transposase Tn5041
XC_RS10485324-3.215207hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10415ARGREPRESSOR310.004 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 31.4 bits (71), Expect = 0.004
Identities = 16/46 (34%), Positives = 20/46 (43%), Gaps = 12/46 (26%)

Query: 168 SQSELARRLVADGYPVQQSHISRMAD---AVR---------YLLPA 201
+Q EL L DGY V Q+ +SR V+ Y LPA
Sbjct: 21 TQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSYKYSLPA 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10485BONTOXILYSIN330.004 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 32.6 bits (74), Expect = 0.004
Identities = 16/74 (21%), Positives = 36/74 (48%), Gaps = 1/74 (1%)

Query: 303 VYDCFVYEVIYKKKTYVLFSGEWYC-IESKFFDAVEKDYQALLGVSFHLKTKAKNEQELI 361
+Y+ + ++Y KK Y F +W+ S++F+ + Q++L +K +N+ +
Sbjct: 650 LYEIYSKNIVYFKKIYFSFLDQWWTEYYSQYFELICMAKQSILAQESLVKQIVQNKFTDL 709

Query: 362 SELDKNSNLLNLDK 375
S+ + L L +
Sbjct: 710 SKASIPPDTLKLIR 723


35XC_RS10590XC_RS10685Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS10590132-3.034160transmembrane repetitive protein
XC_RS10595140-7.515573conjugal transfer protein TrbP
XC_RS10600135-6.196527hypothetical protein
XC_RS10605134-5.674575replication initiation protein
XC_RS10610238-5.443913Phi-Lf prophage-derived helix-destabilizing
XC_RS10615032-4.844076hypothetical protein
XC_RS10620132-4.576072Phi-Lf prophage-derived major coat protein
XC_RS10625235-4.460769hypothetical protein
XC_RS10630227-4.456184hypothetical protein
XC_RS10635226-4.594314minor coat protein
XC_RS10640325-4.249626hypothetical protein
XC_RS10645226-3.756291hypothetical protein
XC_RS10650126-4.377102hypothetical protein
XC_RS10655025-3.990547hypothetical protein
XC_RS10660126-4.627571minor coat protein
XC_RS10665-124-4.862804hypothetical protein
XC_RS10670-123-4.660030hypothetical protein
XC_RS10675-127-5.225841Phi-Lf prophage-derived major coat protein
XC_RS10680029-5.464204hypothetical protein
XC_RS10685-131-5.771531single-stranded DNA-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10590IGASERPTASE503e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.1 bits (119), Expect = 3e-08
Identities = 43/273 (15%), Positives = 81/273 (29%), Gaps = 21/273 (7%)

Query: 156 DAAPATAASGAQQAAASAVAASSAAASPASTSSTVAEPAEPAADVPDVAPEPAVAAAAEP 215
D T + Q S + + A PA P+ VA +
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTV- 1051

Query: 216 STPDIPQVRVQVPQVTLESPLQVTETPVPTTDFVVPPPPTITVAPRAIEPAAPQVRQREI 275
+ Q T +V + + E A +E
Sbjct: 1052 ------EKNEQDATETTAQNREVAKEAKSNVKA----------NTQTNEVAQSGSETKET 1095

Query: 276 QT-VTERPQVRELQQPTAAVAVRAAAAPTVREREIVVPEQARVTAPTVR-AREVVPTVRM 333
QT T+ E ++ + P V + EQ+ P ARE PTV +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 334 PELAVRAAELPNVPDPAPQPTPAQTPPAQPAVVQTPSAAAVSISPAAPAPSSA-PSPANQ 392
E + + PA + T + + +V +P P++ P+ ++
Sbjct: 1156 KEPQSQTNTTADTEQPA-KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSE 1214

Query: 393 TNQQTAAQSAKPASAAAASSKPAASSGAGPKPV 425
++ + + + + + +PA +S V
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTV 1247



Score = 37.0 bits (85), Expect = 3e-04
Identities = 19/141 (13%), Positives = 39/141 (27%), Gaps = 9/141 (6%)

Query: 144 AEQGGGEGEPAADAAPATAASGAQQAAASAV---AASSAAASPASTSSTVAEPAEPAADV 200
E+ + + +P S Q A + P S ++T A+ +PA +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET 1175

Query: 201 PDVAPEPAVAAAAEPSTPDIPQVRVQVPQVTLESPLQVTETPVPTTDFVVPPPPTITVAP 260
+P + + + + P+ T + Q T + +V
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVV----ENPENTTPATTQPTVN--SESSNKPKNRHRRSVRS 1229

Query: 261 RAIEPAAPQVRQREIQTVTER 281
+ TV
Sbjct: 1230 VPHNVEPATTSSNDRSTVALC 1250



Score = 32.7 bits (74), Expect = 0.005
Identities = 19/98 (19%), Positives = 36/98 (36%), Gaps = 2/98 (2%)

Query: 339 RAAELPNVPDPAPQPTPAQTPPAQPAVVQTPSAAAVSISPAAPAPSSAPSPANQTNQQTA 398
+ D TP P+V + ++ AP P AP+ ++T + A
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPS--NNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 399 AQSAKPASAAAASSKPAASSGAGPKPVDRSGGWDVAAN 436
S + + + + A + A + V + +V AN
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN 1079


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10645AEROLYSIN270.020 Aerolysin signature.
		>AEROLYSIN#Aerolysin signature.

Length = 493

Score = 26.9 bits (59), Expect = 0.020
Identities = 14/66 (21%), Positives = 24/66 (36%), Gaps = 3/66 (4%)

Query: 5 TYDHVDLTGPWAGFGFQGYRFFTPEGREVRPEEMRWWSLTCTIAREWALMMAEERERVWH 64
++H + GP+ + + P E++WW TI + M RV
Sbjct: 361 NWNHTFVIGPYKD---KASSIRYQWDKRYIPGEVKWWDWNWTIQQNGLSTMQNNLARVLR 417

Query: 65 PRPAEV 70
P A +
Sbjct: 418 PVRAGI 423


36XC_RS10850XC_RS10980Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS108504141.751688hypothetical protein
XC_RS108554141.494623hypothetical protein
XC_RS108604141.510979beta-N-acetylglucosaminidase
XC_RS108654141.444700alpha/beta hydrolase
XC_RS108754141.194086serine protease
XC_RS108804150.720761hypothetical protein
XC_RS10885-112-0.356278diguanylate phosphodiesterase
XC_RS108901160.366012peptidase
XC_RS10895016-0.389698chemotaxis protein-glutamate methylesterase
XC_RS10900-114-1.369912hypothetical protein
XC_RS10905115-0.642013stress-induced protein
XC_RS10910115-0.527839hypothetical protein
XC_RS10915212-0.106177dehydrogenase
XC_RS109202110.075757hypothetical protein
XC_RS109253111.164667glucose-1-phosphate cytidylyltransferase
XC_RS109304102.182152NAD-dependent epimerase
XC_RS109354102.549601dTDP-4-dehydrorhamnose 3,5-epimerase
XC_RS10940291.777803NAD(P)-dependent oxidoreductase
XC_RS109451101.853954exopolysaccharide biosynthesis protein
XC_RS109501102.168775uroporphyrin-III C-methyltransferase
XC_RS109551101.682040nitrate reductase subunit alpha
XC_RS109600101.476741nitrite reductase
XC_RS109650101.311650nitrite reductase
XC_RS109703112.439991MFS transporter
XC_RS109754123.028428nitrate ABC transporter substrate-binding
XC_RS109803131.814859histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10855PF06580325e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 5e-04
Identities = 12/54 (22%), Positives = 24/54 (44%), Gaps = 9/54 (16%)

Query: 2 FVLLQALGWALF----LAIAYVARPSEDSVPELLQLAGVVAMSICGLLGSLALR 51
+ Q +GW ++ A + P+L + +A+S+ GL+ + A R
Sbjct: 12 YWYCQGIGWGVYTLTGFGFASLYGS-----PKLHSMIFNIAISLMGLVLTHAYR 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10875SUBTILISIN1111e-28 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 111 bits (278), Expect = 1e-28
Identities = 62/318 (19%), Positives = 111/318 (34%), Gaps = 39/318 (12%)

Query: 106 NADLAQQAGARGQGVKIGVMDDNLVQTYAPVAGKVDAFTDYTAVPGAAESTSNRLRGHGS 165
A G+GVK+ V+D + + ++ ++T GHG+
Sbjct: 30 QAPAVWNQTR-GRGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGT 88

Query: 166 VVSALVLGSAQDGFAGGVAPDADLIYGRICAENSCGTQQARRAAVDMAAA-GVRIANLSI 224
V+ + + + GVAP+ADL+ ++ + G + A V I ++S+
Sbjct: 89 HVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSL 148

Query: 225 GASYADAASSANAALAWRFALTPLVQADALIVASTGNDGAAEAS-----YPAAAPVQEAS 279
G A V + L++ + GN+G + YP
Sbjct: 149 GGPEDVPELHEAVKKA--------VASQILVMCAAGNEGDGDDRTDELGYPGCYN----- 195

Query: 280 LRNNWLAVGAVEIDSAGNPAGLSSYSNHCGSAAQWCLVAPGMYAAPALAGTELQGQIAGT 339
++VGA+ D S +SN LVAPG + G + +GT
Sbjct: 196 ---EVISVGAINFDR-----HASEFSNSNNE---VDLVAPGEDILSTVPGGKYA-TFSGT 243

Query: 340 SFSTAAVSGVAAQVLGVYPW-----MSASNLQQTLLTTATDLGDPGVDALYGWGMVNAAK 394
S +T V+G A + + ++ L L+ LG + G G++
Sbjct: 244 SMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLG--NSPKMEGNGLLYLTA 301

Query: 395 AIKGPGQFASNWAANVTS 412
+ F + A + S
Sbjct: 302 VEELSRIFDTQRVAGILS 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10880IGASERPTASE350.006 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 0.006
Identities = 46/213 (21%), Positives = 70/213 (32%), Gaps = 38/213 (17%)

Query: 2096 FGGSLSDAGNLE-----KVGTGVFQLTGTSSIGGTTLVSAGTLDVDGTLASAGGLTVANG 2150
+ GNL K F LTG +++ G V GTL G +
Sbjct: 657 GEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNLNGDLTVEKGTL-------FLSGRPTPHA 709

Query: 2151 GALSGSGVVDAAVTVADGGRVVVSS---GALLTTGTLSLAPNATIDAFLGIPSQTGVLAV 2207
++G A+ VVV T+++ NA + G V +
Sbjct: 710 RDIAGISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTGNA--SLYSGRN----VANI 763

Query: 2208 NGDLTLDGTLNITDIGGFGNGVYRLIDYTG-------GLSDNGLAFGTIPGSVDPTQLAL 2260
++T + G+ V DYTG LSD L S +PT L
Sbjct: 764 TSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALN------SFNPTNLRG 817

Query: 2261 QTALAQQVNVVVSAPGSNVQFWDGAQTVGNAQI 2293
L + N V+ + Q+ GN+Q+
Sbjct: 818 NVNLTESANFVL----GKANLFGTIQSRGNSQV 846


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10930NUCEPIMERASE893e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 89.1 bits (221), Expect = 3e-22
Identities = 58/260 (22%), Positives = 101/260 (38%), Gaps = 29/260 (11%)

Query: 1 MRILITGNMGYIGPVVARHLRQQYPDAWLVGLD--RGWFAHCLSDPRQLPEVVLDQQWF- 57
M+ L+TG G+IG V++ L + +VG+D ++ L R E++ +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQ--VVGIDNLNDYYDVSLKQARL--ELLAQPGFQF 56

Query: 58 --GDVRDLTA----QQLSGFDGVVHLAAISNDAMGKRFE----QVTDAINCRASVAVAQA 107
D+ D F+ V + R+ N + + +
Sbjct: 57 HKIDLADREGMTDLFASGHFERVFISPH----RLAVRYSLENPHAYADSNLTGFLNILEG 112

Query: 108 AAQAGVKHFVFASSCSVYGAAADGTPRSESSAVA-PLTAYAHSKIDTERALAQLET---D 163
++H ++ASS SVYG P S +V P++ YA +K E +A +
Sbjct: 113 CRHNKIQHLLYASSSSVYGLNRK-MPFSTDDSVDHPVSLYAATKKANEL-MAHTYSHLYG 170

Query: 164 MVITCLRFATACGASPRLRLDLVLNDFVASAVANGTVEVLSDGSPWRPLIHVRDMARAID 223
+ T LRF T G P R D+ L F + + +++V + G R ++ D+A AI
Sbjct: 171 LPATGLRFFTVYG--PWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 224 WALGRTPLAGGRFLAVNAGA 243
P A ++
Sbjct: 229 RLQDVIPHADTQWTVETGTP 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10970TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 29/130 (22%), Positives = 57/130 (43%), Gaps = 10/130 (7%)

Query: 61 AVLRLVLGMLADRIGAKRAGVVA-QLAVIASLFGAWQLGVHSMGQVLLLG-VLLGIAGAS 118
++ V G L+D++G KR + + S+ G HS +L++ + G A+
Sbjct: 63 SIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVG---HSFFSLLIMARFIQGAGAAA 119

Query: 119 F-AVALPLASRWYPPEHQCTAMG-IAGAGNSGTVLAALFAPMLALAFGYQNVFGLACIPL 176
F A+ + + +R+ P E++ A G I G + M+A + + IP+
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL---LIPM 176

Query: 177 VLTLLVFVLI 186
+ + V L+
Sbjct: 177 ITIITVPFLM 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10980HTHFIS591e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.1 bits (143), Expect = 1e-12
Identities = 35/142 (24%), Positives = 57/142 (40%), Gaps = 8/142 (5%)

Query: 2 RVLLVNDTEKPIGELRQALSRAGYTVLDDVASVSALLHAVQSQQPDVVVIDVDSPSRDTL 61
+L+ +D L QALSRAGY V ++ + L + + D+VV DV P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EQLSMLHAHAPR-PVVMFSGDGDDALIHAAVGAGVTAYVVDGLAPARLAPIVQVALARFA 120
+ L + P PV++ S A G Y+ L I+ ALA
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE-- 121

Query: 121 HENSMRKRLDEVQQALQDRKQI 142
++R +++ QD +
Sbjct: 122 ----PKRRPSKLEDDSQDGMPL 139


37XC_RS11370XC_RS11540Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS113702262.537101flagellar hook-basal body protein
XC_RS113752263.535416flagellar M-ring protein FliF
XC_RS113802293.530384flagellar motor switch protein FliG
XC_RS113853283.911785flagellar assembly protein FliH
XC_RS113903273.612461flagellar protein FliI
XC_RS113953202.290753flagellar export protein FliJ
XC_RS114003151.970081flagellar protein
XC_RS11405315-0.959320flagellar biosynthesis protein
XC_RS114153180.188663flagellar motor switch protein FliN
XC_RS114202112.220840flagellar protein
XC_RS114252102.695677flagellar biosynthesis protein FliP
XC_RS114302113.263782hypothetical protein
XC_RS114351133.067711flagellar biosynthesis
XC_RS114401163.150532flagellar biosynthesis protein FliR
XC_RS114450183.572500diguanylate cyclase
XC_RS11450-1223.304715diguanylate cyclase
XC_RS114550263.022359diguanylate cyclase
XC_RS11460-1272.227555flagellar biosynthesis protein FlhB
XC_RS11465-1282.560849flagellar biosynthesis protein FlhA
XC_RS114700252.614108flagellar biosynthesis regulator FlhF
XC_RS11475015-0.879286cobyrinic acid a,c-diamide synthase
XC_RS11480113-2.264240RNA polymerase sigma factor
XC_RS11485217-3.488775chemotaxis protein CheY
XC_RS11490221-4.297222chemotaxis protein
XC_RS11495227-5.086355chemotaxis protein
XC_RS11500442-7.936728hypothetical protein
XC_RS11505339-5.679776hypothetical protein
XC_RS11510031-4.315881hypothetical protein
XC_RS11515035-5.040161hypothetical protein
XC_RS11520239-5.650547hypothetical protein
XC_RS11525238-5.722302hypothetical protein
XC_RS11530223-3.632795hypothetical protein
XC_RS11535121-4.017945transposase
XC_RS11540220-3.909821arsenic resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11370FLGHOOKFLIE611e-15 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 61.2 bits (148), Expect = 1e-15
Identities = 28/84 (33%), Positives = 48/84 (57%)

Query: 40 AGAQGTPATQAPSFSETLRGAIGGVNEAQQKSGALAKAFEMGDPSADLARVMVASQQSQV 99
A AQ + SF+ L A+ +++ Q + A+ F +G+P L VM Q++ V
Sbjct: 20 ARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASV 79

Query: 100 AFRATVEVRNRLVQAYQDVMNMPL 123
+ + ++VRN+LV AYQ+VM+M +
Sbjct: 80 SMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11375FLGMRINGFLIF347e-115 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 347 bits (891), Expect = e-115
Identities = 184/576 (31%), Positives = 295/576 (51%), Gaps = 48/576 (8%)

Query: 16 KAGQWFDRIRSMQITRKLTMMAMIALAVAAGLAVFFWSQKPGYQALYTGLDDKGNAEAAD 75
K +W +R+R+ ++ ++ + AVA +A+ W++ P Y+ L++ L D+
Sbjct: 11 KPLEWLNRLRANP---RIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVA 67

Query: 76 LLRTAQIPFKIDQSTGAISVPQDRLYDARLKLAGSGLTGQQTGGGFELMEKDPGFGVSQF 135
L IP++ +GAI VP D++++ RL+LA GL + GFEL++++ FG+SQF
Sbjct: 68 QLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLP-KGGAVGFELLDQEK-FGISQF 125

Query: 136 VENARYQHALETELSRTIGTLRPVREARVHLAIPKPSAFTRQRDVASASVVLELRGGQGL 195
E YQ ALE EL+RTI TL PV+ ARVHLA+PKPS F R++ SASV + L G+ L
Sbjct: 126 SEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRAL 185

Query: 196 ERNQVDAIVNLVASSIPDMTPERVTVVDQSGRMLSIADPNSDAAQHAAQFEQVRRQESSY 255
+ Q+ A+V+LV+S++ + P VT+VDQSG +L+ ++ + AQ + ES
Sbjct: 186 DEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLN-DAQLKFANDVESRI 244

Query: 256 NQRIRELLEPMTGAGRVNPEVSVDMDFSVVEEARELYN----GEPAKLRSEQVSDSS-TS 310
+RI +L P+ G G V+ +V+ +DF+ E+ E Y+ A LRS Q++ S
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 311 ATGPQGPPGATSNSPGQPPAPAATAGAPGAPAAGQAATPAAPTES----------SKSAT 360
A P G PGA SN P P A P TP T + ++ T
Sbjct: 305 AGYPGGVPGALSNQP--APPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNET 362

Query: 361 RNYELDRTLQHTRQPAGRIKRVSVAVLLDNVPRPGAKGKIVEQPLTAAELTRIEGLVKQA 420
NYE+DRT++HT+ G I+R+SVAV+++ K PLTA ++ +IE L ++A
Sbjct: 363 SNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKP----LPLTADQMKQIEDLTREA 418

Query: 421 VGFDAARGDTVSVMNAPFVREAVAGEEGPKWWEDPRVQNGLRLLVGAVVVLALLF----G 476
+GF RGDT++V+N+PF G E P W + + L ++VL + +
Sbjct: 419 MGFSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAG-RWLLVLVVAWILWRK 477

Query: 477 VVRPTLRQLTGTAAVKEKHKKGGNDGTPQSADVRMVDEDDDLMPRLEEDTAQIGQDKKLP 536
VRP L + A ++ + + + L +E Q +++L
Sbjct: 478 AVRPQLTRRVEEAKAAQEQAQVRQET--------EEAVEVRLSK--DEQLQQRRANQRLG 527

Query: 537 IALPDAYEERVRLAREAVKADSKRVAQVVKGWVASE 572
E + RE D + VA V++ W++++
Sbjct: 528 ------AEVMSQRIREMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11380FLGMOTORFLIG307e-106 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 307 bits (789), Expect = e-106
Identities = 105/329 (31%), Positives = 199/329 (60%)

Query: 1 MSGVQRAAVLLLSLGETDAAEVLKHMDPKEVQKIGIAMATMSGISRDQVEKVMDDFNGEL 60
++G Q+AA+LL+S+G +++V K++ +E++ + +A + I+ + + V+ +F +
Sbjct: 15 LTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELM 74

Query: 61 AGKTSLGVGADDYIRNVLIQALGADKAGGLIDRILLGRNTTGLDTLKWMDPRAVADLVRN 120
+ + G DY R +L ++LG KA +I+ + + + ++ DP + + ++
Sbjct: 75 MAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQ 134

Query: 121 EHPQIIAIVMAHLDSDQAAEALKLLPERTRADVLLRIATLDGIPPNALSELNDIMERQFA 180
EHPQ IA+++++LD +A+ L LP + +V RIA +D P + E+ ++E++ A
Sbjct: 135 EHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLA 194

Query: 181 GNQNLKSSNVGGIKVAANILNFLDTGPEQGVLGEIGKIDADLASKIQDLMFVFDNLVDLD 240
+ ++ GG+ I+N D E+ ++ + + D +LA +I+ MFVF+++V LD
Sbjct: 195 SLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLD 254

Query: 241 DRGLQTLLREVSGERLGLALRGADVKVREKITRNMSQRAAEILLEDMEARGPVRLADVEA 300
DR +Q +LRE+ G+ L AL+ D+ V+EKI +NMS+RAA +L EDME GP R DVE
Sbjct: 255 DRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEE 314

Query: 301 AQKEILTIVRRLADEGAISLGGAGAEAMV 329
+Q++I++++R+L ++G I + G E ++
Sbjct: 315 SQQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11385FLGFLIH447e-08 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 44.4 bits (104), Expect = 7e-08
Identities = 36/159 (22%), Positives = 76/159 (47%), Gaps = 7/159 (4%)

Query: 51 QEGFARGHAEGFAQGQSEVRRLTAQIDGILDNFTRPLARLENEVVGALGELTVRIAGSLV 110
QEG A+G +G A+ +S+ + A++ ++ F L L++ + L ++ + A ++
Sbjct: 73 QEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 111 GRAYQAEPQLLADLVQEAIDAVGSAGREVEVRLHPDDITALLPHLATSSTT---RVAPDL 167
G+ + L +Q+ + + ++R+HPDD+ + L + + R+ D
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 168 TLSRGDLRVHAESVRVDGTLDARLRAALETVMRKSGAGL 206
TL G +V A+ +G LDA + + + R + G+
Sbjct: 193 TLHPGGCKVSAD----EGDLDASVATRWQELCRLAAPGV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11395FLGFLIJ270.017 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 27.5 bits (60), Expect = 0.017
Identities = 34/140 (24%), Positives = 57/140 (40%), Gaps = 4/140 (2%)

Query: 1 MMQSKRIDPLLRRAQEQEDKVARDLAERQRALDTHQSRLEELRRYAEEYASSQMSGTSAV 60
M + + L A+++ + AR L E +R + +L+ L Y EY ++ S SA
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 ALSNR----RAFLDRLDSAVLQQAQTVESNRAKVEAERTRLLLASREKQVLEQLAASYRA 116
SNR + F+ L+ A+ Q Q + KV+ + Q + L
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 117 QENKVIERRDQREMDDLGAR 136
R DQ++MD+ R
Sbjct: 121 AALLAENRLDQKKMDEFAQR 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11400FLGHOOKFLIK461e-07 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 46.4 bits (109), Expect = 1e-07
Identities = 55/241 (22%), Positives = 96/241 (39%), Gaps = 17/241 (7%)

Query: 178 ASMSAATTATTATPTLPTDAAAPATTATAATALPSLGALAPAAATAKPAAATALSGEPQA 237
AS+SA P AP+T + TA+P A +P
Sbjct: 129 ASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAPGTPAQP-- 186

Query: 238 AALMSLATAALDSPADDLKTDAPDAPAFVLPTTTASALNRLQDAPAVFSASPTPTPEMGS 297
++ A S A+ + T +P A T P A+P + +GS
Sbjct: 187 ---LTPLVAEAQSKAEVISTPSPVTAAASPLITPHQT------QPLPTVAAPVLSAPLGS 237

Query: 298 DTFDDAIGARLSWLADQKIGHAHIKVTPNEMGPVEVRLHLEGDKVNASFTAANADVRQAL 357
+ ++ +S Q A +++ P ++G V++ L ++ ++ + + VR AL
Sbjct: 238 HEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAAL 297

Query: 358 EQSLPRLRDMLGQQGFQLGQADVS------QQQQNPSGNRNGAGTDGNGLSLEDSPPVGI 411
E +LP LR L + G QLGQ+++S QQQ ++ + L+ ED + +
Sbjct: 298 EAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPV 357

Query: 412 P 412
P
Sbjct: 358 P 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11415FLGMOTORFLIN1151e-36 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 115 bits (289), Expect = 1e-36
Identities = 54/103 (52%), Positives = 78/103 (75%), Gaps = 1/103 (0%)

Query: 9 AAPATFESLHAERDQDATDLNLDVILDVPVTLSLEVGRARIPIRNLLQLNQGSVVELERG 68
AA A F+ L D ++D+I+D+PV L++E+GR R+ I+ LL+L QGSVV L+
Sbjct: 34 AADAVFQQL-GGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGL 92

Query: 69 AGEPLDVYVNGTLIAHGEVVVINDRFGIRLTDVVSPSERIRRL 111
AGEPLD+ +NG LIA GEVVV+ D++G+R+TD+++PSER+RRL
Sbjct: 93 AGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRL 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11425FLGBIOSNFLIP2414e-82 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 241 bits (616), Expect = 4e-82
Identities = 123/229 (53%), Positives = 162/229 (70%), Gaps = 1/229 (0%)

Query: 49 APATGNQIPTLPNVSVGRIGDQPVSLPLQTLLLMTAITLLPSMLLVLTAFTRITIVLGLL 108
P Q+P + + + G Q SLP+QTL+ +T++T +P++LL++T+FTRI IV GLL
Sbjct: 16 TPLAFAQLPGITSQPL-PGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFTRIIIVFGLL 74

Query: 109 RQALGTGQTPSNQVLLGLAMFLTALVMMPVWQKMWGAGLQPYLNNQIDFSTAWTLTTQPL 168
R ALGT P NQVLLGLA+FLT +M PV K++ QP+ +I A QPL
Sbjct: 75 RNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQEALEKGAQPL 134

Query: 169 RAFMLAQIRETDLMTFAGMAGDGKYAGPDAVPFPVLVASFVTSELKTAFEIGFLIFIPFV 228
R FML Q RE DL FA +A G GP+AVP +L+ ++VTSELKTAF+IGF IFIPF+
Sbjct: 135 REFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQIGFTIFIPFL 194

Query: 229 IIDLVVASVLMSMGMMMLSPMLISAPFKILLFILVDGWVLVVGTLAASF 277
IIDLV+ASVLM++GMMM+ P I+ PFK++LF+LVDGW L+VG+LA SF
Sbjct: 195 IIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11435TYPE3IMQPROT462e-10 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 45.9 bits (109), Expect = 2e-10
Identities = 17/69 (24%), Positives = 33/69 (47%)

Query: 13 GLVTVLWIAGPMLLAVLVVGVVIGVVQAATQLNEPTIGFVAKAVALTATLFATGSMLLGH 72
L VL ++G + ++G+++G+ Q TQL E T+ F K + + LF
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 73 LVEFTIELF 81
L+ + ++
Sbjct: 71 LLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11440TYPE3IMRPROT1212e-35 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 121 bits (305), Expect = 2e-35
Identities = 80/239 (33%), Positives = 130/239 (54%), Gaps = 2/239 (0%)

Query: 23 WTMLRTGALLTAMPLIGTRAVPGRVRVMLTGTLAMALAPILPPVPEWDGFNATAVLSIAR 82
W +LR AL++ P++ R+VP RV++ L + A+AP LP F+ A+ +
Sbjct: 18 WPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPV-FSFFALWLAVQ 76

Query: 83 ELAVGASMGFMLRLIFEAGALAGELVSQATGLSFAQMSDPLRGVTSGVIAQWFYIGFGLL 142
++ +G ++GF ++ F A AGE++ GLSFA DP + V+A+ + LL
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 143 FFAANGHLAVIALLVDSYKALPIGTAVPDAAAFAEVAPTLLLQVLRGGLTLALPMMVAML 202
F NGHL +I+LLVD++ LPIG ++ AF + + GL LALP++ +L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALT-KAGSLIFLNGLMLALPLITLLL 195

Query: 203 AVNLAFGALAKAAPALNPMQLGLPLTVLLGLFLLSSFASEFAPPVQRLFDSAFDAARAL 261
+NLA G L + AP L+ +G PLT+ +G+ L+++ AP + LF F+ +
Sbjct: 196 TLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLADI 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11445GPOSANCHOR350.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 0.002
Identities = 21/59 (35%), Positives = 29/59 (49%), Gaps = 6/59 (10%)

Query: 783 KLLARKQELEHL--IAERTAELEQDKRDLEAARAALTHKATHDELTGLLNRAGILEALR 839
+L A Q+LE I+E A + +RDL+A+R A K E L + I EA R
Sbjct: 327 QLEAEHQKLEEQNKISE--ASRQSLRRDLDASREA--KKQLEAEHQKLEEQNKISEASR 381



Score = 33.9 bits (77), Expect = 0.003
Identities = 31/112 (27%), Positives = 43/112 (38%), Gaps = 20/112 (17%)

Query: 783 KLLARKQELEHLIAERTAELEQDKRDLEAARAALTHKATHDELTGLLNRAGILEALRGML 842
L A K +LEH A + +RDL+A+R A K E L + I EA R L
Sbjct: 292 ALEAEKADLEHQSQVLNANRQSLRRDLDASREAK--KQLEAEHQKLEEQNKISEASRQSL 349

Query: 843 --DSAPLREQPLAVVLIDLDYFKQVNDQHGHL-----AGDAVLAGVGKRLNA 887
D RE KQ+ +H L +A + + L+A
Sbjct: 350 RRDLDASREA-----------KKQLEAEHQKLEEQNKISEASRQSLRRDLDA 390


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11460TYPE3IMSPROT340e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 340 bits (875), Expect = e-118
Identities = 103/344 (29%), Positives = 181/344 (52%), Gaps = 2/344 (0%)

Query: 8 GERTELPTEKRLREAREQGNIPQSRELSTAAVFGTGVFALMLMARGIGDGASVWMKTALS 67
GE+TE PT K++R+AR++G + +S+E+ + A+ LM ++ + S M +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLM--LIP 60

Query: 68 PDPKMRENPMALFGHFGDLLLQLLWVMVPLIGICLAAGLVGPLLMSGLHFSGKAIMPDLN 127
+ AL ++LL+ ++ PL+ + + ++ G SG+AI PD+
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 128 KLNPMNGIKRMWGSNSLAELVKSILRLLFVGLAASLCISKGLHGLRSLVNRPLEQAVGNG 187
K+NP+ G KR++ SL E +KSIL+++ + + + I L L L +E
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 188 LDFTKSLLFYTAGALVLLAAFDAPYQKWNWLRKLKMTREEIKREMKESEGSPEVKGRIRQ 247
+ L+ V+++ D ++ + ++++LKM+++EIKRE KE EGSPE+K + RQ
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 248 MQMQMSQRRMMEALPTADVVLMNPTHYAVALKYEGGKMRAPVVVAKGVDEMAFRIREACE 307
++ R M E + + VV+ NPTH A+ + Y+ G+ P+V K D +R+ E
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 308 QHRVAIVTAPPLARALYREAQLGKEIPVRLYSVVAQVLSYVYQL 351
+ V I+ PLARALY +A + IP A+VL ++ +
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11485HTHFIS933e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 3e-25
Identities = 31/105 (29%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 6 RILIVDDFSTMRRIVKNLLADLGFTNTAEAEDGNSALAALRAGPFDFVVTDWNMPGMTGI 65
IL+ DD + +R ++ L+ G+ + + + AG D VVTD MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 DLLRNIRADAKLKHLPVMMVTAEAKREQIIEAAQCGVNGYIIKPF 110
DLL I+ LPV++++A+ I+A++ G Y+ KPF
Sbjct: 64 DLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11495PF06580441e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 44.1 bits (104), Expect = 1e-06
Identities = 24/136 (17%), Positives = 44/136 (32%), Gaps = 53/136 (38%)

Query: 283 LVRNAIDHGIESPALREATGKPRSGHVRLSAQQEGDYVSIEIQDDGAGIDPERLREIARN 342
LV N I HGI P+ G + L ++ V++E+++ G+
Sbjct: 263 LVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEVENTGSLALKN-------- 306

Query: 343 KGLIDAEAAARLSTDECLHLIFMPGFSTKAEVTDISGRGVGMDVVQSRIRELSG---QIQ 399
G G+ V+ R++ L G QI+
Sbjct: 307 ---------------------------------TKESTGTGLQNVRERLQMLYGTEAQIK 333

Query: 400 IQSELGRGSRFMIRVP 415
+ + G+ M+ +P
Sbjct: 334 LSEKQGKV-NAMVLIP 348


38XC_RS12020XC_RS12245Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS12020-130-3.017266transposase
XC_RS12025-132-3.388182transposase
XC_RS12030032-3.399229filamentous hemagglutinin
XC_RS12035640-6.368810transporter
XC_RS12045843-7.404874*integrase
XC_RS12050745-7.968395DNA helicase
XC_RS12055747-8.678040hypothetical protein
XC_RS12060643-8.564356plasmid mobilization protein
XC_RS12065642-8.974177MchC protein
XC_RS12070743-9.493207hemin transporter
XC_RS12075743-9.321871ABC transporter permease
XC_RS12080641-9.001482hydrogenase expression protein HypA
XC_RS12085537-7.857936ABC transporter
XC_RS12090640-6.535455multidrug transporter
XC_RS12095449-5.818466hypothetical protein
XC_RS12100454-7.214684hypothetical protein
XC_RS12105553-7.939039hypothetical protein
XC_RS12110532-8.447845hypothetical protein
XC_RS12115632-8.917395hypothetical protein
XC_RS12120731-9.220118hypothetical protein
XC_RS12125828-9.177776hypothetical protein
XC_RS12130827-8.723842hypothetical protein
XC_RS12135628-9.096217hypothetical protein
XC_RS12140643-10.382704hypothetical protein
XC_RS12145646-11.030795hypothetical protein
XC_RS12150847-8.838389hypothetical protein
XC_RS12155846-8.753943hypothetical protein
XC_RS12160847-9.065931integrase
XC_RS12165949-9.314366hypothetical protein
XC_RS12170947-8.818928hypothetical protein
XC_RS12175945-8.796884hypothetical protein
XC_RS12180744-10.441918hypothetical protein
XC_RS12185744-10.205949ankyrin
XC_RS12190641-9.690731hypothetical protein
XC_RS12195641-9.458367transcriptional regulator
XC_RS12200643-10.514309nucleotidyltransferase
XC_RS12205549-9.816979hypothetical protein
XC_RS12210751-10.064769ATPase
XC_RS12220752-9.786014exonuclease
XC_RS12225536-7.311485hypothetical protein
XC_RS12235534-6.362064TatD family hydrolase
XC_RS12240325-4.638272hypothetical protein
XC_RS12245217-2.988568ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12030PF05860772e-18 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 76.8 bits (189), Expect = 2e-18
Identities = 23/134 (17%), Positives = 40/134 (29%), Gaps = 23/134 (17%)

Query: 42 TPNPGGQQPGQQVAANGVPVVDIVAPNARGISHNRYSNFNVGPNGLILNNSAQISKTELG 101
TP+ +++ + H+ + F+V +G N+
Sbjct: 4 TPDTTLPINSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNP-------- 54

Query: 102 GYVAGNNNLQRSGAASLILNEVTSAS-SRLQGYTEIAGAKAQLVIANPNGISCDGCGFLN 160
I++ VT S S + G A L + NPNGI L+
Sbjct: 55 ------------TNIQNIISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLD 101

Query: 161 TSRVTLTTGAPNLG 174
+ + A L
Sbjct: 102 IGGSFVGSTANRLK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12070RTXTOXINC552e-12 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 54.9 bits (132), Expect = 2e-12
Identities = 30/124 (24%), Positives = 44/124 (35%), Gaps = 21/124 (16%)

Query: 24 KKFSIAAAYVWLW-------------------PAIRLGQLVTIEDEDGVWTGYALWAYLT 64
K I WLW PAI+ Q V + D Y WA L+
Sbjct: 5 KPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTR-DDYPVAYCSWANLS 63

Query: 65 PETASHLVLQDPPFLPISDWNEGDQLWILDFVAMPGHHRRLARALRHRLRPHFKQAHRLV 124
E + D L DW GD+ W +D++A G + L + +R + +A R+
Sbjct: 64 LENEIKYL-NDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIRVD 122

Query: 125 RDKT 128

Sbjct: 123 PKTH 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12080PERTACTIN383e-05 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 37.8 bits (87), Expect = 3e-05
Identities = 33/100 (33%), Positives = 41/100 (41%), Gaps = 3/100 (3%)

Query: 21 VDPSEPPTEIPPIVVNPPPSPPPTIPPSPPVEQPGPITPPGGGGGVPTPLPPLVSGLGAG 80
V PP P P P P P PP PP + P P PP P P PP L A
Sbjct: 563 VGAKAPPAPKPAPQPGPQPGPQPPQPPQPP-QPPQPPQPPQRQPEAPAPQPPAGRELSAA 621

Query: 81 ADAYLNKSDEVKSLLSKYLAGGADVEFKDLGTVKGEVDAG 120
A+A +N V + + A + K LG ++ DAG
Sbjct: 622 ANAAVNTGG-VGLASTLWYAESNALS-KRLGELRLNPDAG 659


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12085PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.008
Identities = 19/65 (29%), Positives = 26/65 (40%), Gaps = 11/65 (16%)

Query: 515 KWIMSALQLRA-----PAGQVIAIVGNSGVGKTTLIRVLAGLEDLQVGDFLVNGEDLRKV 569
K+I+ R + + G G+GK+TLI L GL DF +
Sbjct: 578 KYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGL------DFFSDTHFDIGT 631

Query: 570 GKSSY 574
GK SY
Sbjct: 632 GKDSY 636


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12090RTXTOXIND1272e-34 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 127 bits (320), Expect = 2e-34
Identities = 67/423 (15%), Positives = 152/423 (35%), Gaps = 52/423 (12%)

Query: 48 ALFLLLATFVLTASYSKREHVSGQIISTHGRVDIRSGTPGLILSTTLKPNALVKKGQVLA 107
+ +L+ +G++ + +I+ ++ +K V+KG VL
Sbjct: 69 VIAFILSVL---GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLL 125

Query: 108 ELSADITD---------------EAGR----------------SLSDETIKRALTRSEEL 136
+L+A + E R L DE + ++ E L
Sbjct: 126 KLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVL 185

Query: 137 TKEQLQTHDFS--GQRERELTRQVEETTGAMQEVARKISILEKKYAKNKELLKTIEPLLA 194
L FS ++ + +++ V +I+ E K L LL
Sbjct: 186 RLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLH 245

Query: 195 EKYVSKYTYLTYENALLDAEAEIQDARAQQSTLRNQ----RAALLGEITEIKTTASRQAS 250
++ ++K+ L EN ++A E++ ++Q + ++ + K +
Sbjct: 246 KQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLR 305

Query: 251 EIEREKSTIEDQVARAKSD-RLQTITSPLSGTVAAIYA-SQGQRIGTDSIIASITPSESV 308
+ + ++A+ + + I +P+S V + ++G + T + I P +
Sbjct: 306 QTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT 365

Query: 309 FEAEILIPSRAIGHVNVGTEVLLNIAAFPKAKYGAIQGRIASLSTQTSPIGELERRYGRQ 368
E L+ ++ IG +NVG ++ + AFP +YG + G++ +++
Sbjct: 366 LEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIE----------D 415

Query: 369 SPTEPVYTAKVALPSQTIGVAQEAKSFLPGMEVDAELILEGRKIWEWMFDPFQTMGSRLT 428
V+ +++ + + GM V AE+ R + ++ P + +
Sbjct: 416 QRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESL 475

Query: 429 GEK 431
E+
Sbjct: 476 RER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12135BICOMPNTOXIN320.006 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 31.8 bits (72), Expect = 0.006
Identities = 11/44 (25%), Positives = 21/44 (47%)

Query: 460 QIVYAPREQQDANDYSDMLGYTTVRKKNKSHTSGKQSSVSYSET 503
I Y P+ + ++ + S LGY + + G S +YS++
Sbjct: 122 LINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKS 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12175GPOSANCHOR413e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.8 bits (95), Expect = 3e-05
Identities = 36/167 (21%), Positives = 66/167 (39%), Gaps = 12/167 (7%)

Query: 585 QQTKQNFQTRSAQVSKRLAELEALRQALSALPSLRQRE-EHASRQRADAEQSASTTMETW 643
+ + A + R AELE + + + + ++A E +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQS 304

Query: 644 QTAQSISTNARSQLEGSREALSQHERDRPGLFALKRLFGSISVKQWEAAHTPLAREHDQA 703
Q + + R L+ SREA Q E + L + IS EA+ L R+ D +
Sbjct: 305 QVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNK----IS----EASRQSLRRDLDAS 356

Query: 704 QVAYRKLKSEEDAA---LQASEQARKAAHDTAASHVTAKDALDKALA 747
+ A ++L++E + SE +R++ + AK ++KAL
Sbjct: 357 REAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALE 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12240PERTACTIN280.040 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 28.1 bits (62), Expect = 0.040
Identities = 27/104 (25%), Positives = 36/104 (34%), Gaps = 6/104 (5%)

Query: 14 GPMIPPWADAPPAGGGDTPAPPPGTPPEPDAPSPQAQPQRWTGVRRNLGDFARNGDGRSL 73
G PP P G PP P P P P PQ R+ + GR L
Sbjct: 564 GAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPPQPPQ-----RQPEAPAPQPPAGREL 618

Query: 74 RRAVKHYVRSGYGGAATASRRMSSTGSTANALGNALAGLGAGGF 117
A V +G G A+ + + + + + LG AGG
Sbjct: 619 SAAANAAVNTGGVGLAS-TLWYAESNALSKRLGELRLNPDAGGA 661


39XC_RS12770XC_RS12805Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS12770391.935588alkane 1-monooxygenase
XC_RS127753112.184129MFS transporter
XC_RS127804112.770541ABC transporter ATP-binding protein
XC_RS127854112.789177serine kinase
XC_RS127904133.004159Mg-protoporphyrin IX monomethyl ester oxidative
XC_RS127955143.606910hypothetical protein
XC_RS128004143.621533hexosyltransferase
XC_RS128053132.404569glycosyl transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12775TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.6 bits (77), Expect = 0.001
Identities = 31/130 (23%), Positives = 44/130 (33%), Gaps = 9/130 (6%)

Query: 72 FGMSDQAAFAAATFLGLF-----VGAALLSPFADRFGRRLVFTVALVWYTAATVAMGLQS 126
S+ L L+ A +L +DRFGRR V V+L M
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP 94

Query: 127 TATAVIVLRFVVGIGLGIELVTIDTYLSELVPRHMRGAAFAF---AFFLQFLAVPSVALS 183
+ + R V GI G Y++++ R F F F +A P +
Sbjct: 95 FLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL 153

Query: 184 AWLLVPHAPL 193
PHAP
Sbjct: 154 MGGFSPHAPF 163


40XC_RS12950XC_RS13260Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS129502250.433150hypothetical protein
XC_RS129551220.422726hypothetical protein
XC_RS129602180.828743hypothetical protein
XC_RS129652180.699104hypothetical protein
XC_RS129700180.530911dehydrogenase
XC_RS12975020-1.170847hypothetical protein
XC_RS12980-123-3.135135methyltransferase
XC_RS12985025-3.524205glycosyl transferase
XC_RS12990228-4.721837hypothetical protein
XC_RS12995228-4.838383integrase
XC_RS13000025-4.710780transposase
XC_RS13005534-5.971043type IV secretion protein Rhs
XC_RS13010839-6.157819hypothetical protein
XC_RS13015637-5.907386transposase
XC_RS13020637-5.748631transposase
XC_RS13025638-5.622198transposase
XC_RS13035841-5.294784hypothetical protein
XC_RS13040236-3.309464transposase
XC_RS13045135-3.530163integrase
XC_RS13050231-2.873823DNA-binding protein
XC_RS13055230-2.760930hypothetical protein
XC_RS13060130-2.748364invertase
XC_RS13065428-2.507777avirulence protein
XC_RS13070527-2.858574hypothetical protein
XC_RS13075528-3.295749transposase
XC_RS13080632-4.429443transposase
XC_RS13085634-5.233679hypothetical protein
XC_RS13090836-6.276327hypothetical protein
XC_RS130951249-12.418063hypothetical protein
XC_RS131001148-11.681679hypothetical protein
XC_RS131051146-10.495189hypothetical protein
XC_RS131101344-9.389269hypothetical protein
XC_RS131151143-8.431836hypothetical protein
XC_RS131201039-6.058609hypothetical protein
XC_RS13125837-5.049079RadC family protein
XC_RS13130934-5.753982hypothetical protein
XC_RS131351037-6.729015hypothetical protein
XC_RS13140936-6.345647hypothetical protein
XC_RS13145937-5.822226hypothetical protein
XC_RS131501041-7.295039hypothetical protein
XC_RS131551247-9.478120integrase
XC_RS13160845-9.343671integrase
XC_RS13165536-7.611462hypothetical protein
XC_RS13170534-7.783616hypothetical protein
XC_RS13175633-7.794841hypothetical protein
XC_RS13180324-6.072324hypothetical protein
XC_RS13185123-4.695989transposase
XC_RS13190226-5.627486transposase
XC_RS13195230-6.655386transposase
XC_RS13200434-7.136121transposase
XC_RS13205335-6.802772transposase
XC_RS13210738-6.271839hypothetical protein
XC_RS13215735-5.439776transposase
XC_RS13220935-4.525029transposase
XC_RS132251041-5.797737hypothetical protein
XC_RS13230838-6.038921transcriptional regulator
XC_RS13235943-8.240362hypothetical protein
XC_RS13240942-8.469838hypothetical protein
XC_RS13245735-7.326481hypothetical protein
XC_RS13250629-6.489656hypothetical protein
XC_RS13255524-4.861876hypothetical protein
XC_RS13260320-3.325825hydroxylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12970TRNSINTIMINR290.022 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 29.3 bits (65), Expect = 0.022
Identities = 16/61 (26%), Positives = 28/61 (45%)

Query: 85 SGAAHVTHTLVTAPHQGAPGLFAVALDQPGIRIDASAWQALGMRAIETANVTFSATPAQL 144
SG+ VT L+ P QG +A+ + G+R+ + G A+ + N + P +
Sbjct: 489 SGSGPVTGRLIGTPGQGIQSTYALLANSGGLRLGMGGLTSGGETAVSSVNAAPTPGPVRF 548

Query: 145 V 145
V
Sbjct: 549 V 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13005LPSBIOSNTHSS300.018 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 30.2 bits (68), Expect = 0.018
Identities = 18/67 (26%), Positives = 29/67 (43%), Gaps = 5/67 (7%)

Query: 25 VVTDYSYDARGRLTASIVRGQDSGSETDDRIT--RYSYWPTGLVKQVALPGGDVSDFTYD 82
V++D+ + + A+ + S ET T YS+ + LVK+VA GG+V F
Sbjct: 90 VLSDFELELQM---ANTNKTLASDLETVFLTTSTEYSFLSSSLVKEVARFGGNVEHFVPS 146

Query: 83 SAFRLTK 89

Sbjct: 147 HVAAALY 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13035GPOSANCHOR320.021 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 31.6 bits (71), Expect = 0.021
Identities = 19/66 (28%), Positives = 33/66 (50%), Gaps = 3/66 (4%)

Query: 519 KEAQDRWERARKHFKQLSSACSTRLSQLENLRRALIQL---PKLREEEKAKRQQHLEASS 575
+ + + +R+ KQ+ A S+L L + +L KL E+EKA+ Q LEA +
Sbjct: 382 QSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEA 441

Query: 576 HANQAQ 581
A + +
Sbjct: 442 KALKEK 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13090IGASERPTASE552e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.5 bits (133), Expect = 2e-09
Identities = 35/263 (13%), Positives = 76/263 (28%), Gaps = 15/263 (5%)

Query: 729 QASVAAQARQEREQQDRLAQEQHAAQVREHLHQAQPEREDQSQSEQAVQAHAMLEGQRQA 788
QA + + A P ++ +E + Q + +
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ-------ESKT 1050

Query: 789 EQQREQ---EDHQQQERQAHTNQQRELQEREGREVQQRQAQERQSEDNQQREQQDRQAQE 845
++ EQ E Q A + + EV Q ++ ++++ + +E + +E
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 846 TRQVEVQEAQAQQTHDQQQQTQGLEPRQASPQLDTQPHAPDAALVQQTPRPESQQED--- 902
+VE ++ Q Q + + PQ + +++ + D
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 903 -ARQQPQMRNQQAHEHLAPDAPDPLKQTHSPGDARPHQAQEAERALEMSAVESRAAARLQ 961
A++ Q E + + + + Q + R + R
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230

Query: 962 VSSSEGP-ESGNQFSQGAEADAV 983
+ E S N S A D
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLT 1253



Score = 53.5 bits (128), Expect = 6e-09
Identities = 45/296 (15%), Positives = 95/296 (32%), Gaps = 22/296 (7%)

Query: 647 RNESGQYGYDSPIVHLQRGSDGVARVVAVTSSDDIRQALNEAQGNRRELPPLARAPEATG 706
RN +G+Y +P V + +T+ ++I+ + N E+ + AP
Sbjct: 972 RNVNGRYDLYNPEVEKRNQ---TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPP 1028

Query: 707 ETDTSTSDSSNPQQVLYIQARMQASVAAQARQEREQQDRLAQEQHAAQVREHLHQAQPER 766
T + A + Q + E+ ++ A E AQ RE +A+
Sbjct: 1029 APATPSE-----------TTETVAENSKQESKTVEKNEQDATET-TAQNREVAKEAKSNV 1076

Query: 767 EDQ-SQSEQAVQAHAMLEGQRQAEQQREQEDHQQQERQAHTNQQRELQEREGREVQQRQA 825
+ +E A E Q ++ + +++ + Q + +Q Q+
Sbjct: 1077 KANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136

Query: 826 QERQSEDNQQREQ------QDRQAQETRQVEVQEAQAQQTHDQQQQTQGLEPRQASPQLD 879
+ Q + RE ++ Q+Q + ++ + + + +Q +
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV 1196

Query: 880 TQPHAPDAALVQQTPRPESQQEDARQQPQMRNQQAHEHLAPDAPDPLKQTHSPGDA 935
P A Q T ES + + + H + T + D
Sbjct: 1197 ENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDL 1252



Score = 44.7 bits (105), Expect = 3e-06
Identities = 44/280 (15%), Positives = 85/280 (30%), Gaps = 34/280 (12%)

Query: 571 QMRQPSLERSQEALTAIQALPAPTATQMEHNELLHRYRAAGVDLNVNPQTQQAVELASQR 630
Q PS+ + E + + P P +E V N Q + VE Q
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSE-----TTETVAENS-KQESKTVEKNEQD 1057

Query: 631 TRDASGIAGPTMQQLQRNESGQYGYDSPIVHLQRGSDGVARVVAVTSSDDIRQALNEAQG 690
+ + ++ + N V A T ++++ Q+ +E +
Sbjct: 1058 ATETTAQNREVAKEAKSN-----------------------VKANTQTNEVAQSGSETKE 1094

Query: 691 NRR----ELPPLARAPEATGETDTSTSDSSNPQQVLYIQARMQASVAAQARQEREQQDRL 746
+ E + + +A ET+ + QV Q + + V QA RE +
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSET-VQPQAEPARENDPTV 1153

Query: 747 AQEQHAAQVREHLHQAQPEREDQSQSEQAVQAHAMLEGQRQAEQQREQEDHQQQERQAHT 806
++ +Q QP +E S EQ V + + E + ++
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 807 NQQRELQEREGREVQQRQAQERQSEDNQQREQQDRQAQET 846
+ + R R V+ + + T
Sbjct: 1214 ESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253



Score = 43.1 bits (101), Expect = 1e-05
Identities = 25/220 (11%), Positives = 64/220 (29%), Gaps = 7/220 (3%)

Query: 682 RQALNEAQGNRRELPPLARAPEATGETDTSTSDSSNPQQVLYIQARMQASVAAQARQERE 741
Q E RE+ A+ ++ + +T T++ + + A +E +
Sbjct: 1055 EQDATETTAQNREV---AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 742 QQDRLAQEQHAAQVREHL--HQAQPEREDQSQSEQAVQAHAMLEGQRQAEQQREQEDHQQ 799
+ + Q +V + Q Q E + + Q++ D +Q
Sbjct: 1112 AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT-TADTEQ 1170

Query: 800 QERQAHTNQQRELQEREGREVQQRQAQERQSEDNQQREQQDRQA-QETRQVEVQEAQAQQ 858
++ +N ++ + E + ++ + + + +
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230

Query: 859 THDQQQQTQGLEPRQASPQLDTQPHAPDAALVQQTPRPES 898
H+ + T R D +A L + +
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQF 1270



Score = 35.8 bits (82), Expect = 0.001
Identities = 39/302 (12%), Positives = 91/302 (30%), Gaps = 10/302 (3%)

Query: 981 DAVPPALPPSVQTQQAEMEQASVRQQDVARKQEAEQVRVMAQTAPATAEPLIASQPPAPT 1040
V + QA++ +++AR EA T T E + +
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESK 1049

Query: 1041 TLERAAEARQASPSDVSPIDHGAAVLPAINLAQAHGQSLGGSAENRFPT---EPADAESQ 1097
T+E+ + + + + A N G + T E A E +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 1098 QAQSTPAQDRSAADPGGALFLEETMRSLRQLQQEIEAADREDERFHQEWKEYRERGEPYP 1157
+ +++ P + +Q + E A D + KE + +
Sbjct: 1110 EKAKVET-EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI--KEPQSQTNTTA 1166

Query: 1158 FVRDGARDQESGSIDAFSAHQPSRRSPSEAAEQTDAFPSLAQRFAGPGGGDAPNDSSKDE 1217
A++ S + S + P+ Q + P + +
Sbjct: 1167 DTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRS 1226

Query: 1218 RKSITGDPDVDEVLYALDSKNELAIEQALNRVANSAATQALLKKGNDFLEAQAQQEAQEH 1277
+S+ + + + + ++ +A+ + N+ + A K F+ + +H
Sbjct: 1227 VRSVPHNVEPATT--SSNDRSTVALCDLTSTNTNAVLSDARAKA--QFVALNVGKAVSQH 1282

Query: 1278 VA 1279
++
Sbjct: 1283 IS 1284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13225cloacin347e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.3 bits (78), Expect = 7e-04
Identities = 25/129 (19%), Positives = 57/129 (44%), Gaps = 20/129 (15%)

Query: 4 DQRRRDVERHQKEIARLQTEKSREETKAVGEKKKAFDASAAATRTKSVSTQQSKLREAQR 63
++ R ++ + +++AR Q E+ + + +K DA+ K+++ +++++ R
Sbjct: 324 ERARAELNQANEDVARNQ-ERQAKAVQVYNSRKSELDAA-----NKTLADAIAEIKQFNR 377

Query: 64 YEGNAVA--------IQKKIADLETKIAREHERLGNANRQLSAAQV-----QEQKKQVQE 110
+ + +A K +T + + A ++ S A E +K+ +E
Sbjct: 378 FAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAMESRKK-KE 436

Query: 111 DKKRDAERK 119
DKKR AE
Sbjct: 437 DKKRSAENN 445


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13240adhesinb280.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.9 bits (62), Expect = 0.008
Identities = 10/32 (31%), Positives = 15/32 (46%)

Query: 1 MKHARIGLVALTMALGLTACGGKPSSDNAKEA 32
MK R ++ L +GL AC + SS +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSS 32


41XC_RS13405XC_RS13460Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS134053111.617166aminotransferase
XC_RS134103121.024327Cell division protein ZipA homolog
XC_RS134153121.244412chromosome segregation protein
XC_RS134203180.402660hypothetical protein
XC_RS13425422-1.52640050S ribosomal protein L9
XC_RS13430320-1.25113130S ribosomal protein S18
XC_RS13435218-0.67977630S ribosomal protein S6
XC_RS13440017-0.015372HesB protein family
XC_RS13445015-0.634038asparagine--tRNA ligase
XC_RS13450214-0.648164hypothetical protein
XC_RS134552131.993613membrane protein
XC_RS134602122.367314hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13415GPOSANCHOR635e-12 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 62.8 bits (152), Expect = 5e-12
Identities = 58/355 (16%), Positives = 124/355 (34%), Gaps = 16/355 (4%)

Query: 150 SQIIEARPEDLRVYLEEAAG-ISKYKERRKETETRIRHTRENLDRLGDLREEITKQLAHL 208
+ + + L+ + +E +S KE+ ++ + + + L + ++ K L
Sbjct: 73 NSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGA 132

Query: 209 QRQARQAE-QYQALQEERRIKDAEWKALEY--RGLDGRLQGLREKLNQEETRLQQLIAEQ 265
+ + + L+ E+ A LE G K+ E L A Q
Sbjct: 133 MNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQ 192

Query: 266 RDAEARIETGRVRREEAAEAVAKAQADVYQVGGALARIEQQIQHQRELSHRLHKARDEAQ 325
+ E +E + + +A+ + A +E+ ++ S +
Sbjct: 193 AELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLE 252

Query: 326 SQLQELTQHISGDSAKLAVLREAVADAEPQLEQLREDHEFRQDALREAEARLADWQQRWE 385
++ L + L +++ L + + + E + +
Sbjct: 253 AEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQ 312

Query: 386 THNRDTGEASRAGEVERTRVDYLDRQSLEADRRREALVNERAGLDLDALAEAFEQIELRH 445
+ RD + A + L+ Q+ ++ R++L DLDA EA +Q+E
Sbjct: 313 SLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRR-----DLDASREAKKQLE--- 364

Query: 446 ETQKASLDGLTEQVEARKHALGGLQEQQRASQGELAEVRKQAQAARGRLSSLETL 500
A L EQ + + + L+ AS+ +V K + A +L++LE L
Sbjct: 365 ----AEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKL 415



Score = 40.4 bits (94), Expect = 4e-05
Identities = 36/190 (18%), Positives = 76/190 (40%), Gaps = 13/190 (6%)

Query: 657 EREIQELRSQIDTLQEREGDLEEQLASFREQLLAAEQQREDAQRQLYMAHRSVSELAGQL 716
E E L ++ L++ + ++ E ++ + + L
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR 311

Query: 717 QSQQGKVDAARGRIERIEAELGQLLETLDTSR----------EQAREARAKLEDAVVRMG 766
QS + +DA+R +++EAE +L E S + +REA+ +LE +
Sbjct: 312 QSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQK-- 369

Query: 767 DLQGTREALESERRQLTDARDQARDAARGVRDAMHALALTLESQRTQITSLSQTLERMDS 826
L+ + E+ R+ L D +R+A + V A+ L + L ++ + +
Sbjct: 370 -LEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEK 428

Query: 827 QRGQLDTRLE 836
++ +L +LE
Sbjct: 429 EKAELQAKLE 438



Score = 36.6 bits (84), Expect = 5e-04
Identities = 33/197 (16%), Positives = 74/197 (37%), Gaps = 7/197 (3%)

Query: 657 EREIQELRSQIDTLQEREGDLEEQLASFREQLLAAEQQREDAQRQLYMAHRSVSELAGQL 716
E+ ++ + + LE + A+ + E+ E A + L +
Sbjct: 231 EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 290

Query: 717 QSQQGKVDAARGRIERIEAELGQLLETLDTSREQAREARAKLEDAVVRMGDLQGTREALE 776
+ + + + + + A L LD SRE ++ A+ + L+ + E
Sbjct: 291 AALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQ-------KLEEQNKISE 343

Query: 777 SERRQLTDARDQARDAARGVRDAMHALALTLESQRTQITSLSQTLERMDSQRGQLDTRLE 836
+ R+ L D +R+A + + L + SL + L+ + Q++ LE
Sbjct: 344 ASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALE 403

Query: 837 DLVAQLSEGDSPVETLE 853
+ ++L+ + + LE
Sbjct: 404 EANSKLAALEKLNKELE 420



Score = 36.6 bits (84), Expect = 7e-04
Identities = 43/294 (14%), Positives = 99/294 (33%), Gaps = 13/294 (4%)

Query: 715 QLQSQQGKVDAARGRIERIEAELGQLLETLDTSREQAREARAKLEDAVVRMGDLQGTREA 774
L+ Q + D ++ + L ++ E +L +A ++ +
Sbjct: 51 TLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSE 110

Query: 775 LESERRQLTDARDQARDAARGVRDAMHALALTLESQRTQITSLSQTLERMDSQRGQLDTR 834
S+ ++L + A + + ++ + + L +++ L+
Sbjct: 111 KASKIQELEARKADLEKA----LEGAMNFSTADSAKIKTLEAEKAALA---ARKADLEKA 163

Query: 835 LEDLVAQLSEGDSPVETLEHEHQAALSERVRTERVLGEARTMLDSIDGELRSFEQTRQQR 894
LE + + + ++TLE E A + + E+ L A + + +T +
Sbjct: 164 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA----MNFSTADSAKIKTLEAE 219

Query: 895 DEQALAQRERISQRKLDQQALVLNAEQLAAAVVKAGFVLEDVVNGLEEAANPAEWDAAVG 954
A++ + + + LE LE+A A
Sbjct: 220 KAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAM--NFST 277

Query: 955 QIDARMRRLEPVNLAAIQEYGEAAQRSEYLDAQNVDLTTALETLEEAIRKIDRE 1008
A+++ LE A E + +S+ L+A L L+ EA ++++ E
Sbjct: 278 ADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331


42XC_RS13975XC_RS14040Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS13975132-3.821241hypothetical protein
XC_RS13980238-5.186135hypothetical protein
XC_RS13985445-8.356488integrase
XC_RS13990336-7.429130transposase
XC_RS13995336-7.429130integrase
XC_RS14000539-8.130586hypothetical protein
XC_RS14005640-8.000369hypothetical protein
XC_RS14010639-7.763432transposase
XC_RS14015643-8.364923transposase
XC_RS14020647-8.893004transposase
XC_RS14025750-9.277075helicase
XC_RS14030549-8.885813hypothetical protein
XC_RS14035444-7.293398hypothetical protein
XC_RS14040227-4.112855hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14035OMPADOMAIN457e-08 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 45.3 bits (107), Expect = 7e-08
Identities = 26/77 (33%), Positives = 35/77 (45%), Gaps = 6/77 (7%)

Query: 83 ADQRISFPTVGTFGFNSYRLPAEADGALAQLVPAVLNAADGELGKKWLKQVVVEGYTDTK 142
+ + + F FN L E AL QL + L+ D + G VVV GYTD
Sbjct: 211 QTKHFTLKSDVLFNFNKATLKPEGQAALDQLY-SQLSNLDPKDGS-----VVVLGYTDRI 264

Query: 143 GGYLYNLHLSLQRSEWV 159
G YN LS +R++ V
Sbjct: 265 GSDAYNQGLSERRAQSV 281


43XC_RS14605XC_RS14695Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS146052132.917365phosphotransferase
XC_RS146101132.052505ferredoxin
XC_RS146150101.636278hypothetical protein
XC_RS146201121.501318membrane protein
XC_RS146251121.566215RNA polymerase sigma factor
XC_RS146302121.970835hypothetical protein
XC_RS146352121.517864delta 9 acyl-lipid fatty acid desaturase
XC_RS146401121.609671dehydrogenase
XC_RS146452141.471317chromosome partitioning protein ParA
XC_RS146501131.252060cyclopropane-fatty-acyl-phospholipid synthase
XC_RS146552142.194089membrane protein
XC_RS146600142.023516membrane protein
XC_RS146651142.187874cyclopropane-fatty-acyl-phospholipid synthase
XC_RS146701152.635885aminoacrylate peracid reductase
XC_RS146751142.722164membrane protein
XC_RS146801132.572443membrane protein
XC_RS146850131.321533virulence protein
XC_RS146902160.609795hypothetical protein
XC_RS146952150.809388hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14685PF060572778e-94 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 277 bits (711), Expect = 8e-94
Identities = 72/225 (32%), Positives = 113/225 (50%), Gaps = 8/225 (3%)

Query: 227 AAARSLGEQKGVALPPPPGGLEDLPVVELPARGDGDNDDDTFAIFVSGDGGWAGLDEEVA 286
+ A + ++ L +E V + IF+SGDGGWA LD+ V
Sbjct: 16 STANAFADEFADNLGLTLLPVEPSTQVNAASSHTKP----PLVIFLSGDGGWATLDKAVG 71

Query: 287 DALVAQGIPVVGLDSLRYFWSERTPQGFATDLDRIARFYAQRWQRRRVVLIGFSQGADVL 346
L QG PVVG SL+Y+W ++ P+ D I Y + ++V+LIG+S GA+V+
Sbjct: 72 GILQQQGWPVVGWSSLKYYWKQKDPKDVTQDTLAIIDKYQAEFGTQKVILIGYSFGAEVI 131

Query: 347 PAAINKLPAPTKQNLRMTALLSVGKLADYEFHVSNWLGSDDDG--LPIAPEVQRLPPGIT 404
P +N++PA ++N+ LLS + +D+E HVS + SD+ PEV +
Sbjct: 132 PFVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVTSDNQSARYLTLPEVNKQTTVPM 191

Query: 405 VCIYGQDDEDA--LCPSLPAETAKRVVLPGDHHFKGDYATLAKVI 447
+C+YG++D+ LCP + + L G H F DY + K+I
Sbjct: 192 LCLYGKEDDAPLHLCPEVKQPNVTVMELSGGHSFDDDYDKVVKLI 236


44XC_RS15055XC_RS15255Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS15055-125-3.539706hypothetical protein
XC_RS15060030-4.994660alpha-L-fucosidase
XC_RS15065036-6.416720beta-glucosidase
XC_RS15070136-6.894622hypothetical protein
XC_RS15075236-7.264507transposase
XC_RS15080240-7.637438hypothetical protein
XC_RS15085239-7.212831type III effector
XC_RS15090228-4.674371hypothetical protein
XC_RS15095022-3.151595hypothetical protein
XC_RS15100015-2.341664hypothetical protein
XC_RS15105115-1.557971hypothetical protein
XC_RS15110212-0.940183lytic transglycosylase
XC_RS151153120.390768hypothetical protein
XC_RS151203131.163890hydrogenase
XC_RS151255151.356483type III secretion system protein
XC_RS151306151.516308HPr kinase
XC_RS151354140.959573ATP synthase
XC_RS15140-115-0.171700type III secretion system protein
XC_RS15145113-1.677181HPr kinase
XC_RS15150013-2.093239hypothetical protein
XC_RS15155112-1.447111HPr kinase
XC_RS15160113-1.648651HPr kinase
XC_RS15165213-1.820705type III secretion system protein
XC_RS15170517-0.897227hypersensitivity response secretion protein
XC_RS15175319-0.332166type III secretion protein
XC_RS15180419-0.741061aldolase
XC_RS15185221-2.776707flagellar biosynthesis protein FliP
XC_RS15190125-3.119165aldolase
XC_RS15195-123-2.716868HpaA protein
XC_RS15200-223-3.558865HPr kinase
XC_RS15205228-6.537391HPr kinase
XC_RS15210228-6.262517HPr kinase
XC_RS15215328-5.532328HpaB protein
XC_RS15220129-4.915177DNA-binding protein
XC_RS15225230-5.161163hypothetical protein
XC_RS15230032-5.023593HPr kinase
XC_RS15235031-2.1966563-hydroxyisobutyrate dehydrogenase
XC_RS15240031-2.087616transcriptional regulator
XC_RS15245134-2.796318transaminase
XC_RS15250133-3.230592alcohol dehydrogenase
XC_RS15255033-3.465216isopenicillin N epimerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15120TYPE3OMGPROT324e-105 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 324 bits (831), Expect = e-105
Identities = 103/309 (33%), Positives = 159/309 (51%), Gaps = 16/309 (5%)

Query: 300 ASVWPEMSQARRDAPLAVDAGS---GGELASDAPVIEADPRTNGILIRDRPERMAAYGTL 356
A++ + + VD AS +EADP N I++RD PERM Y L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 357 IQQLDNRPKLLQIDATIIEIRDGALQDLGVDWRFHSRRVDVQTGDGRGGQLGYDGSLSGA 416
I LD +++ +I++I L +LGVDWR V ++TG+ + G S
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWR-----VGIRTGNNHQVVIKTTGDQSNI 326

Query: 417 AAAGAAAPLGGTLTAVLGDAGRYLMTRVSALEQTNKAKIVSTPQVATLDNVEAVMDHKQQ 476
A+ GA G+L G YL+ RV+ LE A++VS P + T +N +AV+DH +
Sbjct: 327 ASNGAL----GSLVDARGL--DYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSET 380

Query: 477 AFVRVSGYASADLYNLSAGVSLRVLPSVVPGSPNGQMRLDVRIEDGQLGANT--VDGIPV 534
+V+V+G A+L ++ G LR+ P V+ ++ L++ IEDG N+ ++GIP
Sbjct: 381 YYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEGIPT 440

Query: 535 ITSSEITTQAFVNEGQSLLIAGYASDTDQTDLNNVPGLSRIPLVGNLFKHRQQSGSRLQR 594
I+ + + T A V GQSL+I G D L+ VP L IP +G LF+ + + R R
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 595 LFLLTPHIV 603
LF++ P I+
Sbjct: 501 LFIIEPRII 509



Score = 236 bits (604), Expect = 5e-72
Identities = 68/230 (29%), Positives = 113/230 (49%), Gaps = 5/230 (2%)

Query: 16 LAAALLLGLLPLLPPHANAASVPWHSRSFKYVADRKDLKEVLRDLSASQSITTWISPEVT 75
+L G L LL ++ A + W + YVA + L+++L D A+ T +S ++
Sbjct: 8 FFKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIN 67

Query: 76 GTLSGKFEA-TPQKFLDDLSGTFGFVWYYDGSVLRIWGANETKNATLSLGAASTSALRDA 134
+SG+FE PQ FL ++ + VWYYDG+VL I+ +E + + L + + L+ A
Sbjct: 68 DKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQA 127

Query: 135 LARMRLDDPRFPVRYDETAHLAVVSGPPGYVDTVAAIAKQVEQVARQR----DATEVQVF 190
L R + +PRF R D + L VSGPP Y++ V A +EQ + R A +++F
Sbjct: 128 LQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIF 187

Query: 191 QLHYAQAADHTTRIGGQDIQVPGMASLLRNIYGVRGAPTAALPGPGANFG 240
L YA A+D T ++ PG+A++L+ + +
Sbjct: 188 PLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQA 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15125TYPE3IMRPROT1696e-54 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 169 bits (431), Expect = 6e-54
Identities = 51/240 (21%), Positives = 104/240 (43%), Gaps = 3/240 (1%)

Query: 8 LLALSSQGVSLLTLLALCGVRVFVLFFVLPATAQDSLPGMTRNGVIYVLSSFIAYGQPAD 67
L S Q +S L L +RV L P ++ S+P + G+ +++ IA PA+
Sbjct: 2 LQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPAN 61

Query: 68 ALARIEAAGLVGLVFKEAFIGLLIGFAASTVFWVAESVGLLIDDVSGYNNVQMINPLSGE 127
+ L L ++ IG+ +GF F + G +I G + ++P S
Sbjct: 62 DVPVFSFFAL-WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 128 QSTPVSTVLMQLAIVSFYALGGMLMLLGALFESFRWWPLSQLMPDMGAIGESFVIQQTDG 187
++ ++ LA++ F G L L+ L ++F P+ + + + +
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGG--EPLNSNAFLALTKAGSL 178

Query: 188 MMAAIVKLSAPVMLVLVLVDLAIGFVARAADKLDPSNLSQPIRGVLALLLLALLTSVFIA 247
+ + L+ P++ +L+ ++LA+G + R A +L + P+ + + L+A L +
Sbjct: 179 IFLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15140FLGFLIH300.008 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 29.8 bits (66), Expect = 0.008
Identities = 31/109 (28%), Positives = 43/109 (39%), Gaps = 9/109 (8%)

Query: 114 RLAEIVAHACEQVLHGHDPA----ALYARAAQALDGALDEANALQVSVHPDALDDARRAF 169
RL ++ A QV+ G P AL + Q L + Q+ VHPD L
Sbjct: 119 RLMQMALEAARQVI-GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDML 177

Query: 170 DAAAAAGGWSMPVELCGDTTLALGACVCEWDTGVFETDLRDQLRSLRRV 218
A + GW L GD TL G C D G + + + + L R+
Sbjct: 178 GATLSLHGW----RLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRL 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15150FLGMRINGFLIF811e-19 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 81.2 bits (200), Expect = 1e-19
Identities = 37/165 (22%), Positives = 71/165 (43%), Gaps = 8/165 (4%)

Query: 23 LYSGLTENDANDMLAVLLTAGVDAEKLTPDDGKTWAVNAPHDQVAYALNVLRTHGMPHER 82
L+S L++ D ++A L + G A+ P D+V L G+P +
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNIPYR-FANGSG---AIEVPADKVHELRLRLAQQGLP--K 106

Query: 83 HANLG-EMFKKDGLISTPTEERVRFIYGVSQQLSQTLSNIDGVIAADVQIVLPNNDPLSA 141
+G E+ ++ + E+V + + +L++T+ + V +A V + +P
Sbjct: 107 GGAVGFELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 142 SVKPSSAAVFIKFRVGSDLT-SLVPSIKTLVMHSVEGLTYENVSV 185
K SA+V + G L + ++ LV +V GL NV++
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTL 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15165TYPE3IMSPROT332e-115 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 332 bits (854), Expect = e-115
Identities = 110/349 (31%), Positives = 190/349 (54%), Gaps = 2/349 (0%)

Query: 1 MSDEKTEKPTEKKLQDARRDGEVPISPDVTAAAVLLAALLVMKLAGSYFVEHLRALMSIG 60
MS EKTE+PT KK++DAR+ G+V S +V + A+++A ++ Y+ EH LM I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 61 FDFTTNTRDATALHRALGRIGIQGVLLTLPFVTACLAAGLIGTFVQTGLNASLKPVTPKF 120
+ + + AL + + ++ L P +T + VQ G S + + P
Sbjct: 61 AE-QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 121 DSLNPVNGVKKLFSLRSLINLLKLGIKAAVIGVVLWYGLRALMPTIIGLAYQPPADIAQI 180
+NP+ G K++FS++SL+ LK +K ++ +++W ++ + T++ L I +
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 181 GWRALGILCALAVLVFVLVGAADWSVQHWLFIRDKRMSKDEQKREHKESEGDPEVKGKRK 240
+ L L + + FV++ AD++ +++ +I++ +MSKDE KRE+KE EG PE+K KR+
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 241 EFAKELVFGDPRERVAKAKVMVVNPTHYAVALAYEPDGFGLPQVVAKGVDEGALELRAYA 300
+F +E+ + RE V ++ V+V NPTH A+ + Y+ LP V K D +R A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 301 HNQGIPIVANPPLARAL-HEVELGEAVPESLFETVAVVLRWVDELGRDN 348
+G+PI+ PLARAL + + +P E A VLRW++ +
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEK 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15180TYPE3OMOPROT533e-10 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 53.5 bits (128), Expect = 3e-10
Identities = 41/177 (23%), Positives = 77/177 (43%), Gaps = 15/177 (8%)

Query: 144 PSPLPAWLSALRVTTRLRIGQRTATAALLQSLRPGDVLLHALATAPVRSGELLWGIPGGA 203
P+ LR R IG +LL + GDVLL + A V G
Sbjct: 138 PAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRAEVYCYAKKLG----- 192

Query: 204 VLRAPVRLTLQQMILETAPTMQHDMPASDSSSSATDVAAL-ELPVQLEVD--QLALSLSV 260
++ I+ +QH ++++ +A + L +LPV+LE + ++L+
Sbjct: 193 -----HFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAE 247

Query: 261 LSGLQPGQILELSVPVDQADIRLVVYGQTIGIGRLLAVGEHLGVQILS-MSETAHAD 316
L + Q+L L + ++ ++ G +G G L+ + + LGV+I +SE+ + +
Sbjct: 248 LEAMGQQQLLSLPTNAEL-NVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGNGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15185TYPE3IMPPROT2457e-85 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 245 bits (626), Expect = 7e-85
Identities = 80/219 (36%), Positives = 132/219 (60%), Gaps = 8/219 (3%)

Query: 3 MPDVGSLLLVVIMLGLLPFAAMVVTSYTKIVVVLGLLRNAIGVQQVPPNMVLNGVALLVS 62
M + SL+ ++ LLPF T + K +V ++RNA+G+QQ+P NM LNGVALL+S
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 63 CFVMAPVGMEAFKAAQNYSPGADN-SRVVVLLDACREPFRQFLLKHTREREKAFFIRSAQ 121
FVM P+ +A+ ++ ++ S + +D + +R +L+K++ FF +
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 122 QIWPKDKA-------DTLKPDDLLVLAPAFTLSELTEAFRIGFLLYLVFIVIDLVVANAL 174
+ ++ D ++ + L PA+ LSE+ AF+IGF LYL F+V+DLVV++ L
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 175 MAMGLSQVTPTNVAIPFKLLLFVALDGWSMLIHGLVLSY 213
+A+G+ ++P ++ P KL+LFVALDGW++L GL+L Y
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15190TYPE3IMQPROT629e-17 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 62.5 bits (152), Expect = 9e-17
Identities = 24/78 (30%), Positives = 43/78 (55%)

Query: 4 DDLVRFTSEALLLCLKVSLPVVGVAAVAGLLIAFIQAVMSLQDASISFALKLVVVVAAIA 63
DDLV ++AL L L +S VA + GLL+ Q V LQ+ ++ F +KL+ V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VTAPWGASAIMQFGQALM 81
+ + W ++ +G+ ++
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15220PF05616300.018 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.7 bits (66), Expect = 0.018
Identities = 17/48 (35%), Positives = 21/48 (43%), Gaps = 8/48 (16%)

Query: 14 LQPQDQPFSDGQPPTNPNPTPPMPPSRPPCSTNPGGQHGSGTGDGSGG 61
L P P +DGQP T P+ P P R P G+H +G G
Sbjct: 359 LNPDANPDTDGQPGTRPD--SPAVPDR------PNGRHRKERKEGEDG 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15240HTHTETR698e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.3 bits (169), Expect = 8e-17
Identities = 30/94 (31%), Positives = 49/94 (52%), Gaps = 3/94 (3%)

Query: 21 PRPPRADAQRNKTHILEVASDAFAEEGI-NVSMDSIAKRAGVGPGTLYRHFPSREALLAA 79
R + +AQ + HIL+VA F+++G+ + S+ IAK AGV G +Y HF + L +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 80 LLETHHANIEQVRMAITAEEHDPGCALQRWIEAL 113
+ E +NI + + + + PG L E L
Sbjct: 62 IWELSESNIGE--LELEYQAKFPGDPLSVLREIL 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15250DHBDHDRGNASE843e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 83.9 bits (207), Expect = 3e-21
Identities = 69/269 (25%), Positives = 105/269 (39%), Gaps = 20/269 (7%)

Query: 1 MKQPGFEDKHVLITGGAGGIGIELIRVFANASCRVT-FTHRPGEDSRRRALDLVAAFGDD 59
M G E K ITG A GIG + R A+ + + P + + + A +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 60 RVHAHALDVGDRRSHAAFMQALTMPVDIVIHNAGVGTKTVERAAATY---KEQDEAFFRV 116
A D A ++ P+DI+++ AGV R + E+ EA F V
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGV-----LRPGLIHSLSDEEWEATFSV 115

Query: 117 NTIGPLWLSEDLIPGMQARGQGKILFMSSVDGGITHFPRFR-AADGMSKAAVAFLGRQLA 175
N+ G S + M R G I+ + S PR AA SKAA + L
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGS---NPAGVPRTSMAAYASSKAAAVMFTKCLG 172

Query: 176 ATLACTGIDVFTVCPGATDTPMFQASTLNALSPQQR-----QELEASLPGGRLIEPREIA 230
LA I V PG+T+T M + + +Q + + +P +L +P +IA
Sbjct: 173 LELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIA 232

Query: 231 DLCFYLCRDEAR--ILRGAVIDASLGLGV 257
D +L +A + +D LGV
Sbjct: 233 DAVLFLVSGQAGHITMHNLCVDGGATLGV 261


45XC_RS15985XC_RS16050Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS15985221-0.567702hypothetical protein
XC_RS159900200.014229ADP-dependent (S)-NAD(P)H-hydrate dehydratase
XC_RS15995-120-0.947278phosphoglycerate mutase
XC_RS16000015-3.593886TetR family transcriptional regulator
XC_RS16005224-5.073530hypothetical protein
XC_RS16010221-5.265182hypothetical protein
XC_RS16015423-7.097703hypothetical protein
XC_RS16020422-6.883435hypothetical protein
XC_RS16025421-6.989683restriction endonuclease subunit R
XC_RS16030527-6.869126restriction endonuclease S
XC_RS16035422-5.7740362-hydroxyacid dehydrogenase
XC_RS16040426-5.880893DNA methyltransferase
XC_RS16045128-2.559836restriction endonuclease
XC_RS16050133-3.037050plasmid stabilization protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15990PERTACTIN290.039 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 28.5 bits (63), Expect = 0.039
Identities = 40/148 (27%), Positives = 54/148 (36%), Gaps = 10/148 (6%)

Query: 30 GRVLIVGGSARVP-GAVMLAGEAALRAGAGKLQLA---TAASVAPGMAL-----AMPEAL 80
RV + GGS P G V+ G A R L+ A + A G AL P L
Sbjct: 322 ARVTVSGGSLSAPHGNVIETGGGARRFPPPASPLSITLQAGARAQGRALLYRVLPEPVKL 381

Query: 81 VLGLGENGQGEIVR-GHRALDAAMSACDAAVIGPGMASTNTTAALVKRAIDQAVCTLVLD 139
L G GQG+IV + A S + T T A+ +ID A + +
Sbjct: 382 TLAGGAQGQGDIVATELPPIPGASSGPLDVALASQARWTGATRAVDSLSIDNATWVMTDN 441

Query: 140 AGALSPRLRAPLGRPFVLTPHAGEMAAL 167
+ + RL + F AG L
Sbjct: 442 SNVGALRLASDGSVDFQQPAEAGRFKVL 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16000HTHTETR449e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.2 bits (104), Expect = 9e-08
Identities = 20/68 (29%), Positives = 32/68 (47%)

Query: 8 RTAKQREELLVDEVLDAAERCIDEIGLSATSFELIAERARLSCGTLLQRFEDKDALVRAL 67
R KQ + +LD A R + G+S+TS IA+ A ++ G + F+DK L +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 68 LERNYMRA 75
E +
Sbjct: 63 WELSESNI 70


46XC_RS16205XC_RS16265Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS16205225-4.0351872-methylisocitrate lyase
XC_RS16210330-4.635849propanediol utilization protein
XC_RS16215536-6.050583hypothetical protein
XC_RS16220230-3.722206hypothetical protein
XC_RS16225122-2.110742hypothetical protein
XC_RS16230-312-0.208585hypothetical protein
XC_RS16240-1111.415680*type IV fimbriae assembly protein
XC_RS162451111.432677DNA polymerase III subunit delta'
XC_RS162501131.469288aminodeoxychorismate lyase
XC_RS162551150.437488hypothetical protein
XC_RS16260318-0.6744083-oxoacyl-ACP synthase
XC_RS16265219-0.597478acyl carrier protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16210HTHFIS322e-107 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 322 bits (828), Expect = e-107
Identities = 139/415 (33%), Positives = 202/415 (48%), Gaps = 34/415 (8%)

Query: 142 EDARDCIADLRA--SGIQVIVGTGMA-IDFAEQAGLPGVLLYSADSVRLALDQAIALATA 198
E+A D + ++ + V+V + A +A G Y L + I +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP--FDLTELIGIIGR 117

Query: 199 PSSAAPALRNATRRTRNQPAALLGEDTAMVQVRALVQLYAPHASTVLISGETGTGKELVA 258
+ + L+G AM ++ ++ T++I+GE+GTGKELVA
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA 177

Query: 259 RQLHAGSRRRGR-FVAINCGAITESLLEAELFGYSDGAFTGARRGGHTGLIEAAQDGTLF 317
R LH +RR FVAIN AI L+E+ELFG+ GAFTGA+ TG E A+ GTLF
Sbjct: 178 RALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQAEGGTLF 236

Query: 318 LDEIGELPLPLQTRLLRVLEEREVLRVGATVPVPVDVRVVAASLQPLQTLVDQGRFRRDL 377
LDEIG++P+ QTRLLRVL++ E VG P+ DVR+VAA+ + L+ ++QG FR DL
Sbjct: 237 LDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDL 296

Query: 378 FYRLAALRIDLPPLRNRPDDIALLLTHYAQHGSETALP---LSPAALQRLQQHAWPGNVR 434
+YRL + + LPPLR+R +DI L+ H+ Q + L AL+ ++ H WPGNVR
Sbjct: 297 YYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVR 356

Query: 435 ELRNLVERLCI---------------------HWKAQAQGQVSLAQLEAWAPELADASTP 473
EL NLV RL + S + + A E
Sbjct: 357 ELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYF 416

Query: 474 AGYDSTSADAASSRAHLTAT---LVRAALERAQGNRASAAKALGVSRTTLWRWMQ 525
A + + L L+ AAL +GN+ AA LG++R TL + ++
Sbjct: 417 ASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIR 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16250FLGHOOKAP1300.019 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.5 bits (66), Expect = 0.019
Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 122 FTLVEGWNFRQLRAALGTATPLQQTIAGLDDAAL 155
++LV+G RQL A +A P + T+A +D A
Sbjct: 232 YSLVQGSTARQLAAVPSSADPSRTTVAYVDGTAG 265


47XC_RS16445XC_RS16490Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS16445313-2.471370lytic transglycosylase
XC_RS16450413-3.350201hypothetical protein
XC_RS16455414-2.930031peptidylprolyl isomerase
XC_RS16475316-3.038577***transcriptional regulator
XC_RS16480317-2.464515peptidase
XC_RS16485420-1.886979ATP-dependent Clp protease ATP-binding subunit
XC_RS16490322-0.948010ATP-dependent Clp protease proteolytic subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16475DNABINDINGHU1172e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 2e-38
Identities = 54/88 (61%), Positives = 67/88 (76%)

Query: 2 NKTELIDGVAAAADISKAEAGRAVDAVVSEITKALKKGDAVTLVGFGTFQVRERAERTGR 61
NK +LI VA A +++K ++ AVDAV S ++ L KG+ V L+GFG F+VRERA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPKTGDSIKIAASKNPAFKAGKALKDAV 89
NP+TG+ IKI ASK PAFKAGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16480HTHFIS320.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.009
Identities = 31/124 (25%), Positives = 51/124 (41%), Gaps = 21/124 (16%)

Query: 333 ERILEYLAVQSRVKQMKGPILCLVGPPGVGKTSLGQSIAKATNRK---FVRMSLGGVRD- 388
+ E V +R+ Q ++ + G G GK + +++ R+ FV +++ +
Sbjct: 144 AAMQEIYRVLARLMQTDLTLM-ITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD 202

Query: 389 --EAEIRGHRRTY----VGSMPGRLVQNLNKVGSKNPLFLLDEIDKMSMDFRGDPSSALL 442
E+E+ GH + GR Q LFL DEI M MD + + LL
Sbjct: 203 LIESELFGHEKGAFTGAQTRSTGRFEQ-----AEGGTLFL-DEIGDMPMDAQ----TRLL 252

Query: 443 EVLD 446
VL
Sbjct: 253 RVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS16485HTHFIS320.005 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.005
Identities = 15/85 (17%), Positives = 32/85 (37%), Gaps = 10/85 (11%)

Query: 60 QSARSSLPKPREILEVLDQY----VIGQLRAKRTLAVAVYNHYKRIESRSKNDEVELAK- 114
+ A LPKP ++ E++ + R + + S + + +
Sbjct: 96 KGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155

Query: 115 -----SNILLVGPTGSGKTLLAETL 134
+++ G +G+GK L+A L
Sbjct: 156 LMQTDLTLMITGESGTGKELVARAL 180


48XC_RS17060XC_RS17170Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS170600143.458312hypothetical protein
XC_RS170650101.285597acylphosphatase
XC_RS17070-1120.794207thioredoxin
XC_RS17075-2120.646480ribonuclease BN
XC_RS170801101.229730NAD(P)H quinone oxidoreductase
XC_RS170852111.127671membrane protein
XC_RS170902110.846069xylanase
XC_RS170953132.104464asparaginase
XC_RS171002142.527195transcriptional regulator
XC_RS171052142.127100peptidase S8
XC_RS171101172.365755peptidase S8
XC_RS171150152.045913peptidase
XC_RS171200152.163226extracellular protease
XC_RS17125-1152.216381branched-chain amino acid aminotransferase
XC_RS171301112.729299hypothetical protein
XC_RS171350123.959395membrane protein
XC_RS17140192.873135RNA polymerase sigma70 factor
XC_RS171450103.646346membrane protein
XC_RS17150093.610658hypothetical protein
XC_RS17155093.443758transcriptional regulator
XC_RS17160-193.039744hydrolase
XC_RS17165-1113.029720hypothetical protein
XC_RS17170-2123.152957hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS17105SUBTILISIN2009e-62 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 200 bits (511), Expect = 9e-62
Identities = 101/339 (29%), Positives = 139/339 (41%), Gaps = 57/339 (16%)

Query: 134 DPGLPQQWAMGTTTASL---NVRPAWDRTTGKGIVVAVIDTGI-TAHPDLAANVLPGYDF 189
+ Q+ + + W++T G+G+ VAV+DTG HPDL A ++ G +F
Sbjct: 10 YQVIKQEQQVNEIPRGVEMIQAPAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIGGRNF 69

Query: 190 ITDPTVAGDGNGRDNNAADQGDWSAANACGAGASASNSSWHGTHVAGIVAAVGNNAAGVV 249
D G + + HGTHVAG +AA N GVV
Sbjct: 70 TDDDE------------------------GDPEIFKDYNGHGTHVAGTIAATENEN-GVV 104

Query: 250 GTAFNAKLLPLRVLGKCG-GYMSDIADAIVWASGGKVTGVPANPNPATVINLSLGGYGSC 308
G A A LL ++VL K G G I I +A +I++SLGG
Sbjct: 105 GVAPEADLLIIKVLNKQGSGQYDWIIQGIYYA----------IEQKVDIISMSLGGPEDV 154

Query: 309 STIIGNAITGAVTRGTAVVVAAGNSNMDVAT----SMPANCANVIAVAATTSAGAKASFS 364
+ A+ AV V+ AAGN P VI+V A + FS
Sbjct: 155 PEL-HEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFS 213

Query: 365 NFGKGVDIAAPGQAIISTLNSGTTVPANPAYAVYSGTSMAAPHVAGVVALMQSVALN--- 421
N VD+ APG+ I+ST+ G YA +SGTSMA PHVAG +AL++ +A
Sbjct: 214 NSNNEVDLVAPGEDILSTVPGGK-------YATFSGTSMATPHVAGALALIKQLANASFE 266

Query: 422 -PLTPATVEALLKSSARPLPVACAPGCGAGLVNADGAVA 459
LT + A L PL + G GL+
Sbjct: 267 RDLTEPELYAQLIKRTIPLGNS-PKMEGNGLLYLTAVEE 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS17110SUBTILISIN1924e-61 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 192 bits (490), Expect = 4e-61
Identities = 103/348 (29%), Positives = 139/348 (39%), Gaps = 58/348 (16%)

Query: 2 EIDQIMYPTLTPNDTRLSEQWGFGTTASGINVRPAWDTATGTGVVVAVIDTGI-TSHPDL 60
++ I Y + G I W+ G GV VAV+DTG HPDL
Sbjct: 4 KVHIIPYQVIKQEQQVNEIPRGVEM----IQAPAVWNQTRGRGVKVAVLDTGCDADHPDL 59

Query: 61 NANVLPGYDFISDAARARDNNGRDSNAADQGDWRTANQCGTGVAAANSSWHGTHVAGTIA 120
A ++ G +F D++ D + HGTHVAGTIA
Sbjct: 60 KARIIGGRNFT-------DDDEGDPEIFKDYNG-----------------HGTHVAGTIA 95

Query: 121 AVTNNSTGVAGTAFNARIVPIRALGLCG-GSSSDIADAIVWASGGTVSGVPANANPAEVI 179
A T N GV G A A ++ I+ L G G I I +A ++I
Sbjct: 96 A-TENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYA----------IEQKVDII 144

Query: 180 NMSLGGNGTCSNTYQNAINGAVSRGTTVVVAAGNSNANVAN----FTPASCANVISVASI 235
+MSLGG A+ AV+ V+ AAGN P VISV +I
Sbjct: 145 SMSLGGPED-VPELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAI 203

Query: 236 TSAGARSSFSNFGTTIDISGPGSAILSTLNSGTTTPGSASYASYNGTSMAAPHVAGVVAL 295
S FSN +D+ PG ILST+ G YA+++GTSMA PHVAG +AL
Sbjct: 204 NFDRHASEFSNSNNEVDLVAPGEDILSTVPGGK-------YATFSGTSMATPHVAGALAL 256

Query: 296 VQSAAS----RPLTPAAVETLLKNTARPLPGACSGGCGAGIVNAAGAV 339
++ A+ R LT + L PL + G G++
Sbjct: 257 IKQLANASFERDLTEPELYAQLIKRTIPLGNS-PKMEGNGLLYLTAVE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS17120SUBTILISIN1944e-59 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 194 bits (494), Expect = 4e-59
Identities = 99/335 (29%), Positives = 141/335 (42%), Gaps = 57/335 (17%)

Query: 147 QWAFGTTNAGL---NIRPAWDKATGSGTVVAVIDTGI-TSHADLNANILAGYDFISDATT 202
+ G+ W++ G G VAV+DTG H DL A I+ G +F D
Sbjct: 16 EQQVNEIPRGVEMIQAPAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIGGRNFTDD--- 72

Query: 203 ARDGNGRDSNAADEGDWYAANECGAGIPAASSSWHGTHVAGTVAAVTNNTTGVAGTAYGA 262
DEGD + HGTHVAGT+AA T N GV G A A
Sbjct: 73 ------------DEGDPE---------IFKDYNGHGTHVAGTIAA-TENENGVVGVAPEA 110

Query: 263 KVVPVRVLGKCG-GSLSDIADAIVWASGGTVSGIPANANPAEVINMSLGGGGSCSTTMQN 321
++ ++VL K G G I I +A ++I+MSLGG +
Sbjct: 111 DLLIIKVLNKQGSGQYDWIIQGIYYA----------IEQKVDIISMSLGGPED-VPELHE 159

Query: 322 AINGAVSRGTTVVVAAGNDASNVSG----SLPANCANVIAVAATTSAGAKASYSNFGTGI 377
A+ AV+ V+ AAGN+ P VI+V A + +SN +
Sbjct: 160 AVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSNNEV 219

Query: 378 DVSAPGSSILSTLNSGTTTPGSASYASYNGTSMASPHVAGVVALVQSVAPTA----LTPA 433
D+ APG ILST+ G YA+++GTSMA+PHVAG +AL++ +A + LT
Sbjct: 220 DLVAPGEDILSTVPGGK-------YATFSGTSMATPHVAGALALIKQLANASFERDLTEP 272

Query: 434 AVETLLKNTARALPGACSGGCGAGIVNADAAVTAA 468
+ L L + G G++ A +
Sbjct: 273 ELYAQLIKRTIPLGNS-PKMEGNGLLYLTAVEELS 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS17155HTHTETR485e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 47.7 bits (113), Expect = 5e-09
Identities = 19/120 (15%), Positives = 41/120 (34%), Gaps = 3/120 (2%)

Query: 37 AALRRAAWEIVAESGARALTLRACARRAGVSHAAPAHHFGSLNGLVAEMVADGYERMVAR 96
+ A + ++ G + +L A+ AGV+ A HF + L +E+ +
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 97 IAATQQDVSD---PVMGCGLGYIRFAIEFPQHFRLMLSLDLRAHPLPRLAQASEAARACL 153
Q V+ L ++ + + RL++ + + A+ L
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133


49XC_RS17795XC_RS17825Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS177952171.85592816S rRNA methyltransferase
XC_RS178001203.820751transcriptional regulator MraZ
XC_RS178052234.738257hypothetical protein
XC_RS178102204.197303plasmid maintenance protein CcdB
XC_RS178150144.103229CcdB cytotoxin-like protein
XC_RS178200133.51313416S rRNA methyltransferase
XC_RS178251123.239468hypothetical protein
50XC_RS17995XC_RS18195Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS179952102.574599glycosyl transferase
XC_RS18000012-0.480569aminoglycoside phosphotransferase
XC_RS18005013-0.318494TonB-denpendent receptor
XC_RS18010116-0.393333aminotransferase
XC_RS180151150.364235transcriptional regulator
XC_RS180201160.208823hypothetical protein
XC_RS18025018-0.215776type II secretion system protein D
XC_RS180303173.439358general secretion pathway protein N
XC_RS180353173.255502general secretion pathway protein M
XC_RS180401162.812113type II secretion system protein L
XC_RS180453181.990590general secretion pathway protein K
XC_RS180502131.522865general secretion pathway protein J
XC_RS180551120.442637type II secretion system protein I
XC_RS18060-2150.913724type II secretion system protein H
XC_RS180650161.919308type II secretion system protein G
XC_RS180700162.029493hypothetical protein
XC_RS180750162.068780type II secretion system protein F
XC_RS18080-1132.326766general secretion pathway protein E
XC_RS180850142.772034protease
XC_RS18090-1133.019537membrane protein
XC_RS18095-192.890020hypothetical protein
XC_RS18100-2102.479462hypothetical protein
XC_RS18105-3112.127479phosphoribosylformylglycinamidine synthase
XC_RS181102131.690072chitinase
XC_RS181151142.042176tyrosine recombinase XerD
XC_RS181200151.471515membrane protein
XC_RS18125012-1.822893hypothetical protein
XC_RS18130-211-1.461669membrane protein
XC_RS18135-112-1.574942membrane protein
XC_RS18140012-2.520641cytosol aminopeptidase
XC_RS18145114-4.277652DNA polymerase III subunit chi
XC_RS18150114-4.347330ATPase
XC_RS18155-114-1.704065valine--tRNA ligase
XC_RS18160-114-1.224084hypothetical protein
XC_RS18165-115-0.580173pectate lyase
XC_RS18170-1150.682343pectate lyase
XC_RS181751172.126414alanine acetyltransferase
XC_RS181801161.956344alanine acetyltransferase
XC_RS181851150.989062CDP-diacylglycerol--serine
XC_RS181902160.747818hypothetical protein
XC_RS181952150.161643proline--tRNA ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18025BCTERIALGSPD408e-135 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 408 bits (1050), Expect = e-135
Identities = 175/703 (24%), Positives = 298/703 (42%), Gaps = 115/703 (16%)

Query: 65 VIRRGSGTMINQSAAAAPSPTLGMASSGSATFNFEGESVQAVVKAILGDMLGQNYVIAPG 124
VIR S T++ +A A++ + +F+G +Q + + + L + +I P
Sbjct: 6 VIRSFSLTLLIFAALLFRP-----AAAEEFSASFKGTDIQEFINTVSKN-LNKTVIIDPS 59

Query: 125 VQGTVTLATPNPVSPAQALNLLEMVLG-WNNARMVFSGGRYNIVPA-DQALAGTVAPSTA 182
V+GT+T+ + + ++ Q VL + A + + G +V + D A S A
Sbjct: 60 VRGTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDA 119

Query: 183 SPSAARGFEVRVVPLKYISASEMKKVLEPYARPNAIVGTDA---SRNVITLGGTRAELEN 239
+P RVVPL ++A ++ +L NA VG+ NV+ + G A ++
Sbjct: 120 APGIGDEVVTRVVPLTNVAARDLAPLLRQL-NDNAGVGSVVHYEPSNVLLMTGRAAVIKR 178

Query: 240 YLRTVQIFDVDWLSGMSVGVFPIQSGKAEKISADLEKVFGEQSKT--PSAGMFRFMPLEN 297
L V+ VD SV P+ A + + ++ + SK+ P + + + E
Sbjct: 179 LLTIVE--RVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADER 236

Query: 298 ANAVLVI---TPQPRYLDQIQQWLDRIDSAGGGVRLFSYELKYIKAKDLADRLSEVFGGR 354
NAVLV + R + I+Q LDR + G ++ LKY KA DL + L+ +
Sbjct: 237 TNAVLVSGEPNSRQRIIAMIKQ-LDRQQATQGNTKVIY--LKYAKASDLVEVLTGI---- 289

Query: 355 GNGGNSGPSLVPGGVVNMLGNNSGGADRDESLGSSSGATGGDIGGTSNGSSQSGTSGSFG 414
+S S+ +
Sbjct: 290 ---------------------------------------------SSTMQSEKQAAKPVA 304

Query: 415 GSSGSGMLQLPPSTNQNGSVTLEVEGDKVGVSAVAETNTLLVRTSAQAWKSIRDVIEKLD 474
+ +++ TN L V ++ + VI +LD
Sbjct: 305 ALDKNIIIKAHGQTN-----ALIVTAAPDVMNDLER------------------VIAQLD 341

Query: 475 VMPMQVHIEAQIAEVTLTGRLQYGVNWYFENAVTTPSNADGSGGPNLPSAAGRGIW---G 531
+ QV +EA IAEV L G+ W +NA T SG P + AG + G
Sbjct: 342 IRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMT--QFTNSGLPISTAIAGANQYNKDG 399

Query: 532 DVSGSVTS-----NGVAWTFLGKNAAAIISALDQVTNLRLLQTPSVFVRNNAEATLNVGS 586
VS S+ S NG+A F N A +++AL T +L TPS+ +N EAT NVG
Sbjct: 400 TVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQ 459

Query: 587 RIPINSTSINTGLGSDSSFSSVQYIDTGVILKVRPRVTKDGMVFLDIVQEVSTPGARPAA 646
+P+ + S T D+ F++V+ G+ LKV+P++ + V L+I QEVS+
Sbjct: 460 EVPVLTGSQTT--SGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSV------ 511

Query: 647 CTAAATTTVNSAACNVDINTRRVKTEAAVQNGDTIMLAGLIDDSTTDGSNGIPFLSKLPV 706
A + S+ NTR V V +G+T+++ GL+D S +D ++ +P L +PV
Sbjct: 512 ---ADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPV 568

Query: 707 VGALFGRKTQNSDRREVIVLITPSIVRNPQDARDLTDEYGSKF 749
+GALF ++ +R +++ I P+++R+ + R + + F
Sbjct: 569 IGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYTAF 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18030TONBPROTEIN345e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.8 bits (77), Expect = 5e-04
Identities = 20/84 (23%), Positives = 27/84 (32%), Gaps = 7/84 (8%)

Query: 145 IEGPGGTQTLELQVFNGQGGQPPTAIGGRPQAPGAVPPLPPNVPPAPATPAPPPAEVPQQ 204
IE P Q + + + +PP QA P P P PP E P
Sbjct: 36 IELPAPAQPISVTMVTPADLEPP-------QAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 88

Query: 205 QPGGQAPPTVPPQRSDGAQEAPRP 228
+ P P+ QE P+
Sbjct: 89 IEKPKPKPKPKPKPVKKVQEQPKR 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18055PilS_PF08805349e-05 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 34.1 bits (78), Expect = 9e-05
Identities = 7/55 (12%), Positives = 22/55 (40%), Gaps = 4/55 (7%)

Query: 1 MKHQRGYSLIEVIVAFALLALALTLLLGSLSGAARQVRGADDSTRATLHAQSLLA 55
+ +G +L+EV++ ++ + S + S+ + +++A
Sbjct: 22 KEQDKGATLMEVLLVVGVIVVLAASAYKLYSMV----QSNIQSSNEQNNVLTVIA 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18060BCTERIALGSPH310.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 30.7 bits (69), Expect = 0.002
Identities = 24/108 (22%), Positives = 49/108 (45%), Gaps = 1/108 (0%)

Query: 21 QLRGSSLLEMLLVIALIALAGVLAAAALTGGIDGMRLRSAGKAIAAQLRYTRTQAIATGT 80
+ RG +LLEM+L++ L+ ++ + A D ++ + AQLR+ + + + TG
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLAR-FEAQLRFVQQRGLQTGQ 60

Query: 81 PQRFLIDPQQRRWEAPGGHHGDLPAALEVRFTGARQVQSRQDQGAIQF 128
+ P + ++ G PA + ++G R + R + A
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSG 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18065BCTERIALGSPG1412e-46 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 141 bits (358), Expect = 2e-46
Identities = 41/132 (31%), Positives = 61/132 (46%), Gaps = 18/132 (13%)

Query: 15 QAGMSLLEIIIVIVLIGAVLTLVGSRVLGGADRGKANLAKSQIQTLAGKIENFQLDTGKL 74
Q G +LLEI++VIV+IG + +LV ++G ++ A S I L ++ ++LD
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHY 66

Query: 75 PSKLDDLVTQPGGSSGWLGPYAKPVELN------------DPWGHTIEYRVPGDGQAFDL 122
P+ T G S P P+ N DPWG+ PG+ A+DL
Sbjct: 67 PT------TNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 123 ISLGKDGRPGGS 134
+S G DG G
Sbjct: 121 LSAGPDGEMGTE 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18075BCTERIALGSPF436e-154 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 436 bits (1122), Expect = e-154
Identities = 135/411 (32%), Positives = 215/411 (52%), Gaps = 12/411 (2%)

Query: 1 MPLYRYKALDAHGEMLDGQMEAANDAEVALRLQEQGHLPV---ETRLATGENGSPSLRML 57
M Y Y+ALDA G+ G EA + + L+E+G +P+ E R ++GS L L
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLS-L 59

Query: 58 LRKKPFDNAALVQFTQQLATLIGAGQPLDRALSILMDLPEDDKSRRVIADIRDTVRGGAP 117
RK + L T+QLATL+ A PL+ AL + E +++A +R V G
Sbjct: 60 RRKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHS 119

Query: 118 LSVALERQHGLFSKLYINMVRAGEAGGSMQDTLQRLADYLERSRALKGKVINALIYPAIL 177
L+ A++ G F +LY MV AGE G + L RLADY E+ + ++ ++ A+IYP +L
Sbjct: 120 LADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVL 179

Query: 178 LAVVGCALLFLLGYVVPQFAQMYESLDVALPWFTQAVLSVGLLVRDW--WLVLVVIPGVL 235
V + LL VVP+ + + + ALP T+ ++ + VR + W++L ++ G +
Sbjct: 180 TVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFM 239

Query: 236 G--LWLDRKRRNAAFRAALDAWLLRQKVIGSLIARLETARLTRTLGTLLRNGVPLLAAIG 293
+ L +++R R + LL +IG + L TAR RTL L + VPLL A+
Sbjct: 240 AFRVMLRQEKR----RVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMR 295

Query: 294 IARNVMSNTALVEDVAAAADDVKNGHGLSMSLARGKRFPRLALQMIQVGEESGALDTMLL 353
I+ +VMSN ++ A D V+ G L +L + FP + MI GE SG LD+ML
Sbjct: 296 ISGDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLE 355

Query: 354 KTADTFELETAQAIDRALAALVPLITLVLASVVGLVIISVLVPLYDLTNAI 404
+ AD + E + + AL PL+ + +A+VV +++++L P+ L +
Sbjct: 356 RAADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18085SUBTILISIN2041e-62 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 204 bits (521), Expect = 1e-62
Identities = 103/362 (28%), Positives = 152/362 (41%), Gaps = 68/362 (18%)

Query: 156 PQLVPNDPLYAQYQWHLSNPNGGINAPAAWDLSQGAGVVVAVLDTGILPGHPDFAGNLLQ 215
Q++ + + + I APA W+ ++G GV VAVLDTG HPD ++
Sbjct: 10 YQVIKQEQQVNEIPRGV----EMIQAPAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIG 65

Query: 216 GYDFITDAEVSRRPTDARVPGALDYGDWEEADNVCYDGSVAQESSWHGTHVSGTVAEATH 275
G +F D+ D + ++ + HGTHV+GT+A AT
Sbjct: 66 GRNF--------------------------TDDDEGDPEIFKDYNGHGTHVAGTIA-ATE 98

Query: 276 NGVGMAGVAPKATILPVRVLGRCG-GYTSDIADAIVWASGGTVDGVPANTNPAEVINMSL 334
N G+ GVAP+A +L ++VL + G G I I +A VD +I+MSL
Sbjct: 99 NENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVD----------IISMSL 148

Query: 335 GGGEPCDPATQVAINGAVSRGTTVVVAAGNSGEDAAN----HSPASCNNTITVGATRITG 390
GG E A+ AV+ V+ AAGN G+ P N I+VGA
Sbjct: 149 GGPEDVP-ELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDR 207

Query: 391 GIIYYSNYGSQVDLSGPGGGSVDGNPGGYIWQAGYDGATTPTSGSYSYMGIGGTSMASPH 450
+SN ++VDL PG I G Y GTSMA+PH
Sbjct: 208 HASEFSNSNNEVDLVAPGED---------ILSTVPGG---------KYATFSGTSMATPH 249

Query: 451 VAGVVALVQSASIGLGDGPLTPAAMEALLKQTSRRFPVTPPTSTPIGSGIVDAKAALEAV 510
VAG +AL++ + + LT + A L + + +P G+G++ A E
Sbjct: 250 VAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKME---GNGLLYLTAVEELS 306

Query: 511 LV 512
+
Sbjct: 307 RI 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18090OMADHESIN562e-09 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 55.7 bits (133), Expect = 2e-09
Identities = 71/241 (29%), Positives = 108/241 (44%), Gaps = 24/241 (9%)

Query: 1270 GAESVADGTSAAAFGFGAEATSNYSTALGGYSTASGFNSTALGNFSTASGSSSVAVGGDA 1329
G + A G + A G AEA + A+G S A+G NS A+G S A G S+V G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1330 TASGAYSIAAGQASVASGYNSVAVGGALLGLLPTEASGDFSTALGGAAWAPGLNSTALGN 1389
TA + VA+G ++ D A+G + A NS A+G+
Sbjct: 122 TAQK---------------DGVAIGA-------RASTSDTGVAVGFNSKADAKNSVAIGH 159

Query: 1390 FAESTGES--SVALGADSVADRDFAVSVGSAGNERQITNVAAGTQGTDAVNLDQLTAVAE 1447
+ S+A+G S DR+ +VS+G RQ+T++AAGT+ TDAVN+ QL E
Sbjct: 160 SSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIE 219

Query: 1448 TAQGTSKYFKASGSDDSDAGAYIEGDNALAAGEGANASSDNSTAVGAGAQAVAENATAVG 1507
Q + A +++A A + + L S T A +A A++ +
Sbjct: 220 KTQENTNKRSAELLANANAYADNKSSSVLGIANNYTDSKSAETLENARKEAFAQSKDVLN 279

Query: 1508 M 1508
M
Sbjct: 280 M 280



Score = 55.7 bits (133), Expect = 2e-09
Identities = 65/193 (33%), Positives = 94/193 (48%), Gaps = 22/193 (11%)

Query: 72 GRGASAPAAHATAIGAGSNASATGAVATGADSSASGVNSSAIGRQTNAIGENAVAIGYNS 131
G ASA H+ AIGA + A+ AVA GA S A+GVNS AIG + A+G++AV G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 132 FVRQAG----------ENGVALGANAGVTGANSVALGAGSRTHEDDVVSVGSGNGRGG-- 179
++ G + GVA+G N+ NSVA+G S + S+ G+
Sbjct: 122 TAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDR 181

Query: 180 ---------PATRRITNVSAGVNANDAVNVAQLQAVSEVAEDTATFFKAQPGDDSIGAYA 230
R++T+++AG DAVNVAQL+ E ++ A+ ++ AYA
Sbjct: 182 ENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANA-NAYA 240

Query: 231 DGAGAVAAGDAAN 243
D + G A N
Sbjct: 241 DNKSSSVLGIANN 253



Score = 50.3 bits (119), Expect = 8e-08
Identities = 54/164 (32%), Positives = 81/164 (49%), Gaps = 5/164 (3%)

Query: 2104 AATSTAVGNAAVANHITGTAIGGSAYAHGPNDTAIGSNARVNADGSTAVGANTQIAAVAT 2163
AT+ A AAVA A G ++ A GP A+G +A STA I A A+
Sbjct: 76 GATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGARAS 135

Query: 2164 NA---VAMGEGAQVSAASGTAIGQGARASAQG--AVALGQGSVADRANTVSVGSVGGERQ 2218
+ VA+G ++ A + AIG + +A ++A+G S DR N+VS+G RQ
Sbjct: 136 TSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQ 195

Query: 2219 VANVAAGTRATDAVNKGQLDSGVAAANSYTDSRYSAMADSFETY 2262
+ ++AAGT+ TDAVN QL + T+ R + + + Y
Sbjct: 196 LTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAY 239



Score = 49.5 bits (117), Expect = 1e-07
Identities = 51/144 (35%), Positives = 72/144 (50%), Gaps = 3/144 (2%)

Query: 834 GANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDESIALGWEAQAEGDQGIALGAGS 893
G NA A SIA+G A A A+++G S A G S+A+G ++A GD + GA S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 894 QADAYSTAIGGYATASGASATAVGNNSRAVDGYATALGSDSMASGN--FSTTVGGASVAS 951
A AIG A+ S + AVG NS+A + A+G S + N +S +G S
Sbjct: 122 TAQKDGVAIGARASTSD-TGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTD 180

Query: 952 GRGATAIGAESVARMDRDTAIGTE 975
+ +IG ES+ R A GT+
Sbjct: 181 RENSVSIGHESLNRQLTHLAAGTK 204



Score = 49.5 bits (117), Expect = 1e-07
Identities = 59/165 (35%), Positives = 82/165 (49%), Gaps = 26/165 (15%)

Query: 371 GTQTSASGTSSTAVGGPVDLIPGLGFFVQTQASGEAASALGAGAIASGTYTTAVGTLSEA 430
G SA G S A+G +A+ AA A+GAG+IA+G + A+G LS+A
Sbjct: 62 GLNASAKGIHSIAIGA------------TAEAAKGAAVAVGAGSIATGVNSVAIGPLSKA 109

Query: 431 SGTEATAVGYFAYAPGEG------------ATAVGPESWASGELSTALGYYS--TARGAN 476
G A G + A +G AVG S A + S A+G+ S A
Sbjct: 110 LGDSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGY 169

Query: 477 SVALGANSVATRADTVSVGAAGAERQITSVAAGTEGTDAVNLNQL 521
S+A+G S R ++VS+G RQ+T +AAGT+ TDAVN+ QL
Sbjct: 170 SIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQL 214



Score = 46.8 bits (110), Expect = 9e-07
Identities = 56/159 (35%), Positives = 79/159 (49%), Gaps = 4/159 (2%)

Query: 650 ALGVGSVAFGGTSTAVGGASVAFGTDSAAFGANAAAGGTASTAIGANSNAFGERTVALGG 709
ALG+ A G + A G S A GA A A A+ A+GA S A G +VA+G
Sbjct: 46 ALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGP 105

Query: 710 ASNASGDESIALGVSSLASALGTTAVGSNANASIANATAVGFNSSAGDDYATALGGDSN- 768
S A GD ++ G +S A G A+G+ A+ S AVGFNS A + A+G S+
Sbjct: 106 LSKALGDSAVTYGAASTAQKDG-VAIGARASTS-DTGVAVGFNSKADAKNSVAIGHSSHV 163

Query: 769 -ASGYFSTAVGGTSIANGRGATAIGYESIGNGAASTALG 806
A+ +S A+G S + + +IG+ES+ A G
Sbjct: 164 AANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAG 202



Score = 45.3 bits (106), Expect = 2e-06
Identities = 64/211 (30%), Positives = 95/211 (45%), Gaps = 60/211 (28%)

Query: 903 GGYATASGASATAVGNNSRAVDGYATALGSDSMASGNFSTTVGGASVASGRGATAIGAES 962
G A+A G + A+G + A G A A+G+ S+A+G S +G S A G A GA S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 963 VARMDRDTAIGTESVADGGDSTALGANARASYDSSVALGANANSSNYYSVALGTYAVATG 1022
A+ D VA+GA A++S
Sbjct: 122 TAQKD-----------------------------GVAIGARASTS--------------- 137

Query: 1023 GSATSIGGQSYAPGNESVALGWQSNASGTRSVSLGSGAYTPADDG--VALGAGSIADRDN 1080
+ VA+G+ S A SV++G ++ A+ G +A+G S DR+N
Sbjct: 138 --------------DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDREN 183

Query: 1081 TVSVGSVGSERQITNVAAGTEGTDAVNLDQL 1111
+VS+G RQ+T++AAGT+ TDAVN+ QL
Sbjct: 184 SVSIGHESLNRQLTHLAAGTKDTDAVNVAQL 214



Score = 44.9 bits (105), Expect = 3e-06
Identities = 58/199 (29%), Positives = 94/199 (47%), Gaps = 8/199 (4%)

Query: 820 GTESVAYGDDSTALGANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDESIALGWEA 879
G + A G S A+GA A AA ++AVG + A G ++++G S A GD ++ G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 880 QAEGDQGIALGAGSQADAYSTAIGGYATASGASATAVGNNSR--AVDGYATALGSDSMAS 937
A+ D G+A+GA + A+G + A ++ A+G++S A GY+ A+G S
Sbjct: 122 TAQKD-GVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTD 180

Query: 938 GNFSTTVGGASVASGRGATAIG-----AESVARMDRDTAIGTESVADGGDSTALGANARA 992
S ++G S+ A G A +VA++ ++ E+ ANA A
Sbjct: 181 RENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYA 240

Query: 993 SYDSSVALGANANSSNYYS 1011
SS LG N ++ S
Sbjct: 241 DNKSSSVLGIANNYTDSKS 259



Score = 44.5 bits (104), Expect = 4e-06
Identities = 54/159 (33%), Positives = 80/159 (50%), Gaps = 4/159 (2%)

Query: 1170 AIGSGASATAQYANASGYNAAASGYGSVSTGAFSQASGDYGVALGGESEASGAQSTAVGA 1229
A+G A G NA+A G S++ GA ++A+ VA+G S A+G S A+G
Sbjct: 46 ALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGP 105

Query: 1230 AAGASGDGAFAGGALSVAEGTESTALGYFASATGESATAVGAESVADGTSAAAFGFGAEA 1289
+ A GD A GA S A+ + A+G AS T ++ AVG S AD ++ A G +
Sbjct: 106 LSKALGDSAVTYGAASTAQ-KDGVAIGARAS-TSDTGVAVGFNSKADAKNSVAIGHSSHV 163

Query: 1290 TSN--YSTALGGYSTASGFNSTALGNFSTASGSSSVAVG 1326
+N YS A+G S NS ++G+ S + +A G
Sbjct: 164 AANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAG 202



Score = 42.6 bits (99), Expect = 2e-05
Identities = 44/132 (33%), Positives = 70/132 (53%), Gaps = 4/132 (3%)

Query: 764 GGDSNASGYFSTAVGGTSIANGRGATAIGYESIGNGAASTALGFASVAWGEGGTAIGTES 823
G +++A G S A+G T+ A A A+G SI G S A+G S A G+ G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 824 VAYGDDSTALGANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDE--SIALGWEAQA 881
A D A+GA A +DTG +AVG + A ++++G S+ A + SIA+G ++
Sbjct: 122 TAQ-KDGVAIGARASTSDTG-VAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 882 EGDQGIALGAGS 893
+ + +++G S
Sbjct: 180 DRENSVSIGHES 191



Score = 42.6 bits (99), Expect = 2e-05
Identities = 57/210 (27%), Positives = 89/210 (42%), Gaps = 21/210 (10%)

Query: 1653 GFIPARASGTGAAAFGGGAWATADYTTAIGWNSYADGVNASALGQSAAALADNALAIGGN 1712
G + A A G + A G A A A+G S A GVN+ A+G + AL D+A+ G
Sbjct: 61 GGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAA 120

Query: 1713 SRADAIGASVVGVDASATGINSTGVGRQVNVIGENAVSVGYNSFVRESAVNGVALGANAG 1772
S A G + +G AS + + V+VG+NS + ++
Sbjct: 121 STAQKDGVA-IGARASTS---------------DTGVAVGFNSKADAKNSVAIGHSSHVA 164

Query: 1773 ATGADSVALGSGSRTYEANTVSVGSGNGRGGPATRRIVNVSDGEVATDAVNKGQLDALAA 1832
A S+A+G S+T N+VS+G + R++ +++ G TDAVN QL
Sbjct: 165 ANHGYSIAIGDRSKTDRENSVSIGHES-----LNRQLTHLAAGTKDTDAVNVAQLKKEIE 219

Query: 1833 DVQTTSGMVQTTGEGVARATGDRATAAGAG 1862
Q + A A D +++ G
Sbjct: 220 KTQENTNKRSAELLANANAYADNKSSSVLG 249



Score = 40.3 bits (93), Expect = 9e-05
Identities = 38/103 (36%), Positives = 58/103 (56%), Gaps = 4/103 (3%)

Query: 1858 AAGAGATASGARSVAVAAGSTASATGASAMGVDSSASGVNSTAMGRQTNSIGENGVALGY 1917
A G A+A G S+A+ A + A+ A A+G S A+GVNS A+G + ++G++ V G
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 1918 NSFVRQSGANAVALGANAGASGADSVALGSGSRTYEANVVSVG 1960
S ++ G VA+GA A S VA+G S+ N V++G
Sbjct: 120 ASTAQKDG---VAIGARASTSDT-GVAVGFNSKADAKNSVAIG 158



Score = 40.3 bits (93), Expect = 9e-05
Identities = 45/131 (34%), Positives = 70/131 (53%), Gaps = 4/131 (3%)

Query: 552 AAGSNAAAFNDYSTALGSSSVASAQGATAVGSGANATTDNATAVGFNSTAVAENTTALGG 611
A G NA+A +S A+G+++ A+ A AVG+G+ AT N+ A+G S A+ ++ G
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 612 NSSASGDGSTAVGGATRATASGATALGYESIANGADSTALGVGS--VAFGGTSTAVGGAS 669
S+A DG GA +T+ A+G+ S A+ +S A+G S A G S A+G S
Sbjct: 120 ASTAQKDGVAI--GARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRS 177

Query: 670 VAFGTDSAAFG 680
+S + G
Sbjct: 178 KTDRENSVSIG 188



Score = 39.5 bits (91), Expect = 2e-04
Identities = 42/123 (34%), Positives = 61/123 (49%), Gaps = 12/123 (9%)

Query: 610 GGNSSASGDGSTAVGGATRATASGATALGYESIANGADSTALGVGSVAFGGTSTAVGGAS 669
G N+SA G S A+G A A A+G A S A GV SVA G S A+G ++
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVG-------AGSIATGVNSVAIGPLSKALGDSA 114

Query: 670 VAFGTDSAAFGANAAAGGTAST-----AIGANSNAFGERTVALGGASNASGDESIALGVS 724
V +G S A A G AST A+G NS A + +VA+G +S+ + + ++ +
Sbjct: 115 VTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIG 174

Query: 725 SLA 727
+
Sbjct: 175 DRS 177



Score = 38.0 bits (87), Expect = 5e-04
Identities = 37/130 (28%), Positives = 67/130 (51%)

Query: 1477 AAGEGANASSDNSTAVGAGAQAVAENATAVGMDALASGIGAAALGNNAQALGENSSAVGS 1536
A G A+A +S A+GA A+A A AVG ++A+G+ + A+G ++ALG+++ G+
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 1537 NALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVGFRSAASNVGSAAFGGYSES 1596
+ A G + + + A S+A A ++ A AA++ S A G S++
Sbjct: 120 ASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 1597 SGRLSSALGY 1606
S ++G+
Sbjct: 180 DRENSVSIGH 189



Score = 37.2 bits (85), Expect = 8e-04
Identities = 38/141 (26%), Positives = 67/141 (47%)

Query: 1200 GAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFAGGALSVAEGTESTALGYFA 1259
G + A G + +A+G +EA+ + AVGA + A+G + A G LS A G + G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1260 SATGESATAVGAESVADGTSAAAFGFGAEATSNYSTALGGYSTASGFNSTALGNFSTASG 1319
+A + S +D A F A+A ++ + + A+ S A+G+ S
Sbjct: 122 TAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDR 181

Query: 1320 SSSVAVGGDATASGAYSIAAG 1340
+SV++G ++ +AAG
Sbjct: 182 ENSVSIGHESLNRQLTHLAAG 202



Score = 35.6 bits (81), Expect = 0.002
Identities = 59/189 (31%), Positives = 85/189 (44%), Gaps = 25/189 (13%)

Query: 1538 ALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVGFRSAASNVGSAAFGGYSESS 1597
A A D N Q ALG E A G ++A + S A G +E++
Sbjct: 23 AFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAA 82

Query: 1598 GRLSSALGYGAVASSDYSTAVGAASLASGASAVAVGEFSEATGDESVAVGGSTFFGFIPA 1657
+ A+G G++A+ S A+G S A G SAV G S A D VA+G A
Sbjct: 83 KGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKD-GVAIG---------A 132

Query: 1658 RASGTGAAAFGGGAWATADYTTAIGWNSYADGVNASALGQSAAALADN--ALAIGGNSRA 1715
RAS T+D A+G+NS AD N+ A+G S+ A++ ++AIG S+
Sbjct: 133 RAS-------------TSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 1716 DAIGASVVG 1724
D + +G
Sbjct: 180 DRENSVSIG 188



Score = 35.6 bits (81), Expect = 0.003
Identities = 47/142 (33%), Positives = 71/142 (50%), Gaps = 4/142 (2%)

Query: 1135 AQGEDATAAGSNATADADYSSAFGASSQATAIGAVAIGSGASATAQYANASGYNAAASGY 1194
+ A G NA+A +S A GA+++A AVA+G+G+ AT + A G + A G
Sbjct: 53 VRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGD 112

Query: 1195 GSVSTGAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFAGGALS--VAEGTES 1252
+V+ GA S A D GVA+G + S AVG + A + A G S A S
Sbjct: 113 SAVTYGAASTAQKD-GVAIGARASTSDT-GVAVGFNSKADAKNSVAIGHSSHVAANHGYS 170

Query: 1253 TALGYFASATGESATAVGAESV 1274
A+G + E++ ++G ES+
Sbjct: 171 IAIGDRSKTDRENSVSIGHESL 192



Score = 34.9 bits (79), Expect = 0.004
Identities = 45/147 (30%), Positives = 73/147 (49%), Gaps = 4/147 (2%)

Query: 1518 AALGNNAQALGENSSAVGSNALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVG 1577
A G NA A G +S A+G+ A A+ A A GAG+ A + A+G + A + A G
Sbjct: 59 GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYG 118

Query: 1578 FRSAASNVGSAAFGGYSESSGRLSSALGYGAVASSDYSTAVGAAS--LASGASAVAVGEF 1635
S A G A G S+ A+G+ + A + S A+G +S A+ ++A+G+
Sbjct: 119 AASTAQKDGVAI--GARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDR 176

Query: 1636 SEATGDESVAVGGSTFFGFIPARASGT 1662
S+ + SV++G + + A+GT
Sbjct: 177 SKTDRENSVSIGHESLNRQLTHLAAGT 203



Score = 34.1 bits (77), Expect = 0.007
Identities = 43/137 (31%), Positives = 57/137 (41%), Gaps = 1/137 (0%)

Query: 393 GLGFFVQTQASGEAASALGAGAIASGTYTTAVGTLSEASGTEATAVGYFAYAPGEGATAV 452
G+ Q S A ALG A G + A G + A+G A A A AV
Sbjct: 30 GIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAV 89

Query: 453 GPESWASGELSTALGYYSTARGANSVALGANSVATRADTVSVGAAGAERQITSVAAGTEG 512
G S A+G S A+G S A G ++V GA S A + D V++GA +
Sbjct: 90 GAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQK-DGVAIGARASTSDTGVAVGFNSK 148

Query: 513 TDAVNLNQLTAVSDVAS 529
DA N + S VA+
Sbjct: 149 ADAKNSVAIGHSSHVAA 165



Score = 33.7 bits (76), Expect = 0.010
Identities = 38/114 (33%), Positives = 57/114 (50%), Gaps = 2/114 (1%)

Query: 1127 GTGDGTAFAQGEDATAAGSNATADADYSSAFGASSQATAIGAVAIGSGASATAQYANASG 1186
G G A A+G + A G+ A A + A GA S AT + +VAIG + A A G
Sbjct: 59 GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYG 118

Query: 1187 YNAAASGYGSVSTGAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFA 1240
+ A G V+ GA + S D GVA+G S+A S A+G ++ + + ++
Sbjct: 119 AASTAQKDG-VAIGARASTS-DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYS 170



Score = 33.3 bits (75), Expect = 0.013
Identities = 38/130 (29%), Positives = 61/130 (46%), Gaps = 3/130 (2%)

Query: 1043 GWQSNASGTRSVSLGSGAYTPADDGVALGAGSIADRDNTVSVGSVGSERQITNVAAGTEG 1102
G ++A G S+++G+ A VA+GAGSIA N+V++G + + V G
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1103 TDAVNLDQLNAVAGTAETTARLYAGTGDGTAFAQGEDATAAGSNATADADYSSAFGASSQ 1162
T + + A A T++T A + A A+ A S+ A+ YS A G S+
Sbjct: 122 TAQKDGVAIGARASTSDTGV---AVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSK 178

Query: 1163 ATAIGAVAIG 1172
+V+IG
Sbjct: 179 TDRENSVSIG 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18150BACYPHPHTASE310.012 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 30.5 bits (68), Expect = 0.012
Identities = 23/72 (31%), Positives = 28/72 (38%), Gaps = 13/72 (18%)

Query: 135 RPYPFLRLHHGVGKVWAGEEAVETSSGEEDRAQEDVELTDLRQLGIATLGTLKEHPR--- 191
R P HHG G+ A + + G E RA+ LT LR TL PR
Sbjct: 164 RERPHTSGHHGAGEARATAPSTVSPYGPEARAELSSRLTTLRN----TLAPATNDPRYLQ 219

Query: 192 ------IKRFRD 197
+ RFRD
Sbjct: 220 ACGGEKLNRFRD 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18175SACTRNSFRASE371e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.8 bits (85), Expect = 1e-05
Identities = 15/59 (25%), Positives = 25/59 (42%)

Query: 70 DEAHVLNVCIAPEAQSQGHGRVLLRALIKAACDRGARRAFLEVRPSNPSAIALYHSEGF 128
A + ++ +A + + +G G LL I+ A + LE + N SA Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18180PF05616290.013 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.6 bits (63), Expect = 0.013
Identities = 23/71 (32%), Positives = 30/71 (42%), Gaps = 11/71 (15%)

Query: 34 PEPAPAVAEAP--APVREPTRPQHAA--PAPSAAPAARRNAPAAPSEPP-------ADVS 82
P+ P AEAP P+ E + ++ A PAP+ P R N P P
Sbjct: 313 PDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPG 372

Query: 83 VRPSAPAAPRR 93
RP +PA P R
Sbjct: 373 TRPDSPAVPDR 383


51XC_RS18270XC_RS18395Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS18270215-2.948405dTDP-4-dehydrorhamnose 3,5-epimerase
XC_RS18275014-1.880022glucose-1-phosphate thymidylyltransferase
XC_RS18280011-1.734189dTDP-glucose 4,6-dehydratase
XC_RS18285215-2.123209electron transfer flavoprotein subunit beta
XC_RS18290321-3.906988electron transfer flavoprotein subunit beta
XC_RS18295338-7.857795WxcH protein
XC_RS18300236-7.744435aminotransferase
XC_RS18305340-9.002167glycosyl transferase
XC_RS18310343-9.603343isomerase
XC_RS18315342-9.522407sugar transferase
XC_RS18320337-8.968109hypothetical protein
XC_RS18330329-6.894005transposase
XC_RS18335233-6.844413transposase
XC_RS18340338-7.619751transposase
XC_RS18345343-7.945381UDP-glucose 4-epimerase
XC_RS18350543-8.272503GDP-D-mannose dehydratase
XC_RS18355446-8.342343glycosyltransferase
XC_RS18360445-8.340382glycosyltransferase
XC_RS18365437-7.448602glycosyl transferase
XC_RS18370329-6.192632kinase
XC_RS18375322-5.606373ABC transporter ATP-binding protein
XC_RS18380217-3.483397ABC transporter
XC_RS18385215-2.063712glycosyl transferase
XC_RS18390316-0.699801cystathionine gamma-synthase
XC_RS18395316-0.243964cystathionine beta-synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18280NUCEPIMERASE1875e-59 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 187 bits (477), Expect = 5e-59
Identities = 91/349 (26%), Positives = 140/349 (40%), Gaps = 48/349 (13%)

Query: 5 LVTGGAGFIGGNFVLEAVSRGIRVVNLDALT--YAGNLNTL-ASLEGNADHIFVKGDIGD 61
LVTG AGFIG + + G +VV +D L Y +L L F K D+ D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLAD 63

Query: 62 GALVTRLLQEHQPDAVLNFAAESHVDRSIEGPGAFIQTNVVGTLALLEAVRDYWKALPDT 121
+T L + V V S+E P A+ +N+ G L +LE R
Sbjct: 64 REGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN------- 116

Query: 122 RRDAFRFLHVSTDEVYGTLGETGKFTETTPYA-PNSPYSASKAASDHLVRAFHHTYGLPV 180
L+ S+ VYG L F+ P S Y+A+K A++ + + H YGLP
Sbjct: 117 --KIQHLLYASSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 181 LTTNCSNNYGPYHFPEKLIPLVIAKALAGEPLPVYGDGKQVRDWLFVSDHCEAI------ 234
YGP+ P+ + L G+ + VY GK RD+ ++ D EAI
Sbjct: 174 TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 235 ---------------RTVLAKGRVGETYNVGGNSERQNIEVVQAICALLDQHRPREDGKP 279
+A RV YN+G +S + ++ +QA+ L
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRV---YNIGNSSPVELMDYIQALEDALG---------- 280

Query: 280 RESQIAYVTDRPGHDRRYAIDASKLKDELGWEPAYTFEQGIAQTVDWYL 328
E++ + +PG + D L + +G+ P T + G+ V+WY
Sbjct: 281 IEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18345NUCEPIMERASE1127e-31 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 112 bits (282), Expect = 7e-31
Identities = 68/326 (20%), Positives = 122/326 (37%), Gaps = 39/326 (11%)

Query: 11 RVLVSGASGFTGRYMSQQLKEQGCTVIGVGTRCDGAGTSATDE-----------FWPMDL 59
+ LV+GA+GF G ++S++L E G V+G+ D S F +DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 RDAADVARVVAQAQADYVVHLAAVAFVGH--GDADDFYRVNLLGTRNLLQALAASQHRPE 117
D + + A + V V + + + NL G N+L+ ++ +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILE--GCRHNKIQ 119

Query: 118 RVLIASSANVYG-NATEGVIDESVTPSPANDYAVSKLAMEYVASLWHE--RLPLVIARPF 174
+L ASS++VYG N + P + YA +K A E +A + LP R F
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 175 NYTGVGQATNFLIPKIVSHFASRASSIEL-GNTEVWRDFGDVRSVVLAYRKLLQAPA--- 230
G + + K SI++ ++ RDF + + A +L
Sbjct: 180 TVYGPWGRPDMALFKFTKAMLE-GKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHAD 238

Query: 231 --------------AEGQIVNVCSGVASSLSDIIRICSKITGHDIDVQVNPAFVRQNEVR 276
A ++ N+ + L D I+ G + + P ++ +V
Sbjct: 239 TQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLP--LQPGDVL 296

Query: 277 KLLGSNARLQELIGDWQSLALEDTLR 302
+ L E+IG ++D ++
Sbjct: 297 ETSADTKALYEVIGFTPETTVKDGVK 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18350NUCEPIMERASE865e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 85.6 bits (212), Expect = 5e-21
Identities = 66/343 (19%), Positives = 115/343 (33%), Gaps = 35/343 (10%)

Query: 5 LITGISGQDGAYLAQLLLDKGYRVFG-----TYRRTSSVNFWRIEELGIAAHPNLNLIEY 59
L+TG +G G ++++ LL+ G++V G Y S + L + A P +
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVS----LKQARLELLAQPGFQFHKI 59

Query: 60 DLTDLGSSIRMLESTGATEVYNLAAQSFVGVSFDQPTTTAQITGIGPVHLLEAIRQVNRD 119
DL D + S V+ + V S + P A G +++LE R
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 120 IRFYQASTSEMFGKVQAVPQIESTPF-YPRSPYGVAKLYAHWMTVNYRESYDIFGCSGIL 178
AS+S ++G + +P +P S Y K M Y Y +
Sbjct: 120 -HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 179 FNHESPLRGRE-----FVTRKITDSVAKIKLGLLDCMELGNLDAKRDWGFAREYVEGMWR 233
F P GR T+ + + + I + KRD+ + + E + R
Sbjct: 179 FTVYGP-WGRPDMALFKFTKAMLEGKS-IDV-------YNYGKMKRDFTYIDDIAEAIIR 229

Query: 234 MLQA-DEPDTFVLATNRTETVRDFVSMAFK-GAGIDVEFRGKDAEETAVDTATGKVVMRI 291
+ DT T + G VE + + +
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVEL------MDYIQALEDALGIEA 283

Query: 292 NPKF--HRPAEVELLIGDPEKATRILGWKPETTLEQLCQMMVE 332
+P +V D + ++G+ PETT++ + V
Sbjct: 284 KKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVN 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18370GPOSANCHOR624e-12 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 62.0 bits (150), Expect = 4e-12
Identities = 47/235 (20%), Positives = 83/235 (35%), Gaps = 6/235 (2%)

Query: 459 YQRLQKALLGYNARLSAVHIETEHHRVELAARGAAIEHLRDSTLQDQERTQAFEQGVAAA 518
+ L + + + L+ + + I+ L ++ + A
Sbjct: 83 LKDHNDELTEELSNAKE---KLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTAD 139

Query: 519 EERYKRLEEESEKLAAWAKGLEAQTIESNRDKEALAALNAELESDKAALATRIASLGQEL 578
+ K LE E LAA LE + A +A LE++KAAL R A L + L
Sbjct: 140 SAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKAL 199

Query: 579 EERQRARELAELLAADVGRLTEERDAARSDLLDTQSVVEQHQATITALEARVAVQQQQIS 638
E A + +A + L E+ A + D + +E TA A++ + + +
Sbjct: 200 EG---AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 256

Query: 639 GLESSRDQERNRLRELQVDLSRSMDGTASAREYIRELEMAVDALEGQINSLHGSR 693
LE+ + + L + + LE LE Q L+ +R
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR 311



Score = 50.8 bits (121), Expect = 1e-08
Identities = 39/239 (16%), Positives = 78/239 (32%), Gaps = 4/239 (1%)

Query: 459 YQRLQKALLGYNARLSAVHIETEHHRVELAARGAAIEHLRDSTLQDQERTQAFEQGVAAA 518
+ AR + + E A A I+ L R E+ + A
Sbjct: 108 LSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGA 167

Query: 519 EERYKRLEEESEKLAAWAKGLEAQTIESNRDKEALAALNAELESDKAALATRIASLGQEL 578
+ + L A LEA+ E + E + + L A+L
Sbjct: 168 MNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK 227

Query: 579 EERQRARELAELLAADVGRLTEERDAARSDLLDTQSVVEQHQATITALEARVAVQQQQIS 638
+ ++A E A + + +A ++ L Q+ +E+ + + + +
Sbjct: 228 ADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 287

Query: 639 GLESSRDQERNRLRELQVDLSRSMDGT----ASAREYIRELEMAVDALEGQINSLHGSR 693
+++ + E+ L L+ + ++RE ++LE LE Q SR
Sbjct: 288 AEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASR 346



Score = 38.9 bits (90), Expect = 8e-05
Identities = 45/221 (20%), Positives = 77/221 (34%), Gaps = 7/221 (3%)

Query: 442 LQQADATPHSAPEWVTIYQRLQKALLGYNARLSAVHIETEHHRVELAARGAAIEHLRDST 501
A + + A L A E + EL A+E + +
Sbjct: 220 KAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK---ALEGAMNFS 276

Query: 502 LQDQERTQAFEQGVAAAEERYKRLEEESEKLAAWAKGLEAQTIESNRDKEALAALNAELE 561
D + + E AA E LE +S+ L A + L S K+ L A + +LE
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336

Query: 562 SDKAALATRIASLGQELEERQRARELAEL----LAADVGRLTEERDAARSDLLDTQSVVE 617
SL ++L+ + A++ E L R + R DL ++ +
Sbjct: 337 EQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKK 396

Query: 618 QHQATITALEARVAVQQQQISGLESSRDQERNRLRELQVDL 658
Q + + +++A ++ LE S+ ELQ L
Sbjct: 397 QVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKL 437



Score = 30.0 bits (67), Expect = 0.043
Identities = 26/91 (28%), Positives = 36/91 (39%), Gaps = 16/91 (17%)

Query: 516 AAAEERYKRLEEESEKLAAWAKGL--------------EAQTIESNRDKEALAALNAELE 561
E +++LEE+++ A + L E E+N AL LN ELE
Sbjct: 361 KQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELE 420

Query: 562 SDKAALATRIASLGQELEERQRARELAELLA 592
K A L +LE A+ L E LA
Sbjct: 421 ESKKLTEKEKAELQAKLE--AEAKALKEKLA 449


52XC_RS18590XC_RS18685Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS18590320-0.585394membrane protein
XC_RS18595320-0.501232chorismate mutase
XC_RS18600522-1.238333ATP synthase epsilon chain
XC_RS18605424-1.372332ATP synthase subunit beta
XC_RS18610220-1.931919ATP synthase gamma chain
XC_RS18615218-1.639437ATP synthase subunit alpha
XC_RS18620216-2.700630ATP synthase subunit delta
XC_RS18625114-1.904055ATP synthase subunit b
XC_RS18630-113-0.721108F0F1 ATP synthase subunit C
XC_RS18635-1140.027901F0F1 ATP synthase subunit A
XC_RS18640-1131.200385membrane protein
XC_RS186451161.096152hypothetical protein
XC_RS18650-112-0.791480Xaa-Pro dipeptidase
XC_RS18660220-3.995242hypothetical protein
XC_RS18665122-4.816502dihydrolipoamide acetyltransferase
XC_RS18670031-6.380174hypothetical protein
XC_RS18680029-7.137702membrane protein
XC_RS18685126-6.385760hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18645INTIMIN320.002 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.3 bits (73), Expect = 0.002
Identities = 32/113 (28%), Positives = 45/113 (39%), Gaps = 20/113 (17%)

Query: 23 AEDTDDRFQIRLGAMNIDNDNTLRGNTVVAGNEISLDQDFD-------FGGKEW----EP 71
A D RF LGA + G + +DQDF GG+ W +
Sbjct: 248 ARYIDSRFTANLGA----GQRFFLPEN-MLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKS 302

Query: 72 RIDGVFRMSTRQRLIFDYFKYDKDRRETLGEDVSAGGFTVPSGSFIKGELKYQ 124
++G FRMS Y K D D R G D+ G+ +PS + +L Y+
Sbjct: 303 SVNGYFRMSGWHE---SYNKKDYDERPANGFDIRFNGY-LPSYPALGAKLMYE 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18650UREASE290.028 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 29.3 bits (66), Expect = 0.028
Identities = 23/86 (26%), Positives = 36/86 (41%), Gaps = 12/86 (13%)

Query: 14 GCACTSLSAHAQTTVITAARLLDVQAGRYVERPQIDVVAGRITRVGRQGDPLPPQAQRID 73
G + + A TVIT A +LD + + I + GRI +G+ G+P I
Sbjct: 57 GQSQVTREGGAVDTVITNALILDHWG---IVKADIGLKDGRIAAIGKAGNPDMQPGVTII 113

Query: 74 LGERT---------LLPGLIDMHVHL 90
+G T + G +D H+H
Sbjct: 114 VGPGTEVIAGEGKIVTAGGMDSHIHF 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18665RTXTOXIND300.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.031
Identities = 12/37 (32%), Positives = 21/37 (56%)

Query: 48 EVPSSVAGVVKEIKVKVGDSLSQGALVALIEVADADA 84
E+ +VKEI VK G+S+ +G ++ + A+A
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134


53XC_RS19065XC_RS19260Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS19065211-0.092450ABC transporter ATP-binding protein
XC_RS19070115-1.289346hypothetical protein
XC_RS190801142.147375alcohol dehydrogenase
XC_RS190850143.580336hypothetical protein
XC_RS190902143.633578hypothetical protein
XC_RS190952143.569104hypothetical protein
XC_RS191003144.001066hypothetical protein
XC_RS191051123.485311oxidoreductase
XC_RS191104122.776011hypothetical protein
XC_RS191155122.342421hypothetical protein
XC_RS191202142.096621membrane protein
XC_RS191252122.571419hypothetical protein
XC_RS191302132.247055ligand-gated channel
XC_RS191352122.580237ABC transporter permease
XC_RS191402102.105021ABC transporter permease
XC_RS191452102.050728ABC transporter ATP-binding protein
XC_RS191503102.042230hypothetical protein
XC_RS191555100.864620transcriptional regulator
XC_RS191603101.029005endoribonuclease L-PSP
XC_RS191654100.802829oxidoreductase
XC_RS19170391.222927hypothetical protein
XC_RS191754101.605547ketosteroid isomerase
XC_RS19180-215-2.621596aldo/keto reductase
XC_RS19185-321-4.671966LysR family transcriptional regulator
XC_RS19190025-6.402654hypothetical protein
XC_RS19195229-7.008010hypothetical protein
XC_RS19200233-7.372562transcriptional regulator
XC_RS19205233-7.533618response regulator
XC_RS19210316-5.251010histidine kinase
XC_RS19215316-4.022633hypothetical protein
XC_RS19220115-1.936897transposase
XC_RS19225014-1.655105integrase
XC_RS19235-1130.503809*hypothetical protein
XC_RS19240-1141.384542RNA polymerase sigma factor RpoD
XC_RS192450134.012825D-aminoacyl-tRNA deacylase
XC_RS19250-2123.809170lipid A biosynthesis lauroyl acyltransferase
XC_RS19255-1113.211132phosphinothricin acetyltransferase
XC_RS192600123.291690GTP cyclohydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19115CHANLCOLICIN340.003 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.9 bits (77), Expect = 0.003
Identities = 58/310 (18%), Positives = 108/310 (34%), Gaps = 19/310 (6%)

Query: 319 EGISTGFAHSASAASAHWSTAMTAQERAQQ-AQTEQLQHALAQMAQQTTALQDSVGQAVQ 377
+G S + +A A+A WSTA + +A+Q A+ + A A+ AL + V
Sbjct: 40 KGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVN 99

Query: 378 QQLGVLTDGFERSTSAAAANWQAALTAQEQAQQALTQQLQSTLAQLAAQTTALQD--NLG 435
+ L +T A AN A E + L + + + A A Q+
Sbjct: 100 EALRHNASRTPSATELAHANNAA--MQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRR 157

Query: 436 QAVQQQLAALNDGVAHSTAAAAAGWAAVLAEHQQST---QALSAQLQRTVEQIAEHSTTL 492
+ ++++ A + A A A L+E ++ Q + Q V ++ TL
Sbjct: 158 KEIEREKAETERQL--KLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTL 215

Query: 493 QDGVGQAVQQQ------LDGLSTGFAASTATAAQTWTAAASEQQRANQSLTDALHTTLTH 546
+ ++ + L G A ++A + RAN L + T
Sbjct: 216 NSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATR 275

Query: 547 -FASTFEARSEAL--VDAVSARLEQSSSQTADAWNAALAQQQQASAELAAQHQGALSAAT 603
+ R E V A R+ + ++ A +A +A H+ +
Sbjct: 276 RRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKK 335

Query: 604 ASSEAHAAAL 613
A + + +
Sbjct: 336 AQNNLLNSQI 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19120OMPADOMAIN771e-18 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 76.5 bits (188), Expect = 1e-18
Identities = 41/143 (28%), Positives = 62/143 (43%), Gaps = 16/143 (11%)

Query: 68 ALAAPLAAGRVTLVNGRIGIRGSVLFALNSDQLQPEGREVLKSLAAPLSEYLVSREEILM 127
+ AP A + ++ VLF N L+PEG+ L L + LS + +
Sbjct: 198 PVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSV-V 256

Query: 128 ISGFTDDRPVLDSNRRYADNWELSAQRALTVTRALIAEGVPASSVFAAAFGSQQPVDSNA 187
+ G+TD + Y N LS +RA +V LI++G+PA + A G PV N
Sbjct: 257 VLGYTDRI----GSDAY--NQGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNT 310

Query: 188 DETRR---------AKNRRVEIA 201
+ + A +RRVEI
Sbjct: 311 CDNVKQRAALIDCLAPDRRVEIE 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19165DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 70.5 bits (172), Expect = 2e-16
Identities = 41/186 (22%), Positives = 77/186 (41%), Gaps = 5/186 (2%)

Query: 2 KTVLISGSGSGMGLLTAQTLMRAGYVVYAGVRDPDGRSKERRHALEDYARAHAAQVRVVD 61
K I+G+ G+G A+TL G + A +P E+ + +A A
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNP-----EKLEKVVSSLKAEARHAEAFP 63

Query: 62 MDIHSEASCQRAVEQVVAEHGHLDVVIHNAAHLYIGMAEGFTAEQLADSFNTNAVGAHRL 121
D+ A+ ++ E G +D++++ A L G+ + E+ +F+ N+ G
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 122 NRAALPQLRKQGHGTLLYVGSTITRIVAPFMMPYVAGKYALDAVAETTAYEVQPLGIETV 181
+R+ + + G+++ VGS + M Y + K A + E+ I
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 182 IVMPGT 187
IV PG+
Sbjct: 184 IVSPGS 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19175PF05844270.033 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 27.3 bits (60), Expect = 0.033
Identities = 16/49 (32%), Positives = 24/49 (48%), Gaps = 7/49 (14%)

Query: 9 STHTALPAVLLAAVLWLPGAQAASVAAPAAAAQHDTPHMAAHNKQLVDI 57
+T A+P+ +A PGA SV P AAA + P + A V++
Sbjct: 10 ATQAAIPSEPIA-----PGAAGRSVGTPQAAA--ELPQVPAARADRVEL 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19255SACTRNSFRASE415e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 5e-07
Identities = 15/61 (24%), Positives = 26/61 (42%), Gaps = 1/61 (1%)

Query: 82 SVEHSIYVHRDHRGKGLGRTLLQLLIDAAQARGVHVLVGGIDASNAASIALHEQFGFTHA 141
+E I V +D+R KG+G LL I+ A+ L+ N ++ + + F
Sbjct: 91 LIED-IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIG 149

Query: 142 G 142

Sbjct: 150 A 150


54XC_RS19340XC_RS19390Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS193401114.514171threonylcarbamoyl-AMP synthase
XC_RS19345-2113.257295hypothetical protein
XC_RS19350-2102.654009hypothetical protein
XC_RS19355-1112.553377diguanylate cyclase
XC_RS19360-1113.404881tropinone reductase
XC_RS19365-1124.659162hypothetical protein
XC_RS19370-2114.305444endopeptidase IV
XC_RS19375-1104.441191multidrug resistance protein NorM
XC_RS19380-2104.576504hypothetical protein
XC_RS19385-2114.043874membrane protein
XC_RS19390-3133.236464primosome assembly protein PriA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19350IGASERPTASE352e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 2e-04
Identities = 26/155 (16%), Positives = 43/155 (27%), Gaps = 9/155 (5%)

Query: 43 VQQQKKLMTAPAAMPLPSSAAAPVSARPAPARAATQAPPVAAPAPQAPAAVPTATTA--- 99
V+ +K + + +P A P V PQ+ T
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 100 -TSSLPPPPLFECTAHDNGRYFTE--ESEPATRCLPMQVTNLAGGPAQGGGSACEVVTDR 156
TSS P+ E T + G E E+ P + + P + V
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRS---VRSV 1230

Query: 157 CAPVPDQSLCAAWRQRAEQAEAAWRFSDEAQSAAR 191
V + + R + ++ S AR
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDAR 1265



Score = 29.6 bits (66), Expect = 0.009
Identities = 30/176 (17%), Positives = 52/176 (29%), Gaps = 11/176 (6%)

Query: 40 PKGVQQQKKLMTAPAAMPLPSSAAAPVSARPAPARAATQAPPVAAPAPQAPAAVPTATTA 99
P+ ++ + + T P A P A PV PAP A P+ TT
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAP----ATPSETTE 1038

Query: 100 TSSLPPPPLFECTAHDNGRYFTEESEPATRCLPMQVTNLAGGPAQGGGSACEVVTDRCAP 159
T + + + E+ R +V A + EV
Sbjct: 1039 TVAENSKQESKTVEKNEQD--ATETTAQNR----EVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 160 VPDQSLCAAWRQRAEQAEAAWRFSDEAQSAARKQRYDQARRVMDES-RCAGTPATP 214
Q+ E+ E A +++ Q + ++ E+ + PA
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19360DHBDHDRGNASE1123e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (282), Expect = 3e-32
Identities = 75/253 (29%), Positives = 112/253 (44%), Gaps = 10/253 (3%)

Query: 8 LDGQTALITGASAGIGLAIARELLGFGADLLMVARDADALAQARDELAEEFPERELHGLA 67
++G+ A ITGA+ GIG A+AR L GA + A D + + + + R
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIA--AVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 68 ADVSDDEERRAILDWVEDHADGLHLLINNAGGNITRAAIDYTEDEWRGIFETNVFSAFEL 127
ADV D I +E + +L+N AG +++EW F N F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 128 SRYAHPLLTRHAASAIVNVGSVSGITHVRSGAPYGMTKAALQQMTRNLAVEWAEDGIRVN 187
SR + + +IV VGS S A Y +KAA T+ L +E AE IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 188 AVAPWYIRTRRTSGPLSDPDYYEQVIERT--------PMRRIGEPEEVAAAVGFLCLPAG 239
V+P T +D + EQVI+ + P++++ +P ++A AV FL
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 240 SYITGECIAVDGG 252
+IT + VDGG
Sbjct: 244 GHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19380IGASERPTASE300.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.004
Identities = 21/98 (21%), Positives = 35/98 (35%), Gaps = 17/98 (17%)

Query: 31 PASPVATPAAPAAPARSIPSAAQQRARWDAL--------TPAQRTELRARYAAWKALTA- 81
+ + TP A S+PS ++ AR D TP++ TE A + ++ T
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVE 1052

Query: 82 --------TDKVVLRQARERLQALPDDQQRALRTQFGA 111
T A+E + + Q Q G+
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS 1090


55XC_RS19505XC_RS19605Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS195052140.603783membrane protein
XC_RS195101110.414231acriflavine resistance protein B
XC_RS19515090.403107acriflavin resistance protein
XC_RS195202141.411960dehydrogenase
XC_RS195253141.360795glyoxalase
XC_RS195302141.266972hypothetical protein
XC_RS195351141.857187RNA polymerase subunit sigma-24
XC_RS195401161.822546hypothetical protein
XC_RS195451151.432150hypothetical protein
XC_RS195500120.362833hypothetical protein
XC_RS195550120.685224haloacid dehalogenase
XC_RS19560122-2.939690hypothetical protein
XC_RS19565225-4.707099hypothetical protein
XC_RS19570223-4.516597membrane protein
XC_RS19575120-3.165647hypothetical protein
XC_RS19580-217-1.525048D-alanyl-D-alanine dipeptidase
XC_RS19585-314-1.450034peptidoglycan-binding protein
XC_RS195901101.316295hypothetical protein
XC_RS195951111.908250hypothetical protein
XC_RS196002102.557552N-acetylmuramoyl-L-alanine amidase
XC_RS196052121.883356hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19505RTXTOXIND508e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 50.2 bits (120), Expect = 8e-09
Identities = 30/158 (18%), Positives = 60/158 (37%), Gaps = 22/158 (13%)

Query: 55 AATQDVPVYATALGTVTAL-NTVTVNPQVSGQLMSLNFQEGQEVKKGALLAQIDPRT--- 110
+ V + ATA G +T + + P + + + +EG+ V+KG +L ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 111 ----LQASYDQALAAKRQNQALLATA---RVNYQRSNDPAYKQYVS-----------RTD 152
Q+S QA + + Q L + ++ + D Y Q VS +
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ 194

Query: 153 LDTQRNQVAQYEAAVAANDAQMRSAQVQLQFTRVTAPI 190
T +NQ Q E + A+ + ++ + +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRV 232



Score = 34.4 bits (79), Expect = 9e-04
Identities = 22/178 (12%), Positives = 62/178 (34%), Gaps = 29/178 (16%)

Query: 93 EGQEVKKGALLAQIDPRTLQASYDQ-------ALAAKR----------QNQALLATARVN 135
+ + ++ +LA+I+ + ++ +L K+ +N+ + A +
Sbjct: 210 DKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELR 269

Query: 136 YQRSNDPAYKQYVSRTDLD-TQRNQVAQYEAAVAANDAQMRSAQV---------QLQFTR 185
+S + + + Q+ + E + + Q +
Sbjct: 270 VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 186 VTAPIDGIAGIRGV-DVGNIVSTTSTIVTLT-QIRPIYVSFSLPERELPAVRSGQAAT 241
+ AP+ V G +V+T T++ + + + V+ + +++ + GQ A
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAI 387



Score = 31.7 bits (72), Expect = 0.005
Identities = 11/104 (10%), Positives = 35/104 (33%), Gaps = 1/104 (0%)

Query: 79 NPQVSGQLMSLNFQEGQEVKKGALLAQIDPRTLQASYDQALAAKRQNQALLATARVNYQR 138
+V + Q + +++ +A LA + + L +
Sbjct: 181 EEEVLRLTSLIKEQFSTW-QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 139 SNDPAYKQYVSRTDLDTQRNQVAQYEAAVAANDAQMRSAQVQLQ 182
+ +KQ +++ + Q N+ + + +Q+ + ++
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEIL 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19510ACRIFLAVINRP7310.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 731 bits (1888), Expect = 0.0
Identities = 300/1072 (27%), Positives = 503/1072 (46%), Gaps = 65/1072 (6%)

Query: 4 STIFIRRPIATSLLMAGILLLGILGYRQLPVSALPEIDAPSLVVSTQYPGANATTMASLV 63
+ FIRRPI +L +++ G L QLPV+ P I P++ VS YPGA+A T+ V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TTPLERQLGQISGLQMMTSDS-SAGLSTIILQFSMDRDIDIAAQDVQAAIRQAT--LPSS 120
T +E+ + I L M+S S SAG TI L F D DIA VQ ++ AT LP
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 121 LPYQPVYNRVNPADAAILTLKLTSDT--LPLREVNRYADAILAQRLSQVPGVGLVSIAGN 178
+ Q + + + ++ SD +++ Y + + LS++ GVG V + G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 179 VRPAVRIQVNPAQLSNMGLTMESLRSALTQTNVSAPKGSLN------GKTQSYSIGTNDQ 232
A+RI ++ L+ LT + + L N G L G+ + SI +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 LTDAAQYRETIISYN-NGAPVRLADVAKVVDGVENDQLAAWADGKPAVLLEIRRQPGANI 291
+ ++ + + N +G+ VRL DVA+V G EN + A +GKPA L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 292 VQTVEQIRSILPQLQGVLPADVHLDVFSDRTETIRASVHEVKFTLVLTIFLVVAVIFVFL 351
+ T + I++ L +LQ P + + D T ++ S+HEV TL I LV V+++FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 352 RRLWATIIPSVAVPLSLAGTFGVMAFAGMSLDNLSLMALVVATGFVVDDAIVMIENIVRY 411
+ + AT+IP++AVP+ L GTF ++A G S++ L++ +V+A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 412 IEQGKSGP-EAAEIGARQIGFTVLSLTVSLVAVFLPLLLMPGVTGRLFHEFAWVLSIAVV 470
+ + K P EA E QI ++ + + L AVF+P+ G TG ++ +F+ + A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 471 ISMLVSLTLTPMMCAYLLKPDALPEGEDAHERAAAEGKTNLWTRTVGVYERSLDWVLDHQ 530
+S+LV+L LTP +CA LLKP + E ++ + +V Y S+ +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHE--NKGGFFGWFNTTFDHSVNHYTNSVGKILGST 537

Query: 531 PLTLAVAIGAVALTVVLYVAIPKGLLPEQDTGLITGVVQVDQNVAFPQMEQRTQAVAAAL 590
L + VA VVL++ +P LPE+D G+ ++Q+ + ++ V
Sbjct: 538 GRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 591 RKDPA--VTGVAAFIGAGSMNPTLNQGQLSIVLKTRGDRD----DLETVVARLQKAVSGI 644
K+ V V G N G + LK +R+ E V+ R + + I
Sbjct: 598 LKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKI 657

Query: 645 PGVALFLKPVQDV-TLDTRVAATEYQYSMADVDSTELASWA-TRMTEAMRKLPELADVDN 702
+ + + L T A + L + A + L V
Sbjct: 658 RDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 703 NLANQGRALELSIDRDKASMLGVPMQTIDDTLYDAFGQRQISTIFTELNQYRVVLEVAPE 762
N +L +D++KA LGV + I+ T+ A G ++ ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 763 FRSSTALMNQLAVASNGSGALTGTNATSFGQVTSSNSSTATGVGNQNTGIVVGAGSIIPL 822
FR +++L V S G ++P
Sbjct: 778 FRMLPEDVDKLYVRSA-------------------------------------NGEMVPF 800

Query: 823 AALAEAKVTNTPLVVSHQQQLPAVTISFNLAPGHSLSQAVAAIEQARADLKIPTQVHAEF 882
+A + + LP++ I APG S A+A +E + K+P + ++
Sbjct: 801 SAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLAS--KLPAGIGYDW 858

Query: 883 VGKAAEFTGSQTDIVWLLLASIVVIYIVLGVLYESYIHPLTIISTLPPAGVGALLALMVC 942
G + + S L+ S VV+++ L LYES+ P++++ +P VG LLA +
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 943 GLSLSVDGIVGIVLLIGIVKKNAIMMIDFAIEA-RRTGVNAHEAIRRACLLRFRPIMMTT 1001
V +VG++ IG+ KNAI++++FA + + G EA A +R RPI+MT+
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 1002 AAAMLGALPLALGTGIGSELRRPLGIAIVGGLLLSQLVTLYTTPVIYLYMER 1053
A +LG LPLA+ G GS + +GI ++GG++ + L+ ++ PV ++ + R
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 73.3 bits (180), Expect = 3e-15
Identities = 72/454 (15%), Positives = 156/454 (34%), Gaps = 48/454 (10%)

Query: 614 QGQLSIVLKTRGDRDD-LETVVARLQKAVSGIPGVALFLKPVQDVTLDTRVAATEYQYSM 672
+++ ++ D D V +LQ A +P + + + + +
Sbjct: 87 SVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDN 146

Query: 673 ADVDSTELASWATR-MTEAMRKLPELADVDNNLANQGRALELSIDRDKASMLGVPMQTID 731
+++ + + + + +L + DV L A+ + +D D
Sbjct: 147 PGTTQDDISDYVASNVKDTLSRLNGVGDV--QLFGAQYAMRIWLDADL------------ 192

Query: 732 DTLYDAFGQRQISTIFTELNQYRVVLEVAPEFRSSTALMNQLAVASNGS-GALTGTNATS 790
LN+Y++ L Q + G G
Sbjct: 193 ------------------LNKYKLTPV-----DVINQLKVQNDQIAAGQLGGTPALPGQQ 229

Query: 791 FGQVTSSNSSTATGVGNQNTGIVVGA-GSIIPLAALAEAKVT--NTPLVVSHQQQLPAVT 847
+ + + V + GS++ L +A ++ N ++ + PA
Sbjct: 230 LNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAG 288

Query: 848 ISFNLAPGHSLSQAVAAIEQARADLK--IPTQVHAEFVGKAAEF-TGSQTDIVWLLLASI 904
+ LA G + AI+ A+L+ P + + F S ++V L +I
Sbjct: 289 LGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI 348

Query: 905 VVIYIVLGVLYESYIHPLTIISTLPPAGVGALLALMVCGLSLSVDGIVGIVLLIGIVKKN 964
+++++V+ + ++ L +P +G L G S++ + G+VL IG++ +
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDD 408

Query: 965 AIMMIDFAIEARR-TGVNAHEAIRRACLLRFRPIMMTTAAAMLGALPLALGTGIGSELRR 1023
AI++++ + EA ++ ++ +P+A G + R
Sbjct: 409 AIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYR 468

Query: 1024 PLGIAIVGGLLLSQLVTLYTTPVIYLYMERAGER 1057
I IV + LS LV L TP + + +
Sbjct: 469 QFSITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19515ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1943), Expect = 0.0
Identities = 287/1034 (27%), Positives = 488/1034 (47%), Gaps = 26/1034 (2%)

Query: 3 ISAPFIKRPIGTSLLAIGLFVIGLMCYLRLGVAALPNIQIPVIFVHATQSGADASTMAST 62
++ FI+RPI +LAI L + G + L+L VA P I P + V A GADA T+ T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTAPLERHLGQLPGIDRMRSSS-SESSSMVFMIFQSNRDIDSAAQDVQTAINSAQSDLPS 121
VT +E+++ + + M S+S S S + + FQS D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 GLGTPMYQKANPNDDPVIAIALTSDT--QSADELYNVADSLLAQRLRQITGISSVDIAGA 179
+ + ++ SD + D++ + S + L ++ G+ V + GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 STPAVRVDVDLRALNALGLTPDDLRNAVRAANVTSPTGFL------SDGNTTMAIIANDS 233
A+R+ +D LN LTP D+ N ++ N G L +IIA
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 VAKAADFAQLAISTQSNGRIVRLGDVATVYDGQQDAYQAAWFNGKPAVVMYAFTRAGANI 293
+F ++ + S+G +VRL DVA V G ++ A NGKPA + GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 VETVDQVKAQIPELRAYLQPGTTLTPYFDRTPTIRASLHEVQATLMISLAMVVLTMALFL 353
++T +KA++ EL+ + G + +D TP ++ S+HEV TL ++ +V L M LFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 RRLAPTLIAAVTVPLSLAGSALVMYVLGFTLNNLSLLALVIAIGFVVDDAIVVIENIMRH 413
+ + TLI + VP+ L G+ ++ G+++N L++ +V+AIG +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 L-DEGMPRLQAALTGAREIGFTIVSITASLVAVFIPMLFASGMVGAFFREFTVTLVAAIV 472
+ ++ +P +A +I +V I L AVFIPM F G GA +R+F++T+V+A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 VSMLVSLTLTPALCSRFLSAHTAP--ETPSRFGAWLDRMHDRMLAVYTVALDFSLRHALL 530
+S+LV+L LTPALC+ L +A E F W + D + YT ++ L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 531 LSLTPLVLIAATIFLGGAVKKGSFPPQDTGLIWGRANSSATVSFADMVSRQRRITDMLMA 590
L +++A + L + P +D G+ A + ++TD +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 591 DP-----AVKTVGARLGSGRQGSSASLNIELKKRDE--GRRETTAQVVARLSAKADRYPD 643
+ +V TV SG+ ++ + LK +E G + V+ R + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 644 LDLRLRAIQDLPSDGGGGTSQGAQYRVSLQGNDLAQLQEWLPKVQAALKKNP-RLRDVGT 702
D + G + + G L + ++ ++P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 703 DVDTSGLRQNIVIDRAKAARLGISVGAIDGALYGAFGQRSISTIYSDLNQYSVVVNALPS 762
+ + + +D+ KA LG+S+ I+ + A G ++ + V A
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 763 QTATPKALDEVFVPNRAGQMAPITAVATQVPGLAPPQITHDNQYSTMDLSYNLAPGVSTG 822
P+ +D+++V + G+M P +A T P++ N +M++ APG S+G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 823 EADLIIKTTVEGLRMPGDIRISDGGGF-NVQLNPNSMGVLLLAAVLTVYIVLGMLYESLI 881
+A +++ ++P I G +L+ N L+ + + V++ L LYES
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 882 HPVTILSTLPAAGVGALLALFITNTELSVISMIALVLLIGIVKKNAIMMIDFALVAQREH 941
PV+++ +P VG LLA + N + V M+ L+ IG+ KNAI++++FA +
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 942 GKDARAAAREASIVRFRPIMMTTMVAILAAVPLAVGLGEGSELRRPLGIAMIGGLVFSQG 1001
GK A A +R RPI+MT++ IL +PLA+ G GS + +GI ++GG+V +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1002 LTLLSTPALYVIFS 1015
L + P +V+
Sbjct: 1016 LAIFFVPVFFVVIR 1029



Score = 112 bits (281), Expect = 4e-27
Identities = 80/506 (15%), Positives = 170/506 (33%), Gaps = 31/506 (6%)

Query: 2 NISAPFIKRPIGTSLLAIGLFVIGLMCYLRLGVAALPNIQIPVIFVHA-TQSGADASTMA 60
N + L+ + ++ +LRL + LP V +GA
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 STVTAPLERHLGQLPGID--------RMRSSSSESSSMVFMIFQSNRDIDSAAQDVQTAI 112
+ + +L S ++++ M F+ + + + + I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NSAQSDL---PSGLGTPMYQKANPNDDPVIAIALTSDTQSA-----DELYNVADSLLAQR 164
+ A+ +L G P + A + D L + LL
Sbjct: 648 HRAKMELGKIRDGFVIPF--NMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMA 705

Query: 165 LRQITGISSVDIAG-ASTPAVRVDVDLRALNALGLTPDDLRNAVRAANVTSPTGFLSDGN 223
+ + SV G T +++VD ALG++ D+ + A + D
Sbjct: 706 AQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRG 765

Query: 224 TTMAIIA---NDSVAKAADFAQLAISTQSNGRIVRLGDVATVYDGQQDAYQAAWFNGKPA 280
+ D +L + + +NG +V T + + + +NG P+
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVRS-ANGEMVPFSAFTTSHWVYG-SPRLERYNGLPS 823

Query: 281 VVMYAFTRAGANIVETVDQVKAQIPELRAYLQPGTTLTPYFDRTPTIRASLHEVQATLMI 340
+ + G + A + L + L G + + R S ++ A + I
Sbjct: 824 MEIQGEAAPGTS----SGDAMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAI 878

Query: 341 SLAMVVLTMALFLRRLAPTLIAAVTVPLSLAGSALVMYVLGFTLNNLSLLALVIAIGFVV 400
S +V L +A + + + VPL + G L + + ++ L+ IG
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 401 DDAIVVIENIM-RHLDEGMPRLQAALTGAREIGFTIVSITASLVAVFIPMLFASGMVGAF 459
+AI+++E EG ++A L R I+ + + + +P+ ++G
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 460 FREFTVTLVAAIVVSMLVSLTLTPAL 485
+ ++ +V + L+++ P
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVF 1024


56XC_RS19695XC_RS19820Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS196952120.204581protoheme IX farnesyltransferase
XC_RS197002150.256857cytochrome oxidase assembly protein
XC_RS19705215-0.620969hypothetical protein
XC_RS19710114-3.213550membrane protein
XC_RS19715015-3.769117membrane protein
XC_RS19720018-3.627322cytochrome C oxidase subunit III
XC_RS19725-111-1.922346cysteine ABC transporter substrate-binding
XC_RS19730-211-1.811418membrane protein
XC_RS19735-214-2.737789cytochrome C oxidase subunit I
XC_RS19740-320-2.725346cytochrome C oxidase subunit II
XC_RS19745-319-2.384188membrane protein
XC_RS19750-319-2.199607pyrroline-5-carboxylate dehydrogenase
XC_RS19755038-5.316488transposase
XC_RS19760040-6.223588hypothetical protein
XC_RS19765135-5.514499hypothetical protein
XC_RS19770129-4.092210hypothetical protein
XC_RS19775231-5.827570hypothetical protein
XC_RS19780430-6.555816hypothetical protein
XC_RS19785125-5.080729transposase
XC_RS19790029-5.183925transposase
XC_RS19795030-4.867655baseplate assembly protein
XC_RS19800144-8.308299transposase
XC_RS19805039-7.269368transposase
XC_RS19810-136-5.775071transposase
XC_RS19815-128-5.235898integrase
XC_RS19820-121-4.039187transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19795VACCYTOTOXIN290.005 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.005
Identities = 22/92 (23%), Positives = 27/92 (29%), Gaps = 11/92 (11%)

Query: 46 FTDGAQIHYDTEAHALQATLPSGGTATITADGGITLNGPLTVN----GETMLNGDATITG 101
T+ A +H L G IT++GPL VN G + A
Sbjct: 416 TTNAAHLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEF 475

Query: 102 TATATTD----VLGGGISLKH---HKTTGVTA 126
A T ISL K TA
Sbjct: 476 KAGTDTKNGTATFNNDISLGRFVNLKVDAHTA 507


57XC_RS20110XC_RS20185Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS20110193.233261hypothetical protein
XC_RS201151113.654912hypothetical protein
XC_RS201201103.124434hypothetical protein
XC_RS201254123.533489NAD-dependent dehydratase
XC_RS201303122.451800mercuric reductase
XC_RS201350161.367784membrane protein
XC_RS20140-1130.650971transcriptional regulator
XC_RS201450150.492699histidine kinase
XC_RS201501140.626262hypothetical protein
XC_RS201551140.507957hypothetical protein
XC_RS201602110.290360histidine biosynthesis protein HisIE
XC_RS201650100.346362heat-shock protein
XC_RS201701131.344206hypothetical protein
XC_RS201751121.922341membrane protein
XC_RS201802131.860323hypothetical protein
XC_RS201852120.833176transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20125NUCEPIMERASE399e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 39.4 bits (92), Expect = 9e-06
Identities = 16/27 (59%), Positives = 18/27 (66%)

Query: 1 MHLLITGGTGFIGQALCPALLEAGHQV 27
M L+TG GFIG + LLEAGHQV
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQV 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20140HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 26/155 (16%), Positives = 61/155 (39%), Gaps = 5/155 (3%)

Query: 2 HLLLVEDDTMLADAIYDGVRQQSWTIDHVGTAAAARTALVEHHYTAVLLDIGLPGDSGLT 61
+L+ +DD + + + + + + AA + V+ D+ +P ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VIRYMRGHYDATPVIALTARGQLTDRIRGLDAGADDYLVKPFQFDELMARVRAIARRSQG 121
++ ++ PV+ ++A+ I+ + GA DYL KPF EL+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 RVVPLLTQGE-----VSVDPSSRKVTRNGKWVALS 151
R L + V + +++ R + +
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQT 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20145PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 2e-04
Identities = 36/206 (17%), Positives = 74/206 (35%), Gaps = 49/206 (23%)

Query: 203 TLESKDAALQRLLETARRSNRLAEQLLDLARLDAGISSAAYHQVEMAELITHVLDEFSVQ 262
L + A +LE ++ + L +L R S+A QV +A+ +T V +
Sbjct: 178 ALNNIRA---LILEDPTKAREMLTSLSELMRYSLRYSNA--RQVSLADELTVVDSYLQLA 232

Query: 263 AEARH---INLQVEAAPCVLRCDVDAVGILIRNLVDNAIRYG----RLHGTVEVSCGYCL 315
+ + + + P ++ V +L++ LV+N I++G G + +
Sbjct: 233 -SIQFEDRLQFENQINPAIMDVQVPP--MLVQTLVENGIKHGIAQLPQGGKILLK----G 285

Query: 316 RADQLHPFVQVSDDGPGVPEDARAAIFERFYRVSGNAVQGSGIGLSLVAGIARLHEARIE 375
D ++V + G ++ + + +G GL V E R++
Sbjct: 286 TKDNGTVTLEVENTGSLALKNTK---------------ESTGTGLQNV------RE-RLQ 323

Query: 376 TYEGGDGR--------GLCVRVLFPA 393
G + + + VL P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20165V8PROTEASE832e-19 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 82.7 bits (204), Expect = 2e-19
Identities = 31/193 (16%), Positives = 70/193 (36%), Gaps = 40/193 (20%)

Query: 110 LGSGVIIDAQKGYVLTNHHVIENADDVQVTL------------GDGRTVKADFIGSDADT 157
+ SGV++ K +LTN HV++ L +G +
Sbjct: 103 IASGVVVG--KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEG 160

Query: 158 DIALIRIKAD--------NLTDIKLADSNALRVGDFVVAIGNPFG---FTQTVTSGIVSA 206
D+A+++ + + ++++ +V + G P T + G ++
Sbjct: 161 DLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKPVATMWESKGKITY 220

Query: 207 VGRSGIRGLGYQNFIQTDASINPGNSGGALVNLQGQLVGINTASFNPQGSMAGNIGLGLA 266
+ +Q D S GNSG + N + +++GI+ + N + +
Sbjct: 221 L---------KGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNE----FNGAVFIN 267

Query: 267 --IPSNLARNVVE 277
+ + L +N+ +
Sbjct: 268 ENVRNFLKQNIED 280


58XC_RS20585XC_RS20660Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS20585213-1.6535082-dehydropantoate 2-reductase
XC_RS20590318-2.922507membrane protein
XC_RS20595220-2.741317thioesterase
XC_RS20600118-0.570614thioredoxin
XC_RS20605118-0.429647flavodoxin
XC_RS20610018-0.330553ribonucleotide-diphosphate reductase subunit
XC_RS20615-1170.619621ribonucleotide-diphosphate reductase subunit
XC_RS20620-2141.356417membrane protein
XC_RS20625-2121.447883GCN5 family N-acetyltransferase
XC_RS20630-2120.427011magnesium transporter
XC_RS20635-1133.009810carbonic anhydrase
XC_RS20640-1133.177001potassium transporter
XC_RS20645-1143.907232glucose-6-phosphate 1-dehydrogenase
XC_RS206500135.098866glycosyl hydrolase
XC_RS206551156.091599phosphotyrosine protein phosphatase
XC_RS206600165.082240ankyrin
59XC_RS20815XC_RS20900Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS208151153.711768DNA helicase II
XC_RS208204143.884439pyridine nucleotide-disulfide oxidoreductase
XC_RS208251142.865651cardiolipin synthase
XC_RS208301142.940028LysR family transcriptional regulator
XC_RS20835-1151.464168hypothetical protein
XC_RS208400141.431473FldA protein
XC_RS208450160.4545404-oxalomesaconate hydratase
XC_RS208501171.63510850S ribosomal protein L33
XC_RS208550153.03515750S ribosomal protein L28
XC_RS208601163.510266hypothetical protein
XC_RS20865-2133.177289cation transporter
XC_RS20870-1133.824801hypothetical protein
XC_RS20875-2123.370843cation transporter
XC_RS20880-2123.557110cation efflux system protein
XC_RS20885-2122.318287hypothetical protein
XC_RS20890-2111.802138hypothetical protein
XC_RS208951131.704588ribosomal RNA small subunit methyltransferase G
XC_RS209002141.596518alkaline phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20865ACRIFLAVINRP7580.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 758 bits (1960), Expect = 0.0
Identities = 245/1075 (22%), Positives = 420/1075 (39%), Gaps = 72/1075 (6%)

Query: 5 IIRFAIAQRWLMLALTGVLIAIGAWSFSRLPIDATPDITNVQVQVNTAAPGYSPLESEQR 64
+ F I + L +L+ GA + +LP+ P I V V+ PG +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 65 VTFPLETVLAGLPGLESTRSLS-RYGLSQVTAVFADGTDLYFARQQVAERLQQVKSQLPP 123
VT +E + G+ L S S G +T F GTD A+ QV +LQ LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 124 ELEPQLGPIATGLGEIFMYTVEAKPNARKADGSAWTATDLRTLQDWVVRPQLRNVPGVTE 183
E++ Q + M +D T D+ V+ L + GV +
Sbjct: 121 EVQQQGISVEKSSSSYLMVA------GFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGD 174

Query: 184 VNTIGGHARQIHITPDPARLVALGFTLDDVAQAVEANNRNVGAGYIER----SGQQF--L 237
V G + I D L T DV ++ N + AG + GQQ
Sbjct: 175 VQLFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS 233

Query: 238 VRVPGQVDDIAQIGAIVLD-RREGVPIRVRDVAQVGEGRELRNGAATQDGTEVVLGTVFM 296
+ + + + G + L +G +R++DVA+V G E N A +G + +
Sbjct: 234 IIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 297 LVGANSRTVAQAAAQRLELANASLPAGVRAVPVYDRTALVDRTIVTVAKNLIEGALLVIV 356
GAN+ A+A +L P G++ + YD T V +I V K L E +LV +
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 357 VLFLMLGNVRAALITAAVIPLAMLFTLTGMVRGGVSGNLMSLG--ALDFGLIVDGAVIIV 414
V++L L N+RA LI +P+ +L T + G S N +++ L GL+VD A+++V
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 415 ENCLRRFGEAQARLGRVLERDERFALTAEATAEVIRPSLFGVGIIAAVYLPVFALTGIEG 474
EN R E ++ T ++ +++ + +++AV++P+ G G
Sbjct: 414 ENVERVMME---------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 475 KMFHPMAVTVVLALTGAMLLSLTFVPAAIALLLGGKVAEHE----------NRAMRWARG 524
++ ++T+V A+ ++L++L PA A LL AEH N +
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVN 524

Query: 525 VYAPLLDRVLRHARWVGIGALTTVVLCAALATRLGSEFIPNLDEGDIALHAMRIPGTSLE 584
Y + ++L + V L RL S F+P D+G G + E
Sbjct: 525 HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQE 584

Query: 585 --QAISMQATLERRIKQFPEVAHVFGKLGTAEVATDPMPPSVADTFLIMHPREQWPDPRK 642
Q + Q T + V VF G + + F+ + P E+
Sbjct: 585 RTQKVLDQVTDYYLKNEKANVESVFTVNG---FSFSGQAQNAGMAFVSLKPWEERNGDEN 641

Query: 643 PKAQLVAEIEAAVRQLPGNNYEFTQPIQM-RMNELISGVRADVA-IKLYGDDLDTLVTVG 700
++ + + ++ F P M + EL + D I G D L
Sbjct: 642 SAEAVIHRAKMELGKIRDG---FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQAR 698

Query: 701 RRIEAIARSVPGA-ADVGLEQSTGLPMLAVVPDRDALAGYGLNPGVVQDTVAAAVGGQPA 759
++ +A P + V + D++ G++ + T++ A+GG
Sbjct: 699 NQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYV 758

Query: 760 GQLFDGDRRFDIVVRLPEALRQDPTALADLPIPLQGDGERNDADESSRAAGWRSGAPTTV 819
D R + V+ R P + L + +G V
Sbjct: 759 NDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRS------------------ANG--EMV 798

Query: 820 PLREVARIESVLGPNQINREDGKRRIVITANVRDRDLGSFVAEVQQRVQ-AEVALPTGYW 878
P V G ++ R +G + I G+ + ++ LP G
Sbjct: 799 PFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAP---GTSSGDAMALMENLASKLPAGIG 855

Query: 879 IGYGGTFEQLISAGQRLAWVVPATLLLIFALLYWSFGSLRDAVVVFSGVPLALTGGVVAL 938
+ G Q +G + +V + +++F L + S V V VPL + G ++A
Sbjct: 856 YDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAA 915

Query: 939 ALRGLSLSISAGVGFIALSGVAVLNGLVMIAFIRGLRDA-GMPLEQALREGALARLRPVL 997
L + VG + G++ N ++++ F + L + G + +A RLRP+L
Sbjct: 916 TLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPIL 975

Query: 998 MTALVAALGFVPMAFNVGAGAEVQRPLATVVIGGIVSSTLLTLLVLPVLYRWLHR 1052
MT+L LG +P+A + GAG+ Q + V+GG+VS+TLL + +PV + + R
Sbjct: 976 MTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 81.0 bits (200), Expect = 2e-17
Identities = 86/436 (19%), Positives = 164/436 (37%), Gaps = 48/436 (11%)

Query: 639 DPRKPKAQLVAEIEAAVRQLPGNNYEFTQPIQMRMNELISGVRADVAIKLYGDDLDTLVT 698
DP + Q+ +++ A LP E Q S + + D+ T
Sbjct: 99 DPDIAQVQVQNKLQLATPLLPQ---EVQQQGISVEKSSSSYL---MVAGFVSDNPGTTQD 152

Query: 699 -----VGRRIEAIARSVPGAADVGLEQSTGLPMLAVVPDRDALAGYGLNPGVVQDTVAAA 753
V ++ + G DV L + + D D L Y L P V + +
Sbjct: 153 DISDYVASNVKDTLSRLNGVGDVQL--FGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQ 210

Query: 754 VGGQPAGQLFDGDRRFDIVVRLPEALRQDPTALADLPIPLQGDGERNDADESSRAAGWRS 813
AGQL P Q A + + +E + +
Sbjct: 211 NDQIAAGQL----------GGTPALPGQQLNAS------IIAQTRFKNPEEFGKVTLRVN 254

Query: 814 GAPTTVPLREVARIESVLGP---NQINREDGKRRIVITANVRDRDLGSFVAEVQQRVQAE 870
+ V L++VAR+E LG N I R +GK + + G+ + + ++A+
Sbjct: 255 SDGSVVRLKDVARVE--LGGENYNVIARINGKPAAGLGIKLAT---GANALDTAKAIKAK 309

Query: 871 VA-----LPTGYWIGYGGTFEQLISAGQRLAWVVPATL---LLIFALLYWSFGSLRDAVV 922
+A P G + Y ++ + VV +L+F ++Y ++R ++
Sbjct: 310 LAELQPFFPQGMKVLY--PYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLI 367

Query: 923 VFSGVPLALTGGVVALALRGLSLSISAGVGFIALSGVAVLNGLVMIAFI-RGLRDAGMPL 981
VP+ L G LA G S++ G + G+ V + +V++ + R + + +P
Sbjct: 368 PTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPP 427

Query: 982 EQALREGALARLRPVLMTALVAALGFVPMAFNVGAGAEVQRPLATVVIGGIVSSTLLTLL 1041
++A + ++ A+V + F+PMAF G+ + R + ++ + S L+ L+
Sbjct: 428 KEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALI 487

Query: 1042 VLPVLYRWLHRERAPR 1057
+ P L L + +
Sbjct: 488 LTPALCATLLKPVSAE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS20875RTXTOXIND371e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.7 bits (85), Expect = 1e-04
Identities = 29/179 (16%), Positives = 51/179 (28%), Gaps = 21/179 (11%)

Query: 180 EVQGLLTPAEGGQAQATARFPGPVRSLRANVGDQVRA-GQVLATVESNLSLTTYSVSAPI 238
+++ + A+ T F + D + LA E + + AP+
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASV--IRAPV 334

Query: 239 SGTVLARNA-SVGSNAGEGQALFEIA-DLSTLWVDLHIFGADAGHITAGAPVTVTRIS-- 294
S V + G + L I + TL V + D G I G + ++
Sbjct: 335 SVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAII-KVEAF 393

Query: 295 --------DGVVAQTTLERVLPGT----ATASQSTVARAVLRNTDGLW-RPGSAVKARV 340
G V L+ + S + + G AV A +
Sbjct: 394 PYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEI 452


60XC_RS21510XC_RS21850Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS21510292.308958hypothetical protein
XC_RS21515292.230970hypothetical protein
XC_RS21520192.293994membrane protein
XC_RS215251102.490360ligand-gated channel
XC_RS215303112.166568ATP-binding protein
XC_RS215350162.775522hypothetical protein
XC_RS215400183.968577hypothetical protein
XC_RS215451164.348595GTP cyclohydrolase 1
XC_RS21550-1133.298209MarR family transcriptional regulator
XC_RS21555-1123.236480membrane protein
XC_RS21560-1133.510490membrane protein
XC_RS21565-2123.523801RND transporter
XC_RS21570-1142.365882multidrug transporter
XC_RS21575-1162.067010DeoR family transcriptional regulator
XC_RS215800163.647876hypothetical protein
XC_RS215851153.972615LysR family transcriptional regulator
XC_RS215900163.536645short-chain dehydrogenase
XC_RS215950163.077661cardiolipin synthetase
XC_RS216001154.019073membrane protein
XC_RS216050142.087482hypothetical protein
XC_RS216100131.722523plasmid stabilization protein ParE
XC_RS216151150.274384enterochelin esterase
XC_RS21620227-4.368687hypothetical protein
XC_RS21625330-5.739183hypothetical protein
XC_RS21630435-6.877425hypothetical protein
XC_RS21635334-6.870627transposase
XC_RS21640436-7.472749hypothetical protein
XC_RS21645332-5.801326hypothetical protein
XC_RS21650026-4.129857integrase
XC_RS21655638-8.393230hypothetical protein
XC_RS21660539-8.682352hypothetical protein
XC_RS21665643-9.713113DNA-binding protein
XC_RS21670639-9.203643pseudouridine synthase
XC_RS21675842-11.123962restriction endonuclease
XC_RS216851044-11.479970plasmid partitioning protein ParA
XC_RS21690425-1.461631peptidoglycan-binding protein
XC_RS216951223.686640hypothetical protein
XC_RS21700-1185.276417hypothetical protein
XC_RS217050144.581524hypothetical protein
XC_RS217100134.452858XRE family transcriptional regulator
XC_RS217150134.360460exodeoxyribonuclease V subunit alpha
XC_RS217200113.883858exodeoxyribonuclease V subunit beta
XC_RS217251112.876618exodeoxyribonuclease V subunit gamma
XC_RS217303131.277683hemagglutinin
XC_RS21735-216-0.192856microcystin dependent protein
XC_RS21740-2140.355847microcystin dependent protein
XC_RS21745-3140.388469microcystin dependent protein
XC_RS21750-3140.752896acetyltransferase
XC_RS21755-3151.266612hypothetical protein
XC_RS21760-3203.067447ABC transporter ATP-binding protein
XC_RS21765-1222.337184ABC transporter permease
XC_RS21770-2222.483454mammalian cell entry protein
XC_RS21775-1222.464231organic solvent ABC transporter
XC_RS21780-1203.250466hypothetical protein
XC_RS217851183.772092lipoprotein
XC_RS217901172.950230glutathione peroxidase
XC_RS217951152.729821hypothetical protein
XC_RS21800-1142.513860hypothetical protein
XC_RS218051153.238758NADPH:quinone reductase
XC_RS218101162.555495transcriptional regulator
XC_RS218151171.758562hypothetical protein
XC_RS218201182.131824membrane protein
XC_RS218251182.508057amino acid transporter
XC_RS218300203.308324alpha-1 2-mannosidase
XC_RS218350203.184253membrane protein
XC_RS218401183.288552membrane protein
XC_RS218451173.713515diguanylate cyclase
XC_RS218501163.611149CdaR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21510SECA332e-04 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 2e-04
Identities = 10/17 (58%), Positives = 11/17 (64%)

Query: 8 DPCPCGRAAGYAQCCGQ 24
DPCPCG Y QC G+
Sbjct: 883 DPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21530RTXTOXIND382e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.9 bits (88), Expect = 2e-04
Identities = 20/147 (13%), Positives = 44/147 (29%), Gaps = 10/147 (6%)

Query: 314 REQQRLALLEARLQEVDSQDRGLAGEEGQRRESLDNHEQKLAGLER--EQRAAGGEQIEE 371
Q + E L + ++ + + + +L ++A + E
Sbjct: 197 TWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLE 256

Query: 372 LERERVRVERERDERLRRRAQIEQACRQLGTALAAGASGFAAQVAHAQGVLDGGKQHAST 431
E + V E + QIE F + +LD +Q
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNE------ILDKLRQTTDN 310

Query: 432 LDEAIAERMGERRDDERRFAEIRAELD 458
+ E + ++ ++ + IRA +
Sbjct: 311 IGLLTLEL--AKNEERQQASVIRAPVS 335



Score = 32.1 bits (73), Expect = 0.015
Identities = 35/240 (14%), Positives = 82/240 (34%), Gaps = 40/240 (16%)

Query: 596 AALRNADRAITREGQVKHPGDRYEKDDRHAVNDRKRWLLGHDNRDKLKVFEREAQALAQR 655
+ L + T G++ H G K+ + N + ++ + + ++ + L +
Sbjct: 75 SVLGQVEIVATANGKLTHSGRS--KEIKPIENSIVKEIIVKEG-ESVR----KGDVLLKL 127

Query: 656 IAS-CDADVAALRKQREQ---DQEKRLAAHTLVERDWDEIDVGPKLQRLSDIDEQ-LQQL 710
A +AD + Q +Q + +E + KL L DE Q +
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELN--------KLPELKLPDEPYFQNV 179

Query: 711 REGDSGLRALGQAIDTARTLRDQAKRTYEDVRLERAQLARERVRLEQQQAACAGRAGTAA 770
E + + +L + T+ + Q ++ + L++++A
Sbjct: 180 SEEE---------VLRLTSLIKEQFSTW------QNQKYQKELNLDKKRAERLTVLARIN 224

Query: 771 LTPTQLQGLQERLAALAPLSLDTLEAHFRVV--ERGLAE---QLAESQGRDSRLSAQLLE 825
+ + RL + L A V+ E E +L + + ++ +++L
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21560RTXTOXIND505e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.8 bits (119), Expect = 5e-09
Identities = 19/142 (13%), Positives = 50/142 (35%), Gaps = 2/142 (1%)

Query: 86 ALEQARAALAERQATLTQLRREIARDRSLQDLVAAEDAEVRRSNVQKAQAAVATAQSAVD 145
+A L ++ L Q+ EI + LV +++ + +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA 319

Query: 146 LAQLNLDRTEVRSPADGHISDRTVR-VGDYVSAGRPVVAVL-DTGSFRVDGYFEETRLQG 203
+ + +R+P + V G V+ ++ ++ + + V + +
Sbjct: 320 KNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGF 379

Query: 204 VHPGQRVDVHLMGEPVTLHGHV 225
++ GQ + + P T +G++
Sbjct: 380 INVGQNAIIKVEAFPYTRYGYL 401



Score = 42.5 bits (100), Expect = 1e-06
Identities = 21/168 (12%), Positives = 58/168 (34%), Gaps = 19/168 (11%)

Query: 10 PALLTLAMVVVAAVVLQHLWRYYMDAPWTRDAHVGADVV------QVAPDVSGLVEQVAV 63
+A ++ +V+ + A + ++ P + +V+++ V
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVL--GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIV 112

Query: 64 ADNQAVTRGQLLFVVDRARYAIALEQARAALAERQATLTQLRREIARD----RSLQDLVA 119
+ ++V +G +L + + +++L QA L Q R +I L +L
Sbjct: 113 KEGESVRKGDVLLKLTALGAEADTLKTQSSLL--QARLEQTRYQILSRSIELNKLPELKL 170

Query: 120 AEDAEVRRSNVQKAQAAVATAQSAVD-----LAQLNLDRTEVRSPADG 162
++ + + ++ + + Q L+ + R+
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21565RTXTOXIND300.021 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.021
Identities = 14/117 (11%), Positives = 36/117 (30%), Gaps = 5/117 (4%)

Query: 357 TLPSSGAHARVRATEAGADAAVAQFDHTVLQA-LREVQTTLSRYAQDLDRLRLLEQA-QQ 414
LP V E ++ + + Q + + L + + + +
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 415 QAELASSQN---RRLYQGGRTPYLSSLDAERTLATADMTLADAQAQVSKDQIQLFLA 468
+ + S+ L + L+ E A L ++Q+ + + ++ A
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21575TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 3e-04
Identities = 22/86 (25%), Positives = 42/86 (48%), Gaps = 14/86 (16%)

Query: 69 AIFA-MTFLMRPIGAWYFGRFADRYGRRLALTISVSMMALCSFVIAITPTVSTIGIAAPI 127
A++A M F P+ G +DR+GRR L +S++ A+ ++A P +
Sbjct: 50 ALYALMQFACAPVL----GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW-------- 97

Query: 128 ILLLARLLQGFATGGEYGTSATYMSE 153
+L + R++ G TG + Y+++
Sbjct: 98 VLYIGRIVAGI-TGATGAVAGAYIAD 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21590DHBDHDRGNASE771e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.0 bits (189), Expect = 1e-18
Identities = 43/187 (22%), Positives = 83/187 (44%), Gaps = 3/187 (1%)

Query: 6 LITGASSGFGRGLTQTLLARGDRVAA---TVRRADALADLQAAHGNALTVLQLDVRDTAA 62
ITGA+ G G + +TL ++G +AA + + + A DVRD+AA
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAA 71

Query: 63 VQAVVAQAFAALGRIDVVISNAGYGTLGAAEAATDAQVRALIDTNLIGSISVIQAALPHL 122
+ + A+ +G ID++++ AG G + +D + A N G + ++ ++
Sbjct: 72 IDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYM 131

Query: 123 RRQGGGHVVQVSSEGGQIAYPGFSLYHASKWGIEGYVEAVRQEVAGFGIQFTLAEPGPAR 182
+ G +V V S + + Y +SK + + + E+A + I+ + PG
Sbjct: 132 MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGSTE 191

Query: 183 TNFGAAL 189
T+ +L
Sbjct: 192 TDMQWSL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21600VACCYTOTOXIN300.004 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.004
Identities = 30/138 (21%), Positives = 49/138 (35%), Gaps = 18/138 (13%)

Query: 32 RQPWKPVAATLRTALLLVATAALGYAAYG----LPGVAPGVALGAVVGAGLGVVSLRYTH 87
R+ +P+ + L+ T +AA+ +P + G+A GA VG G++
Sbjct: 8 RKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGLLGWGLKQ 67

Query: 88 AEWVDGRGWYTPNPWIGGALMLV---------LLGRLAWRWTDGAFSGGAA-----VAGS 133
AE + W A L L DG + G A V
Sbjct: 68 AEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAARHYWVKDG 127

Query: 134 QASPLTLGIAAALVLYSL 151
Q + L + + A+ Y+L
Sbjct: 128 QWNKLEVDMQNAVGTYNL 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21610PF05616353e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 35.5 bits (81), Expect = 3e-04
Identities = 33/117 (28%), Positives = 44/117 (37%), Gaps = 6/117 (5%)

Query: 132 PPQGSASGGRTKVDFVGDTSQPDQPVPSPTPVPPSPTPTPVQPPPAASPVQSTLVQQAKN 191
P Q A+ GR D G+T+ Q +P P P S QP P SP ++ A N
Sbjct: 288 PVQVVATFGR---DSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPN 344

Query: 192 PVP---PQGDTAPGSLAERRRQTRRQTRPTPPQPPAPPAASTQRRPETWTGRPPGML 245
P P + P + T Q P P P + + R E G G+L
Sbjct: 345 ENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKERKEGEDGGLL 401


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21690TYPE4SSCAGX320.007 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.007
Identities = 18/64 (28%), Positives = 31/64 (48%)

Query: 382 NVFVVQGALDNPAHLMAHMKTSDAIAQPVEQSLSQLQTLSETQRQQQAQQQSQQQDQQQL 441
N+ + A+ NP +L + S+ I Q E L Q++ L + Q Q QA Q ++ +
Sbjct: 179 NLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKK 238

Query: 442 SAPQ 445
A +
Sbjct: 239 QAEE 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21730INTIMIN360.002 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 35.8 bits (82), Expect = 0.002
Identities = 73/389 (18%), Positives = 118/389 (30%), Gaps = 52/389 (13%)

Query: 236 QGQGTIVNDDALPSLSIDDVSVNEGNSGTTTATFTVTL--SAASGQTVSVNYASADGTAT 293
G +V+ + + D S + T T TV A + VS N S +
Sbjct: 549 LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLS 608

Query: 294 AGSDYVARSGTLTFAPGTTAQGVAI----TVNGDTALEPNETFSVGLSGASNASI----- 344
A S SG T + G + T +AL N V + AS I
Sbjct: 609 ANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKT 668

Query: 345 -ARATGAGTILNDDVVVTVGPASLPAATAGSAYSQNLSASGGTAPYSFAVTAGALPAGLT 403
A A G I V V P + ++ L + + G LT
Sbjct: 669 TAVANGQDAI---TYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT--NGYAKVTLT 723

Query: 404 LSAAGVLSGTPTATG-SFNFTATATDSGGSPTSGNRAYTLTVAGATVTLPATSLPAGTAG 462
+ G + + + + A + + T + + G LP L G
Sbjct: 724 STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVN 783

Query: 463 QAYSGALNPATGGIPPYTYAVTAGALPAGITLNGSSGALTGTPGSVGSFAFSVTATDSTS 522
A+GG YT+ A+ +++ SSG + T G+ SV ++D+ +
Sbjct: 784 L-------KASGGNGKYTWRSANPAIA---SVDASSGQV--TLKEKGTTTISVISSDNQT 831

Query: 523 GTPSQGTRGYTLNIAAPPIVVAPSTLPAATRGTAYSQTLSAS---------------GGT 567
T YT+ IV S + G
Sbjct: 832 AT-------YTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAA 884

Query: 568 APYTYALASGALPAGLTLASNGTLSGTAT 596
Y Y +S + + + + SG A+
Sbjct: 885 NKYEYYKSSQTIISWVQQTAQDAKSGVAS 913



Score = 33.5 bits (76), Expect = 0.011
Identities = 70/402 (17%), Positives = 116/402 (28%), Gaps = 54/402 (13%)

Query: 547 TLPAATRGTAYSQTLSASG----GTAPYTYALASGALPAGLTLASNGTLSGTATVEGS-- 600
LPA +G + ++A G + L L G + G TA +
Sbjct: 513 ILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKA 572

Query: 601 -----FNFTVTATDAGSFTANQAYSLTVAGPNLVLPASSLPAGTAGQAYSASITPATGGT 655
+T T G AN S + +G A ++ + T G+
Sbjct: 573 DGTEAITYTATVKKNGVAQANVPVSFNI---------------VSGTAVLSANSANTNGS 617

Query: 656 APYSYALTAGALPTGVVVDVATGGLSGTPTVAGTFNFTLTVSDSTPSPAAQASRSYTLTI 715
+ L + VV S A F S + + +
Sbjct: 618 GKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDA 677

Query: 716 AAPVIVVAPTALPAATRGTVYSQTLSASGGTAPYTYAVSAGNVPAGLTLASNGTLSGTAT 775
+ V P + + ++ TL + T G LT + G +A
Sbjct: 678 ITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT--NGYAKVTLTSTTPGKSLVSAR 735

Query: 776 VEGSFNFTVTATDANTFTAS--QAYAVTVAGPNL--ALPASSLPAGTAGQAYAATIAPAT 831
V V A + FT + + G + LP L G A+
Sbjct: 736 VSDV-AVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNL-------KAS 787

Query: 832 GGTAPYSYALSAGVLPNGVVLDTATGGLSGMPTLSGTFNFTLTVTDSTPSPAAQASQSYT 891
GG Y++ + + + SG TL T++V S +Q+ T
Sbjct: 788 GGNGKYTWRSANPAI-------ASVDASSGQVTLKEKGTTTISVISSD-------NQTAT 833

Query: 892 LSIAAPVIVVAPTALPAATRGTAYSQVLTASGGTAPYTYEVN 933
+IA P ++ P T A + G E+
Sbjct: 834 YTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELE 875


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21750SACTRNSFRASE453e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 44.6 bits (105), Expect = 3e-08
Identities = 28/124 (22%), Positives = 51/124 (41%), Gaps = 23/124 (18%)

Query: 59 YRQHYRDADFLI------------VQTPDERIGRLYLQRAAVQHVLV-DISLLPHWRGRG 105
Y + Y D D + + IGR+ ++ + L+ DI++ +R +G
Sbjct: 46 YFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKG 105

Query: 106 VGTALIMHAQALAQAAG-CGVALHVFHANPAARRLYTRLGFL---------AGDASSTHL 155
VGTAL+ A A+ CG+ L N +A Y + F+ + ++ +
Sbjct: 106 VGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSNFPTANEI 165

Query: 156 AMHW 159
A+ W
Sbjct: 166 AIFW 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21760PF00577280.047 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 27.9 bits (62), Expect = 0.047
Identities = 22/117 (18%), Positives = 40/117 (34%), Gaps = 21/117 (17%)

Query: 7 PVVQLSGVRIDRGGRAILRDVS-----------------LDVPRGSITAVLGPSGSGKST 49
V +GVR D G A+L + +D+ V G+
Sbjct: 730 KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPT-RGAIVRA 788

Query: 50 LLAALTGELRPVAGRVTLFGQEIPRGSRALLEMRRNVGVLLQ-GNGLLTDLSVADNV 105
A G + +T + +P G+ E ++ G++ G L+ + +A V
Sbjct: 789 EFKARVG--IKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKV 843


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21785VACJLIPOPROT2358e-79 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 235 bits (602), Expect = 8e-79
Identities = 76/216 (35%), Positives = 104/216 (48%), Gaps = 13/216 (6%)

Query: 108 GGAAAGPALPGGAPAYDPWERYNRGMHRFNLAV-DRGVARPLATGYTKVVPRPARLGVTN 166
G A+ G DP E +NR M+ FN V D + RP+A + VP+PAR G++N
Sbjct: 16 VGCASSGTDQQGR--SDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWRDYVPQPARNGLSN 73

Query: 167 FFDNLGSPLTMVNQLLQGHPVYAVQTLGRFVMNSTLGVAGLFDPASAAGIPRR---SEDF 223
F NL P MVN LQG P + RF +N+ LG+ G D A A + F
Sbjct: 74 FTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGMANPKLQRTEPHRF 133

Query: 224 GQTLGVWGWRNSRYFELPLFGPRTVRDTLGLGGDI---PLSWVRQVNDGGARFALQGLQL 280
G TLG +G Y +LP +G T+RD G D LSW L+
Sbjct: 134 GSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSW----LTWPMSVGKWTLEG 189

Query: 281 VDTRAQLMSLDSLRDQAPDEYALTRDAWMQRRNYQI 316
++TRAQL+ D L Q+ D Y + R+A+ QR ++
Sbjct: 190 IETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIA 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21850HTHFIS300.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.018
Identities = 11/49 (22%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 308 PLQALLTQDRRSQLLQTLSCWFANGMRMTPTAKALGIHRNTLDYRMQRI 356
+L + +L L+ A A LG++RNTL +++ +
Sbjct: 428 LYDRVLAEMEYPLILAALT---ATRGNQIKAADLLGLNRNTLRKKIREL 473


61XC_RS00540XC_RS00565N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS00540-2112.313971hypothetical protein
XC_RS00545-3102.288346ATP-dependent DNA ligase
XC_RS00550-3142.006809l-lactate dehydrogenase
XC_RS00555-1142.386545protein-S-isoprenylcysteine methyltransferase
XC_RS00560-2130.956714sensor histidine kinase
XC_RS005650101.380318LuxR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00540IGASERPTASE280.040 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.040
Identities = 13/47 (27%), Positives = 23/47 (48%), Gaps = 2/47 (4%)

Query: 268 TPAKKTTAAADTAPAKKTATKKAAKKATKKTA--TKATKKAAPRRKA 312
TP++ T A+ + + +K + AT+ TA + K+A KA
Sbjct: 1032 TPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKA 1078


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00545IGASERPTASE330.006 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.5 bits (76), Expect = 0.006
Identities = 36/225 (16%), Positives = 64/225 (28%), Gaps = 10/225 (4%)

Query: 530 RQAAFKRLREDKPMSDLGGDR--ATPGKSRGARTRTAAAAAGKASRAAATRTAAVSAGGS 587
+++ E +R A KS A S T+T +
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 588 AAKPGKAGKSSTAADVSTPSRVA----KQRVTPAASSAAKPGKPGKSSAAATGTASPRAA 643
K KA K T P + KQ + A+P + + S
Sbjct: 1106 VEKEEKA-KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT-- 1162

Query: 644 KRGAVSTASASSTPKSGKRSVSSGSAASGKPAAPSKAASSKTARTPATSSARKTSTATAA 703
A + A T + ++ V+ + + + ++ A T T ++ ++S
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS-ESSNKPKN 1221

Query: 704 SSIRKARASASNAPDGVAITHPERVVFPAAGISKGDVAAYYRAVA 748
R R+ N ++ V S A A A
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARA 1266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00560PF06580423e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.8 bits (98), Expect = 3e-06
Identities = 68/370 (18%), Positives = 133/370 (35%), Gaps = 79/370 (21%)

Query: 79 WRWATVIGFTALFLFRALLPPRPLPQHAAL-------LLQALLAVALIWLEPRTGTSPVL 131
W T+ GF L+ + P+ ++ L+ +L A R
Sbjct: 20 WGVYTLTGFGFASLYGS-------PKLHSMIFNIAISLMGLVLTHAYRSFIKR----QGW 68

Query: 132 LVLVAAQAAMRWPPPQVL--ALMLVLNAAMYAVFVLAGV-PRPFLM-------VAIYTSF 181
L L Q +R P V+ + V N +++ + P F + +
Sbjct: 69 LKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVT 128

Query: 182 QAFAALT--ANYARSAERARDALVYVNADLLATRALLADSARDAERLRLARELHDVAGH- 238
++ L ++ ++ ++A +A A++A+ + L +++ H
Sbjct: 129 FMWSLLYFGWHFFKNYKQAEIDQWK-----------MASMAQEAQLMALKAQINP---HF 174

Query: 239 ---KLTAMRIHLRLLRAEPALAPREDLRMLEQLSGEL---LGDIRSVVQSLRDDAGLDLH 292
L +R L+ +P A ML LS + L + SL D+ L +
Sbjct: 175 MFNALNNIR---ALILEDPTKA----REMLTSLSELMRYSLRYSNARQVSLADE--LTVV 225

Query: 293 TA-LHALAAPFPRPALRLQIDAQVRVSDPQVAEALVR--LVQEALTNAARHG-----DAD 344
+ L + F RLQ + Q+ +P + + V LVQ + N +HG
Sbjct: 226 DSYLQLASIQFED---RLQFENQI---NPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGG 279

Query: 345 TVTVAVRCDAGALCVDIQDDGR-CAEQIHEGNGI--AGMRERLAALSG-QLDLR-RAAHG 399
+ + D G + +++++ G + E G +RERL L G + ++ G
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQG 339

Query: 400 GLHLTARLPA 409
++ +P
Sbjct: 340 KVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS00565HTHFIS851e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 1e-21
Identities = 30/118 (25%), Positives = 53/118 (44%), Gaps = 1/118 (0%)

Query: 1 MSALRIALADDQVLVRAALRALLQQQGITVVCEADDGQALLDALATHNVDVVLSDIRMPG 60
M+ I +ADD +R L L + G V + L +A + D+V++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 LDGIAALEQLRARGDRTPVLLLTTFDDSELLLRACEAGAQGFLLKDAAPDDLHEAIAR 118
+ L +++ PVL+++ + ++A E GA +L K +L I R
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


62XC_RS02175XC_RS02225N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS02175-1123.1267043-oxoacyl-ACP reductase
XC_RS02180-2124.095718hypothetical protein
XC_RS02185-291.981268pyridine nucleotide-disulfide oxidoreductase
XC_RS02190-291.597337hypothetical protein
XC_RS02195-2101.549446transporter
XC_RS02200-290.816538AcrR family transcriptional regulator
XC_RS02205-191.771708hemolysin secretion protein D
XC_RS02210-1101.106320cation efflux system protein
XC_RS02215-2132.813374short-chain dehydrogenase
XC_RS02220-2142.871452hypothetical protein
XC_RS02225-2163.409252RNA helicase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02175DHBDHDRGNASE1197e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 119 bits (300), Expect = 7e-35
Identities = 79/251 (31%), Positives = 113/251 (45%), Gaps = 9/251 (3%)

Query: 11 VLIAGGSRGIGLAIAQAFVQSGAQVSLCARNPEGLANAAAQLAADGAPVHTFACDLADAA 70
I G ++GIG A+A+ GA ++ NPE L + L A+ F D+ D+A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 71 QIERYVHAAAQAFGGLDVVVNNAS----GYGHGNDDESWQAGLDVDLMAAVRCNRAAVPY 126
I+ + G +D++VN A G H DE W+A V+ +R+ Y
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 127 LRQSGNAVILNISSINGQRPTPRAIAYSTAKAALNYYTTTLAAELARERIRVNAIAPGSI 186
+ + I+ + S P AY+++KAA +T L ELA IR N ++PGS
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGST 190

Query: 187 E--FPGGLWEQRSRDEPALY---ARIRDSIPFGGFGQVQHIADAALFLASPQARWITGQV 241
E LW + E + + IP + IADA LFL S QA IT
Sbjct: 191 ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHN 250

Query: 242 LAVDGGQSLGV 252
L VDGG +LGV
Sbjct: 251 LCVDGGATLGV 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02190SURFACELAYER270.024 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 27.3 bits (60), Expect = 0.024
Identities = 15/45 (33%), Positives = 26/45 (57%)

Query: 5 ASIAVTAALCACALPALASETLNSRTDVLAALEAGYDVSVSTDLS 49
A++ A + A A+P A+ T+N+ + + A A YDV V+ +S
Sbjct: 13 AALLAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSIS 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02200HTHTETR611e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 1e-13
Identities = 28/194 (14%), Positives = 60/194 (30%), Gaps = 6/194 (3%)

Query: 18 DVRDQIVNAATEHFRRYGYEKTAVSDLAKSIGFSKAYIYKFFESKQAIGEMICTNCLRQI 77
+ R I++ A F + G T++ ++AK+ G ++ IY F+ K + I I
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 78 -EDEVRAAVDETDSPPEKFRRMFKVIVDASL-----RLFFEDRKLYEIAASAATERWQSV 131
E E+ P R + ++++++ RL E Q+
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 132 LAYEERVLALLQEILQQGRQGGDFERKTPLDEATRALYVLIRPYTNPVLMQHSLDVIDEV 191
+++ L+ + A + I L + +
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKKE 190

Query: 192 PGLLSGLVLRSLSP 205
++L
Sbjct: 191 ARDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02205RTXTOXIND448e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.7 bits (103), Expect = 8e-07
Identities = 18/104 (17%), Positives = 35/104 (33%), Gaps = 7/104 (6%)

Query: 68 VSGKVSERLVDAGQRVTRGQPLLRIDPVD-----LKLAAQAQQEAVTAARARA--QQAGE 120
+ V E +V G+ V +G LL++ + LK + Q + R + +
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 121 DEARYRDLRGTGAISASAYDQIKAAADAAKAQLSAAQAQADVAR 164
++ L + +++ K Q S Q Q
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE 206



Score = 33.6 bits (77), Expect = 0.001
Identities = 11/112 (9%), Positives = 31/112 (27%), Gaps = 5/112 (4%)

Query: 88 PLLRIDPVDLKLAAQAQQEAVTAARARAQQAGEDEARYRDLRGTGAISASAYDQIKAAAD 147
+ + K + V ++ ++ A+ T D+++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT-- 307

Query: 148 AAKAQLSAAQAQADVARNANRYTDLLADADGVVMETLV-EPGQVVAAGQPVV 198
+ + + + + A V + V G VV + ++
Sbjct: 308 --TDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLM 357



Score = 30.6 bits (69), Expect = 0.012
Identities = 15/84 (17%), Positives = 29/84 (34%), Gaps = 2/84 (2%)

Query: 178 GVVMETLVEPGQVVAAGQPVVRLAHAGRREAIIQLPETLQPVVGSTAQATLFGNAGVTEP 237
+V E +V+ G+ V G +++L G ++ +L Q + E
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLL--QARLEQTRYQILSRSIEL 162

Query: 238 ATLRQLSDSADRLTRTFEARYVLG 261
L +L + + VL
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLR 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02210ACRIFLAVINRP438e-139 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 438 bits (1127), Expect = e-139
Identities = 230/1047 (21%), Positives = 434/1047 (41%), Gaps = 63/1047 (6%)

Query: 8 LSALAVRERSITLFLIVLISLAGLVAFLKLGRAEDPAFTVKVMTIITAWPGATPQEMQEQ 67
++ +R L +++ +AG +A L+L A+ P +++ +PGA Q +Q+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 68 VAEKLEKRMQELRWYDRTETYT-RPGLAFTTLTLLDSTPP----GEVQEQFYQARKKVGD 122
V + +E+ M + + + G TLT T P +VQ + A
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPL--- 117

Query: 123 EAGNLPAGVIGPMVNDEYADVTFAL---FALKAKGEPQRLLARDAE-TMRQRLLHVPGVK 178
LP V ++ E + ++ + F G Q ++ ++ L + GV
Sbjct: 118 ----LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 179 KVNIIGEQPERIFVEFSHDRLATLGVSPQDVFAALNAQNALTAAGSVETRGP------DV 232
V + G Q + + D L ++P DV L QN AAG + +
Sbjct: 174 DVQLFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNA 232

Query: 233 FIRLDGALDNLQKIRDTPLVVQ--GRSLKLSDIATVKRGYEDPSTFMIRSGGEPALLLGI 290
I N ++ L V G ++L D+A V+ G E+ + R G+PA LGI
Sbjct: 233 SIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVI-ARINGKPAAGLGI 291

Query: 291 IMRDGWNGLDLGKSLDAEVGAINAELPLGMQLSKVTDQAVNIDASVGEFMTKFFVALLVV 350
+ G N LD K++ A++ + P GM++ D + S+ E + F A+++V
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLV 351

Query: 351 MLVCFVSMG-WRVGIVVAAAVPLTLAAVFVVMLATGKNFDRITLGSLILALGLLVDDAII 409
LV ++ + R ++ AVP+ L F ++ A G + + +T+ ++LA+GLLVDDAI+
Sbjct: 352 FLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIV 411

Query: 410 AIEMMV-VKMEEGYGRVAASAYAWSHTAAPMLSGTLVTAVGFMPNGFAASTAGEYTSNMF 468
+E + V ME+ A+ + S ++ +V + F+P F + G
Sbjct: 412 VVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFS 471

Query: 469 WIVGIALIVSWAVAVVFTPYLGVKML----PDMKKIEGGHAAIYNT---PRYNRFRQALG 521
+ A+ +S VA++ TP L +L + + +GG +NT N + ++G
Sbjct: 472 ITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVG 531

Query: 522 RVIARKWLVAGSVVGLFVLAVVGMGIVKKQFFPISDRPEVLVEVQLPYGTSINQTSTAAA 581
+++ + VV + F P D+ L +QLP G + +T
Sbjct: 532 KILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLD 591

Query: 582 KVEAWLSKQKEAKIVTAYIGQGAPRFFLAMGPELPDPSFAKIVV-----RTDDQHERDAL 636
+V + K ++A + + + G + + + A + + R D++ +A+
Sbjct: 592 QVTDYYLKNEKANVESVFTVNG-----FSFSGQAQNAGMAFVSLKPWEERNGDENSAEAV 646

Query: 637 KLRLREAIAQGLAPEARVRA----TQLTFGPYSKFPVA-YRVSGPDPTVLRGIAAQVMQV 691
R + + + + V + G + F +G L Q++ +
Sbjct: 647 IHRAKMELGK--IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGM 704

Query: 692 MQDSP-MLRTVNTDWGVRTPTLHFSLDQDRLQAVGLTSAAVAQQLQFLLSGVPITLVRED 750
P L +V + T +DQ++ QA+G++ + + Q + L G + +
Sbjct: 705 AAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDR 764

Query: 751 IRSVQVVARSAGDTRLDPARIADFTLAGGNGQRVPLSQVGKVDVRMEEPVMRRRDRVPTI 810
R ++ ++ R+ P + + NG+ VP S P + R + +P++
Sbjct: 765 GRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSM 824

Query: 811 TVGGDVDDTLQPPDVSAAITRQLQPIIDKLPSGYQIKEAGSIEESGKATAAMLPLFPIML 870
+ G+ P S ++ + KLP+G G + + L I
Sbjct: 825 EIQGEA----APGTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISF 880

Query: 871 AATLLIIILQVRSISAMVMVFLTSPLGLIGVVPTLILFQQPFGINALVGLIALSGILMRN 930
L + S S V V L PLG++GV+ LF Q + +VGL+ G+ +N
Sbjct: 881 VVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKN 940

Query: 931 TLILIGQIHH-NEAEGLDPFHALVGATVQRARPVILTALAAILAFIPLTHSVFWGT---- 985
++++ E EG A + A R RP+++T+LA IL +PL S G+
Sbjct: 941 AILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQN 1000

Query: 986 -LAYTLIGGTLAGTVLTLVFLPAMYSI 1011
+ ++GG ++ T+L + F+P + +
Sbjct: 1001 AVGIGVMGGMVSATLLAIFFVPVFFVV 1027



Score = 71.4 bits (175), Expect = 1e-14
Identities = 59/330 (17%), Positives = 122/330 (36%), Gaps = 24/330 (7%)

Query: 712 LHFSLDQDRLQAVGLTSA----AVAQQLQFLLSGVPITLVREDIRSVQVVARSAGDTRLD 767
+ LD D L LT + Q + +G + + + + +
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK-N 242

Query: 768 PARIADFTL-AGGNGQRVPLSQVGKVDVRMEE-PVMRRRDRVPTITVGGDVDDTLQPPDV 825
P TL +G V L V +V++ E V+ R + P +G + D
Sbjct: 243 PEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDT 302

Query: 826 SAAITRQLQPIIDKLPSGYQIKEA----GSIEESGKATAAMLPLFPIMLAATLLIIILQV 881
+ AI +L + P G ++ ++ S L IML L++ L +
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTL-FEAIMLVF--LVMYLFL 359

Query: 882 RSISAMVMVFLTSPLGLIGVVPTLILFQQPFGINA--LVGLIALSGILMRNTLILIGQIH 939
+++ A ++ + P+ L+G IL + IN + G++ G+L+ + ++++ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTF--AILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVE 417

Query: 940 -HNEAEGLDPFHALVGATVQRARPVILTALAAILAFIPL-----THSVFWGTLAYTLIGG 993
+ L P A + Q ++ A+ FIP+ + + + T++
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 994 TLAGTVLTLVFLPAMYSIWFKIRPDPGKGN 1023
++ L+ PA+ + K N
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHEN 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02215DHBDHDRGNASE1023e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 102 bits (256), Expect = 3e-28
Identities = 56/185 (30%), Positives = 82/185 (44%), Gaps = 8/185 (4%)

Query: 6 VVLITGVSSGIGRAAAEHFARAGCLVYGSVRHLAGATPLTAVELVE--------MDIRDA 57
+ ITG + GIG A A A G + + + + E D+RD+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 58 ASVQRAVDGIIARAGRIDVLVNNAGTNLVGAIEETSVDEAAALFDINLLGILRTVQAVLP 117
A++ I G ID+LVN AG G I S +E A F +N G+ ++V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 118 HMRARGQGRIVNVSSVLGFLPAPYMGVYAASKHAVEGLSETLDHELRQFGIRVTLVEPAY 177
+M R G IV V S +P M YA+SK A ++ L EL ++ IR +V P
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 178 TKTSL 182
T+T +
Sbjct: 190 TETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02225SECA300.019 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.019
Identities = 21/66 (31%), Positives = 31/66 (46%), Gaps = 6/66 (9%)

Query: 252 LLVFVTSRHGADKVAEKLSKTGIAALPLHGELSQGRRERTLRAFKQADVQ--VLVATDLA 309
+LV S ++ V+ +L+K GI H L+ QA V +AT++A
Sbjct: 452 VLVGTISIEKSELVSNELTKAGIK----HNVLNAKFHANEAAIVAQAGYPAAVTIATNMA 507

Query: 310 GRGIDI 315
GRG DI
Sbjct: 508 GRGTDI 513


63XC_RS02515XC_RS02535N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS025151192.475764bacterioferritin
XC_RS025201193.081093histidine kinase
XC_RS02525-2113.024920GGDEF signaling protein
XC_RS02530-1132.697095membrane protein
XC_RS02535-1151.992597membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02515HELNAPAPROT300.003 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 29.8 bits (67), Expect = 0.003
Identities = 20/103 (19%), Positives = 42/103 (40%), Gaps = 10/103 (9%)

Query: 44 EYKESIDEMKHADKLSDRILFLEGLPNF---QALGKLRIGENP-----TEMFRCDLALER 95
E + E D +++R+L + G P + I + +EM + + +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EAVVVLREAVAYAETVKDYVSRQLFVDILESEEEHIDWLETQL 138
+ + + AE +D + LFV ++E E+ + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02520HTHFIS593e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 59.5 bits (144), Expect = 3e-11
Identities = 25/134 (18%), Positives = 47/134 (35%), Gaps = 5/134 (3%)

Query: 498 RILLVEDNPVNLLVAQKLLGVLGFEADTATDGEAALSSMESTRYDMVFMDCQMPVLDGYA 557
IL+ +D+ V + L G++ ++ + + D+V D MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 558 ATRRWRAMETESGGRPIPIVAMTANAMAGDRERCLAAGMDDYLSKPVAREQLNACLQRWL 617
R + + +P++ M+A + G DYL KP +L + R L
Sbjct: 65 LLPRIKKARPD-----LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 618 PRQALLPGPSTGAP 631
P
Sbjct: 120 AEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02525HTHFIS686e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.5 bits (165), Expect = 6e-14
Identities = 28/133 (21%), Positives = 56/133 (42%), Gaps = 4/133 (3%)

Query: 107 RVLIVEDDRSQALFAQSVLHGAGMHAQVEMTAASVPQAIQDYHPDLILMDLHMPELDGIR 166
+L+ +DD + L AG ++ AA++ + I DL++ D+ MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 167 LTTLIRQQPGQQLLPIVFLTGDPDPERQFEVLDSGADDFLTKPIRPRHLIAAVSN--RIR 224
L I++ + LP++ ++ + + GA D+L KP LI +
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 225 RARQQALQQAGEQ 237
+ R L+ +
Sbjct: 123 KRRPSKLEDDSQD 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02535GPOSANCHOR405e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.0 bits (93), Expect = 5e-06
Identities = 21/79 (26%), Positives = 30/79 (37%), Gaps = 1/79 (1%)

Query: 55 EAGLQQAQRAQAAQRRQIEQLQQRQVNLAMSDKISRAANTEVQASLAERDEQIAALRADV 114
A Q +R A R +QL+ L +KIS A+ ++ L E L A+
Sbjct: 308 NANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEH 367

Query: 115 AFYERLVG-STAQRKGLNA 132
E S A R+ L
Sbjct: 368 QKLEEQNKISEASRQSLRR 386


64XC_RS02610XC_RS02645N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS026106182.445409ser/threonine protein phosphatase
XC_RS026153120.890953hypothetical protein
XC_RS026203120.649195hypothetical protein
XC_RS026252130.419901CDP-diacylglycerol--glycerol-3-phosphate
XC_RS026302140.183317acyltransferase
XC_RS026352140.632258phosphatidate cytidylyltransferase
XC_RS026402141.360285ice nucleation protein
XC_RS026450144.860006Fis family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02610PHPHTRNFRASE290.048 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.0 bits (65), Expect = 0.048
Identities = 16/65 (24%), Positives = 24/65 (36%), Gaps = 10/65 (15%)

Query: 354 PASLRAAAQAIEAARAHG-PVLVCCALGYSRSAASVVTWLV------LSQRAASISEAMA 406
PA LR I+AA + G V +C + + L+ S A SI A +
Sbjct: 480 PAILRLVDMVIKAAHSEGKWVGMCGEMA---GDEVAIPLLLGLGLDEFSMSATSILPARS 536

Query: 407 CVRAA 411
+
Sbjct: 537 QLLKL 541


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02620cloacin358e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.7 bits (79), Expect = 8e-04
Identities = 17/49 (34%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 448 GTPWASYHEIRAPNTHHHDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGG 496
G+ W+S + P S GG G G GG G+SGG G G
Sbjct: 36 GSGWSSENN---PWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81



Score = 32.8 bits (74), Expect = 0.003
Identities = 14/41 (34%), Positives = 20/41 (48%)

Query: 461 NTHHHDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGGCGGGG 501
++ ++ G S G GG G G+ GG+G G G GG
Sbjct: 40 SSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 31.2 bits (70), Expect = 0.009
Identities = 14/38 (36%), Positives = 18/38 (47%)

Query: 465 HDSTGSSCGGGGDGGDGGGGDSGGDGGGGCGGCGGGGD 502
++ G G G G G G +GG G GG G GG+
Sbjct: 43 NNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 30.8 bits (69), Expect = 0.015
Identities = 19/62 (30%), Positives = 22/62 (35%), Gaps = 18/62 (29%)

Query: 458 RAPNTHHHDSTGSSCGGGGDGGDGGGGDSG------------------GDGGGGCGGCGG 499
R NT H ++G+ GG G GGG G GGG G GG
Sbjct: 7 RGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGG 66

Query: 500 GG 501
G
Sbjct: 67 GN 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02640ICENUCLEATIN9940.0 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 994 bits (2571), Expect = 0.0
Identities = 942/1306 (72%), Positives = 1056/1306 (80%), Gaps = 48/1306 (3%)

Query: 1 MNREKVLALRTCTNNMSDHCGLIWPLSGIVECRHWQPSIKQENGLTGLLWGQGTNAHLNM 60
M +KVL LRTC NNM+DH G+IWPLSGIVEC++W+P ENGLTGL+WG+G+++ L++
Sbjct: 1 MKEDKVLILRTCANNMADHGGIIWPLSGIVECKYWKPVKGFENGLTGLIWGKGSDSPLSL 60

Query: 61 HADAHWVVCMVDTADIIWLGEEGMIKFPRAEVVYAGNRAGAMSCIAAGIEQHSPPKPEPP 120
HADA WVV VD + I + G IKFPRAEV++ G + AM I +
Sbjct: 61 HADARWVVAEVDADECIAIETHGWIKFPRAEVLHVGTKTSAMQFILHHRADYV------- 113

Query: 121 ADSVIAAEFTPKAAHAQFTAPIVESGAHSTAPLPSPPNGIGPQAAQPSNAILRTREIATY 180
+ + P + + T + + Q Q T EIATY
Sbjct: 114 --ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQ-------TIEIATY 164

Query: 181 GSTLTGADQSQLIAGYGSTETAGNGSELIAGYGSTGVAGSDSTIVAGYGSSQTAGGGSTL 240
GSTL+G QSQLIAGYGSTETAG+ S LIAGYGSTG AG+DST+VAGYGS+QTAG S+
Sbjct: 165 GSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQ 224

Query: 241 TAGYGSTQTARNGSELTAGYGSTETAGADSSLIAGYGSTQTSGGDSSLTAGYGSTQTAQN 300
AGYGSTQT GS+LTAGYGST TAG DSSLIAGYGSTQT+G DSSLTAGYGSTQTAQ
Sbjct: 225 MAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 301 GSDLTAGYGSTSTAGTDSSLIAGYGSTQTSGGESSLTAGYGSTQTAQDGSDLTAGYGSTG 360
GSDLTAGYGST TAG DSSLIAGYGSTQT+G ES+ TAGYGSTQTAQ GSDLTAGYGSTG
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 361 TAGADSSLIAGYGSTQTSGNDSSLTAGYGSTQTARTGSDLTAGYGSTSTAGADSTLIAGY 420
TAG DSSLIAGYGSTQT+G DSSLTAGYGSTQTA+ GSDLTAGYGST TAGADS+LIAGY
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 421 GSTQTSGGDSSLTAGYGSTQTARKGSDLTTGYGSTSTAGADSTLIAGYGSTQTSGSESSL 480
GSTQT+G +S+ TAGYGSTQTA+KGSDLT GYGST TAG DS+LIAGYGSTQT+G +SSL
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 481 TAGYGSTQTARKGSDLTAGYGSTSTAGADSTLIAGYGSTQTSGGESSLTAGYGSTQTARK 540
TAGYGSTQTA+KGSDLTAGYGSTSTAG +S+LIAGYGSTQT+G S+LTAGYGSTQTA+
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 541 GSDLTAGYGSTSTAGGDSTLVAGYGSTQTSGGDSSLTAGYGSTQTARSGSDLTTGYGSTS 600
SDL GYGSTSTAG +S+L+AGYGSTQT+ +S LTAGYGSTQTAR GSDLT GYGST
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 601 TAGGESTLIAGYGSTQTSGNASSLTAGYGSTQTARSGSDLTTGYGSTSTAGADSTLIAGY 660
TAG +S++IAGYGSTQT+ SSLTAGYGSTQTAR S LTTGYGSTSTAGADS+LIAGY
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGY 644

Query: 661 GSTQTSGGESSLTAGYGSTQTARKGSDLTTGYGSTSTAGADSTLIAGYGSTQTAGGESSL 720
GSTQT+G S LTAGYGSTQTA++GSDLT GYGSTSTAGADS+LIAGYGSTQTAG S L
Sbjct: 645 GSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSIL 704

Query: 721 TAGYGSTQTARKGSDLTAGYGSTSTAGSDSSLIAGYGSTQTAGFKSILTTGYGSTQTAQE 780
TAGYGSTQTA++GSDLT+GYGSTSTAG+DSSLIAGYGSTQTA + S LT GYGSTQTA+E
Sbjct: 705 TAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTARE 764

Query: 781 GSLLTAGYGSSSTAGSDSSLIAGYGSTQTAGFKSILTAGYGSTQTAQERSTLTTGYGSTS 840
S+LT GYGS+STAG+DSSLIAGYGSTQTAG+ SILTAGYGSTQTAQERS LTTGYGSTS
Sbjct: 765 QSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTS 824

Query: 841 TAGHDSTLIAGYGSTQTAGYKSILTTGYGSTQTAQESSSLIAGYGSSSMAGPDSSLIAGY 900
TAG DS+LIAGYGSTQTAGY SILT GYGSTQTAQE+S L GYGS+S AG DSSLIAGY
Sbjct: 825 TAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGY 884

Query: 901 GSTQTAGYDSFLTAGYGSTQTAQSSSWLITGYGSTSTASFQSSLIAGYGSTQTAGYESTL 960
GSTQTAGY+S LTAGYGSTQTAQ +S L TGYGSTSTA ++SSLIAGYGSTQTA ++STL
Sbjct: 885 GSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTL 944

Query: 961 TAGYGSTQTAQEISWLTTGYGSTQTAGHGSILTAGYGSNSTAGYESTLIAGYGSTQTAGY 1020
AGYGS+QTA+E S LT GYGST AG+ S L AGYGS TAGY+STL AGYGSTQTA +
Sbjct: 945 MAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEH 1004

Query: 1021 ESTLTAGYGSTLTALENSSLTAGYGSTEIAGFSSTLIAGYGSSQTAGGDSTLTAGYGSTL 1080
STLTAGYGST TAG DS+L AGYGS+L
Sbjct: 1005 SSTLTAGYGST--------------------------------ATAGADSSLIAGYGSSL 1032

Query: 1081 MALDNSTLTAGYGSTETAGQDSSLIAGYGSNLTSGVRSYLTAGYGSNQIASYGSSLIAGH 1140
+ S LTAGYGST +G S L AGYGS+L SG RS LTAGYGSNQIAS+ SSLIAG
Sbjct: 1033 TSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGP 1092

Query: 1141 ESTQIAGHRSMLIAGKLSSQTAGSRSTLIAGRGSIQTAGDRSKLIAGADSTQIAGDRSKL 1200
ESTQI G+RSMLIAGK SSQTAG RSTLI+G S+Q AG+R KLIAGADSTQ AGDRSKL
Sbjct: 1093 ESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKL 1152

Query: 1201 LAGSNSFLTAGDRSKLTAGDDCTLMAGDRSKLTAGKNSILTAGANSRLIGSLGSTLTGGE 1260
LAG+NS+LTAGDRSKLTAG+DC LMAGDRSKLTAG NSILTAG S+LIGS GSTLT GE
Sbjct: 1153 LAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTAGINSILTAGCRSKLIGSNGSTLTAGE 1212

Query: 1261 DSVLIFRCWDGKRYTNIIAKTGEEGVEADIAYQVDDDKNVVEKFDD 1306
+SVLIFRCWDGKRYTN++AKTG+ G+EAD+ YQ+D+D N+V K ++
Sbjct: 1213 NSVLIFRCWDGKRYTNVVAKTGKGGIEADMPYQMDEDNNIVNKPEE 1258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS02645DNABINDNGFIS1143e-37 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 114 bits (287), Expect = 3e-37
Identities = 38/74 (51%), Positives = 55/74 (74%)

Query: 16 KSPLREHVAQSVRRYLRDLDGSDADDVYEIVLREMEIPLFVEVLNHCEGNQSRAAAMLGI 75
+ PLR+ V Q+++ Y L+G D +D+YE+VL E+E PL V+ + GNQ+RAA M+GI
Sbjct: 24 QKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQPLLDMVMQYTRGNQTRAALMMGI 83

Query: 76 HRATLRKKLKEYGL 89
+R TLRKKLK+YG+
Sbjct: 84 NRGTLRKKLKKYGM 97


65XC_RS03265XC_RS03295N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS032650173.050360carboxylesterase
XC_RS032700172.592539histidine kinase
XC_RS032750172.540436transcriptional regulator
XC_RS03280-1150.581137porphobilinogen deaminase
XC_RS03285010-2.357080salt-induced outer membrane protein
XC_RS03290114-3.659074transcriptional regulator
XC_RS03295114-3.458784outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03265TCRTETOQM300.008 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 30.2 bits (68), Expect = 0.008
Identities = 9/16 (56%), Positives = 9/16 (56%)

Query: 6 IERETGPNPQWAVIWL 21
I E PNP WA I L
Sbjct: 432 IHIEVPPNPFWASIGL 447


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03270PF065801574e-47 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 157 bits (398), Expect = 4e-47
Identities = 70/303 (23%), Positives = 128/303 (42%), Gaps = 17/303 (5%)

Query: 59 LWLALAVSVLLCVLRPSLSRLPPRLGGLAALSIAAVVAMLGAGIVHGLYAVLDQAPLGPL 118
+ L L + + R +L L L V+ M+ ++ +L P+
Sbjct: 51 MGLVLTHAYRSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPV 110

Query: 119 VGFWRFTLGSAATVVLIT-ALALRYFYVS----------DRWEAQVQANARAEADALQAR 167
L VV++T +L YF D+W+ A A+ AL+A+
Sbjct: 111 AFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQ-EAQLMALKAQ 169

Query: 168 IRPHFLFNSMNLIASLLRRDPVVAEQAVLDLSDLFRAALGAGEG-VSTLRAECELAERYL 226
I PHF+FN++N I +L+ DP A + + LS+L R +L +L E + + YL
Sbjct: 170 INPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYL 229

Query: 227 AIESLRLGDRLQVRWHKQEPLPWALPMPRLVLQPLVENAVLHGVSRMPEGGTLYLSLRQR 286
+ S++ DRLQ + + +P +++Q LVEN + HG++++P+GG + L +
Sbjct: 230 QLASIQFEDRLQFENQINPAI-MDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD 288

Query: 287 GSQLQIRIVNPAPQPGTQAPLVVAGAGHAQASISHRLAFQFGAGARMAAGWSEGYYACEI 346
+ + + N G ++ RL +G A++ +G +
Sbjct: 289 NGTVTLEVENTGSLALKNTKE---STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 347 TLP 349
+P
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03275HTHFIS653e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 3e-14
Identities = 30/137 (21%), Positives = 51/137 (37%), Gaps = 6/137 (4%)

Query: 1 MTATQVRVLIADDEPLARERLRLLLGEHLHVQVVGEAENGEQVLQLCEQQQPDLVLLDIA 60
MT +L+ADD+ R L L + + N + + DLV+ D+
Sbjct: 1 MTGA--TILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVV 56

Query: 61 MPGVDGLETARLLRQRPQPPAVVFCTAYD--QHALSAFDAAALDYLMKPVRPERLAAALQ 118
MP + + +++ V+ +A + A+ A + A DYL KP L +
Sbjct: 57 MPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 119 RVATYLAGRTPSLAPAA 135
R R L +
Sbjct: 117 RALAEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03295BCTLIPOCALIN503e-10 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 50.4 bits (120), Expect = 3e-10
Identities = 29/117 (24%), Positives = 52/117 (44%), Gaps = 1/117 (0%)

Query: 33 DLSKIMGTWYVIARMPNPVERGHVASRDEYTLVEDGKVAVRYRYREGFEEPEKEVNARAS 92
+L+ +G WY +AR+ + ERG EY + DG ++V R + KE +A
Sbjct: 30 ELNNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLNRGYSEEKGEWKEAEGKAY 89

Query: 93 VDADSGNRDWRVWFYKVIPAKQRILEIAPDG-SWMLISYPGRDLAWIFARKPDMSRD 148
S + +V F+ + E+ + S+ +S P + W+ +R P + R
Sbjct: 90 FVNGSTDGYLKVSFFGPFYGSYVVFELDRENYSYAFVSGPNTEYLWLLSRTPTVERG 146


66XC_RS03675XC_RS03765N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS03675-2121.355563chemotaxis protein CheY
XC_RS036800131.837156histidine kinase
XC_RS036852121.8797322-amino-4-hydroxy-6-
XC_RS03690090.587515pteridine reductase
XC_RS03695190.539174hypothetical protein
XC_RS03700091.002042membrane protein
XC_RS03705191.125388hypothetical protein
XC_RS037101101.208493hypothetical protein
XC_RS037150101.273232TonB-denpendent receptor
XC_RS037204132.989177type II secretion system protein C
XC_RS037253132.892426type II secretion system protein D
XC_RS037304152.928542general secretion pathway protein E
XC_RS037354172.978209general secretion pathway protein F
XC_RS037406183.518648general secretion pathway protein GspG
XC_RS037458193.671960general secretion pathway protein H
XC_RS037509183.169073general secretion pathway protein I
XC_RS037557153.043372type II secretion system protein J
XC_RS037604121.526791type II secretion system protein K
XC_RS037652101.409027type II secretion system protein L
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03675HTHFIS535e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 52.5 bits (126), Expect = 5e-11
Identities = 25/124 (20%), Positives = 51/124 (41%), Gaps = 14/124 (11%)

Query: 1 MTAIRTILLAEDSPADAEMAVDALRDARLANPIVHVEDGVETMDYLLRRGAFANREEGLP 60
MT IL+A+D A + AL A + + ++ G
Sbjct: 1 MTGAT-ILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWI---------AAGDG 48

Query: 61 AVLLLDIKMPRLDGLEVLKQIRNEESLKRLPVVILSSSREESDLARSWDLGVNAYVVKPV 120
+++ D+ MP + ++L +I+ LPV+++S+ ++ + G Y+ KP
Sbjct: 49 DLVVTDVVMPDENAFDLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106

Query: 121 DVDQ 124
D+ +
Sbjct: 107 DLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03680HTHFIS706e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.9 bits (171), Expect = 6e-15
Identities = 35/147 (23%), Positives = 63/147 (42%), Gaps = 4/147 (2%)

Query: 12 KILLVEDSPEDAELLSDQLLDAGIDAAFERVDSEPSLRAAMERFQPDIVLSDLSMPGFSG 71
IL+ +D +L+ L AG D + +L + D+V++D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDV--RITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 72 HQALRLVRQSGA-TPFIFVSGTMGEETAVKALQDGANDYIIKH-NPTRLPSAVLRAIREA 129
L ++++ P + +S TA+KA + GA DY+ K + T L + RA+ E
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 130 RAEIERQRVESELMRAQRLESLAMLAA 156
+ + +S+ S AM
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEI 149



Score = 41.7 bits (98), Expect = 5e-06
Identities = 27/126 (21%), Positives = 53/126 (42%), Gaps = 15/126 (11%)

Query: 380 GQRILLVDGEATRLSLLGNALSSQGYQPQLATDGAAALQLVQQHAMPDLVIIDSDIILLS 439
G IL+ D +A ++L ALS GY ++ ++ A + + DLV+ D + +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDEN 61

Query: 440 AVSVLLSMQELGYQGPAIVLED-------VGAPLQRAHLPHDIPVHVLRKPLEMRRVFRA 492
A +L +++ P +V+ + A + A+ L KP ++ +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAY-------DYLPKPFDLTELIGI 114

Query: 493 VAHALE 498
+ AL
Sbjct: 115 IGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03690DHBDHDRGNASE1154e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 115 bits (288), Expect = 4e-33
Identities = 77/253 (30%), Positives = 122/253 (48%), Gaps = 16/253 (6%)

Query: 6 KVVLITGAARRIGAQIATTLHGAGYRVALHAHRSGDALAARVAALCAQRAGSACALQADL 65
K+ ITGAA+ IG +A TL G +A + + L V++L A+ A A A AD+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAE-ARHAEAFPADV 66

Query: 66 RTPEAPAQLVDACVAAFGRLDAVVNNASAFYPTVLGEATPAQWDELFAVNARAPFFIAQA 125
R A ++ G +D +VN A P ++ + +W+ F+VN+ F +++
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 126 AAAQLRAHH-GAIVNLTDLHAEQPMRQHPLYGASKSALEMLTRSLALELAPQ-VRVNAVA 183
+ + G+IV + A P Y +SK+A M T+ L LELA +R N V+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 184 PGAI-------LWPEDGKADAAKQALLAR----TPLARIGTPEEVAEAVRWLLDD-ASFI 231
PG+ LW ++ A+ + L PL ++ P ++A+AV +L+ A I
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 232 TGHTLRVDGGRRL 244
T H L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03710GPOSANCHOR330.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.5 bits (76), Expect = 0.002
Identities = 9/35 (25%), Positives = 12/35 (34%)

Query: 27 ADRVASPGEGAFPPAPTPAPTPAPTPAPTPAPAPT 61
A+ +A G + TP P P AP
Sbjct: 452 AEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQ 486


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03720BCTERIALGSPC451e-07 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 45.0 bits (106), Expect = 1e-07
Identities = 54/273 (19%), Positives = 89/273 (32%), Gaps = 44/273 (16%)

Query: 15 MSARAVRTAAVCVLLVLLAVQGVRLVWLLVTPLG----------------PLGTAQG--- 55
+S +R +L++L Q + W + P P+
Sbjct: 9 LSPSVIRRILFYLLMLLFCQQLAMIFWRIGLPDNAPVSSVQITPAQARQQPVTLNDFTLF 68

Query: 56 --ATTAAPLPALQRDVFFRAPADSGDLGLVLHGVRVGG--ADSAAYLSTGDGRQGAYRIG 111
+ AL P + L L L GV G + S A +S D Q + +
Sbjct: 69 GVSPEKNKAGALDASQMSNLPPST--LNLSLTGVMAGDDDSRSIAIISK-DNEQFSRGVN 125

Query: 112 DAV-GPGLTLQAIATDHVMVRAGSALRRLPLIEHAAASAAITAPLPASGAPAAAAAVASN 170
+ V G + +I D V+++ L L SG+ A
Sbjct: 126 EEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQ-----------EDSGSDGVPGA---Q 171

Query: 171 VGARTAAAGTTAVDPQQLLRTTGLRANADGGGFTVMPRGDDALLRQAGLAPGDVLTQLNG 230
V + +T + + + + + G+ + P + GL D+ LNG
Sbjct: 172 VNEQLQQRASTTM--SDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDNDMAVALNG 229

Query: 231 RTL-DAEHLHELQDELRDGQTATLTYRRDGQTH 262
L DAE + + + D TLT RDGQ
Sbjct: 230 LDLRDAEQAKKAMERMADVHNFTLTVERDGQRQ 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03725BCTERIALGSPD376e-123 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 376 bits (967), Expect = e-123
Identities = 208/684 (30%), Positives = 329/684 (48%), Gaps = 58/684 (8%)

Query: 4 VRRPWLLSATLLLALPSLTMLPLHAADAPAVRMQDVDLRAFIQDVSRATGITFIVDTRVQ 63
V R + L+ LL +L P A + + D++ FI VS+ T I+D V+
Sbjct: 6 VIRSFSLT---LLIFAALLFRPAAAEEFS-ASFKGTDIQEFINTVSKNLNKTVIIDPSVR 61

Query: 64 GSVNVARAQAMSEADLLGMLLAVLRANGLIAVSSGPSTYRIIPDDTAAQQPG-----SAA 118
G++ V ++E L+VL G ++ +++ A +A
Sbjct: 62 GTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAP 121

Query: 119 SGNLGFATQVFTLQRVDARSAAEILKPLVGRGGVIMAM--PQGNSLLIADYADNLRRIRG 176
T+V L V AR A +L+ L GV + N LL+ A ++R+
Sbjct: 122 GIGDEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLT 181

Query: 177 LVAQIDTDR-AAIDTVTLRNSSAQELARTLTTLF----GQAGERSAVLSVLPVESSNSLI 231
+V ++D ++ TV L +SA ++ + +T L A S V +V+ E +N+++
Sbjct: 182 IVERVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVL 241

Query: 232 VRGDPALVQRVVRTALDLDGRAERRGDVSVVRLQHASAEQLLPVLQQLVGQTPGNEAEPG 291
V G+P QR++ LD + +G+ V+ L++A A L+ VL + T +E +
Sbjct: 242 VSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTG-ISSTMQSEKQAA 300

Query: 292 QETRPTAVDVAASAGAAQAQVIAPAAGKRPVIVRYPGSNALIINADPETQRALMDVIRQL 351
+ ++ A +NALI+ A P+ L VI QL
Sbjct: 301 KPVAALDKNIIIKAH--------------------GQTNALIVTAAPDVMNDLERVIAQL 340

Query: 352 DVHREQVLVEAIVVEISDTAAKRLGVQLLLAGRNGTVPLLATQYSGAAPGIVPLAAAAAG 411
D+ R QVLVEAI+ E+ D LG+Q A +N + TQ++ + I A A
Sbjct: 341 DIRRPQVLVEAIIAEVQDADGLNLGIQW--ANKNAGM----TQFTNSGLPISTAIAGANQ 394

Query: 412 TRSNNGEDDSVLEQARNVAAQSLLGLSGGLIGLAGQSNDAVFGMIIDAVKSDTGSNLLST 471
+ S+ S L G+ Q N + M++ A+ S T +++L+T
Sbjct: 395 YNKDGTVSSSLA---------SALSSFNGIAAGFYQGN---WAMLLTALSSSTKNDILAT 442

Query: 472 PSIMTLDNEQARILVGQEVPITTGEVLGAANDNPFRTIQRQDVGVELEVRPQINTAGGIT 531
PSI+TLDN +A VGQEVP+ TG + DN F T++R+ VG++L+V+PQIN +
Sbjct: 443 PSIVTLDNMEATFNVGQEVPVLTGSQTTS-GDNIFNTVERKTVGIKLKVKPQINEGDSVL 501

Query: 532 LAIKQEVSAIAGPVSTQSSEL--VFNKRQIETRVVVENGAIVALGGLLDQNDRQTVEKVP 589
L I+QEVS++A S+ SS+L FN R + V+V +G V +GGLLD++ T +KVP
Sbjct: 502 LEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVP 561

Query: 590 LLGDVPGLGALFRHKSRNRDKTNLMVFIRPTIIRDAADAQRMTAPRYNYLRDRQLADGDP 649
LLGD+P +GALFR S+ K NLM+FIRPT+IRD + ++ ++ +Y D Q
Sbjct: 562 LLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGK 621

Query: 650 EAALDALVRDYLRAQPPQLPASAP 673
E L +D L P Q A+
Sbjct: 622 ENNDAMLNQDLLEIYPRQDTAAFR 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03735BCTERIALGSPF342e-117 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 342 bits (878), Expect = e-117
Identities = 170/405 (41%), Positives = 238/405 (58%), Gaps = 10/405 (2%)

Query: 1 MAKFDYTVLDLHGRNRHGVISADSVRSARSLLEQRQWVPLRVEPAAATTP---------M 51
MA++ Y LD G+ G ADS R AR LL +R VPL V+
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 52 RAARFSGKDLVLFTRQLATLVDTA-PLEEALRTIGTQSERRGVRAVTGQTHALVVEGFRL 110
R R S DL L TRQLATLV + PLEEAL + QSE+ + + + V+EG L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 111 SDAMARQGTAFPPLYRAMVAAGESAGALPQVLERLADLLERQAQVRSKLQSALVYPAALA 170
+DAM +F LY AMVAAGE++G L VL RLAD E++ Q+RS++Q A++YP L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 171 MTAGVVVIVLMTFVVPKVVDQFDSMGRALPWLTRAVIALSQFLLHAGIPLLVAGVIAAVV 230
+ A VV +L++ VVPKVV+QF M +ALP TR ++ +S + G +L+A + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 231 AVQVRKRPPVRLAIDRAILRAPLLGRLLRDLHAARMARTLAIMVNSGLPLMEGLMIAART 290
+ ++ R++ R +L PL+GR+ R L+ AR ARTL+I+ S +PL++ + I+
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 291 VDNHALRLATDSMVTAIREGGSLGAAMKRAGVFPPTLLYMASSGENSGRLAPMLERAADY 350
+ N R A+REG SL A+++ +FPP + +M +SGE SG L MLERAAD
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 351 LEREFESFTTAAMSLLEPLIIVLLGGVVAVIVLSILLPILQFNTL 395
+REF S T A+ L EPL++V + VV IVL+IL PILQ NTL
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTL 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03740BCTERIALGSPG1823e-62 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 182 bits (462), Expect = 3e-62
Identities = 65/134 (48%), Positives = 91/134 (67%), Gaps = 3/134 (2%)

Query: 17 RGFTLVELMVVIVIIGLLATVVMINVMPSQDRAMVEKARADVAVLEQALETYRLDNLTYP 76
RGFTL+E+MVVIVIIG+LA++V+ N+M ++++A +KA +D+ LE AL+ Y+LDN YP
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 77 STEQGLQALLSAPGGLSRPERYRQGGYIRRLPEDPWGHAYQYRRPGRSGGFDVYSFGADG 136
+T QGL++L+ AP Y + GYI+RLP DPWG+ Y PG G +D+ S G DG
Sbjct: 68 TTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLSAGPDG 127

Query: 137 AEGGDADNADIGNW 150
G + DI NW
Sbjct: 128 EMGTE---DDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03745BCTERIALGSPH415e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 40.7 bits (95), Expect = 5e-07
Identities = 14/74 (18%), Positives = 31/74 (41%), Gaps = 1/74 (1%)

Query: 13 GFTLLEVLAVLVITALASTVVLLTLPDT-QRTLPDQADALATALSHARDEAILSLRMTEV 71
GFTLLE++ +L++ +++ +VLL P + + L + + + + V
Sbjct: 5 GFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFFGV 64

Query: 72 VLSAGGYAFRRQAR 85
+ + F
Sbjct: 65 SVHPDRWQFLVLEA 78


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03750BCTERIALGSPG315e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 31.4 bits (71), Expect = 5e-04
Identities = 19/59 (32%), Positives = 34/59 (57%), Gaps = 4/59 (6%)

Query: 3 RRADTQAGFSLLELLVALAIFG-MAVVGLLNLSGESTRTAVILEERALAAVVAENQAIE 60
R D Q GF+LLE++V + I G +A + + NL G + ++A++ +VA A++
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADK---QKAVSDIVALENALD 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03755BCTERIALGSPG325e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.2 bits (73), Expect = 5e-04
Identities = 12/48 (25%), Positives = 24/48 (50%)

Query: 1 MMRLRRAAGFTLIELLVALAVFALVAAAAVGVLRQSIEQRAAVQARLQ 48
M + GFTL+E++V + + ++A+ V L + E+ +A
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS03765SUBTILISIN310.009 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 30.6 bits (69), Expect = 0.009
Identities = 22/94 (23%), Positives = 30/94 (31%), Gaps = 13/94 (13%)

Query: 22 DAQGH-VHVHGTAASTPPAARTVLVVPGTQVH-LRWLALPGRSPAQSLAAARLQLAEHLA 79
D GH HV GT A+T V V P + ++ L G + E
Sbjct: 82 DYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKV 141

Query: 80 ----------SDVSTLHVVIAANAQADGTRLVAA 103
DV LH + A A ++ A
Sbjct: 142 DIISMSLGGPEDVPELHEAV-KKAVASQILVMCA 174


67XC_RS04075XC_RS04120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS04075-1110.8499253-ketoacyl-ACP reductase
XC_RS040800120.941790citrate transporter
XC_RS04085-1121.320924porin
XC_RS040902151.480279transcriptional regulator
XC_RS040950132.465711psensor histidine kinase
XC_RS04100-2141.939957ABC transporter substrate-binding protein
XC_RS04105-2151.114564hypothetical protein
XC_RS04110-3190.786866LuxR family transcriptional regulator
XC_RS04115-3131.137664hypothetical protein
XC_RS04120-3141.246427sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04075DHBDHDRGNASE1022e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 102 bits (256), Expect = 2e-28
Identities = 68/257 (26%), Positives = 116/257 (45%), Gaps = 19/257 (7%)

Query: 4 RIAYVTSGMGSVGTAICQKLARTGHTVVAGCGPNSPRKSAWLREQRELGFDFVASEGNAA 63
+IA++T +G A+ + LA G +A N + + + A +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQG-AHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 64 DWDSTMAAFAKVKAEVGEIDVLVNNAGGSRDTLFRQMSRDDWNAVIASNLHSLFNITKQV 123
D + A+++ E+G ID+LVN AG R L +S ++W A + N +FN ++ V
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 124 VDGMTSRGWGRIVNIGSVSAHKGQIGQINFATAKAAMHGFSRALAQEVASRGVTVNTISP 183
M R G IV +GS A + +A++KAA F++ L E+A + N +SP
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 184 G--------------YIASASISNFPPDVLDRLATSVPVRRLGKPAEVAGLCAWLASDDA 229
G A I L+ T +P+++L KP+++A +L S A
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGS----LETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 230 AYVTGADYAVNGGLYMG 246
++T + V+GG +G
Sbjct: 244 GHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04090HTHFIS772e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.8 bits (189), Expect = 2e-18
Identities = 31/123 (25%), Positives = 59/123 (47%), Gaps = 1/123 (0%)

Query: 2 RLLLVEDNADLADAIVRRMRRSGHAVDWQQDGLAAASVLRYQSFDLVVLDIGLPKLDGLR 61
+L+ +D+A + + + + R+G+ V + + DLVV D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLAGMRERGDTTPVLMLTARDGIEDRVQALDVGADDYLGKPFDFREF-EARCRVLLRRNR 120
+L +++ PVL+++A++ ++A + GA DYL KPFD E R L R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GQA 123
+
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04100HTHFIS320.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.004
Identities = 15/107 (14%), Positives = 29/107 (27%), Gaps = 12/107 (11%)

Query: 49 AVIQDYQRLHPGTEVI----YEDVIAWDIYEQYLHPAANAPQADLLISASMDLQTKLVND 104
++ ++ P V+ + A+ A + DL +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTA--------IKASEKGAYDYLPKPFDLTELIGII 115

Query: 105 GHALTHRSAQTEALPAWAQWRHEVFGISYEPVAIVYNTRKLAAARVP 151
G AL + L +Q + G S I +L +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLT 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04110HTHFIS613e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.0 bits (148), Expect = 3e-13
Identities = 33/108 (30%), Positives = 53/108 (49%), Gaps = 3/108 (2%)

Query: 1 MADLTILVADDHPLFRAAVIHVLHQTLPTAEVVEASSATTLSAMLRSHPQAELVLLDLAM 60
M TILVADD R + L + +V S+A TL + + +LV+ D+ M
Sbjct: 1 MTGATILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVM 57

Query: 61 PGARGFSALLHVRGEHPDIPVVVISSNDHPRVIRRAQQFGAAGFIPKS 108
P F L ++ PD+PV+V+S+ + +A + GA ++PK
Sbjct: 58 PDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04120HTHFIS625e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 5e-12
Identities = 24/120 (20%), Positives = 44/120 (36%), Gaps = 3/120 (2%)

Query: 762 RVWCVDDDPRVCEATRALLERWECRVDFAGGPDDALAAASPDDAPELLLLDVRMGNHYGP 821
+ DDD + L R V + +L++ DV M +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-AGDGDLVVTDVVMPDENAF 63

Query: 822 MLLPQLVAQWQREPRVILVTAEPSPALREHALEAG-WGFLSKPVRPPALRALVTQMLLRR 880
LLP++ P V++++A+ + A E G + +L KP L ++ + L
Sbjct: 64 DLLPRIKKARPDLP-VLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122


68XC_RS04670XC_RS04700N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS04670-1183.671312membrane protein
XC_RS04675-1141.700878membrane protein
XC_RS04680-3141.230925hypothetical protein
XC_RS04685-2130.646261hypothetical protein
XC_RS04690-1120.148400ATPase
XC_RS04695111-1.204468ATPase AAA
XC_RS04700110-1.122336fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04670PF03544290.038 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.2 bits (65), Expect = 0.038
Identities = 21/140 (15%), Positives = 35/140 (25%), Gaps = 5/140 (3%)

Query: 299 IVVEASARGATQAQFPELPTPSVPDAQVFAEPAQYEERFVDGSPQLRLTRRYSIVPNRAG 358
V A + Q ELP P+ P + PA E P + P
Sbjct: 26 GAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEP-EPEPEPI- 83

Query: 359 PLLIPGLQVAWWDVGAAAAKTASLPDLTLDVAARSAAFAAPAAAPSPASQPAGEAVAPAP 418
P V + K P V + P+ + A +
Sbjct: 84 PEPPKEAPV---VIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSS 140

Query: 419 AASTLRLQGAPAASRPWGWI 438
A+ + + + +
Sbjct: 141 TATAATSKPVTSVASGPRAL 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04675IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 5e-04
Identities = 19/168 (11%), Positives = 47/168 (27%), Gaps = 22/168 (13%)

Query: 414 ALRQQPQLQDAIANRAAVEAARKRQQQNKDGKGKPQQ------DQNKGDKQQDQNKDGQG 467
R Q I ++A N + + + + + +
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 468 KPEQTAGKDGKDGKGGQDGKPSSQPPKDGQTPGQPSPARQGEQKQPDSPPQTPDAQAQQQ 527
+ +T K + Q + T A++ + + AQ+ +
Sbjct: 1046 QESKTVEK-------------NEQDATE-TTAQNREVAKEAKSNVKANTQTNEVAQSGSE 1091

Query: 528 ADAAQRR--KMEQAMAQAGAKGVPSKEQQAAAAAAAETPEQREQRQAV 573
Q K + + V +++ Q ++ ++EQ + V
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139



Score = 32.0 bits (72), Expect = 0.008
Identities = 24/157 (15%), Positives = 43/157 (27%), Gaps = 12/157 (7%)

Query: 416 RQQPQLQDAIANRAAVEAARK----RQQQNKDGKGKPQQDQNKGDKQQDQNKDGQGKPEQ 471
Q A A EA Q G ++ + ++ + + K +
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 472 TAGKDGKDGKGGQDGKPSSQPPKDGQTPGQPSPARQGEQKQPDSPPQTPDAQAQQQADAA 531
K + K P + + Q +P+ + P + P +Q AD
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR-----ENDPTVNIKEPQSQTNTTADTE 1169

Query: 532 QRRKMEQAMAQAGAKGVPSKEQQAAAAAAAETPEQRE 568
Q K + + V + E PE
Sbjct: 1170 QPAKETSSN---VEQPVTESTTVNTGNSVVENPENTT 1203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04695HTHFIS353e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 3e-04
Identities = 40/158 (25%), Positives = 60/158 (37%), Gaps = 24/158 (15%)

Query: 35 IVGQS----ALVERLLIALLADGHLLVEGAPGLAKTT---AIRALASRLETDFARVQ--- 84
+VG+S + L + D L++ G G K A+ R F +
Sbjct: 139 LVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAA 198

Query: 85 FTPDLLPADLTG------TEIWRPQEGRFEFMPGPIFHPILLADEINRAPAKVQSALLEA 138
DL+ ++L G T GRFE G L DEI P Q+ LL
Sbjct: 199 IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGT----LFLDEIGDMPMDAQTRLLRV 254

Query: 139 MGERQVT-VGRHTYALPQLFLVMATQNPIEQ---EGTF 172
+ + + T VG T + +V AT ++Q +G F
Sbjct: 255 LQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLF 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS04700BCTERIALGSPD2177e-64 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 217 bits (553), Expect = 7e-64
Identities = 113/528 (21%), Positives = 195/528 (36%), Gaps = 63/528 (11%)

Query: 141 DSLAYQTGNEYVVEITPRKGQPAVGGVSVSAVTQAAAQIAARGYSGRPVTFNFQDVPVRT 200
A G+E V + P + V+ + Q+ G V
Sbjct: 117 SDAAPGIGDEVVTRVVP------LTNVAARDLAPLLRQLNDNAGVGSVV-----HYEPSN 165

Query: 201 VLQLIAEESNLN----IVASDTVQGNVTLRLMNVPWDQALDIVLRAKGLDKRRDGGVVWV 256
VL + + + IV G+ ++ + + W A D+V L+K +
Sbjct: 166 VLLMTGRAAVIKRLLTIVERVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPG 225

Query: 257 APQPELAKFEQDKEDARIAIENREDLITDYVQ----------------INYHNAAVIFKA 300
+ + E+ N I ++ + Y A+ + +
Sbjct: 226 SMVANVVADERTNAVLVSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEV 285

Query: 301 LTEAKGIGGGGGGGGQGGQGGAGQQDNGFLSPRGRLVADERTNTLMISDIPKKVAQMREL 360
L GI Q + A N + A +TN L+++ P + + +
Sbjct: 286 L---TGISSTMQSEKQAAKPVAALDKNIIIK------AHGQTNALIVTAAPDVMNDLERV 336

Query: 361 ISHIDRPVDQVLIESRIVIATDTFARDLGARFGVTGATGRGILSGSLESNVNYLNTSAQS 420
I+ +D QVL+E+ I D +LG ++ A + L + T+
Sbjct: 337 IAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPIS-----TAIAG 391

Query: 421 RLEQANGGQVTTLPAHLFPSGLNVDLGAGGFTNSGAAGLAYTLLGSHFNLDIELSAMQEE 480
+ G V++ S L+ G G N + L+A+
Sbjct: 392 ANQYNKDGTVSSS----LASALSSFNGIAAGFYQG-------------NWAMLLTALSSS 434

Query: 481 GRGEVVSNPRIVTANQREGVIKQGREIGYVTISGAGVAGGGSQANVQFKEVLLELKVTPT 540
+ ++++ P IVT + E G+E+ +T S +G V+ K V ++LKV P
Sbjct: 435 TKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTT-SGDNIFNTVERKTVGIKLKVKPQ 493

Query: 541 ITNDNRVFLNMNVKKDEVARFITLPQYGTVPEINRREVNTAVLVADGETVVIGGVYEFTD 600
I + V L + + VA + N R VN AVLV GETVV+GG+ + +
Sbjct: 494 INEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSV 553

Query: 601 RESVAKVPFLGDIPFLGNLFKKRGRSKEKAELLVFVTPKVLRVASAAR 648
++ KVP LGDIP +G LF+ + K L++F+ P V+R R
Sbjct: 554 SDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601



Score = 51.1 bits (122), Expect = 9e-09
Identities = 31/208 (14%), Positives = 75/208 (36%), Gaps = 29/208 (13%)

Query: 175 AAAQIAARGYSGRPVTFNFQDVPVRTVLQLIAEESNLNIVASDTVQGNVTLR----LMNV 230
A + R + + +F+ ++ + +++ N ++ +V+G +T+R L
Sbjct: 16 IFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEE 75

Query: 231 PWDQALDIVLRAKGLDK-RRDGGVVWVAPQPELAKFEQDKEDARIAIENREDLITDYVQI 289
+ Q VL G + GV+ V + AK + A ++++T V +
Sbjct: 76 QYYQFFLSVLDVYGFAVINMNNGVLKVVRSKD-AKTAAVPVASDAAPGIGDEVVTRVVPL 134

Query: 290 NYHNAAVIFKALTEAKGIGGGGGGGGQGGQGGAGQQDNGFLSPRGRLVADERTNTLMISD 349
A + L + G G +V E +N L+++
Sbjct: 135 TNVAARDLAPLLRQLNDNAGV-----------------------GSVVHYEPSNVLLMTG 171

Query: 350 IPKKVAQMRELISHIDRPVDQVLIESRI 377
+ ++ ++ +D D+ ++ +
Sbjct: 172 RAAVIKRLLTIVERVDNAGDRSVVTVPL 199


69XC_RS05290XC_RS05340N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05290-217-1.179449N-acetyltransferase GCN5
XC_RS05295127-5.532443CopG family transcriptional regulator
XC_RS05300131-6.359346hypothetical protein
XC_RS05305236-7.813004transposase
XC_RS05310332-8.395776dephospho-CoA kinase
XC_RS05315322-6.770367type 4 prepilin-like proteins leader
XC_RS05320117-5.066009type II secretion system protein F
XC_RS05325012-2.478403pilin
XC_RS05330014-1.747628pilin
XC_RS05335-116-0.974633type II secretory protein GspE
XC_RS05340-121-0.005775chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05290SACTRNSFRASE290.005 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.1 bits (65), Expect = 0.005
Identities = 15/63 (23%), Positives = 26/63 (41%), Gaps = 2/63 (3%)

Query: 89 LAVDQSLHGRGFGRALMQDAGKRILHAADTIGIRGLLVHALSADAKAFYERIGFEPSPLD 148
+AV + +G G AL+ A + G+ L ++ A FY + F +D
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKE-NHFCGLM-LETQDINISACHFYAKHHFIIGAVD 152

Query: 149 PMV 151
M+
Sbjct: 153 TML 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05315PREPILNPTASE330e-116 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 330 bits (849), Expect = e-116
Identities = 130/282 (46%), Positives = 175/282 (62%), Gaps = 1/282 (0%)

Query: 1 MAFLDQHPGLGFPAAAGLGLLIGSFLNVVILRLPKRMEWQWRRDAREILELPDI-YEPPP 59
+ P L F L+IGSFLNVVI RLP +E +W+ + R D + PP
Sbjct: 5 LELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPP 64

Query: 60 PGIVVEPSHDPVTGDKLKWWENIPLFSWLMLRGKSRYSGKPISIQYPLVELLTSILCVAS 119
++V S P + ENIPL SWL LRG+ R PIS +YPLVELLT++L VA
Sbjct: 65 YNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAV 124

Query: 120 VWRFGFGWQGFGAIVLSCFLVAMSGIDLRHKLLPDQLTLPLMWLGLVGSMDNLYMPAKPA 179
GW A++L+ LVA++ IDL LLPDQLTLPL+W GL+ ++ ++ A
Sbjct: 125 AMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDA 184

Query: 180 LLGAAVGYVSLWTVWWLFKQLTGKEGMGHGDFKLLAALGAWCGLKGILPIILISSLVGAV 239
++GA GY+ LW+++W FK LTGKEGMG+GDFKLLAALGAW G + + ++L+SSLVGA
Sbjct: 185 VIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAF 244

Query: 240 LGSIWLFAKGRDRATPIPFGPYLAIAGWVVFFWGNDLVDGYL 281
+G + + ++ PIPFGPYLAIAGW+ WG+ + YL
Sbjct: 245 MGIGLILLRNHHQSKPIPFGPYLAIAGWIALLWGDSITRWYL 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05320BCTERIALGSPF380e-132 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 380 bits (978), Expect = e-132
Identities = 118/407 (28%), Positives = 219/407 (53%), Gaps = 9/407 (2%)

Query: 20 MVPFVWEGTDKRGIKMKGEQPARNANMLRAELRRQGITPLVV-----KTKPKPLFGAA-- 72
M + ++ D +G K +G Q A +A R LR +G+ PL V + G +
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 73 -GKKISSKDIAFFSRQMATMMKSGVPIVGSLEIIGEGHKNPRMKKMVGQIRTDIEGGSSL 131
++S+ D+A +RQ+AT++ + +P+ +L+ + + + P + +++ +R+ + G SL
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 132 HEAISRHPVQFDDLYRNLVRAGEGAGVLETVLDTVANYKENIEALKGKIKKALFYPAMVM 191
+A+ P F+ LY +V AGE +G L+ VL+ +A+Y E + ++ +I++A+ YP ++
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 192 AVAIIVSGILLVFVVPQFEDVFKGFGAELPAFTQMIVAASRFMVSYWWLMLLGSIAAIAG 251
VAI V ILL VVP+ + F LP T++++ S + ++ MLL +A
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 252 FIFAYKRSPRMRHGMDRLVLKVPVIGQIMHNSSIARFARTTAVTFKAGVPLVEALSIVAG 311
F R + R R +L +P+IG+I + AR+ART ++ + VPL++A+ I
Sbjct: 241 FRVML-RQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGD 299

Query: 312 ATGNKVYEEAVLRMRDDVSVGYPVNMAMKQVNLFPHMVVQMTSIGEEAGALDAMLFKVAE 371
N + D V G ++ A++Q LFP M+ M + GE +G LD+ML + A+
Sbjct: 300 VMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 372 YFEQEVNNAVDALSSLLEPLIMVFIGTIVGGMVIGMYLPIFKLGSVV 418
++E ++ + L EPL++V + +V +V+ + PI +L +++
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05325BCTERIALGSPG463e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.0 bits (109), Expect = 3e-09
Identities = 22/74 (29%), Positives = 39/74 (52%), Gaps = 2/74 (2%)

Query: 1 MKKQNGFTLIELMIVVAIIAILAAIALPAYQDYLARSQVSEGLSLASGAKTAVAETYANT 60
KQ GFTL+E+M+V+ II +LA++ +P ++ + +S + A+ +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDN 63

Query: 61 GAFPATNAAAGLEA 74
+P TN GLE+
Sbjct: 64 HHYPTTN--QGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05330BCTERIALGSPG441e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.5 bits (105), Expect = 1e-08
Identities = 20/72 (27%), Positives = 38/72 (52%), Gaps = 2/72 (2%)

Query: 2 RVRGFTLIELMIVVAVIAILAAIALPAYQDYLVRAQVSEGLSLASGAKVAVEEFHWAKSA 61
+ RGFTL+E+M+V+ +I +LA++ +P +A + +S + A++ +
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHH 65

Query: 62 SPSTNIEAGLGS 73
P+T GL S
Sbjct: 66 YPTT--NQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05340HTHFIS507e-180 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 507 bits (1308), Expect = e-180
Identities = 166/474 (35%), Positives = 257/474 (54%), Gaps = 17/474 (3%)

Query: 6 SALVVDDERDIRELLVLTLGRMGLRISTAANLAEARELLASNPYDLCLTDMRLPDGNGIE 65
+ LV DD+ IR +L L R G + +N A +A+ DL +TD+ +PD N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 66 LVTEIARQYPQTPVAMITAFGSMDLAVEALKAGAFDFVSKPVDISVLRGLVKHALELNNR 125
L+ I + P PV +++A + A++A + GA+D++ KP D++ L G++ AL R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 126 DRPAPPPPPLEQASRLLGDSTAMESLRSTIGKVARSQAPVYIVGESGVGKELVARTIHEQ 185
RP+ + L+G S AM+ + + ++ ++ + I GESG GKELVAR +H+
Sbjct: 125 -RPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 186 GARAAGPFIPVNCGAIPAELMESEFFGHKKGSFTGAHADKPGLFQAAHGGTLFLDEVAEL 245
G R GPF+ +N AIP +L+ESE FGH+KG+FTGA G F+ A GGTLFLDE+ ++
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 246 PLQMQVKLLRAIQEKSVRPVGASGETLVDVRILSATHKDLGDLVSDGRFRHDLYYRINVI 305
P+ Q +LLR +Q+ VG DVRI++AT+KDL ++ G FR DLYYR+NV+
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVV 303

Query: 306 ELRVPPLRERSGDLPQLAAAIIARLARSHGRPIPLLTQSALDALDTYGFPGNVRELENIL 365
LR+PPLR+R+ D+P L + + + G + Q AL+ + + +PGNVRELEN++
Sbjct: 304 PLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRFDQEALELMKAHPWPGNVRELENLV 362

Query: 366 ERALALAEDDQISASDLRLPAH---------------GGHRLAASPGSAAIEPREAVVDI 410
R AL D I+ + G ++ + + + D
Sbjct: 363 RRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDA 422

Query: 411 DPASSALPSYIEQLERAAIQKALEENRWNKTKTAAQLGITFRALRYKLKKLGME 464
P S + ++E I AL R N+ K A LG+ LR K+++LG+
Sbjct: 423 LPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


70XC_RS05550XC_RS05565N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS055504132.200684diaminopimelate decarboxylase
XC_RS055553142.082341iron transporter
XC_RS055601121.908158transporter
XC_RS055651110.772303hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05550ALARACEMASE310.007 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 31.3 bits (71), Expect = 0.007
Identities = 47/230 (20%), Positives = 76/230 (33%), Gaps = 40/230 (17%)

Query: 31 DLAALDAHAAWMRAQLPPGCTLFYAAKANA----EPQILQTLAPYVDGFEAASGGE-LAW 85
DL AL + + +R Q ++ KANA +I + DGF + E +
Sbjct: 10 DLQALKQNLSIVR-QAATHARVWSVVKANAYGHGIERIWSAI-GATDGFALLNLEEAITL 67

Query: 86 LHQQQPDAALLFGGPGKLES-ELAQAVRLPDCTVHVESLGELQRLAAIARQIAQPDRRRI 144
+ L+ G + E+ RL C VH + AR + +
Sbjct: 68 RERGWKGPILMLEGFFHAQDLEIYDQHRLTTC-VHSN---WQLKALQNARL-----KAPL 118

Query: 145 PVFLRMNIAVPGAQTTRLMMGGQPSPFGLDPDDLDAALLQLRASPALELRGFHFHLMSHQ 204
++L++N + RL G PD + QLRA + H
Sbjct: 119 DIYLKVNSGM-----NRL---------GFQPDRVLTVWQQLRAMANVGEMTLMSHFAE-- 162

Query: 205 RDAAAQLHLIAAYLRTVQHWRQRHGLGPLLVNAGGGFGVDYLTPEASFDW 254
A I+ + ++ + L N+ PEA FDW
Sbjct: 163 ---AEHPDGISGAMARIEQAAEGLECRRSLSNSAATL----WHPEAHFDW 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05555PF041832784e-88 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 278 bits (712), Expect = 4e-88
Identities = 109/523 (20%), Positives = 190/523 (36%), Gaps = 51/523 (9%)

Query: 100 DGHALARSLLQALGSTHAVNPELLAQSDNSVAIT----AALL--RQAQAATPTGDTLIDA 153
D LA++LL L +++ +A+ + T LL R+ +A+ + D
Sbjct: 69 DEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNADR 128

Query: 154 EQALLWGHALHPTPKSREGVALAQVLACAPEARAAFALYWFRIDRRLLRVQ---GRDVRA 210
Q LL GH K R G + APE F L+W + R + + D+
Sbjct: 129 LQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQ 188

Query: 211 TLQ---------------QLSGHDDFY---PCHPWEVQRLRDDPLLQELQARGAITPVGV 252
L Q +G D + P HPW+ Q+ + + A G + +G
Sbjct: 189 LLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADF-AEGRMVSLGE 247

Query: 253 LGEPLRPTSSVRTLYHP--ALAYFLKCSVHVRLTNCVRKNAWYELESAVALTHLLAPSWQ 310
G+ S+RTL + +K + + T+C R + + + L +
Sbjct: 248 FGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFA 307

Query: 311 ALAAQV-PGFDVMLEPAATSLDVTQFDPALAAADPLATRALAESFGILYRQAPTAAQRAR 369
A V G ++ EPAA + + AA A E G+++R+ P +
Sbjct: 308 TDATLVQSGAVILGEPAAGYVSHEGY-----AALARAPYRYQEMLGVIWRENPCRWLKPD 362

Query: 370 WRPQVAAALFTCDAQGHSVAAAALRAHSVRRLDRRTATVLWFRAYAGLLLDGVWSALFQH 429
P + A L CD +A A + R T W +++ ++ L ++
Sbjct: 363 ESPVLMATLMECDENNQPLAGA-----YIDRSGLDAET--WLTQLFRVVVVPLYHLLCRY 415

Query: 430 GIALEPHLQNTVIGFSDGWPTRVWIRDLEGT-KLLAQRWPAARLHGVGERARQSLYYTPE 488
G+AL H QN + +G P RV ++D +G +L+ + +P + + + R
Sbjct: 416 GVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPE--MDSLPQEVRDVTSRLSA 473

Query: 489 QAWNRVAYCALVNNLAEAIFHLSDGDAVLEARLWQCVAEIAARWQQRNGAQPALQGLID- 547
+ I L V E R +Q +A + + + +++ L
Sbjct: 474 DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSL 533

Query: 548 GAPLPGKNNLGTRLLQRADRQSDYTALPNPIA----PMHAVQQ 586
P + L L D LPN + P+ V Q
Sbjct: 534 FRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQ 576


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05560TCRTETA613e-12 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 60.6 bits (147), Expect = 3e-12
Identities = 52/156 (33%), Positives = 70/156 (44%), Gaps = 3/156 (1%)

Query: 20 LGMPLFLPQVLAELAPSAAI-GWSGVLYVLPTLCTALTASTWGRLADRYGRKRSLLRAQL 78
L MP+ LP +L +L S + G+L L L A G L+DR+GR+ LL +
Sbjct: 23 LIMPV-LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLA 81

Query: 79 GLALGFAIAGFAPTLPWLVVGLIVQGTCGGSLAAANAYLASQPQAGPLARALDWTQYSAR 138
G A+ +AI AP L L +G IV G G + A A AY+A AR +
Sbjct: 82 GAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 139 LAMVSAPALLGLAVALGPAQALYRALALLPLVAFAL 174
MV+ P L GL P + A A L + F
Sbjct: 142 FGMVAGPVLGGLMGGFSPHAPFFAA-AALNGLNFLT 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05565PF041831512e-41 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 151 bits (384), Expect = 2e-41
Identities = 90/384 (23%), Positives = 139/384 (36%), Gaps = 49/384 (12%)

Query: 121 DEAACAAAHRGLARDAYA-AQAPHLAQALRAADAAERAYRCDQLASYRD-HPFYPTARAK 178
+A A + L Q + L A+D D+L HP + + +
Sbjct: 88 SDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA--DRLQCLLSGHPKFVFNKGR 145

Query: 179 AGLAPAELRAYAPEFAPTFALQWLAIPQAQVSCTSTPPAELWPDLARVGLPAELAQTHQL 238
G L YAPE+A TF L WLA+ + + ++ L P E A+ Q+
Sbjct: 146 RGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQV 205

Query: 239 W------------PVHPLVWARLEQDGFALPAGSVRAPLTYLP-----VRPTLSVRTLVP 281
W PVHP W + F A + L S+RTL
Sbjct: 206 WQENGLDHNWLPLPVHPWQWQQKIATDFI--ADFAEGRMVSLGEFGDQWLAQQSLRTLTN 263

Query: 282 LQHPH-LHLKLPIPMRTLGALNLRLIKPSTLYDGHWLEQALRRIDAHDAALRGRCVFV-D 339
L +KLP+ + R I + G + L+++ A DA L +
Sbjct: 264 ASRRGGLDIKLPLTIYNTSC--YRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILG 321

Query: 340 ESHGGHV-------------GQTRHLAYLLRRYPPLDTA---TLVPVAALCALMPDGRPM 383
E G+V L + R P + V +A L + +P+
Sbjct: 322 EPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPL 381

Query: 384 AIHLAETFSGGDLLAWWRDYTELLLAVHLRLWLRYGIALEANQQNSVLVYAPGQPTRLLM 443
A + SG D W +++ L RYG+AL A+ QN L G P R+L+
Sbjct: 382 AGAYIDR-SGLDAETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLL 440

Query: 444 KDN-DAARIALPQLRAQLPEIDTL 466
KD R+ ++ + PE+D+L
Sbjct: 441 KDFQGDMRL----VKEEFPEMDSL 460


71XC_RS05745XC_RS05785N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05745121-0.771339membrane protein
XC_RS05750-120-1.143369hypothetical protein
XC_RS05755019-1.4529467-carboxy-7-deazaguanine synthase
XC_RS05760-218-2.1519037-cyano-7-deazaguanine synthase
XC_RS05770-319-1.703787*hypothetical protein
XC_RS05775-315-1.207691hypothetical protein
XC_RS05780-212-0.481355histidine kinase
XC_RS05785-111-0.322786chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05745OMPADOMAIN1084e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 108 bits (272), Expect = 4e-31
Identities = 36/112 (32%), Positives = 51/112 (45%), Gaps = 11/112 (9%)

Query: 67 VYFDLDQDSLKPEFQAIMACHAKYLR--DRPSSRITLQGNADERGSREYNMGLGERRGNA 124
V F+ ++ +LKPE QA + L D + + G D GS YN GL ERR +
Sbjct: 221 VLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 125 VSSALQAAGGSASQLTVVSYGEERPVCTETSE---------SCWSQNRRVEI 167
V L + G A +++ GE PV T + C + +RRVEI
Sbjct: 281 VVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05750RTXTOXIND310.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.004
Identities = 20/72 (27%), Positives = 35/72 (48%), Gaps = 7/72 (9%)

Query: 30 RVAVLEQQQANSQANNDL---LNQLQQARSDLQALRATVEQLQHD--NEQLKQ--QSKDQ 82
+ AVLEQ+ +A N+L +QL+Q S++ + + + + NE L + Q+ D
Sbjct: 251 KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN 310

Query: 83 YLDLDGRLNRLE 94
L L + E
Sbjct: 311 IGLLTLELAKNE 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05780PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.5 bits (74), Expect = 0.002
Identities = 35/212 (16%), Positives = 71/212 (33%), Gaps = 26/212 (12%)

Query: 126 ERKQHEQHLQLLINELN-HRVKNSLVMVQSLARQSLKNAESLDDANEKIDARLMALSS-A 183
E L L ++N H + N+L +++L + A + L +LS
Sbjct: 155 ASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREM----------LTSLSELM 204

Query: 184 HNTLTREN--WVS-ADIAELTRDAVELCESHQGQRFELHGSSCRLDP--RRALALAMALH 238
+L N VS AD + ++L R + +++P M +
Sbjct: 205 RYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFE---NQINPAIMDVQVPPMLVQ 261

Query: 239 ELCTNAVKHG-ALSSAEGRVRIAWECTMREGENRLELLWQESGGPEVQP-PQRKGFGSRL 296
L N +KHG A G++ + + + L + +G ++ + G G +
Sbjct: 262 TLVENGIKHGIAQLPQGGKILL----KGTKDNGTVTLEVENTGSLALKNTKESTGTGLQN 317

Query: 297 LERGLKHDLNGDVSWVFDTAGVSYRVSLALPA 328
+ L+ + + +P
Sbjct: 318 VRERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05785HTHFIS518e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.0 bits (122), Expect = 8e-11
Identities = 23/119 (19%), Positives = 52/119 (43%), Gaps = 9/119 (7%)

Query: 3 ARVLVVEDESLVAMLLEDCLAELGYEVAGTVGDVSTALEAVKKGNLDMAVLDVNLGGTMS 62
A +LV +D++ + +L L+ GY+V + +T + G+ D+ V DV + +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVR-ITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 63 FPIAEELDAR--GVPYIFVTGYASSG-IPEKFRHR--HGVQKPFRFRDLQDALALLQKA 116
F + + +P + ++ + + + KPF DL + + ++ +A
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF---DLTELIGIIGRA 118


72XC_RS05905XC_RS05970N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS05905-1103.347459hypothetical protein
XC_RS05910-2102.407597glycosyl transferase
XC_RS05915-2102.720671penicillin-binding protein 1B
XC_RS05920-1102.912491hypothetical protein
XC_RS05925-292.082082helicase
XC_RS05930-2101.477208hypothetical protein
XC_RS05935-390.874901ADP-ribosylglycohydrolase
XC_RS059400110.468413cell envelope biogenesis protein TonB
XC_RS059452131.324154glutathione synthetase
XC_RS059502131.590109pilus response regulator PilG
XC_RS059552132.001851chemotaxis protein CheY
XC_RS059601142.156294pilus biogenesis protein
XC_RS059650102.120610pilus biogenesis protein
XC_RS05970092.182755chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05905RTXTOXIND310.016 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.016
Identities = 33/210 (15%), Positives = 61/210 (29%), Gaps = 15/210 (7%)

Query: 144 LRLRQLQARRAGLDALVAQAVAAQQQGRLDGTPDSA--LPLYQRVLSLAPDRTDALEGRE 201
L+L L A L + A +Q R S L + L P + E
Sbjct: 125 LKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 202 DALTDLLAQARHALARDALAEAAALLAAAKRYDAGHADVPSTEGQYHRLMDQRRQRADTL 261
LT L+ + + L A + E R+ R +L
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLS-RVEKSRLDDFSSL 243

Query: 262 LRRGRLA-PAVRDFTAVLAAEPGDAQAQRG----VERVAAEYAGQATRQAGDFQFDAATQ 316
L + +A AV + + + + +E + F+ + +
Sbjct: 244 LHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDK 303

Query: 317 SLQQARALVPSGPSIAAAEQAIARARDAQR 346
L+Q +I +A+ + Q+
Sbjct: 304 -LRQTTD------NIGLLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05915PF05272300.036 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.036
Identities = 19/95 (20%), Positives = 27/95 (28%), Gaps = 8/95 (8%)

Query: 470 EAQRQVGSLLKPFVY--LLALASPDRWALSSWVDDSPVTVQLARGKTWSPGNSDNRSHGT 527
+ + LLKP + AL S A D+ R W
Sbjct: 439 RLRLRGRWLLKPRRAALIEALRSAPALAGCVAFDELREQPVAVRAFPWRKAPGPLEDADV 498

Query: 528 VRLIDALAHSYNQATVRVGMQVGPERVAQLIQVLA 562
+RL D + +Y + Q I V A
Sbjct: 499 LRLADYVETTYGTGEAS------AQTTEQAINVAA 527


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05920BACINVASINC290.008 Salmonella/Shigella invasin protein C signature.
		>BACINVASINC#Salmonella/Shigella invasin protein C signature.

Length = 409

Score = 29.1 bits (64), Expect = 0.008
Identities = 24/97 (24%), Positives = 39/97 (40%), Gaps = 6/97 (6%)

Query: 69 RETAKAKRQIGDLAAAAAALDQALGLVSGDPAILQERAEVAVLQGDWNASERFAKQAIEL 128
R A+ + GDL + + S A QER+E + Q + + + +A E
Sbjct: 315 RIDARKMQMTGDLIMKNSVTVGGIAGASRQYAATQERSEQQISQVNNRVASTASDEARES 374

Query: 129 GSKTGPLCRRHWATIEQARLARGEKENAASAKAQIVG 165
K+ L + T+E ++ ASA A I G
Sbjct: 375 SRKSTSLIQEMLKTMESI------NQSKASALAAIAG 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05940PF035441182e-34 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 118 bits (298), Expect = 2e-34
Identities = 41/262 (15%), Positives = 86/262 (32%), Gaps = 37/262 (14%)

Query: 11 MDERQRLTATLVISLLLHGLLILGVGFAVSEDAPLVPTLDVIFSQTSTPLTPKQADFLAQ 70
+D +R ++S+ +HG ++ G+ + +P P P +A
Sbjct: 8 LDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPA----------PAQPISVTMVAP 57

Query: 71 ANQQGGGNHDTAQRPRDSQPGVVPQDRNGLAPQAQRATTVQAPLPTQTRVVSSRRGEQAV 130
A D P P+ P+ + P +
Sbjct: 58 A--------DLEPPQAVQPP---PEPVVEPEPEPEPIPEPPKEAPVV------------I 94

Query: 131 PTPQPNPQTDPLSPADAQRVQRDAEMARLAAEVHLRSEQYAKRPNRKFVSASTREYAYAN 190
P+P P+ P ++ +RD + + A+ + +A+++
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVA 154

Query: 191 YLRAWVDRAERVGNLNYPDEARRRRLGGKVVITVGVRRDGSVESSRVLVSSGTPVLDAAA 250
+ R YP A+ R+ G+V + V DG V++ ++L + + +
Sbjct: 155 SGPRALSRN----QPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREV 210

Query: 251 LRVVQLAQPFPPLPRSKDDVDI 272
++ + P P S V+I
Sbjct: 211 KNAMRRWRYEPGKPGSGIVVNI 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05950HTHFIS732e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.3 bits (180), Expect = 2e-18
Identities = 28/115 (24%), Positives = 49/115 (42%), Gaps = 2/115 (1%)

Query: 15 KVMVIDDSKTIRRTAETLLKREGCEVVTATDGFEALAKIADQQPQIIFVDIMMPRLDGYQ 74
++V DD IR L R G +V ++ IA ++ D++MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 75 TCALIKGNQLFKSTPVIMLSSKDGLFDKARGRIVGSEQYLTKPFTREELLSAIRT 129
IK + PV+++S+++ + G+ YL KPF EL+ I
Sbjct: 65 LLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05955HTHFIS886e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 6e-24
Identities = 36/116 (31%), Positives = 57/116 (49%), Gaps = 2/116 (1%)

Query: 2 ARIILIEDSPTDRAVFSQWLEKAGHTVVATDNAEEGLELIRSQAPDLVLMDVVLPGMSGF 61
A I++ +D R V +Q L +AG+ V T NA I + DLV+ DVV+P + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 QATRALARDQATKDIPVLLVSTKGMETDKAWGLRQGASDYIVKPPREDDLIARIKQ 117
+ + + D+PVL++S + +GA DY+ KP +LI I +
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS05970HTHFIS683e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.5 bits (165), Expect = 3e-13
Identities = 24/116 (20%), Positives = 53/116 (45%), Gaps = 2/116 (1%)

Query: 2202 QVPLVMVVDDSLTMRKVTSRVLERHNLDVTTARDGVEALELLEERVPDLMLLDIEMPRMD 2261
++V DD +R V ++ L R DV + + DL++ D+ MP +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 2262 GYELATAMRADPRFKAVPIVMITSRSGEKHRQRAFEIGVQRYLGKPYQELDLMRNV 2317
++L ++ +P++++++++ +A E G YL KP+ +L+ +
Sbjct: 62 AFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII 115


73XC_RS06330XC_RS06345N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS06330-1111.250967histidine kinase
XC_RS06335-2101.410792histidine kinase
XC_RS06340-291.390485histidine kinase
XC_RS06345-1162.198896MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS06330HTHFIS688e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.3 bits (167), Expect = 8e-14
Identities = 22/116 (18%), Positives = 50/116 (43%), Gaps = 4/116 (3%)

Query: 1059 LLLVEDDVTVAQVIVGLLQGRGHHVTHVLHGLAALAEVSARSFDAGLCDLDLPGLDGAAL 1118
+L+ +DD + V+ L G+ V + ++A D + D+ +P + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 1119 VAQLRARGVCFPIVAVTARADADAEPQALAAGCNGFLRKPL----TGDLLAQALAQ 1170
+ +++ P++ ++A+ +A G +L KP ++ +ALA+
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS06335HTHFIS764e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 4e-16
Identities = 28/129 (21%), Positives = 58/129 (44%), Gaps = 2/129 (1%)

Query: 1052 RILLVEDDPTVAEVISGLLAGRGHRVVHAAHGLAALSEAVDGEFDVALLDLDLPGLDGFA 1111
IL+ +DD + V++ L+ G+ V ++ G+ D+ + D+ +P + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1112 LAAQLRRLGHAFPLLAVTARADGDAQTQAQAAGFDGFLRKPVTADMLVEAIAAVQAVARE 1171
L ++++ P+L ++A+ +A G +L KP L+ I +A+A
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG--RALAEP 122

Query: 1172 RDRATPLLG 1180
+ R + L
Sbjct: 123 KRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS06340HTHFIS771e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.2 bits (190), Expect = 1e-16
Identities = 28/123 (22%), Positives = 51/123 (41%)

Query: 1071 RILLVEDEPTIAEVIVGLLRAQGHSVVHAPHGLAALTEAADNAFDLALLDLDLPGLDGFA 1130
IL+ +D+ I V+ L G+ V + A DL + D+ +P + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1131 LARQLRVFGYDMPLIAVTARSDEAAEPTAQQAGFDSFLRKPLTGDMLADTIAEALRRVRP 1190
L +++ D+P++ ++A++ A + G +L KP L I AL +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 1191 REE 1193
R
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS06345TCRTETB1191e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (301), Expect = 1e-31
Identities = 84/408 (20%), Positives = 172/408 (42%), Gaps = 17/408 (4%)

Query: 17 LLWLVSLAIFMQMLDATIVNTALPSMARSLRESPLQMQSVVFSYALAVAMFIPASGWIAD 76
L+WL L+ F +L+ ++N +LP +A + P V ++ L ++ G ++D
Sbjct: 16 LIWLCILS-FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 77 RFGTRRTFLAAIIVFTLGSLLCAAAQQ-LPQLVAARVVQGIGGAMLLPVGRLAVLKTVAR 135
+ G +R L II+ GS++ L+ AR +QG G A + + V + + +
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 136 ADFLRAMSFIAIPALIGPLIGPTLGGWLVEIASWHWVFLINLP-IGVIGFIAALKIMPDH 194
+ +A I +G +GP +GG + HW +L+ +P I +I +K++
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAH--YIHWSYLLLIPMITIITVPFLMKLLKKE 192

Query: 195 YGDARKRFDLVGYLMLAFGMVALSLALDGIAELGLRHAFVMLLAIGGLAALAGYWLHAAS 254
+ FD+ G ++++ G+V L + V+ I + H
Sbjct: 193 -VRIKGHFDIKGIILMSVGIVFFMLFTT-SYSISFLIVSVLSFLI--------FVKHIRK 242

Query: 255 APAALFPLALFKVASYRIGILGNLFARVGSGSMPFLIPLLLQVGLGMSPMNAG-LMMVPV 313
L K + IG+L ++P +++ +S G +++ P
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 314 ALAGMAAKRAAVKLVGRFGYRRILMLNTVLVGLAMASFALISAGQPLWLRVLQLACFGAV 373
++ + LV R G +L + + ++ + + + ++ ++ + G +
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL 362

Query: 374 NSLQFTVMNTVTLRDLDRDQASPGNSLLSMVMMLATGFGAAAAGSLLA 421
+ + TV++T+ L + +A G SLL+ L+ G G A G LL+
Sbjct: 363 SFTK-TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


74XC_RS07070XC_RS07105N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS07070-1120.706622chemotaxis protein CheA
XC_RS070750131.468925anti-anti-sigma factor
XC_RS07080-1131.734863transcription-repair coupling factor
XC_RS07085-2122.044059acyl-CoA synthetase
XC_RS07090-3111.763992hypothetical protein
XC_RS07095-2111.648590transcriptional regulator
XC_RS07100-2131.296184hypothetical protein
XC_RS07105-2121.316295histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07070PF06580364e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 4e-04
Identities = 20/134 (14%), Positives = 38/134 (28%), Gaps = 51/134 (38%)

Query: 399 LVRNAMDHGIEPADVRVANGKPARGTVGLNAFHDSGSIVIQITDDGGGLNRDKILAKALE 458
LV N + HGI P G + L D+G++ +++ + G ++
Sbjct: 263 LVENGIKHGIAQ--------LPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------ 308

Query: 459 RGLVEPGRQLSDREVFAMIFEPGFSTAEKVTNLSGRGVGMDVVKRNITALRGC---VDID 515
G G+ V+ + L G + +
Sbjct: 309 ---------------------------------ESTGTGLQNVRERLQMLYGTEAQIKLS 335

Query: 516 STAGVGTTISVRLP 529
G + V +P
Sbjct: 336 EKQGKVNAM-VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07085SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 35.3 bits (81), Expect = 5e-05
Identities = 11/53 (20%), Positives = 21/53 (39%)

Query: 109 ILVSSFVAGQGLGRQLMRKLVKWARRKYLDCLYGDVLQSNLPMLQLAESLGFK 161
I V+ +G+G L+ K ++WA+ + L + N+ F
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07095HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 1e-20
Identities = 37/136 (27%), Positives = 60/136 (44%), Gaps = 1/136 (0%)

Query: 6 ARVLVVEDETAIADTVLYALRSEGYVPEHCLLGREALERLRNDPADLVVLDVGLPDINGF 65
A +LV +D+ AI + AL GY + DLVV DV +PD N F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 EVCRTLRS-FSEVPVIFLTARNDEIDRVLGLELGADDYMAKPFSPRELVARVRARLRRRS 124
++ ++ ++PV+ ++A+N + + E GA DY+ KPF EL+ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 125 VAVAQAEPGWQPHGAF 140
++ E Q
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07105PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 28/96 (29%), Positives = 42/96 (43%), Gaps = 24/96 (25%)

Query: 384 LLENA----IAFSPDGGTVQLRTQREDGQVQLLVEDRGSGVPDYAIERVFERFYSLARPQ 439
L+EN IA P GG + L+ +++G V L VE+ GS E
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------- 309

Query: 440 TGQRSSGLGLPFVRE-VARLHGGD--VRLRNRDGGG 472
S+G GL VRE + L+G + ++L + G
Sbjct: 310 ----STGTGLQNVRERLQMLYGTEAQIKLSEKQGKV 341


75XC_RS07145XC_RS07175N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS07145-1110.167018glutamate--tRNA ligase
XC_RS071500100.467502Fur family transcriptional regulator
XC_RS07155-1110.506392transcriptional regulator
XC_RS07160-1110.825394MexE family multidrug efflux RND transporter
XC_RS07165-1120.339101multidrug efflux RND transporter permease
XC_RS07170-1121.568417multidrug transporter
XC_RS07175-1131.338554TetR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07145adhesinmafb300.021 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.0 bits (67), Expect = 0.021
Identities = 23/74 (31%), Positives = 33/74 (44%), Gaps = 10/74 (13%)

Query: 367 PLETYDEAAVIKHLKLGAEVPLGKAREMLAALNEWSVEN------VSAALHAAAAALELG 420
PL + AVI L A G + A++ W EN V A + AAAA
Sbjct: 269 PLPAEGKFAVIGGLGSVA----GFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKVAK 324

Query: 421 MGKVAQPLRVAITG 434
+ K A+P + A++G
Sbjct: 325 LAKAAKPGKAAVSG 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07155HTHTETR729e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 9e-18
Identities = 31/211 (14%), Positives = 64/211 (30%), Gaps = 18/211 (8%)

Query: 1 MRVRTEEKRDTIVQAASEVFLELGFEGASMSQIAARVGGSKRTLYGYFGSKEELFVAVAK 60
+ +E R I+ A +F + G S+ +IA G ++ +Y +F K +LF +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW- 63

Query: 61 DMSDRYFDPLLDALSRSSGPI-ADALQRFGEDVLTFLCAPPNITSWQTIIGVSGRSDVGA 119
+ + + D L E ++ L + + ++ +
Sbjct: 64 ---ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 120 LFFSAGQEEGLKRFAEYLQTQVDCGQLRCDDTMLAAQQFAALVEAETLMPCLFGALK--- 176
+ +R L+ + A+ A + + G +
Sbjct: 121 --GEMAVVQQAQRNLCLESYDRIEQTLK---HCIEAKMLPADLMTRRAAIIMRGYISGLM 175

Query: 177 -----NPSPEYLRDATRRAVELFLVGYACKP 202
P L+ R V + L Y P
Sbjct: 176 ENWLFAPQSFDLKKEARDYVAILLEMYLLCP 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07160RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 40.2 bits (94), Expect = 1e-05
Identities = 24/109 (22%), Positives = 41/109 (37%), Gaps = 10/109 (9%)

Query: 66 EVRPQVGGIVQTRQFTEGGDVKAGQTLYQIDPATYRASYASAQATLAKAQANLRTARLKA 125
E++P IV+ EG V+ G L ++ A+A K Q++L ARL+
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALG-------AEADTLKTQSSLLQARLEQ 150

Query: 126 DRYK-ELVQIKAISQQEGD--DTAATLGQAEADVAAGKASVETARINLA 171
RY+ I+ E D +E +V + ++
Sbjct: 151 TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199



Score = 37.5 bits (87), Expect = 8e-05
Identities = 17/102 (16%), Positives = 37/102 (36%), Gaps = 3/102 (2%)

Query: 99 TYRASYASAQATLAKAQANLRTARLKADRYKELVQIKAISQQEGDDTAATLGQAEADVAA 158
Y A L ++ L + KE + + ++Q ++ L Q ++
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSAKE--EYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 159 GKASVETARINLAFARMDAPISGRIGRSSV-TPGALVTANQA 199
+ + + AP+S ++ + V T G +VT +
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355



Score = 30.2 bits (68), Expect = 0.015
Identities = 13/34 (38%), Positives = 17/34 (50%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVGGIVQTRQ-FTEGGDVKAGQTLYQIDP 97
+ +R V VQ + TEGG V +TL I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07165ACRIFLAVINRP12100.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1210 bits (3133), Expect = 0.0
Identities = 665/1034 (64%), Positives = 808/1034 (78%), Gaps = 6/1034 (0%)

Query: 1 MARFFIDRPIFAWVLAIIVMLAGILSIATLPIAQYPSIAPPAVAITANYPGASAQTLEDT 60
MA FFI RPIFAWVLAII+M+AG L+I LP+AQYP+IAPPAV+++ANYPGA AQT++DT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQKMKGLDNLSYMASTSESSGAVTITLTFENGTDPDTAQVQVQNKLSLATALLPQ 120
VTQVIEQ M G+DNL YM+STS+S+G+VTITLTF++GTDPD AQVQVQNKL LAT LLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVTVTKSATNFLNVLAFTSEDGSMSDSDLSDYVAANVQETISRVQGVGDTTLFGS 180
EVQQQG++V KS++++L V F S++ + D+SDYVA+NV++T+SR+ GVGD LFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYSMRIWLDPDKLNNFNLTPVDVRAAIQAQNAQVSAGQLGALPAVANQQLNATITAQTRL 240
QY+MRIWLD D LN + LTPVDV ++ QN Q++AGQLG PA+ QQLNA+I AQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KTAEEFENILLRTRTDGSQVRLRDVARIELGSESYNTVGRYNGKPAAGLAIKLATGANAL 300
K EEF + LR +DGS VRL+DVAR+ELG E+YN + R NGKPAAGL IKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTVRAIDKSLEEQEKFFPPGMKVQKPYDTTPFVRISIEQVVHTLVEAVVLVFLVMYLFLQ 360
DT +AI L E + FFP GMKV PYDTTPFV++SI +VV TL EA++LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFGVLAVFGFTINTLTMFAMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTF +LA FG++INTLTMF MVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEQLSPKEATRKSMDQITGALVGVALVLAAVFVPMAFFGGSTGVIYRQFSITIVSAMTL 480
E++L PKEAT KSM QI GALVG+A+VL+AVF+PMAFFGGSTG IYRQFSITIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVAMVLTPALCATLLKP---GHGMATTGFFGWFNRVFDRSNGRYQGVVRHMLGKGWRY 537
SVLVA++LTPALCATLLKP H GFFGWFN FD S Y V +LG RY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 538 MIAYVVIVALVVVGFMKTPVGFLPDEDQGTLFVLVQLPPGATDARTGAVLKQVEHHFLVD 597
++ Y +IVA +VV F++ P FLP+EDQG ++QLP GAT RT VL QV ++L +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 598 QKDSVAGIFAVSGFSFAGTGQNVGFAFVKLRPWDERTGKGQSVTDVAAKAGAFFATLRDA 657
+K +V +F V+GFSF+G QN G AFV L+PW+ER G S V +A +RD
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 658 QVFAVAPPAVSELGNATGFDLMLQDRANLGHAALMQARNQLLAELSQD-KRLVAVRPNGQ 716
V PA+ ELG ATGFD L D+A LGH AL QARNQLL +Q LV+VRPNG
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 717 EDTPEFKLEIDPHKAEAMGVSIADINNTFSSAWGSTYVNDFIDKGRVKKVMLQADAPYRM 776
EDT +FKLE+D KA+A+GVS++DIN T S+A G TYVNDFID+GRVKK+ +QADA +RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 777 NPQDIDRWFVRNSAGTMVPFNAFATARWTQGSPRLERYNSLPSVEILGMALPGAASSGQA 836
P+D+D+ +VR++ G MVPF+AF T+ W GSPRLERYN LPS+EI G A PG SSG A
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPG-TSSGDA 839

Query: 837 MQIVEAAAAKLPAGIGFEWTGLSRQEKASTGQTGLLYGLSILIVFLCLAALYESWSIPFS 896
M ++E A+KLPAGIG++WTG+S QE+ S Q L +S ++VFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 897 VILVVPLGVFGTLVAAVLTWKMNDVYFQVGLLTTIGLASKNAILIVEFAKELHE-GGKSL 955
V+LVVPLG+ G L+AA L + NDVYF VGLLTTIGL++KNAILIVEFAK+L E GK +
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 956 VAAALEAARMRLRPILMTSLAFILGVVPLVLTSGAGAGAQHALGTAVIGGMISGTVLAIF 1015
V A L A RMRLRPILMTSLAFILGV+PL +++GAG+GAQ+A+G V+GGM+S T+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1016 FVPLFFVLVRGLFE 1029
FVP+FFV++R F+
Sbjct: 1020 FVPVFFVVIRRCFK 1033



Score = 62.9 bits (153), Expect = 5e-12
Identities = 48/334 (14%), Positives = 106/334 (31%), Gaps = 18/334 (5%)

Query: 711 VRPNGQEDTPEFKLEIDPHKAEAMGVSIADINNTFSSA----WGSTYVNDFIDKGRVKKV 766
V+ G + ++ +D ++ D+ N G+
Sbjct: 175 VQLFGAQY--AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNA 232

Query: 767 MLQADAPYRMNPQDIDRWFVR-NSAGTMVPFNAFATARWTQGSPR-LERYNSLPSVEILG 824
+ A ++ NP++ + +R NS G++V A + + R N P+ +
Sbjct: 233 SIIAQTRFK-NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGI 291

Query: 825 MALPGA---ASSGQAMQIVEAAAAKLPAGIGFEW---TGLSRQEKASTGQTGLLYGLSIL 878
GA ++ + P G+ + T Q L I+
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEA--IM 349

Query: 879 IVFLCLAALYESWSIPFSVILVVPLGVFGTLVAAVLTWKMNDVYFQVGLLTTIGLASKNA 938
+VFL + ++ + VP+ + GT + G++ IGL +A
Sbjct: 350 LVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDA 409

Query: 939 ILIVE-FAKELHEGGKSLVAAALEAARMRLRPILMTSLAFILGVVPLVLTSGAGAGAQHA 997
I++VE + + E A ++ ++ ++ +P+ G+
Sbjct: 410 IVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ 469

Query: 998 LGTAVIGGMISGTVLAIFFVPLFFVLVRGLFERR 1031
++ M ++A+ P +
Sbjct: 470 FSITIVSAMALSVLVALILTPALCATLLKPVSAE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07170RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 28/210 (13%), Positives = 62/210 (29%), Gaps = 9/210 (4%)

Query: 227 QLTLRQAQTTVESARVDVERYTAQVAQDRNALVLLVGTQLPESLLPRALPDGASVEGNVL 286
+LT A+ + + + + + + + +LPE LP P +V +
Sbjct: 126 KLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLP-DEPYFQNVSEEEV 184

Query: 287 AAVPAGVPSQLLQRRPDILEAERNLRAANANIGAARAAFFPSISLTATLGSSSSSLSGLF 346
+ + + Q + + E NL A A +L+ S S L
Sbjct: 185 LRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLL 244

Query: 347 ESGTRAWSFVPQLTLPLFNAGRNRANLDMAKANRDIEVARYEKSIQTAFREVSDALAQRD 406
+ + + A ++ +E E ++ L + +
Sbjct: 245 HKQ-----AIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNE 299

Query: 407 TLGRQLQAQQALVDATADSYRLSQARFERG 436
L + Q + T L++ +
Sbjct: 300 ILDKLRQTTDNIGLLTL---ELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07175HTHTETR641e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 64.3 bits (156), Expect = 1e-14
Identities = 40/209 (19%), Positives = 66/209 (31%), Gaps = 8/209 (3%)

Query: 1 MPAPGRRVPHAKRGATLAAAQQLFTQQGFERTSMDRIAEDAGVSKATVYAYFASKEVLFR 60
M ++ R L A +LF+QQG TS+ IA+ AGV++ +Y +F K LF
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 TTLEAMAHAASNRWDGLL-TLPGPVAQRLAAVASVLLQVWTNAARQDAAYGLVRPPSLPS 119
E PG L + +L+ R+ ++
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 120 ---QIREEMWTLCFERYDSTMRALLAREVAQGALVIDNLPD-ASVQFFGLITAVPTEVPA 175
+ ++ + L + L D + A++ G I+ +
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 176 PGMPPDAATAQRYIDGAVAAFLRAYRPVP 204
D R VA L Y P
Sbjct: 181 APQSFDLKKEARD---YVAILLEMYLLCP 206


76XC_RS07600XC_RS07630N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS07600-221-2.497185hypothetical protein
XC_RS07605-225-3.844587acyl-CoA hydrolase
XC_RS07610-124-4.027127transposase
XC_RS07615-127-4.735645acyl-CoA hydrolase
XC_RS07620-126-4.880646histidine kinase
XC_RS07625-128-5.392071hypothetical protein
XC_RS07630-130-5.160122chemotaxis protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07600cloacin348e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.5 bits (76), Expect = 8e-04
Identities = 21/77 (27%), Positives = 34/77 (44%), Gaps = 8/77 (10%)

Query: 105 GGSTPTTPGWTSGGGSSGAGRSGTGGARVSSGLSEGDV-------LPDLIDGDAGLPVLA 157
GG + + W G G G +G G +G + V P L AG ++
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVS 106

Query: 158 IVA-LLAASVAALVAAV 173
I A L+A++A ++AA+
Sbjct: 107 ISAGALSAAIADIMAAL 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07620PF06580364e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 4e-04
Identities = 22/102 (21%), Positives = 45/102 (44%), Gaps = 22/102 (21%)

Query: 553 LIENAVQHA----PAGSRVVITARTLDENGRASVECRVQDAGSGFAPDDLPRIFDPFFTR 608
L+EN ++H P G ++++ +NG ++E V++ GS +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGT--KDNGTVTLE--VENTGSLALKNT----------- 307

Query: 609 RRKGTGLGLAIVQRIVEEHKGTIAG--HNSPEGGAVMVMRLP 648
++ TG GL V+ ++ GT A + +G ++ +P
Sbjct: 308 -KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07625HTHFIS478e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 46.7 bits (111), Expect = 8e-10
Identities = 17/60 (28%), Positives = 28/60 (46%)

Query: 15 FESTSAQDAHAGDLNLTLQQIERQHIKRVLHDVGGKVEQASLRLGVPRSTLYQKIKLHGI 74
F S +G + L ++E I L G +A+ LG+ R+TL +KI+ G+
Sbjct: 416 FASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS07630HTHFIS374e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 374 bits (962), Expect = e-129
Identities = 127/358 (35%), Positives = 203/358 (56%), Gaps = 4/358 (1%)

Query: 7 RTRILIVDDEDSIRFGMRDFLESRGYGVVDADSCQRARELFQASPPDVAVIDYRLHDGSA 66
IL+ DD+ +IR + L GY V + A D+ V D + D +A
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 67 IDLLRDFRQIDADVPMIVLTAYGSIDLAVQAVKEGAEQFLTKPIEMPALHVILKRLLATR 126
DLL ++ D+P++V++A + A++A ++GA +L KP ++ L I+ R LA
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 127 RLQRQQQVVATRDRRERVNPLSGGSPAIRDLAEQASKVMHTDSPILILGETGTGKSLLAK 186
+ + + D + PL G S A++++ +++M TD ++I GE+GTGK L+A+
Sbjct: 123 KRRPSK----LEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVAR 178

Query: 187 WLHHHGMRADEAFVDLNCAGLSPDFLETELFGHEKGAFTGATASKQGMLEIADGGTVFLD 246
LH +G R + FV +N A + D +E+ELFGHEKGAFTGA G E A+GGT+FLD
Sbjct: 179 ALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLD 238

Query: 247 EIGDVDPRVQPKLLKVLEEKRFRRLGAVREREVDIRLIAATHHDLASKVKEGTFRSDLYF 306
EIGD+ Q +LL+VL++ + +G D+R++AAT+ DL + +G FR DLY+
Sbjct: 239 EIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYY 298

Query: 307 RISSIPLTMPALRDRKEDIVPLARTLLARSLADPSRTPQLTDDAALALREYAWPGNIR 364
R++ +PL +P LRDR EDI L R + ++ + + +A ++ + WPGN+R
Sbjct: 299 RLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVR 356


77XC_RS08085XC_RS08170N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08085022-5.519959short-chain dehydrogenase
XC_RS08090229-7.122341membrane protein
XC_RS08095332-7.706604Oar protein
XC_RS08100139-6.999932LOG family protein
XC_RS08105253-9.006709pre-pilin like leader sequence
XC_RS08110331-6.454942pilus modification protein PilV
XC_RS08115429-5.890329pilus assembly protein PilW
XC_RS08120519-5.341487pilus assembly protein
XC_RS08125520-5.303681type IV pilin
XC_RS08135320-5.901008*fimbrial protein
XC_RS08140423-7.556725UvrABC system protein B
XC_RS08150439-9.650353*hypothetical protein
XC_RS08155438-10.008083hypothetical protein
XC_RS08160452-11.265247hypothetical protein
XC_RS08165451-11.412850conjugative transfer protein
XC_RS08170551-11.833425hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08085DHBDHDRGNASE746e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.3 bits (182), Expect = 6e-18
Identities = 43/203 (21%), Positives = 86/203 (42%), Gaps = 3/203 (1%)

Query: 9 AAALAGRVVLVTGAAGGLGAAAAQACAQAGATVVLLGRKLRPLERVYDALAGQGAAPLLY 68
A + G++ +TGAA G+G A A+ A GA + + LE+V +L + +
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 69 PLDLAGATPDDYAALAQRLQSELGGLSGVLHCAAEFAGLTPAELAAPAEFARSIHVNLTA 128
P D+ + R++ E+G + +L A + E+ + VN T
Sbjct: 63 PADV--RDSAAIDEITARIEREMGPID-ILVNVAGVLRPGLIHSLSDEEWEATFSVNSTG 119

Query: 129 RAWLTQACLPLLRQQQDAALVFVVDDPARVGQAYWGAYGAAQHAQRGLIASLHHETAAGS 188
+++ + ++ ++V V +PA V + AY +++ A L E A +
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 189 VRVSGLQPGPMRTALRARAFTHQ 211
+R + + PG T ++ + +
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADE 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08105BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.7 bits (77), Expect = 1e-04
Identities = 14/49 (28%), Positives = 28/49 (57%)

Query: 5 RARGFTLVELMTTVAVVAIVAAIGYPSFQGVIRSNRAVTANNEVVGLLN 53
+ RGFTL+E+M + ++ ++A++ P+ G A +++V L N
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALEN 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08110BCTERIALGSPG280.008 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.3 bits (63), Expect = 0.008
Identities = 8/23 (34%), Positives = 17/23 (73%), Gaps = 2/23 (8%)

Query: 8 REASGFTLIEVLIAIIVLAFGLL 30
+ GFTL+E+++ I+++ G+L
Sbjct: 5 DKQRGFTLLEIMVVIVII--GVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08115BCTERIALGSPH290.019 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.1 bits (65), Expect = 0.019
Identities = 23/67 (34%), Positives = 31/67 (46%), Gaps = 9/67 (13%)

Query: 6 RAAGLSLIEMMIALVIGLVLLLGVIQVFSASRTAFQLSEGASRAQENARF--ALDFLARD 63
R G +L+EMM+ L++ V V+ F ASR S AQ ARF L F+ +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASR-------DDSAAQTLARFEAQLRFVQQR 54

Query: 64 IRMAGHF 70
G F
Sbjct: 55 GLQTGQF 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08125BCTERIALGSPG344e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.5 bits (79), Expect = 4e-05
Identities = 10/39 (25%), Positives = 26/39 (66%)

Query: 1 MIELMIVVAVIAILSAIAYPSYAEYVRKSRRAQAKADLV 39
++E+M+V+ +I +L+++ P+ K+ + +A +D+V
Sbjct: 12 LLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIV 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08135BCTERIALGSPG341e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 34.1 bits (78), Expect = 1e-04
Identities = 10/38 (26%), Positives = 24/38 (63%)

Query: 8 PAKGYTATELLIVMAVLGLLAAIALPSFSSLIERQRLQ 45
+G+T E+++V+ ++G+LA++ +P+ E+ Q
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQ 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08165PF043352196e-73 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 219 bits (560), Expect = 6e-73
Identities = 53/230 (23%), Positives = 104/230 (45%), Gaps = 12/230 (5%)

Query: 14 QVGAAVQKAVNYEVSIADLARRSERRAWLVATVSMLITVITAGGYYYMLPLKEKVPYLVM 73
++ A ++A ++E A RS++ AW+VA V+ + + PLK PY++
Sbjct: 9 ELKAYFEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVIT 68

Query: 74 ADAYSGTSTIAKLEANYGGRTISTSEALARSNIARFILARESFDVSTIGDRDWNTVAAMA 133
D +G ++I +G TI+ EA+ + +A ++ RE + + + ++ V M+
Sbjct: 69 VDRNTGEASI--AAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAR-EEYFDAVMVMS 125

Query: 134 ATNVLAEYRTLHAGNNPLRPFNTYGRSRAIRINILSITLVGGKGKAYTGATVRFQRNVYD 193
A + + +NP P N + + I ++ +GG A V F +
Sbjct: 126 ARPEQDRWSRFYKTDNPQSPQNILANRTDVFVEIKRVSFLGGN-----VAQVYFTKESVT 180

Query: 194 KTSTVTTLLDNKIATMGFAYQDNLQMSDSLRVENPLGFRVTDYRVDSDYS 243
+++ T + +AT+ + D + R +NPLG++V YR D +
Sbjct: 181 GSNSTKT---DAVATIKYKV-DGTPSKEVDRFKNPLGYQVESYRADVEVP 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08170TYPE4SSCAGX368e-05 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 36.3 bits (83), Expect = 8e-05
Identities = 27/89 (30%), Positives = 42/89 (47%), Gaps = 10/89 (11%)

Query: 44 TGLGITTQIELSPNEKILDYSTGFTGGWELTRRENVFYLKPKNVDVD-------TNMMIR 96
T L T I+L +E I +TGF GW + N +++PK+V + N +
Sbjct: 59 TSLDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALM 118

Query: 97 TATHSYILELK---VVATDWQRLEQAKQA 122
T + L+ K V A D + LE+ K+A
Sbjct: 119 TRDYQEFLKTKKLIVDAPDPKELEEQKKA 147



Score = 29.8 bits (66), Expect = 0.011
Identities = 11/27 (40%), Positives = 17/27 (62%)

Query: 165 YDYDYATRTKKSWLVPSRVYDDGKFTY 191
Y+Y A + ++PS ++DDG FTY
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTY 427


78XC_RS08710XC_RS08805N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08710-114-0.776635preprotein translocase subunit SecD
XC_RS08715-3150.310614preprotein translocase subunit SecF
XC_RS08720-3182.718915lipase
XC_RS08725-1223.905060hypothetical protein
XC_RS08730-2222.591090RpfN protein
XC_RS08735-2203.091542PTS system fructose-specific EIIBC component
XC_RS08740-1192.8008901-phosphofructokinase
XC_RS08745-2181.402836PTS fructose transporter subunit IIA
XC_RS08750-2160.331893LacI family transcriptional regulator
XC_RS08755-2150.055924multidrug efflux RND transporter permease
XC_RS08760-2120.762986GntR family transcriptional regulator
XC_RS08765-1110.436599TetR family transcriptional regulator
XC_RS08770-1100.355070glutamate dehydrogenase
XC_RS087752100.598674AraC family transcriptional regulator
XC_RS087802110.917425hypothetical protein
XC_RS087851121.236501major facilitator transporter
XC_RS087900111.020150chemotaxis protein CheY
XC_RS08795-190.900013histidine kinase
XC_RS08800-1120.955959membrane protein
XC_RS08805-1131.445905hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08710SECFTRNLCASE893e-21 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 88.7 bits (220), Expect = 3e-21
Identities = 37/175 (21%), Positives = 83/175 (47%), Gaps = 3/175 (1%)

Query: 439 VIGPSLGAENVERGVTAVVFSFVFTLVFFTVYYRMFGAITSV-ALLFNLLIVIAVMSLFG 497
+GP + E V V +++ + V + + V + A+ +V AL+ ++L+ + + ++
Sbjct: 142 SVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFAVLQ 201

Query: 498 ATMTLPGFAGLALSVGLSVDANVLINERIREELRL--GVPPKSAIVAGYEKAGGTILDAN 555
L A L G S++ V++ +R+RE L +P + + + +
Sbjct: 202 LKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTG 261

Query: 556 LTGLIVGVALYAFGTGPLKGFALTMIIGIFASMFTAITVSRALAVLIYGSRKKLK 610
+T L+ V + +G ++GF M+ G+F ++++ V++ + + I R K K
Sbjct: 262 MTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNKEK 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08715SECFTRNLCASE2823e-96 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 282 bits (722), Expect = 3e-96
Identities = 97/320 (30%), Positives = 161/320 (50%), Gaps = 10/320 (3%)

Query: 4 FPLHLIPNDTKIDFMRLRKPVLILMLVLAIASIGIIVGKGFNYALEFTGGTLVQTSFQKT 63
F L L+P T DF R + +V+ IAS+ + + G N+ ++F GGT ++T
Sbjct: 3 FRLKLVPEKTNFDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTA 62

Query: 64 VDVDQVREQLAKAGFENAQVQNAR------GGNEVMIRLQAREQHNNRDDAAT---TVAE 114
+DV R L + + R + MIR+Q +E + +
Sbjct: 63 IDVGVYRAALEPLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVN 122

Query: 115 EVRKAVSTEQNPATVQPGEFVGPQVGKDLALNGVYATVFMLVGFLIYIAFRFEWKFAVVA 174
+V A++ + E VGP+V +L V++ + V + YI RFEW+FA+ A
Sbjct: 123 KVETALTAVDPALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGA 182

Query: 175 SLTALFDLLVTVAFVSLTGREFDLTVLAGLLSVMGFAINDIIVVFDRVRENFRALRVEPL 234
+ + D+L+TV ++ +FDLT +A LL++ G++IND +VVFDR+REN + PL
Sbjct: 183 VVALVHDVLLTVGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPL 242

Query: 235 -EVLNRSINQTLSRTVITAVMFFLSALALYIYGGESMEGLAETHMIGAVIVVISSIIVAV 293
+V+N S+N+TLSRTV+T + L+ + + I+GG+ + G + G SS+ VA
Sbjct: 243 RDVMNLSVNETLSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAK 302

Query: 294 PMLSIGPFAVTKQDLLPKAK 313
++ K+ P K
Sbjct: 303 NIVLFIGLDRNKEKKDPSDK 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08735RTXTOXINA320.010 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.9 bits (72), Expect = 0.010
Identities = 26/104 (25%), Positives = 43/104 (41%), Gaps = 8/104 (7%)

Query: 53 LTNGAAHVLIVGDADADTARFGDAQLLHLSLGAVLDDPAAAVSQLAATTAPASTSATTDA 112
+ + + I+ +ADADT A + L+ VL + +SQ A +T+ A
Sbjct: 248 ILSAISASFILSNADADTRTKAAAG-VELTT-KVLGNVGKGISQYIIAQRAAQGLSTSAA 305

Query: 113 SGAGGKRIVAITSCP---TGIAHTFMAAEGLQQAA---KKLGYQ 150
+ V + P IA F A +++ + KKLGY
Sbjct: 306 AAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYD 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08745PHPHTRNFRASE5870.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 587 bits (1516), Expect = 0.0
Identities = 210/568 (36%), Positives = 323/568 (56%), Gaps = 11/568 (1%)

Query: 274 AIVGIGASPGVAIGIVHRLRAAQTEVADQPIGLGDGGV-LLHDALTRTRQQLAAIQDDTQ 332
I GI AS GVAI ++ I + L AL +++++L AI+D T+
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTE 63

Query: 333 RRLGASDAAIFKAQAELLNDTDLITR-TCQLMVEGHGVAWSWHQAVEQIASGLAALGNPV 391
+GA A IF A +L+D +L+ ++ E ++ + + S ++ N
Sbjct: 64 ASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEY 123

Query: 392 LAGRAADLRDVGRRVLAQLDPAAAGAGLTDLPEQPCILLAGDLSPSDTANLDTDCVLGLA 451
+ RAAD+RDV +RVL L G+ L + E +++A DL+PSDTA L+ V G A
Sbjct: 124 MKERAADIRDVSKRVLGHLIGVETGS-LATIAE-ETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 452 TAQGGPTSHTAILSRTLGLPALVAAGGQLLDIEDGVTAIIDGSSGRLYINPSEQDLDAAR 511
T GG TSH+AI+SR+L +PA+V I+ G I+DG G + +NP+E+++ A
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 512 THIAEQQAIREREAAQRALPAETTDGHHIDIGANVNLPEQVAMALTQGAEGVGLMRTEFL 571
A + ++ A P+ T DG H+++ AN+ P+ V L G EG+GL RTEFL
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 572 FLERGSTPTEDEQYQTYLAMARALDGRPLIVRALDIGGDKQVAHLELPHEENPFLGVRGA 631
+++R PTE+EQ++ Y + + +DG+P+++R LDIGGDK++++L+LP E NPFLG R
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 632 RLLLRRPDLLEPQLRALYRAAKDGARLSIMFPMITSVPELISLREICARIRAELDA---- 687
RL L + D+ QLRAL RA+ G L +MFPMI ++ EL + I + +L +
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYG-NLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVD 420

Query: 688 --PELPIGIMIEVPAAAAQADVLARHADFFSIGTNDLTQYVLAIDRQNPELAAEADSLHP 745
+ +GIM+E+P+ A A++ A+ DFFSIGTNDL QY +A DR N ++ HP
Sbjct: 421 VSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHP 480

Query: 746 AVLRMIRSTIDGARKHDRWVGVCGGLAGDPFGASLLAGLGVQELSMTPNDIPAVKARLRG 805
A+LR++ I A +WVG+CG +AGD LL GLG+ E SM+ I +++L
Sbjct: 481 AILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLK 540

Query: 806 RALSALQQLAEQALQCETAEQVRALEAQ 833
+ L+ A++AL +TAE+V L +
Sbjct: 541 LSKEELKPFAQKALMLDTAEEVEQLVKK 568


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08755ACRIFLAVINRP10890.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1089 bits (2817), Expect = 0.0
Identities = 525/1038 (50%), Positives = 710/1038 (68%), Gaps = 17/1038 (1%)

Query: 1 MPKFFIEHPVFAWVVAILISLAGVISILNLGIESYPTIAPPQVTVTANFPGASADTAEKA 60
M FFI P+FAWV+AI++ +AG ++IL L + YPTIAPP V+V+AN+PGA A T +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQQLTGIDHLLYFNSSSASNGRVTITLTFETGTDADIAQVQVQNKVSLATPRLPS 120
VTQVIEQ + GID+L+Y +S+S S G VTITLTF++GTD DIAQVQVQNK+ LATP LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVTQQGVVVAKANAGFLMVAALRSDNPSIDRDALNDIVGSRVLEQISRVPGVGSTNQFGA 180
EV QQG+ V K+++ +LMVA SDNP +D ++D V S V + +SR+ GVG FGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 EYAMNIWLNPEKLQGYNLSATQVLTAVRNQNVQFAAGSVGADPTPQGISFTATVSAEGRF 240
+YAM IWL+ + L Y L+ V+ ++ QN Q AAG +G P G A++ A+ RF
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 SSPQEFENIILRTDNNGATVRLKDVARVTVGPSNYGFDTQYNGKPTGAFGIQLLPGANAL 300
+P+EF + LR +++G+ VRLKDVARV +G NY + NGKP GI+L GANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 NVSEAVGAKLDELQPTFPQGVTWFAPYESTTFVRISIEEVVHTLVEAIVLVFLVMLLFLQ 360
+ ++A+ AKL ELQP FPQG+ PY++T FV++SI EVV TL EAI+LVFLVM LFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATVIPTLVIPVALLGTFFGMYVIGFTINQLTLFAMVLAIGIVVDDAIVVIENVERIM 420
N RAT+IPT+ +PV LLGTF + G++IN LT+F MVLAIG++VDDAIVV+ENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 SEEHLEPKAATQKAMTQITGAVVAITVVLAAVFIPSAMQPGASGAIYKQFALTIAMAMAF 480
E+ L PK AT+K+M+QI GA+V I +VL+AVFIP A G++GAIY+QF++TI AMA
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SAFLALSFTPALCAAFLK---STHSDKKNWIYRTFDKYYDKLAHRYVGVVGNTLKRSPAW 537
S +AL TPALCA LK + H + K + F+ +D + Y VG L + +
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 538 MVVFVVLVVLCGFLFTRMPGSFLPEEDQGFAVAIVQLPPGATKIRTNEAFAQMRAVLEKQ 597
++++ ++V LF R+P SFLPEEDQG + ++QLP GAT+ RT + Q+ K
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 598 PA--VEGMLQIAGFSFLGSGENVGMGFIRLKPWEERDV---TAEQLIQQLNGAFYGIKGA 652
VE + + GFSF G +N GM F+ LKPWEER+ +AE +I + I+
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 653 QIFVVNLPTVQGLGQFGGFDMWLQDRSGAGQEALTQARNIVLGKAAQKQDTVVGVRPNGL 712
+ N+P + LG GFD L D++G G +ALTQARN +LG AAQ ++V VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 713 ENSPQLQLHVDRVQAQSMGLEVSDIYSSIQLMLAPVYVNDYFSEGRIKRVNIRADDQFRT 772
E++ Q +L VD+ +AQ++G+ +SDI +I L YVND+ GR+K++ ++AD +FR
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 773 GPESLRNFFTPSSTATGADGQPGMIPLSNVVKAEWTYASPALNRYNGYSAVNIVGNPAPG 832
PE + + S A+G+ M+P S + W Y SP L RYNG ++ I G APG
Sbjct: 781 LPEDVDKLYVRS-----ANGE--MVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPG 833

Query: 833 GSSGQAMQAMEEIVNNDLPPGFGFDWSGMSYQEIIAGNAATLLLALSVVVVFLCLAALYE 892
SSG AM ME + + LP G G+DW+GMSYQE ++GN A L+A+S VVVFLCLAALYE
Sbjct: 834 TSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYE 892

Query: 893 SWSIPVAVLLVVPIGVLGAITFSMLRGLPNDLYFKIGMITVIGLAAKNAILIVEFAVE-Q 951
SWSIPV+V+LVVP+G++G + + L ND+YF +G++T IGL+AKNAILIVEFA +
Sbjct: 893 SWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLM 952

Query: 952 RAAGKTLREATLEAAHLRFRPILMTSFAFILGVLPLAISTGAGANSRHSIGTGVIGGMVF 1011
GK + EATL A +R RPILMTS AFILGVLPLAIS GAG+ +++++G GV+GGMV
Sbjct: 953 EKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVS 1012

Query: 1012 ATVLGVIFIPLFFVVVRR 1029
AT+L + F+P+FFVV+RR
Sbjct: 1013 ATLLAIFFVPVFFVVIRR 1030



Score = 77.2 bits (190), Expect = 2e-16
Identities = 86/536 (16%), Positives = 188/536 (35%), Gaps = 47/536 (8%)

Query: 533 RSPAWMVVFVVLVVLCGFL-FTRMPGSFLPEEDQGFAVAIVQLP-PGATKIRTNEAFAQM 590
R P + V +++++ G L ++P + P V PGA +
Sbjct: 7 RRPIFAWVLAIILMMAGALAILQLPVAQYPTIA--PPAVSVSANYPGAD---AQTVQDTV 61

Query: 591 RAVLEKQ-PAVEGMLQIAGFSFLGSGENVGMGFIRLKPWEERDVTAEQLIQQLNGAFY-- 647
V+E+ ++ ++ ++ S + + F + + D+ Q+ +L A
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTF---QSGTDPDIAQVQVQNKLQLATPLL 118

Query: 648 --GIKGAQIFVVNLPTVQGLGQFGGFDMWLQDRSGAGQEALTQARNIVLGKAAQKQDTVV 705
++ I V + + ++ D G Q+ ++ + + + V
Sbjct: 119 PQEVQQQGISVEKSSSSYLM-----VAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 706 GVRPNGLENSPQLQLHVDRVQAQSMGLEVSDIYSSIQLMLAPV----YVNDYFSEGRIKR 761
V+ G + + ++ L D + L D+ + +++ + G+
Sbjct: 174 DVQLFGAQYAMRIWLDADLLNKY--KLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLN 231

Query: 762 VNIRADDQFRTGPESLRNFFTPSSTATGADGQPGMIPLSNVVKAEWTYASPA-LNRYNGY 820
+I A +F+ PE + +DG + L +V + E + + R NG
Sbjct: 232 ASIIAQTRFKN-PEEFGK----VTLRVNSDGSV--VRLKDVARVELGGENYNVIARINGK 284

Query: 821 SAVNIVGNPAPGGS----SGQAMQAMEEIVNNDLPPGFG----FDWSGMSYQEIIAGNAA 872
A + A G + + + E+ P G +D + Q I
Sbjct: 285 PAAGLGIKLATGANALDTAKAIKAKLAEL-QPFFPQGMKVLYPYDTTPF-VQLSIHEVVK 342

Query: 873 TLLLALSVVVVFLCLAALYESWSIPVAVLLVVPIGVLGAITFSMLRGLPNDLYFKIGMIT 932
TL A +++VFL + ++ + + VP+ +LG G + GM+
Sbjct: 343 TLFEA--IMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVL 400

Query: 933 VIGLAAKNAILIVEFAVEQRAAGKTL-REATLEAAHLRFRPILMTSFAFILGVLPLAIST 991
IGL +AI++VE K +EAT ++ ++ + +P+A
Sbjct: 401 AIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFG 460

Query: 992 GAGANSRHSIGTGVIGGMVFATVLGVIFIPLFFVVVRRMLGDKLDEPPKEYLAVQN 1047
G+ ++ M + ++ +I P + + + + E + N
Sbjct: 461 GSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFN 516


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08760RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.5 bits (100), Expect = 2e-06
Identities = 15/109 (13%), Positives = 41/109 (37%), Gaps = 7/109 (6%)

Query: 59 RSADVRARVDGVLLKRLYTEGTDVKEGQPLFEIDPMPLRATLLQAQGQLAAAEATYANAK 118
RS +++ + ++ + + EG V++G L ++ L A+ +++ A+
Sbjct: 95 RSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTA-------LGAEADTLKTQSSLLQAR 147

Query: 119 IAAQRARSLAPQQYVSRADIDTAEATERSSGASVQQARGVVESASIQLS 167
+ R + L+ +++ S ++ + Q S
Sbjct: 148 LEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFS 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08765HTHTETR582e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.7 bits (139), Expect = 2e-12
Identities = 29/172 (16%), Positives = 57/172 (33%), Gaps = 9/172 (5%)

Query: 5 SPRAARRSDCDRRIHAAVHALLAERGMR-VSMTAVAERAGCSKQTLYSHYGCKENLLRDV 63
+ + I L +++G+ S+ +A+ AG ++ +Y H+ K +L ++
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 64 LQEHV----QLATVPLGTATGDLREDLLAFALAHLDRLNRPDV---LQTCRLVEAESHRF 116
+ +L GD L + L+ + L + E
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 117 PGQSQQIFHEGVVGMQERLASRFAQAIDAGQLRHD-DPHFMAELLLSMIVGL 167
QQ + +R+ I+A L D A ++ I GL
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08785TCRTETB1162e-30 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 116 bits (292), Expect = 2e-30
Identities = 80/411 (19%), Positives = 161/411 (39%), Gaps = 17/411 (4%)

Query: 24 LILACAI-FMEQMDATVLATALPTLARDFGVAAPAMSIAMTSYLLALAVLIPASGAIADR 82
LI C + F ++ VL +LP +A DF + + T+++L ++ G ++D+
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 83 FGLRRVFSASIWVFIGGSILCSLADS-LPTMVAARVLQGAGGAMMAPLGRLILLRTVERR 141
G++R+ I + GS++ + S ++ AR +QGAG A L +++ R + +
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 142 HLVSAMAWTLVPAFIGPMLGPPLGGFFVSYLDWRWIFYINVPIGITGFLLVRRFIPDIPS 201
+ A +G +GP +GG Y+ W ++ I + IT L + +
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFL-MKLLK--KE 192

Query: 202 ESAPARFDLRGFVLCGTALGCLLFGLEMVSQEDGIPRASWLLAIGGATALG-YLWHARQH 260
FD++G +L + + S I + ++ H R+
Sbjct: 193 VRIKGHFDIKGIILMSVGIVFFMLFTTS---------YSISFLIVSVLSFLIFVKHIRKV 243

Query: 261 PAPLLDLSLLRIDSFRLSVIGGALMRITQGAHPFLLPLLFQISFGFSAARSGRLILATAL 320
P +D L + F + V+ G ++ T ++P + + S A G +I+
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 321 GALLMRS-ITPQLLRRFGYRNSLIGNGVLASLGYMVCALFRPDWPPALMFGLLLCCGAFM 379
++++ I L+ R G L S+ ++ + + ++ G +
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GL 362

Query: 380 SFQFAAYNTIAYENVPAARMGRASSLYTTLQQLMLSVGVCAGAMILKLAML 430
SF +TI ++ G SL L G+ +L + +L
Sbjct: 363 SFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLL 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08790HTHFIS633e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.3 bits (154), Expect = 3e-13
Identities = 33/146 (22%), Positives = 63/146 (43%), Gaps = 6/146 (4%)

Query: 1 MPARPLLCVDDESSNLATLRQLL-RDDYPLVFAKSGGEALDAVARHAPALILLDVELPDM 59
M +L DD+++ L Q L R Y + + +A L++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 60 DGYAVARTLKQ-DPDSTAIPILFVTSRSTEHDERTGLEAGAADYVSRPYSAALLKARIAT 118
+ + + +K+ PD +P+L +++++T E GA DY+ +P+ L I
Sbjct: 61 NAFDLLPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117

Query: 119 HLTLAENARLAQQYRDAIHLLGTAGQ 144
L R ++ D+ + G+
Sbjct: 118 ALAE-PKRRPSKLEDDSQDGMPLVGR 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08795HTHFIS732e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.3 bits (180), Expect = 2e-15
Identities = 35/142 (24%), Positives = 58/142 (40%), Gaps = 4/142 (2%)

Query: 1029 LEGAQLLLVDDSEINCEVAQRILEGEGAMVTVAHDGEQAVNTLKRAPDLFHLVLMDVQMP 1088
+ GA +L+ DD V + L G V + + + LV+ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD--GDLVVTDVVMP 58

Query: 1089 VVDGYEATRRLRQIPSLASLPVIALTAGAFRPQQEKALEAGMNGFIAKPFNVEELVTAIR 1148
+ ++ R+++ LPV+ ++A KA E G ++ KPF++ EL+ I
Sbjct: 59 DENAFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 1149 HFLQPGSRRVSSLPHERAPAAG 1170
L RR S L +
Sbjct: 117 RALAEPKRRPSKLEDDSQDGMP 138



Score = 62.9 bits (153), Expect = 5e-12
Identities = 23/115 (20%), Positives = 46/115 (40%), Gaps = 13/115 (11%)

Query: 889 SMPRVLIADDHDAALNNLVRIATELGWRVDAVASGQAAMQAIEDATEPYDIFLLDWRMPD 948
+ +L+ADD A L + + G+ V ++ + I D+ + D MPD
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA--GDGDLVVTDVVMPD 59

Query: 949 IDGVAIARQVRARATPGRHPVIVM---------VTAYERRLLEQHPEQQDLDAVM 994
+ + +++ PV+VM + A E+ + P+ DL ++
Sbjct: 60 ENAFDLLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08800TRNSINTIMINR300.027 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 29.7 bits (66), Expect = 0.027
Identities = 21/112 (18%), Positives = 46/112 (41%), Gaps = 5/112 (4%)

Query: 318 SEIREALRNDLAKEDSSMAAHMERSMQSLGQSLSQDPS--LRDALNVHMLDAADKLTTRL 375
+ A ++ L +E + + ++ + G ++ PS L+D + + A +
Sbjct: 276 NAAESATKDQLTQEAFKNPENQKVNIDANGNAI---PSGELKDDIVEQIAQQAKEAGEVA 332

Query: 376 RASVTEHIASTMKSWDERHLVEQLELGVGRDLQYIRFNGTLVGGLIGLALHA 427
R E A + ++++H Q EL + + Y + +V G IG +
Sbjct: 333 RQQAVESNAQAQQRYEDQHARRQEELQLSSGIGYGLSSALIVAGGIGAGVTT 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08805IGASERPTASE361e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 1e-04
Identities = 31/181 (17%), Positives = 56/181 (30%), Gaps = 42/181 (23%)

Query: 122 PEASDAELSVDT-------------------SDAAARPAEALDSPPWEETSTPVSRAELA 162
PE +VDT ++ AR EA PP +TP E
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAP--ATPSETTETV 1040

Query: 163 REARLSSNQVTMIERIHKQAAPEVVQAVRAGEISISAAAAVATLSEDEQRAAAQAGKAEL 222
E ++ E+ + A Q + E + A E+
Sbjct: 1041 AENSKQESKTV--EKNEQDATETTAQNREVAK-------------EAKSNVKANTQTNEV 1085

Query: 223 KQAAKRVRDAKRKPKPDEAEGGEAERRDVKALQRRVTELENENAALQTKVAALQAQLERL 282
Q+ ++ + + A + E+ + TE E + ++V+ Q Q E +
Sbjct: 1086 AQSGSETKETQTTETKETATVEKEEK------AKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 283 R 283
+
Sbjct: 1140 Q 1140


79XC_RS08865XC_RS08900N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS08865-2140.705044glutamine synthetase
XC_RS08870-1131.121378aminotransferase
XC_RS08875-1130.756923putrescine/spermidine ABC transporter
XC_RS088801121.160649transporter
XC_RS08885090.441820transporter
XC_RS08890-190.513796membrane protein
XC_RS08895-38-1.535487putrescine transporter ATP-binding subunit
XC_RS08900-38-0.758054putrescine/spermidine ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08865adhesinmafb300.014 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.4 bits (68), Expect = 0.014
Identities = 35/167 (20%), Positives = 57/167 (34%), Gaps = 25/167 (14%)

Query: 13 KQPESALRRWLKERHITEVECLVPDITGNARG--KIIPADKFSHDYGTRLPEGIFATTVT 70
K A+ RW++E P+ + A K + P V+
Sbjct: 290 KNTREAVDRWIQEN---------PNAAETVEAVFNVAAAAKVAKLAKAAKPG---KAAVS 337

Query: 71 GDFPDDYYALTSPSDSDMHLRPDASTVRMVPWAADPTAQVIHDCYTKDGQPHEL-APRNV 129
GDF D Y + SDS L +A + + + D +K E+ A N
Sbjct: 338 GDFADSYKKKLALSDSARQLYQNAKYREALDIHYEDLIRRKTDGSSKFINGREIDAVTN- 396

Query: 130 LRRVLEAYAEVK--LQPVVAPELEFFLVQKNTDPDFPLLPPAGRSGR 174
+A + K + + P + FL QKN + A + G+
Sbjct: 397 -----DALIQAKRTISAIDKP--KNFLNQKNRKQIKATIEAANQQGK 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08880RTXTOXIND975e-24 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 96.8 bits (241), Expect = 5e-24
Identities = 54/370 (14%), Positives = 115/370 (31%), Gaps = 81/370 (21%)

Query: 80 SVAVAPRVSGYVTQVMVGDNQIVDAGQPLLQIDD------------RTYQATLQQA---- 123
S + P + V +++V + + V G LL++ QA L+Q
Sbjct: 96 SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 124 ------------------------------------EAAIAAREADIAAATANVAAQESS 147
+ + + N+ + +
Sbjct: 156 LSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAE 215

Query: 148 LVQARTQVASAAASLRFAQAEVKRFAPLAASGA-DTHEHQE------SLQHDLERARAQY 200
+ ++ R ++ + F+ L A H E ++L ++Q
Sbjct: 216 RLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQL 275

Query: 201 EAAQAQAKGAQSQIQASSA--------QLEQAKAGVKQATADADQARVAVEDTRLTSRIH 252
E +++ A+ + Q + +L Q + T + + + + + + +
Sbjct: 276 EQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVS 335

Query: 253 GRVGD-KTVQVGQFLAAGTRTMTIVPQQSLYLV-ANFKETQVGLMRPGQPAEIEVDALSG 310
+V K G + M IVP+ V A + +G + GQ A I+V+A
Sbjct: 336 VKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPY 395

Query: 311 VK---LHGKVESLSPGTGSQFALLPPENATGNFTKVVQRVPVRIRVLAGEEARKVLVPGM 367
+ L GKV++++ + G V+ + + L GM
Sbjct: 396 TRYGYLVGKVKNINLDA-------IEDQRLGLVFNVIISIEENCLSTGNKNIP--LSSGM 446

Query: 368 SVEVTVDTRS 377
+V + T
Sbjct: 447 AVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08885TCRTETB1022e-25 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (255), Expect = 2e-25
Identities = 83/407 (20%), Positives = 164/407 (40%), Gaps = 20/407 (4%)

Query: 25 WLAVLAGTIGSFMATLDISIVNAALPTIQGEVGASGTEGTWISTAYLVAEIIMIPLTGWF 84
WL +L SF + L+ ++N +LP I + W++TA+++ I + G
Sbjct: 18 WLCIL-----SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL 72

Query: 85 VRTLGLRNFLLICAVMFTAFSVVCGLSTS-LTMMIIGRVGQGLAGGALIPTALTIVATRL 143
LG++ LL ++ SV+ + S +++I+ R QG A + +VA +
Sbjct: 73 SDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYI 132

Query: 144 PPSQQTMGTALFGMTVIMGPVIGPLLGGWLTENVSWHYAFFINVPICVGLVALLLLGLKH 203
P + L G V MG +GP +GG + + W Y + +P+ + L+ L
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSY--LLLIPMITIITVPFLMKLLK 190

Query: 204 EKGNWAGLLDADWLGIYGLTAGLGGLTVVLEEGQRERWFESSEINTLSVFALSGFLALVI 263
++ G D GI ++ G+ + +L F +S + + ++ FL V
Sbjct: 191 KEVRIKGHF--DIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVK 238

Query: 264 SQFGQRAPVIRLSLLLHRSFGAVFIMIMAVGMILFGVMYMIPQFLSVISGYNTEQAGYVL 323
P + L + F + + + G + M+P + + +T + G V+
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 324 LLSGLPTVLLMPMMPKLLEVVDVRILVIAGLLCFAAACFVNLSLTADTVGMHFVAGQLLQ 383
+ G +V++ + +L + V+ + F + F+ S +T +
Sbjct: 299 IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV 358

Query: 384 GCGLALAMMSLNQAAISSVPPELAGDASGLFNAGRNLGGSVGLALIS 430
GL+ ++ SS+ + AG L N L G+A++
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVG 405


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS08900PF06057300.012 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 29.8 bits (67), Expect = 0.012
Identities = 11/50 (22%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 99 LLIGYP-----MAYVIARLPLATRN--VAMMLVVLPSWTSFLIRVYAWIG 141
+LIGY + +V+ +P R + +L+ + F I V +
Sbjct: 120 ILIGYSFGAEVIPFVLNEMPARYRKNVLGAVLLSPSQSSDFEIHVSEMVT 169


80XC_RS09310XC_RS09345N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS09310-2150.491437peptidase
XC_RS09315018-0.266007dihydroorotase
XC_RS09320-1151.320681membrane protein
XC_RS093251161.377731hypothetical protein
XC_RS093300141.751010molecular chaperone DnaK
XC_RS093350152.037844hypothetical protein
XC_RS09340-2151.126308hypothetical protein
XC_RS09345-1151.126960multidrug transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09310RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.031
Identities = 9/25 (36%), Positives = 14/25 (56%)

Query: 229 LSRIDVKVGDRVEQGQVIAAVGATG 253
+ I VK G+ V +G V+ + A G
Sbjct: 107 VKEIIVKEGESVRKGDVLLKLTALG 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09315UREASE349e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 34.3 bits (79), Expect = 9e-04
Identities = 25/97 (25%), Positives = 38/97 (39%), Gaps = 19/97 (19%)

Query: 4 TVIVNARLVNEGKEFDADLLIEGGRIARI----------DSHIAPAAGDTVVDAAGRWLL 53
TVI NA +++ AD+ ++ GRIA I I G V+ G+ +
Sbjct: 70 TVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVT 129

Query: 54 PGMIDDQVHFREPGLTHKGDIATESGAAVAGGLTSFM 90
G +D +HF P A+ GLT +
Sbjct: 130 AGGMDSHIHFICPQQIE---------EALMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09325IGASERPTASE260.028 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 25.8 bits (56), Expect = 0.028
Identities = 15/79 (18%), Positives = 28/79 (35%), Gaps = 3/79 (3%)

Query: 2 AAKKTAQKAVQAAKKS-AKPVAKKAAAPAAAKPAAKKATPA--AKQPVAKKAPATKTAAK 58
+ +V + + A+ PA A P+ T A +KQ + A +
Sbjct: 1001 NNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATE 1060

Query: 59 PAPASKPAAPNARETLSAN 77
++ A A+ + AN
Sbjct: 1061 TTAQNREVAKEAKSNVKAN 1079



Score = 25.4 bits (55), Expect = 0.038
Identities = 14/83 (16%), Positives = 29/83 (34%), Gaps = 3/83 (3%)

Query: 3 AKKTAQKAVQAAKKSAKPVAKKAAAPAAAKPAAKKATPAAKQPVAK---KAPATKTAAKP 59
+KT + ++ S K + P A T K+P ++ A + A +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET 1175

Query: 60 APASKPAAPNARETLSANDVLAN 82
+ + + + N V+ N
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVEN 1198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09345TCRTETA419e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.6 bits (95), Expect = 9e-06
Identities = 46/202 (22%), Positives = 71/202 (35%), Gaps = 9/202 (4%)

Query: 68 FCIAPFAGYLVDHLPRRRLGMTAALGLVATALILTAITHGWLPVNGVWPIYAAIALTGAA 127
F AP G L D RR + + +L A + A +W +Y + G
Sbjct: 57 FACAPVLGALSDRFGRRPV-LLVSLAGAAVDYAIMATAPF------LWVLYIGRIVAGIT 109

Query: 128 RSFLSPVYNALFARALPREAFARGASIGSVTFQAGMVIGPALGGVLVAWGGKGLAYGVAA 187
+ V A A + AR S F GMV GP LGG++ + + AA
Sbjct: 110 GA-TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAA 168

Query: 188 SVALVAILALFLLRVTEPVNAGPRAPIFRSIAEGAQFVLSNQIMLGAMALDMFSVLLGGA 247
L + FLL + P + ++ ++ MA+ L+G
Sbjct: 169 LNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQV 228

Query: 248 VSMLPA-FIHDILHYGPEGLGI 268
+ L F D H+ +GI
Sbjct: 229 PAALWVIFGEDRFHWDATTIGI 250


81XC_RS09785XC_RS09815N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS09785-111-1.113787phosphoenolpyruvate synthase
XC_RS09790-280.458476hypothetical protein
XC_RS09795-280.929976hypothetical protein
XC_RS09800-180.8215047,8-dihydro-8-oxoguanine-triphosphatase
XC_RS09805-290.9851193-hydroxybutyrate dehydrogenase
XC_RS09810-290.248066CDP-diacylglycerol--serine
XC_RS09815-190.123515PHA synthase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09785PHPHTRNFRASE2804e-87 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 280 bits (719), Expect = 4e-87
Identities = 141/574 (24%), Positives = 236/574 (41%), Gaps = 89/574 (15%)

Query: 260 KAIRMVYSDVPGERVRIEDTPVE---LRNTFSISDEDVQELSKQAL---------VIEKH 307
KA + +V E+ I D E L S E+++ + Q + H
Sbjct: 18 KAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAAH 77

Query: 308 YGRPMDIEWAKDGVSGKLFIVQARPETVKSRSHATQIERFSLEAKDAKILVEGRAVGAKI 367
D E + GK+ Q E + F E+ D + + E RA A I
Sbjct: 78 LLVLDDPELVDG-IKGKIENEQMNAEYALKEVSDMFVSMF--ESMDNEYMKE-RA--ADI 131

Query: 368 GSGVARVVRSL-----EDMNRVQAGDVLIA-DMTDPDWEPVMK-RASAIVTNRGGRTCHA 420
RV+ L + + V+IA D+T D + K T+ GGRT H+
Sbjct: 132 RDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHS 191

Query: 421 AIIARELGVPAVVGSGNATDVISDGQEVTVSCAEG---------DTGFIYEGLLPFERTT 471
AI++R L +PAVVG+ T+ I G V V EG + E FE+
Sbjct: 192 AIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQK 251

Query: 472 TDLGNMPPAP--------LKIMMNVANPERAFDFGQLPNAGIGLARLEMIIAAHIGIHPN 523
+ + P +++ N+ P+ GIGL R E + +
Sbjct: 252 QEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDR-----D 306

Query: 524 ALLEYDKQDADVRKKIDAKIAGYGDPVSFYINRLAEGIATLTASVAPNTVIVRLSDFKSN 583
L ++Q ++ + G PV ++R D +
Sbjct: 307 QLPTEEEQFEAYKEVVQRM---DGKPV-----------------------VIRTLDIGGD 340

Query: 584 EYANLIGGSRYEPHEENPMIGFRGASRYVDPSFTKAFSLECKAVLKVRNEMGLDNLWVMI 643
+ + + P E NP +GFR ++ F + +A+L+ NL VM
Sbjct: 341 KELSYL----QLPKELNPFLGFRAIRLCLE--KQDIFRTQLRALLRAS---TYGNLKVMF 391

Query: 644 PFVRTLEEGRKVIEVLEQNGLKQGENG------LKIIMMCELPSNALLADEFLEIFDGFS 697
P + TLEE R+ ++++ K G +++ +M E+PS A+ A+ F + D FS
Sbjct: 392 PMIATLEELRQAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFS 451

Query: 698 IGSNDLTQLTLGLDRDSSIVAHLFDERNPAVKKLLSMAIKSARAKGKYVGICGQGPSDHP 757
IG+NDL Q T+ DR + V++L+ +PA+ +L+ M IK+A ++GK+VG+CG+ D
Sbjct: 452 IGTNDLIQYTMAADRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-E 510

Query: 758 ELAEWLMQEGIESVSLNPDTVVDTWLRLAKLKSE 791
L+ G++ S++ +++ +L KL E
Sbjct: 511 VAIPLLLGLGLDEFSMSATSILPARSQLLKLSKE 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09795BACTRLTOXIN280.012 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 28.3 bits (63), Expect = 0.012
Identities = 7/30 (23%), Positives = 14/30 (46%)

Query: 73 YDLCDPVTGEPDPSAYVRLYRDARQAETTH 102
YD+ + D S Y+ +Y D + ++
Sbjct: 225 YDMMPAPGDKFDQSKYLMMYNDNKTVDSKS 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09805DHBDHDRGNASE1014e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 101 bits (253), Expect = 4e-28
Identities = 72/255 (28%), Positives = 110/255 (43%), Gaps = 11/255 (4%)

Query: 2 RSILITGAGSGIGAGLATVLAADGHHLLVSDADLAAAEQTTQQVRAAGGSAEALVLDVTD 61
+ ITGA GIG +A LA+ G H+ D + E+ ++A AEA DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 62 EHSIAQALAAAARAPE---VLVNNAGLQHVAPLEEFPMQRWALLVDVMLTGAARLSRAVL 118
+I + A R +LVN AG+ + + W V TG SR+V
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 119 PGMRAGGYGRIVNIGSIHSLVASPYKSAYVAAKHGLIGLSKVLALETADCDITANTLCPS 178
M G IV +GS + V +AY ++K + +K L LE A+ +I N + P
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 179 YVRTPLVERQIADQARTRGIAEDAVIR---EVMLKPMPKGAFIEYDELAGTVAFLMSHAA 235
T + AD+ + VI+ E +P + ++A V FL+S A
Sbjct: 189 STETDMQWSLWADEN-----GAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 236 RNLTGQAIAIDGGWT 250
++T + +DGG T
Sbjct: 244 GHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09815PF03544330.001 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/88 (13%), Positives = 20/88 (22%)

Query: 300 RGAQAKPAAASAPKPDAGPAPVAAASAPPRAAAQPTTPPEAAPAKRATRQTAAAPETSRP 359
+ +P P+P V P P + +
Sbjct: 72 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFEN 131

Query: 360 KASNTRPAAKKATKTSKSPARAPSGGKP 387
A ++ TSK SG +
Sbjct: 132 TAPARPTSSTATAATSKPVTSVASGPRA 159



Score = 32.3 bits (73), Expect = 0.003
Identities = 14/77 (18%), Positives = 21/77 (27%), Gaps = 4/77 (5%)

Query: 303 QAKPAAASAPKPDAGPAPVAAASAPPR----AAAQPTTPPEAAPAKRATRQTAAAPETSR 358
Q P P+P+ P P AP P ++ R
Sbjct: 67 QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPA 126

Query: 359 PKASNTRPAAKKATKTS 375
NT PA ++ +
Sbjct: 127 SPFENTAPARPTSSTAT 143



Score = 28.4 bits (63), Expect = 0.046
Identities = 18/94 (19%), Positives = 28/94 (29%), Gaps = 5/94 (5%)

Query: 297 RMLRGAQAKPAAASAPKPDAGPAPVAAASAPPRAAAQPTTPPEAAPAKRA----TRQTAA 352
+ A A+P + + P P A PP +P PE P +
Sbjct: 40 VIELPAPAQPISVTMVAPADLE-PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK 98

Query: 353 APETSRPKASNTRPAAKKATKTSKSPARAPSGGK 386
+PK K+ K +S +P
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFENT 132


82XC_RS09845XC_RS09910N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS09845-191.424023transcription accessory protein
XC_RS098500101.180141histidine kinase
XC_RS09855-1110.749488transcriptional regulator
XC_RS09865-212-0.202172*hypothetical protein
XC_RS09890-2110.063401****cytochrome C biogenesis protein CcsA
XC_RS09895-2120.360632cytochrome C
XC_RS09900-2120.502366MexH family multidrug efflux RND transporter
XC_RS09905-1130.623436transporter
XC_RS09910-2130.931361transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09845PF04183320.014 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 31.8 bits (72), Expect = 0.014
Identities = 23/129 (17%), Positives = 36/129 (27%), Gaps = 9/129 (6%)

Query: 166 ARAILMERWGEDAALVGELRSWLNDNGVIRARVAEGKEEAGAKYR-DYFDHAESLARIPS 224
A L RW + V + L +G + E A + +
Sbjct: 293 AAGPLASRWLQQ---VFATDATLVQSGAVILG-----EPAAGYVSHEGYAALARAPYRYQ 344

Query: 225 HRLLALFRARREEFLYLDLDPGTDADAGHQYAEGRVARSAGISNEGRPADRWLLDACRLT 284
L ++R +L D P A + A I G A+ WL R+
Sbjct: 345 EMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVV 404

Query: 285 WRAKLHMHL 293
H+
Sbjct: 405 VVPLYHLLC 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09850HTHFIS848e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 8e-19
Identities = 32/121 (26%), Positives = 57/121 (47%), Gaps = 4/121 (3%)

Query: 1012 LDGVRLLLVDDDQDSREAVMHFLMLAGAQVQAAGSVDAAEAHLAAAHYDVLVSDIAMPLR 1071
+ G +L+ DDD R + L AG V+ + +AA D++V+D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 1072 DGYDLIRSVRSGRPELPRHIRAIALTAYVREEDRDRAIVAGFDAHMGKPVEPPGLIDLIE 1131
+ +DL+ ++ RP+LP + ++A +A G ++ KP + LI +I
Sbjct: 61 NAFDLLPRIKKARPDLPV----LVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 1132 R 1132
R
Sbjct: 117 R 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09855HTHFIS412e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 2e-07
Identities = 23/119 (19%), Positives = 46/119 (38%), Gaps = 7/119 (5%)

Query: 8 RVLVVENDDMNAMLLEMQLIQAGAAVVGPVGEVDDALQLIEADAPDTAVLDYRLGNGQTS 67
+LV ++D +L L +AG V + I A D V D + + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMP-DENA 62

Query: 68 EPVARRLTER--GIPFVLATGVAS-ASIPHGFERGVI--LTKPYMSDELVDALAKARQR 121
+ R+ + +P ++ + + + E+G L KP+ EL+ + +A
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09900RTXTOXIND561e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 55.6 bits (134), Expect = 1e-10
Identities = 41/242 (16%), Positives = 80/242 (33%), Gaps = 33/242 (13%)

Query: 50 VETAKAARRAVAASYTGTAALEPRAEAQVVAKTSGVALAVMVEEGQKVSAGQALVRLDPD 109
+ + Q +AK AV+ +E + V A L
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAK-----HAVLEQENKYVEAVNELRVYKSQ 274

Query: 110 RAHL--AVAQSEAQLRKLENSYRRATQLVGQQLVSA-ADVDQLKFDVENSRAQHRLASLE 166
+ + ++ + + + ++ + +L ++ L +
Sbjct: 275 LEQIESEILSAKEEYQLVTQLFKN---EILDKLRQTTDNIGLL-------TLELAKNEER 324

Query: 167 LSYTTVQAPISGVIASRSIKT-GNFVQINTPIFRIV-DDSQLEATLNVPERELATLKSGQ 224
+ ++AP+S + + T G V + IV +D LE T V +++ + GQ
Sbjct: 325 QQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQ 384

Query: 225 PVTLLADALPGQQF---VGKVDRIAP--VVDSGSGT-FRVVCAF-------GAGAEALQP 271
+ +A P ++ VGKV I + D G F V+ + G L
Sbjct: 385 NAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSS 444

Query: 272 GM 273
GM
Sbjct: 445 GM 446



Score = 44.4 bits (105), Expect = 5e-07
Identities = 16/74 (21%), Positives = 33/74 (44%), Gaps = 9/74 (12%)

Query: 78 VVAKTSGVALAVMVEEGQKVSAGQALVRLDPDRAHLAVAQSEAQLRKLENSYR--RATQL 135
+ + + ++V+EG+ V G L++L +EA K ++S R Q
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTA-------LGAEADTLKTQSSLLQARLEQT 151

Query: 136 VGQQLVSAADVDQL 149
Q L + ++++L
Sbjct: 152 RYQILSRSIELNKL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09905ACRIFLAVINRP6510.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 651 bits (1680), Expect = 0.0
Identities = 259/1143 (22%), Positives = 479/1143 (41%), Gaps = 138/1143 (12%)

Query: 22 LVAFATRRRVTIAMITVTMLLFGLIALRSLKVNLLPDLSYPTLTVRTEYTGAAPAEIETL 81
+ F RR + ++ + +++ G +A+ L V P ++ P ++V Y GA ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 82 VTEPVEEAVGVVKNLRKLKSIS-RTGQSDVVLEFAWGTNMDQAGLEVRDKMEAL--SLPL 138
VT+ +E+ + + NL + S S G + L F GT+ D A ++V++K++ LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 139 EAKPPVLLRFNPSTEPIMRLVLSPKQAPASDTDAIRQLTGLRRYADEDLKKKLEPVAGVA 198
E + + S+ +M + + Y ++K L + GV
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFV-------SDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 199 AVKVGGGLEDEIQVDIDQQKLAQLSLPIDNVITRLKEENVNISGGRL------EEGSQRY 252
V++ G + +++ +D L + L +VI +LK +N I+ G+L
Sbjct: 174 DVQLFGA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNA 232

Query: 253 LVRTVNQFVDLDEIRNMLVTTQSSSSSAADAAMQQMYAIAASTGSQAALAAAAEVQSTSA 312
+ +F + +E + + S
Sbjct: 233 SIIAQTRFKNPEEFGKVTLRVNSD------------------------------------ 256

Query: 313 SSTSSIAGGMPVRLKDVAQVRQGYKEREAIIRLGGKEAVELAIYKEGDANTVSTAASLRK 372
G VRLKDVA+V G + I R+ GK A L I AN + TA +++
Sbjct: 257 --------GSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKA 308

Query: 373 RLEQIKATVPGDVEITTIEDQSHFIEHAISDVKKDAVIGGVLAILIIFLFLRDGWSTFVI 432
+L +++ P +++ D + F++ +I +V K +L L+++LFL++ +T +
Sbjct: 309 KLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIP 368

Query: 433 SLSLPVSIITTFFFMGQLGLSLNVMSLGGLALATGLVVDDSIVVLESIAKA-RERGLSVL 491
++++PV ++ TF + G S+N +++ G+ LA GL+VDD+IVV+E++ + E L
Sbjct: 369 TIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPK 428

Query: 492 DAAMVGTREVSMAVMASTLTTIAVFLPLVFVEGIAGQLFRDQALTVAIAIAISLVVSMTL 551
+A ++ A++ + AVF+P+ F G G ++R ++T+ A+A+S++V++ L
Sbjct: 429 EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALIL 488

Query: 552 IPMLSSLKGAPPMAFPDEPSHPQWQPQQRWLKPVAAGRRGAGASVRYAFFGAAWAVVKAW 611
P L + LKPV+A + FFG
Sbjct: 489 TPALCA----------------------TLLKPVSAEHHEN----KGGFFGWFNTT---- 518

Query: 612 RGATRVVAPVMRKASDIAMAPYGRAERGYLTMLPAALRRPWLVLGLAGAAFVGTVLLVPM 671
+ + Y + L L + G V+L
Sbjct: 519 ---------------------FDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLR 557

Query: 672 LGADLIPQLAQDRFEMTVKLPSGTPLAQTDALVRELQ--LAHDKDPGIASLYGVSGAGTR 729
L + +P+ Q F ++LP+G +T ++ ++ ++ + S++ V+G
Sbjct: 558 LPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSF- 616

Query: 730 LDANPTESGENIGKLTVVMAG-----GGSPAVEAAATERLRSSMTAHPGAQV-DFARPAL 783
+ +N G V + G + EA R + + V F PA+
Sbjct: 617 -----SGQAQNAGMAFVSLKPWEERNGDENSAEAVI-HRAKMELGKIRDGFVIPFNMPAI 670

Query: 784 FSF--STPLEVEL---RGQDLGELEQAGQKLAQMLRAN-GHYADVKSTVEEGFPEIQIRF 837
+T + EL G L QA +L M + V+ E + ++
Sbjct: 671 VELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEV 730

Query: 838 DQERAGALGLTTRQIADVIVKKVRGDVATRYSFRDRKIDVLVRAQHSDRASVDAIRQLIV 897
DQE+A ALG++ I I + G + R R + V+A R + + +L V
Sbjct: 731 DQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYV 790

Query: 898 NPGSSRPVRLAAVAEVVATTGPSEIHRADQTRVAIVSASLR-DIDLGGAVREVETLVRND 956
+ V +A G + R + + G A+ +E L
Sbjct: 791 RSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKL 850

Query: 957 PLAAGVGMHIGGQGEELAQSVKSLLFAFGLAIFLVYLVMASQFESLLHPFVILFTIPLAM 1016
P AG+G G + S ++ +V+L +A+ +ES P ++ +PL +
Sbjct: 851 P--AGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGI 908

Query: 1017 VGAVLALLMTGKPVSVVVFIGLILLVGLVTKNAIILIDKVNQLRE-EGVAKREALIEGAR 1075
VG +LA + + V +GL+ +GL KNAI++++ L E EG EA + R
Sbjct: 909 VGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVR 968

Query: 1076 SRLRPIIMTTLCTLFGFLPLAVAMGEGAEVRAPMAITVIGGLLVSTLLTLLVIPVVYDLL 1135
RLRPI+MT+L + G LPLA++ G G+ + + I V+GG++ +TLL + +PV + ++
Sbjct: 969 MRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVI 1028

Query: 1136 DRR 1138
R
Sbjct: 1029 RRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS09910ACRIFLAVINRP5470.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 547 bits (1412), Expect = 0.0
Identities = 228/1047 (21%), Positives = 441/1047 (42%), Gaps = 67/1047 (6%)

Query: 3 VAAFSIRRPVTTIMCFVSLVVVGLIAAFRLPLEALPDISAPFLFVQLPYTGSTPDEVERN 62
+A F IRRP+ + + L++ G +A +LP+ P I+ P + V Y G+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 LVRPAEEALATMTGIKRMRSTATADG-ANIFIEFSDWDRDIAIAASDARERLDAIRDDFP 121
+ + E+ + + + M ST+ + G I + F D IA + +L P
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQS-GTDPDIAQVQVQNKLQLATPLLP 119

Query: 122 EDLQRFHIFKWSSSDEPVLKVRLAS---QADLTGAYDMLDREFKRRIERIPGVAKVEISG 178
+++Q+ I SS ++ S D + K + R+ GV V++ G
Sbjct: 120 QEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 179 APPNEVEIAIAPDRLTAHDLSLNDLSERLGKLNFSVSAGQI------DDNGQRIRVQPVG 232
A + I + D L + L+ D+ +L N ++AGQ+ +
Sbjct: 180 AQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQT 238

Query: 233 ELRDLQELRDLVLNAKG----LRLGDIAQVRLKPTRMNYGRRLDGRPAIGLDIYKERSAN 288
++ +E + L +RL D+A+V L N R++G+PA GL I AN
Sbjct: 239 RFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGAN 298

Query: 289 LVEVSKAALKEVEDIRAE-PAMRDVQVKVIDNQGKAVTSSLAELAEAGAVGLLLSITVLF 347
++ +KA ++ +++ P ++V + V S+ E+ + ++L V++
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQ--GMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMY 356

Query: 348 FFLRHWPSTLMVTLAIPICFAITLGFMYFVGVTLNILTMMGLLLAVGMLVDNAVVVVESI 407
FL++ +TL+ T+A+P+ T + G ++N LTM G++LA+G+LVD+A+VVVE++
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV 416

Query: 408 YQERERMPGQPQLAALLGTRSVAIALSAGTLCHCIVFVPNLFGETNNISIFMAQIAITIS 467
+ P+ A + AL + VF+P + + Q +ITI
Sbjct: 417 ERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIP-MAFFGGSTGAIYRQFSITIV 475

Query: 468 VSLLASWLVAISLIPMLSARMKTP----------PLVTSERGVIARLQRRYAKVLAWTLA 517
++ S LVA+ L P L A + P Y + L
Sbjct: 476 SAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILG 535

Query: 518 HRG-WSVAGILLVSAISLVPMKLTKVDMFGGDGGNEGYIQY--QWKGSYTREQLGDEIAR 574
G + + L+V+ + ++ ++L + D +G Q T+E +
Sbjct: 536 STGRYLLIYALIVAGMVVLFLRLPSSFLPEED---QGVFLTMIQLPAGATQE----RTQK 588

Query: 575 VENHLEANRAKYHITQIYSWFSEV------EGSNTVVTFDATKVKDLPPLLEQIRKELPR 628
V + + K + S F+ + N + F + K + E + +
Sbjct: 589 VLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 629 SARADYS---IGN---------QGDGGSGNQGVQVQ-LVGDSTQALQELADEVVPLLAQR 675
A+ + G G + ++ G AL + ++++ + AQ
Sbjct: 649 RAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQH 708

Query: 676 -KELRDVRVDTGDRTSELAIRVDRERAAAFGFSAEQVASFVGLALRGTPLREFRRGDNEV 734
L VR + + T++ + VD+E+A A G S + + AL GT + +F
Sbjct: 709 PASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVK 768

Query: 735 PVWVRFAGAEQSKPEDLAGFTVRTKDGRSVPLLSLVDVKIRPAATQIGRTNRQTTLTIKA 794
++V+ + PED+ VR+ +G VP + + ++ R N ++ I+
Sbjct: 769 KLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQG 828

Query: 795 NLGTKVTVPEARAAMEQPLKAMQFPAGYSYTFDGGDYQDDGEAMGQMVFNLVIALVMIYV 854
+ +A A ME + PAG Y + G + + Q + I+ V++++
Sbjct: 829 EAAPGTSSGDAMALMENLASKL--PAGIGYDW-TGMSYQERLSGNQAPALVAISFVVVFL 885

Query: 855 VMAAVFESLLFPAAIMSGVLFSIFGVFWLFWITGTSFGIMSFIGILVLMGVVVNNGIVMI 914
+AA++ES P ++M V I GV + + +G+L +G+ N I+++
Sbjct: 886 CLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIV 945

Query: 915 EHINNLRRR-GMGRTQALIEGSRERLRPIMMTMGTAILAMVPISLTSTTMFSDGPPYFPM 973
E +L + G G +A + R RLRPI+MT IL ++P+++++ +
Sbjct: 946 EFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA---GSGAQNAV 1002

Query: 974 ARAIAGGLAFSTVVSLLFLPTIYAILD 1000
+ GG+ +T++++ F+P + ++
Sbjct: 1003 GIGVMGGMVSATLLAIFFVPVFFVVIR 1029


83XC_RS10840XC_RS10880N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS10840-3101.692559membrane protein
XC_RS10845-2112.551949histidine kinase
XC_RS108504141.751688hypothetical protein
XC_RS108554141.494623hypothetical protein
XC_RS108604141.510979beta-N-acetylglucosaminidase
XC_RS108654141.444700alpha/beta hydrolase
XC_RS108754141.194086serine protease
XC_RS108804150.720761hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10840PF06580260.048 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.4 bits (58), Expect = 0.048
Identities = 15/120 (12%), Positives = 44/120 (36%), Gaps = 17/120 (14%)

Query: 10 TWAIHPFHAAVLGGVLPLFLGALLSDYAYWSSYQIQWSNFASWLLVGAMVFTSIALLCGI 69
+ P +++ + +G +L+ + W ++ + C +
Sbjct: 32 SLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQII-----LRVLPACVV 86

Query: 70 VGLVRGSRHVLYVVALVATWVIGFWNALHHARDAWAIMPMALVLSVIVTLLALLATWAGF 129
+G+ ++ VA + W + + ++ A+ + L LS+I ++ + W+
Sbjct: 87 IGM-------VWFVANTSIWRLLAF--INTKPVAFT---LPLALSIIFNVVVVTFMWSLL 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10855PF06580325e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 5e-04
Identities = 12/54 (22%), Positives = 24/54 (44%), Gaps = 9/54 (16%)

Query: 2 FVLLQALGWALF----LAIAYVARPSEDSVPELLQLAGVVAMSICGLLGSLALR 51
+ Q +GW ++ A + P+L + +A+S+ GL+ + A R
Sbjct: 12 YWYCQGIGWGVYTLTGFGFASLYGS-----PKLHSMIFNIAISLMGLVLTHAYR 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10875SUBTILISIN1111e-28 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 111 bits (278), Expect = 1e-28
Identities = 62/318 (19%), Positives = 111/318 (34%), Gaps = 39/318 (12%)

Query: 106 NADLAQQAGARGQGVKIGVMDDNLVQTYAPVAGKVDAFTDYTAVPGAAESTSNRLRGHGS 165
A G+GVK+ V+D + + ++ ++T GHG+
Sbjct: 30 QAPAVWNQTR-GRGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGT 88

Query: 166 VVSALVLGSAQDGFAGGVAPDADLIYGRICAENSCGTQQARRAAVDMAAA-GVRIANLSI 224
V+ + + + GVAP+ADL+ ++ + G + A V I ++S+
Sbjct: 89 HVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSL 148

Query: 225 GASYADAASSANAALAWRFALTPLVQADALIVASTGNDGAAEAS-----YPAAAPVQEAS 279
G A V + L++ + GN+G + YP
Sbjct: 149 GGPEDVPELHEAVKKA--------VASQILVMCAAGNEGDGDDRTDELGYPGCYN----- 195

Query: 280 LRNNWLAVGAVEIDSAGNPAGLSSYSNHCGSAAQWCLVAPGMYAAPALAGTELQGQIAGT 339
++VGA+ D S +SN LVAPG + G + +GT
Sbjct: 196 ---EVISVGAINFDR-----HASEFSNSNNE---VDLVAPGEDILSTVPGGKYA-TFSGT 243

Query: 340 SFSTAAVSGVAAQVLGVYPW-----MSASNLQQTLLTTATDLGDPGVDALYGWGMVNAAK 394
S +T V+G A + + ++ L L+ LG + G G++
Sbjct: 244 SMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLG--NSPKMEGNGLLYLTA 301

Query: 395 AIKGPGQFASNWAANVTS 412
+ F + A + S
Sbjct: 302 VEELSRIFDTQRVAGILS 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS10880IGASERPTASE350.006 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 0.006
Identities = 46/213 (21%), Positives = 70/213 (32%), Gaps = 38/213 (17%)

Query: 2096 FGGSLSDAGNLE-----KVGTGVFQLTGTSSIGGTTLVSAGTLDVDGTLASAGGLTVANG 2150
+ GNL K F LTG +++ G V GTL G +
Sbjct: 657 GEEEGKNNGNLNVTFKGKSEQNRFLLTGGTNLNGDLTVEKGTL-------FLSGRPTPHA 709

Query: 2151 GALSGSGVVDAAVTVADGGRVVVSS---GALLTTGTLSLAPNATIDAFLGIPSQTGVLAV 2207
++G A+ VVV T+++ NA + G V +
Sbjct: 710 RDIAGISSTKKDPHFAENNEVVVEDDWINRNFKATTMNVTGNA--SLYSGRN----VANI 763

Query: 2208 NGDLTLDGTLNITDIGGFGNGVYRLIDYTG-------GLSDNGLAFGTIPGSVDPTQLAL 2260
++T + G+ V DYTG LSD L S +PT L
Sbjct: 764 TSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALN------SFNPTNLRG 817

Query: 2261 QTALAQQVNVVVSAPGSNVQFWDGAQTVGNAQI 2293
L + N V+ + Q+ GN+Q+
Sbjct: 818 NVNLTESANFVL----GKANLFGTIQSRGNSQV 846


84XC_RS11240XC_RS11325N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS11240017-1.375446chemotaxis protein
XC_RS11245119-1.800016flagellar basal body rod protein FlgB
XC_RS11250119-1.213760flagellar basal body rod protein FlgC
XC_RS11255018-0.455123flagellar basal body rod modification protein
XC_RS11260-1170.071439flagellar hook protein FlgE
XC_RS11265-1160.601281flagellar basal body rod protein FlgF
XC_RS11270-1170.389463flagellar basal body rod protein FlgG
XC_RS11275-2160.021416flagellar L-ring protein
XC_RS11280-117-0.165926flagellar P-ring protein
XC_RS11285-117-0.626637flagellar rod assembly protein FlgJ
XC_RS11290-117-1.035456flagellar hook-associated protein FlgK
XC_RS11295017-1.145956flagellar hook-associated protein FlgL
XC_RS11300019-0.541341flagellin
XC_RS113050211.017425flagellar protein
XC_RS113102210.551470flagellar biosynthesis protein FliS
XC_RS113151191.272415hypothetical protein
XC_RS113200160.274201hypothetical protein
XC_RS113250150.014261response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11240HTHFIS385e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 5e-05
Identities = 15/75 (20%), Positives = 29/75 (38%), Gaps = 9/75 (12%)

Query: 184 VLVVDDSRVARQQIRSVLDQLGVAATLLSDGRQALDHLLQVAASGENPAERYAMVISDIE 243
+LV DD R + L + G + S+ + A +V++D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA---------AGDGDLVVTDVV 56

Query: 244 MPAMDGYTLTTEIRR 258
MP + + L I++
Sbjct: 57 MPDENAFDLLPRIKK 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11260FLGHOOKAP1453e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 45.3 bits (107), Expect = 3e-07
Identities = 25/67 (37%), Positives = 37/67 (55%), Gaps = 3/67 (4%)

Query: 4 NTSLSGINAANADLNVTSNNIANVNTTGFKESRAEFADMFQSTSYGLSRNAVGSGVRVSN 63
N ++SG+NAA A LN SNNI++ N G+ A Q+ S + VG+GV VS
Sbjct: 5 NNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA---QANSTLGAGGWVGNGVYVSG 61

Query: 64 VAQQFSQ 70
V +++
Sbjct: 62 VQREYDA 68



Score = 42.6 bits (100), Expect = 2e-06
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 343 TSGAARVGAPDTSDLGQIESGSLEASTVDLTEQLVNMIVAQRNFQANSQMISTQDQVTQT 402
T+ A + + Q+ + S V+L E+ N+ Q+ + AN+Q++ T + +
Sbjct: 482 TATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDA 541

Query: 403 IINIR 407
+INIR
Sbjct: 542 LINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11265FLGHOOKAP1300.008 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.008
Identities = 9/31 (29%), Positives = 19/31 (61%)

Query: 5 LYVAMTGARASLQAQSTVSHNLANVDTVGFK 35
+ AM+G A+ A +T S+N+++ + G+
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYT 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11270FLGHOOKAP1391e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 39.2 bits (91), Expect = 1e-05
Identities = 12/41 (29%), Positives = 20/41 (48%)

Query: 219 LEGSNVNTVEELVSMIETQRAYEMNAKAISTTDSMLGYLNN 259
S VN EE ++ Q+ Y NA+ + T +++ L N
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 37.6 bits (87), Expect = 3e-05
Identities = 19/82 (23%), Positives = 31/82 (37%), Gaps = 20/82 (24%)

Query: 5 LWVAKTGLDAQQTRMSVISNNLANTNTTGFKRDRAAFEDLLYQQVRAPGGSTSAQTQLPT 64
+ A +GL+A Q ++ SNN+++ N G+ R T T
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQT-----------------TIMAQANST 46

Query: 65 ---GLQLGTGVRVVSTFKGFDQ 83
G +G GV V + +D
Sbjct: 47 LGAGGWVGNGVYVSGVQREYDA 68


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11275FLGLRINGFLGH1423e-44 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 142 bits (359), Expect = 3e-44
Identities = 76/196 (38%), Positives = 111/196 (56%), Gaps = 9/196 (4%)

Query: 39 VPVVAPVAQPTAGAIYAAGPSLN-----LYGDRRARDVGDLLTVNLVENTTASSTANTSI 93
VP PVA G+I+ + +N L+ DRR R++GD LT+ L EN +AS +++ +
Sbjct: 40 VPGPTPVA---NGSIFQSAQPINYGYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANA 96

Query: 94 SKADTVDMSTPTLLGVPLTVNGIDVLRNSTSGDRSFDGKGNTAQSNRMQGSVTVTVMQRL 153
S+ + T+ + G SG +F+GKG SN G++TVTV Q L
Sbjct: 97 SRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVL 156

Query: 154 PNGNLVIQGQKNLRLTQGDELVQVQGIVRAADIAPDNSVPSSKVADARIAYGGRGAIAQS 213
NGNL + G+K + + QG E ++ G+V I+ N+VPS++VADARI Y G G I ++
Sbjct: 157 VNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEA 216

Query: 214 NAMGWLSRFFNSRLSP 229
MGWL RFF + LSP
Sbjct: 217 QNMGWLQRFFLN-LSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11280FLGPRINGFLGI366e-128 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 366 bits (940), Expect = e-128
Identities = 157/364 (43%), Positives = 221/364 (60%), Gaps = 9/364 (2%)

Query: 10 LLAAAVALCAIAAPASAERIKDLAQVGGVRGNALVGYGLVVGLDGSGDRTSQAPFTVQSL 69
+ +A L A A RIKD+A + R N L+GYGLVVGL G+GD +PFT QS+
Sbjct: 12 VFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSM 71

Query: 70 KNLLGELGVNVPANVNPQLKNVAAVAIHAELPPFAKPGQPIDITVSSIANAVSLRGGSLL 129
+ +L LG+ KN+AAV + A LPPFA PG +D+TVSS+ +A SLRGG+L+
Sbjct: 72 RAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLI 130

Query: 130 MAPLKGADGQVYAMAQGNLVVGGFGAQGKDGSRVSVNIPSVGRIPNGATVERALPDVFAG 189
M L GADGQ+YA+AQG L+V GF AQG D + ++ + + R+PNGA +ER LP F
Sbjct: 131 MTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELPSKFKD 189

Query: 190 SGEITLNLHQNDFTTVSRMVAAIDN----SFGAGTARAVDGVTVSVRSPTDPSARIGLLA 245
S + L L DF+T R+ ++ +G A D ++V+ P + L+A
Sbjct: 190 SVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRLMA 248

Query: 246 RLENVELSPGDAPAKVVVNARTGTVVIGQLVRVMPAAIAHGSLTVTISENTNVSQPGAFS 305
+EN+ + D PAKVV+N RTGT+VIG VR+ A+++G+LTV ++E+ V QP FS
Sbjct: 249 EIENLTVET-DTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFS 307

Query: 306 GGRTAVTPQSTIKATSEGSRMFKFEGGTTLDQIVRAVNEVGAAPGDLVAILEALKQAGAL 365
G+TAV PQ+ I A EGS++ E G L +V +N +G ++AIL+ +K AGAL
Sbjct: 308 RGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSAGAL 366

Query: 366 TAEL 369
AEL
Sbjct: 367 QAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11285FLGFLGJ1307e-37 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 130 bits (327), Expect = 7e-37
Identities = 63/140 (45%), Positives = 82/140 (58%), Gaps = 4/140 (2%)

Query: 221 FVAKIWTHAQKAARELGVDPRALVAQAALETGWGRRGI--GNGGDSNNLFGIKATG-WNG 277
F+A++ AQ A+++ GV ++AQAALE+GWG+R I NG S NLFG+KA+G W G
Sbjct: 152 FLAQLSLPAQLASQQSGVPHHLILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKG 211

Query: 278 AKVTTGTHEYVNGVKTTETADFRAYGSAEESFADYVRLLKNNSRYQTALQAGTDIKGFAR 337
T EY NG A FR Y S E+ +DYV LL N RY A+ + A+
Sbjct: 212 PVTEITTTEYENGEAKKVKAKFRVYSSYLEALSDYVGLLTRNPRY-AAVTTAASAEQGAQ 270

Query: 338 GLQQAGYATDPGYAAKIAAI 357
LQ AGYATDP YA K+ +
Sbjct: 271 ALQDAGYATDPHYARKLTNM 290



Score = 72.1 bits (176), Expect = 3e-16
Identities = 44/121 (36%), Positives = 65/121 (53%), Gaps = 15/121 (12%)

Query: 19 AKIDKVSRQLEGQFAQMLVKSMRDASSGDPMFPGENQ-MFREMYDQQMAKALTDGKGLGL 77
A I V+RQ+EG F QM++KSMRDA D +F E+ ++ MYDQQ+A+ +T GKGLGL
Sbjct: 31 ANIRPVARQVEGMFVQMMLKSMRDALPKDGLFSSEHTRLYTSMYDQQIAQQMTAGKGLGL 90

Query: 78 SAMISKQLSGDTGGPA-------LNTSLSTADAAKAYALVAGKR-------DASLPLPAR 123
+ M+ KQ++ + P + L T + AL + D SLP ++
Sbjct: 91 AEMMVKQMTPEQPLPEESTPAAPMKFPLETVVRYQNQALSQLVQKAVPRNYDDSLPGDSK 150

Query: 124 D 124

Sbjct: 151 A 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11290FLGHOOKAP12291e-69 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 229 bits (585), Expect = 1e-69
Identities = 150/441 (34%), Positives = 223/441 (50%), Gaps = 16/441 (3%)

Query: 2 SIMSTGTSALIAFQRALSTVSHNVANINTEGYSRQRVEFATRTPTDMGYAFVGNGAKISD 61
S+++ S L A Q AL+T S+N+++ N GY+RQ A T +VGNG +S
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 62 VGRVADQLAISRLL----DSGGELSRLQQLSSLSNRVDSLYSNTATNVAGLWSNFFDSAS 117
V R D ++L S G +R +Q+S + N + + S+ AT + +FF S
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQM----QDFFTSLQ 117

Query: 118 ALSSNASSTAERQSMLDSGNSLATRFKQLNGQMDSLSNEVNSGLTSSVDEVNRLTQQIAK 177
L SNA A RQ+++ L +FK + + +VN + +SVD++N +QIA
Sbjct: 118 TLVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIAS 177

Query: 178 INGTI----GNSIDNASADLLDQRDALVSKLVGYTGGTAVIQDGGFMNVFTAGGQALVVG 233
+N I G + +LLDQRD LVS+L G +QDGG N+ A G +LV G
Sbjct: 178 LNDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQG 237

Query: 234 TTAAKLTTVADPYQPTKLQVAMQTQGQNVSLSPSSL--GGQIGGLLEFRSNVLEPTQAEL 291
+TA +L V P++ VA P L G +GG+L FRS L+ T+ L
Sbjct: 238 STARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTL 297

Query: 292 GRLAVGMASSFNTAHSQGMDLYGALGGNFFNIGSPTTAANPKNTGTAALTASFSNLGAVD 351
G+LA+ A +FNT H G D G G +FF IG P N KN G A+ A+ ++ AV
Sbjct: 298 GQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVL 357

Query: 352 GQNVTLSFDAGSWKATRTDTGSTVPLTGTGTAADPLVVNGVSMVVGGAPANGDKFLLQPT 411
+ +SFD W+ TR + +T T T A + +G+ + G PA D F L+P
Sbjct: 358 ATDYKISFDNNQWQVTRLASNTT--FTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPV 415

Query: 412 AGLAGSLSVAITDPSRIAAAT 432
+ ++ V ITD ++IA A+
Sbjct: 416 SDAIVNMDVLITDEAKIAMAS 436



Score = 79.6 bits (196), Expect = 9e-18
Identities = 36/105 (34%), Positives = 56/105 (53%)

Query: 517 AGSSDNGNAKLLAKVEDAKALNGGTVTLNGALSGLTTSVGSAARAANYAADAQEVINDQA 576
AG SDN N + L ++ GG + N A + L + +G+ ++ Q + Q
Sbjct: 440 AGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQL 499

Query: 577 QASRDSISGVNLDEEAADMLKLQQAYQAAAQMISTADTIFQAILG 621
+ SISGVNLDEE ++ + QQ Y A AQ++ TA+ IF A++
Sbjct: 500 SNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11295FLAGELLIN531e-09 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 52.7 bits (126), Expect = 1e-09
Identities = 59/350 (16%), Positives = 110/350 (31%), Gaps = 8/350 (2%)

Query: 4 RISTSMMYSQSVSAMTAKQSRLNQLEAQLSSGQRLVTAKDDPVAAGTAVGLDRALAAITR 63
I+T+ + + + + QS L+ +LSSG R+ +AKDD A + +T+
Sbjct: 3 VINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQ 62

Query: 64 FGENANNVQNRLGLQENALAQAGDKMARVTELAVQSNNSSLSPDDRKAIASELTALRDSM 123
NAN+ + E AL + + + RV EL+VQ+ N + S D K+I E+ + +
Sbjct: 63 ASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEI 122

Query: 124 VSLANSTDGTGRYLFAGTADGNAPFIKSNGN----VLYNGDQTQRQVEVAPDTFVSDTLP 179
++N T G + + + +N + + +
Sbjct: 123 DRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 180 GSEIFMRIRTGDGSVDAHANATNTGTGLLLDFSRDASSGSWNGGSYSVQFTAADTYEVRD 239
++ + G A + ++ V
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDA 241

Query: 240 STNALVSTGTYKDG--GDINAAGVRMRISGAPAVGDSFQIGASGTKDVFSTID-DMVAAL 296
N V G A + I G G + T D + D + +
Sbjct: 242 ENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVST 301

Query: 297 NSDTQTPTQKAAMINTLQSSMRDIAQASSKMIDARASGGAQLSAIDNANS 346
+ + T A I +++ SSK + G N
Sbjct: 302 TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351



Score = 31.2 bits (70), Expect = 0.007
Identities = 48/269 (17%), Positives = 80/269 (29%), Gaps = 1/269 (0%)

Query: 127 ANSTDGTGRYLFAGTADGNAPFIKSNGNVLYNGDQTQRQVEVAPDTFVSDTLPGSEIFMR 186
AN T D + G + DTF + +
Sbjct: 232 ANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKT 291

Query: 187 IRTGDGSVDAHANATNTGTGLLLDFSRDASSGSWNGGSYSVQFTAADTYEVRDSTNALVS 246
G+G V N + + A+ + S +T+ +
Sbjct: 292 GNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 247 TGTYKDGGDINAAGVRMRISGAPAVGDSFQIGASGTKDVFSTIDDMVAALNSDTQTPTQK 306
+ D NA +I+ A + G T + D A T
Sbjct: 352 SAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFID-KTASGVSTLINEDA 410

Query: 307 AAMINTLQSSMRDIAQASSKMIDARASGGAQLSAIDNANSLLESNEVTLKTTLSSIRDLD 366
AA + + + I A SK+ R+S GA + D+A + L + L + S I D D
Sbjct: 411 AAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDAD 470

Query: 367 YASAIGQYELERASLQAAQTIFQQMQSSS 395
YA+ + + QA ++ Q
Sbjct: 471 YATEVSNMSKAQILQQAGTSVLAQANQVP 499


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11300FLAGELLIN1263e-34 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 126 bits (318), Expect = 3e-34
Identities = 117/412 (28%), Positives = 168/412 (40%), Gaps = 18/412 (4%)

Query: 2 AQVINTNVMSLNAQRNLNTNSSSMALSIQQLSSGKRITSASVDAAGLAISERFTTQIRGL 61
AQVINTN +SL Q NLN + SS++ +I++LSSG RI SA DAAG AI+ RFT+ I+GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVASRNANDGISLAQTAEGAMVEIGNNLQRIRELSVQSANATNSATDREALNSEVKQLTS 121
ASRNANDGIS+AQT EGA+ EI NNLQR+RELSVQ+ N TNS +D +++ E++Q
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVANQTSFNGTKLLNGDFSGALFQVGADAGQTIGI-----------------NSIVD 164
EIDRV+NQT FNG K+L+ D QVGA+ G+TI I N +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ-MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 165 ANVDSLGKANFAASVSGAGVTGAATASGSLSGITLAFKDASGAAKSVAVADIKIASGDTA 224
A V L + + GA ++ + + + T
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 225 ADINKKVASAINDKLDQTGMYASIDTSGNVKLESLKAGQDFTSLSGGTSGAAGITAGAGI 284
N G + +G +K D+ ++ G +
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 285 QTASAASGSTASTLSSLDISTFSGAQKALEIVDKALTSVNSSRADMGAVQNRFTSTIANL 344
T T + + A + + VN +N
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 345 AATSENLTASRSRIADTDYAKTTAELTRTRTQILQQAGTAMLAQAKSVPQNV 396
A + + + A + + + TA
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAA 411



Score = 87.8 bits (217), Expect = 6e-21
Identities = 71/342 (20%), Positives = 126/342 (36%), Gaps = 5/342 (1%)

Query: 60 GLDVASRNANDGISLAQTAEGAMVEIGNNLQRIRELSVQSANATNSATDREALNSEVKQL 119
G +V L + + + + ++ A + T + +V
Sbjct: 171 GFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVN 230

Query: 120 TSEIDRVANQTSFNGTKLLNGDFSGALFQVGADAGQTIGINSIVDANVDSLGKANFAASV 179
+ + N L A A D G +
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTK 290

Query: 180 SGAGVTGAATASGSLSGITLAFKDASGAAKSVAVADIKIASGDTAADINKKVASAINDKL 239
+G G + + + +TL D + A +V A ++ + + +N + K
Sbjct: 291 TGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKN 350

Query: 240 DQTGMYASIDTSGNVKLESLKAGQDFTSLSGGTSGAAGITAGAGIQTASAASGSTASTLS 299
+ + + + + + +T + ++ ++
Sbjct: 351 ESAKLSDLEANNAVKGESKITVN---GAEYTANAAGDKVTLAGKTMFIDKTASGVSTLIN 407

Query: 300 SLDISTFSGAQKALEIVDKALTSVNSSRADMGAVQNRFTSTIANLAATSENLTASRSRIA 359
+ L +D AL+ V++ R+ +GA+QNRF S I NL T NL ++RSRI
Sbjct: 408 EDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIE 467

Query: 360 DTDYAKTTAELTRTRTQILQQAGTAMLAQAKSVPQNVLSLLQ 401
D DYA + + ++ QILQQAGT++LAQA VPQNVLSLL+
Sbjct: 468 DADYATEVSNM--SKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11325HTHFIS712e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.6 bits (173), Expect = 2e-16
Identities = 34/160 (21%), Positives = 66/160 (41%), Gaps = 9/160 (5%)

Query: 2 RVIIVDDHTLVRAGLSRLLQTFSGIDVVGEASNAQQALDMTSLHRPDLVLMDLSLPGRSG 61
+++ DD +R L++ L + +G DV SNA + DLV+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 LDAMTDVLRAAPRTHVVMMSMHDDPVHVRDALDRGAVGFVVKDAAPLELELALRAAAAGQ 121
D + + +A P V++MS + + A ++GA ++ K EL + A
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA---- 118

Query: 122 VFLSPQISSKMIAPMLGREKPVGIAALSPRQREILREIGR 161
+ + + + + S +EI R + R
Sbjct: 119 ---LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


85XC_RS11370XC_RS11460N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS113702262.537101flagellar hook-basal body protein
XC_RS113752263.535416flagellar M-ring protein FliF
XC_RS113802293.530384flagellar motor switch protein FliG
XC_RS113853283.911785flagellar assembly protein FliH
XC_RS113903273.612461flagellar protein FliI
XC_RS113953202.290753flagellar export protein FliJ
XC_RS114003151.970081flagellar protein
XC_RS11405315-0.959320flagellar biosynthesis protein
XC_RS114153180.188663flagellar motor switch protein FliN
XC_RS114202112.220840flagellar protein
XC_RS114252102.695677flagellar biosynthesis protein FliP
XC_RS114302113.263782hypothetical protein
XC_RS114351133.067711flagellar biosynthesis
XC_RS114401163.150532flagellar biosynthesis protein FliR
XC_RS114450183.572500diguanylate cyclase
XC_RS11450-1223.304715diguanylate cyclase
XC_RS114550263.022359diguanylate cyclase
XC_RS11460-1272.227555flagellar biosynthesis protein FlhB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11370FLGHOOKFLIE611e-15 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 61.2 bits (148), Expect = 1e-15
Identities = 28/84 (33%), Positives = 48/84 (57%)

Query: 40 AGAQGTPATQAPSFSETLRGAIGGVNEAQQKSGALAKAFEMGDPSADLARVMVASQQSQV 99
A AQ + SF+ L A+ +++ Q + A+ F +G+P L VM Q++ V
Sbjct: 20 ARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASV 79

Query: 100 AFRATVEVRNRLVQAYQDVMNMPL 123
+ + ++VRN+LV AYQ+VM+M +
Sbjct: 80 SMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11375FLGMRINGFLIF347e-115 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 347 bits (891), Expect = e-115
Identities = 184/576 (31%), Positives = 295/576 (51%), Gaps = 48/576 (8%)

Query: 16 KAGQWFDRIRSMQITRKLTMMAMIALAVAAGLAVFFWSQKPGYQALYTGLDDKGNAEAAD 75
K +W +R+R+ ++ ++ + AVA +A+ W++ P Y+ L++ L D+
Sbjct: 11 KPLEWLNRLRANP---RIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVA 67

Query: 76 LLRTAQIPFKIDQSTGAISVPQDRLYDARLKLAGSGLTGQQTGGGFELMEKDPGFGVSQF 135
L IP++ +GAI VP D++++ RL+LA GL + GFEL++++ FG+SQF
Sbjct: 68 QLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLP-KGGAVGFELLDQEK-FGISQF 125

Query: 136 VENARYQHALETELSRTIGTLRPVREARVHLAIPKPSAFTRQRDVASASVVLELRGGQGL 195
E YQ ALE EL+RTI TL PV+ ARVHLA+PKPS F R++ SASV + L G+ L
Sbjct: 126 SEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRAL 185

Query: 196 ERNQVDAIVNLVASSIPDMTPERVTVVDQSGRMLSIADPNSDAAQHAAQFEQVRRQESSY 255
+ Q+ A+V+LV+S++ + P VT+VDQSG +L+ ++ + AQ + ES
Sbjct: 186 DEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLN-DAQLKFANDVESRI 244

Query: 256 NQRIRELLEPMTGAGRVNPEVSVDMDFSVVEEARELYN----GEPAKLRSEQVSDSS-TS 310
+RI +L P+ G G V+ +V+ +DF+ E+ E Y+ A LRS Q++ S
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 311 ATGPQGPPGATSNSPGQPPAPAATAGAPGAPAAGQAATPAAPTES----------SKSAT 360
A P G PGA SN P P A P TP T + ++ T
Sbjct: 305 AGYPGGVPGALSNQP--APPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNET 362

Query: 361 RNYELDRTLQHTRQPAGRIKRVSVAVLLDNVPRPGAKGKIVEQPLTAAELTRIEGLVKQA 420
NYE+DRT++HT+ G I+R+SVAV+++ K PLTA ++ +IE L ++A
Sbjct: 363 SNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKP----LPLTADQMKQIEDLTREA 418

Query: 421 VGFDAARGDTVSVMNAPFVREAVAGEEGPKWWEDPRVQNGLRLLVGAVVVLALLF----G 476
+GF RGDT++V+N+PF G E P W + + L ++VL + +
Sbjct: 419 MGFSDKRGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAG-RWLLVLVVAWILWRK 477

Query: 477 VVRPTLRQLTGTAAVKEKHKKGGNDGTPQSADVRMVDEDDDLMPRLEEDTAQIGQDKKLP 536
VRP L + A ++ + + + L +E Q +++L
Sbjct: 478 AVRPQLTRRVEEAKAAQEQAQVRQET--------EEAVEVRLSK--DEQLQQRRANQRLG 527

Query: 537 IALPDAYEERVRLAREAVKADSKRVAQVVKGWVASE 572
E + RE D + VA V++ W++++
Sbjct: 528 ------AEVMSQRIREMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11380FLGMOTORFLIG307e-106 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 307 bits (789), Expect = e-106
Identities = 105/329 (31%), Positives = 199/329 (60%)

Query: 1 MSGVQRAAVLLLSLGETDAAEVLKHMDPKEVQKIGIAMATMSGISRDQVEKVMDDFNGEL 60
++G Q+AA+LL+S+G +++V K++ +E++ + +A + I+ + + V+ +F +
Sbjct: 15 LTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELM 74

Query: 61 AGKTSLGVGADDYIRNVLIQALGADKAGGLIDRILLGRNTTGLDTLKWMDPRAVADLVRN 120
+ + G DY R +L ++LG KA +I+ + + + ++ DP + + ++
Sbjct: 75 MAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQ 134

Query: 121 EHPQIIAIVMAHLDSDQAAEALKLLPERTRADVLLRIATLDGIPPNALSELNDIMERQFA 180
EHPQ IA+++++LD +A+ L LP + +V RIA +D P + E+ ++E++ A
Sbjct: 135 EHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLA 194

Query: 181 GNQNLKSSNVGGIKVAANILNFLDTGPEQGVLGEIGKIDADLASKIQDLMFVFDNLVDLD 240
+ ++ GG+ I+N D E+ ++ + + D +LA +I+ MFVF+++V LD
Sbjct: 195 SLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLD 254

Query: 241 DRGLQTLLREVSGERLGLALRGADVKVREKITRNMSQRAAEILLEDMEARGPVRLADVEA 300
DR +Q +LRE+ G+ L AL+ D+ V+EKI +NMS+RAA +L EDME GP R DVE
Sbjct: 255 DRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEE 314

Query: 301 AQKEILTIVRRLADEGAISLGGAGAEAMV 329
+Q++I++++R+L ++G I + G E ++
Sbjct: 315 SQQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11385FLGFLIH447e-08 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 44.4 bits (104), Expect = 7e-08
Identities = 36/159 (22%), Positives = 76/159 (47%), Gaps = 7/159 (4%)

Query: 51 QEGFARGHAEGFAQGQSEVRRLTAQIDGILDNFTRPLARLENEVVGALGELTVRIAGSLV 110
QEG A+G +G A+ +S+ + A++ ++ F L L++ + L ++ + A ++
Sbjct: 73 QEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 111 GRAYQAEPQLLADLVQEAIDAVGSAGREVEVRLHPDDITALLPHLATSSTT---RVAPDL 167
G+ + L +Q+ + + ++R+HPDD+ + L + + R+ D
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 168 TLSRGDLRVHAESVRVDGTLDARLRAALETVMRKSGAGL 206
TL G +V A+ +G LDA + + + R + G+
Sbjct: 193 TLHPGGCKVSAD----EGDLDASVATRWQELCRLAAPGV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11395FLGFLIJ270.017 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 27.5 bits (60), Expect = 0.017
Identities = 34/140 (24%), Positives = 57/140 (40%), Gaps = 4/140 (2%)

Query: 1 MMQSKRIDPLLRRAQEQEDKVARDLAERQRALDTHQSRLEELRRYAEEYASSQMSGTSAV 60
M + + L A+++ + AR L E +R + +L+ L Y EY ++ S SA
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 ALSNR----RAFLDRLDSAVLQQAQTVESNRAKVEAERTRLLLASREKQVLEQLAASYRA 116
SNR + F+ L+ A+ Q Q + KV+ + Q + L
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 117 QENKVIERRDQREMDDLGAR 136
R DQ++MD+ R
Sbjct: 121 AALLAENRLDQKKMDEFAQR 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11400FLGHOOKFLIK461e-07 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 46.4 bits (109), Expect = 1e-07
Identities = 55/241 (22%), Positives = 96/241 (39%), Gaps = 17/241 (7%)

Query: 178 ASMSAATTATTATPTLPTDAAAPATTATAATALPSLGALAPAAATAKPAAATALSGEPQA 237
AS+SA P AP+T + TA+P A +P
Sbjct: 129 ASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAPGTPAQP-- 186

Query: 238 AALMSLATAALDSPADDLKTDAPDAPAFVLPTTTASALNRLQDAPAVFSASPTPTPEMGS 297
++ A S A+ + T +P A T P A+P + +GS
Sbjct: 187 ---LTPLVAEAQSKAEVISTPSPVTAAASPLITPHQT------QPLPTVAAPVLSAPLGS 237

Query: 298 DTFDDAIGARLSWLADQKIGHAHIKVTPNEMGPVEVRLHLEGDKVNASFTAANADVRQAL 357
+ ++ +S Q A +++ P ++G V++ L ++ ++ + + VR AL
Sbjct: 238 HEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAAL 297

Query: 358 EQSLPRLRDMLGQQGFQLGQADVS------QQQQNPSGNRNGAGTDGNGLSLEDSPPVGI 411
E +LP LR L + G QLGQ+++S QQQ ++ + L+ ED + +
Sbjct: 298 EAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPV 357

Query: 412 P 412
P
Sbjct: 358 P 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11415FLGMOTORFLIN1151e-36 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 115 bits (289), Expect = 1e-36
Identities = 54/103 (52%), Positives = 78/103 (75%), Gaps = 1/103 (0%)

Query: 9 AAPATFESLHAERDQDATDLNLDVILDVPVTLSLEVGRARIPIRNLLQLNQGSVVELERG 68
AA A F+ L D ++D+I+D+PV L++E+GR R+ I+ LL+L QGSVV L+
Sbjct: 34 AADAVFQQL-GGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGL 92

Query: 69 AGEPLDVYVNGTLIAHGEVVVINDRFGIRLTDVVSPSERIRRL 111
AGEPLD+ +NG LIA GEVVV+ D++G+R+TD+++PSER+RRL
Sbjct: 93 AGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRL 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11425FLGBIOSNFLIP2414e-82 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 241 bits (616), Expect = 4e-82
Identities = 123/229 (53%), Positives = 162/229 (70%), Gaps = 1/229 (0%)

Query: 49 APATGNQIPTLPNVSVGRIGDQPVSLPLQTLLLMTAITLLPSMLLVLTAFTRITIVLGLL 108
P Q+P + + + G Q SLP+QTL+ +T++T +P++LL++T+FTRI IV GLL
Sbjct: 16 TPLAFAQLPGITSQPL-PGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFTRIIIVFGLL 74

Query: 109 RQALGTGQTPSNQVLLGLAMFLTALVMMPVWQKMWGAGLQPYLNNQIDFSTAWTLTTQPL 168
R ALGT P NQVLLGLA+FLT +M PV K++ QP+ +I A QPL
Sbjct: 75 RNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQEALEKGAQPL 134

Query: 169 RAFMLAQIRETDLMTFAGMAGDGKYAGPDAVPFPVLVASFVTSELKTAFEIGFLIFIPFV 228
R FML Q RE DL FA +A G GP+AVP +L+ ++VTSELKTAF+IGF IFIPF+
Sbjct: 135 REFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQIGFTIFIPFL 194

Query: 229 IIDLVVASVLMSMGMMMLSPMLISAPFKILLFILVDGWVLVVGTLAASF 277
IIDLV+ASVLM++GMMM+ P I+ PFK++LF+LVDGW L+VG+LA SF
Sbjct: 195 IIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11435TYPE3IMQPROT462e-10 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 45.9 bits (109), Expect = 2e-10
Identities = 17/69 (24%), Positives = 33/69 (47%)

Query: 13 GLVTVLWIAGPMLLAVLVVGVVIGVVQAATQLNEPTIGFVAKAVALTATLFATGSMLLGH 72
L VL ++G + ++G+++G+ Q TQL E T+ F K + + LF
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 73 LVEFTIELF 81
L+ + ++
Sbjct: 71 LLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11440TYPE3IMRPROT1212e-35 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 121 bits (305), Expect = 2e-35
Identities = 80/239 (33%), Positives = 130/239 (54%), Gaps = 2/239 (0%)

Query: 23 WTMLRTGALLTAMPLIGTRAVPGRVRVMLTGTLAMALAPILPPVPEWDGFNATAVLSIAR 82
W +LR AL++ P++ R+VP RV++ L + A+AP LP F+ A+ +
Sbjct: 18 WPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDVPV-FSFFALWLAVQ 76

Query: 83 ELAVGASMGFMLRLIFEAGALAGELVSQATGLSFAQMSDPLRGVTSGVIAQWFYIGFGLL 142
++ +G ++GF ++ F A AGE++ GLSFA DP + V+A+ + LL
Sbjct: 77 QILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALLL 136

Query: 143 FFAANGHLAVIALLVDSYKALPIGTAVPDAAAFAEVAPTLLLQVLRGGLTLALPMMVAML 202
F NGHL +I+LLVD++ LPIG ++ AF + + GL LALP++ +L
Sbjct: 137 FLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALT-KAGSLIFLNGLMLALPLITLLL 195

Query: 203 AVNLAFGALAKAAPALNPMQLGLPLTVLLGLFLLSSFASEFAPPVQRLFDSAFDAARAL 261
+NLA G L + AP L+ +G PLT+ +G+ L+++ AP + LF F+ +
Sbjct: 196 TLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLADI 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11445GPOSANCHOR350.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 0.002
Identities = 21/59 (35%), Positives = 29/59 (49%), Gaps = 6/59 (10%)

Query: 783 KLLARKQELEHL--IAERTAELEQDKRDLEAARAALTHKATHDELTGLLNRAGILEALR 839
+L A Q+LE I+E A + +RDL+A+R A K E L + I EA R
Sbjct: 327 QLEAEHQKLEEQNKISE--ASRQSLRRDLDASREA--KKQLEAEHQKLEEQNKISEASR 381



Score = 33.9 bits (77), Expect = 0.003
Identities = 31/112 (27%), Positives = 43/112 (38%), Gaps = 20/112 (17%)

Query: 783 KLLARKQELEHLIAERTAELEQDKRDLEAARAALTHKATHDELTGLLNRAGILEALRGML 842
L A K +LEH A + +RDL+A+R A K E L + I EA R L
Sbjct: 292 ALEAEKADLEHQSQVLNANRQSLRRDLDASREAK--KQLEAEHQKLEEQNKISEASRQSL 349

Query: 843 --DSAPLREQPLAVVLIDLDYFKQVNDQHGHL-----AGDAVLAGVGKRLNA 887
D RE KQ+ +H L +A + + L+A
Sbjct: 350 RRDLDASREA-----------KKQLEAEHQKLEEQNKISEASRQSLRRDLDA 390


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS11460TYPE3IMSPROT340e-118 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 340 bits (875), Expect = e-118
Identities = 103/344 (29%), Positives = 181/344 (52%), Gaps = 2/344 (0%)

Query: 8 GERTELPTEKRLREAREQGNIPQSRELSTAAVFGTGVFALMLMARGIGDGASVWMKTALS 67
GE+TE PT K++R+AR++G + +S+E+ + A+ LM ++ + S M +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLM--LIP 60

Query: 68 PDPKMRENPMALFGHFGDLLLQLLWVMVPLIGICLAAGLVGPLLMSGLHFSGKAIMPDLN 127
+ AL ++LL+ ++ PL+ + + ++ G SG+AI PD+
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 128 KLNPMNGIKRMWGSNSLAELVKSILRLLFVGLAASLCISKGLHGLRSLVNRPLEQAVGNG 187
K+NP+ G KR++ SL E +KSIL+++ + + + I L L L +E
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 188 LDFTKSLLFYTAGALVLLAAFDAPYQKWNWLRKLKMTREEIKREMKESEGSPEVKGRIRQ 247
+ L+ V+++ D ++ + ++++LKM+++EIKRE KE EGSPE+K + RQ
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 248 MQMQMSQRRMMEALPTADVVLMNPTHYAVALKYEGGKMRAPVVVAKGVDEMAFRIREACE 307
++ R M E + + VV+ NPTH A+ + Y+ G+ P+V K D +R+ E
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 308 QHRVAIVTAPPLARALYREAQLGKEIPVRLYSVVAQVLSYVYQL 351
+ V I+ PLARALY +A + IP A+VL ++ +
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQ 344


86XC_RS12070XC_RS12090N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS12070743-9.493207hemin transporter
XC_RS12075743-9.321871ABC transporter permease
XC_RS12080641-9.001482hydrogenase expression protein HypA
XC_RS12085537-7.857936ABC transporter
XC_RS12090640-6.535455multidrug transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12070RTXTOXINC552e-12 Gram-negative bacterial RTX toxin-activating protein C...
		>RTXTOXINC#Gram-negative bacterial RTX toxin-activating protein C

signature.
Length = 170

Score = 54.9 bits (132), Expect = 2e-12
Identities = 30/124 (24%), Positives = 44/124 (35%), Gaps = 21/124 (16%)

Query: 24 KKFSIAAAYVWLW-------------------PAIRLGQLVTIEDEDGVWTGYALWAYLT 64
K I WLW PAI+ Q V + D Y WA L+
Sbjct: 5 KPLEILGHVSWLWASSPLHRNWPVSLFAINVLPAIQANQYVLLTR-DDYPVAYCSWANLS 63

Query: 65 PETASHLVLQDPPFLPISDWNEGDQLWILDFVAMPGHHRRLARALRHRLRPHFKQAHRLV 124
E + D L DW GD+ W +D++A G + L + +R + +A R+
Sbjct: 64 LENEIKYL-NDVTSLVAEDWTSGDRKWFIDWIAPFGDNGALYKYMRKKFPDELFRAIRVD 122

Query: 125 RDKT 128

Sbjct: 123 PKTH 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12080PERTACTIN383e-05 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 37.8 bits (87), Expect = 3e-05
Identities = 33/100 (33%), Positives = 41/100 (41%), Gaps = 3/100 (3%)

Query: 21 VDPSEPPTEIPPIVVNPPPSPPPTIPPSPPVEQPGPITPPGGGGGVPTPLPPLVSGLGAG 80
V PP P P P P P PP PP + P P PP P P PP L A
Sbjct: 563 VGAKAPPAPKPAPQPGPQPGPQPPQPPQPP-QPPQPPQPPQRQPEAPAPQPPAGRELSAA 621

Query: 81 ADAYLNKSDEVKSLLSKYLAGGADVEFKDLGTVKGEVDAG 120
A+A +N V + + A + K LG ++ DAG
Sbjct: 622 ANAAVNTGG-VGLASTLWYAESNALS-KRLGELRLNPDAG 659


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12085PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.008
Identities = 19/65 (29%), Positives = 26/65 (40%), Gaps = 11/65 (16%)

Query: 515 KWIMSALQLRA-----PAGQVIAIVGNSGVGKTTLIRVLAGLEDLQVGDFLVNGEDLRKV 569
K+I+ R + + G G+GK+TLI L GL DF +
Sbjct: 578 KYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGL------DFFSDTHFDIGT 631

Query: 570 GKSSY 574
GK SY
Sbjct: 632 GKDSY 636


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS12090RTXTOXIND1272e-34 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 127 bits (320), Expect = 2e-34
Identities = 67/423 (15%), Positives = 152/423 (35%), Gaps = 52/423 (12%)

Query: 48 ALFLLLATFVLTASYSKREHVSGQIISTHGRVDIRSGTPGLILSTTLKPNALVKKGQVLA 107
+ +L+ +G++ + +I+ ++ +K V+KG VL
Sbjct: 69 VIAFILSVL---GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLL 125

Query: 108 ELSADITD---------------EAGR----------------SLSDETIKRALTRSEEL 136
+L+A + E R L DE + ++ E L
Sbjct: 126 KLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVL 185

Query: 137 TKEQLQTHDFS--GQRERELTRQVEETTGAMQEVARKISILEKKYAKNKELLKTIEPLLA 194
L FS ++ + +++ V +I+ E K L LL
Sbjct: 186 RLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLH 245

Query: 195 EKYVSKYTYLTYENALLDAEAEIQDARAQQSTLRNQ----RAALLGEITEIKTTASRQAS 250
++ ++K+ L EN ++A E++ ++Q + ++ + K +
Sbjct: 246 KQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLR 305

Query: 251 EIEREKSTIEDQVARAKSD-RLQTITSPLSGTVAAIYA-SQGQRIGTDSIIASITPSESV 308
+ + ++A+ + + I +P+S V + ++G + T + I P +
Sbjct: 306 QTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT 365

Query: 309 FEAEILIPSRAIGHVNVGTEVLLNIAAFPKAKYGAIQGRIASLSTQTSPIGELERRYGRQ 368
E L+ ++ IG +NVG ++ + AFP +YG + G++ +++
Sbjct: 366 LEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIE----------D 415

Query: 369 SPTEPVYTAKVALPSQTIGVAQEAKSFLPGMEVDAELILEGRKIWEWMFDPFQTMGSRLT 428
V+ +++ + + GM V AE+ R + ++ P + +
Sbjct: 416 QRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTESL 475

Query: 429 GEK 431
E+
Sbjct: 476 RER 478


87XC_RS13880XC_RS13910N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS13880-181.522391RND transporter
XC_RS138850100.056993multidrug transporter
XC_RS138900110.045916hypothetical protein
XC_RS13895-1120.620234prephenate dehydrogenase
XC_RS139000161.550326pyridoxine kinase
XC_RS139052191.027073chaperone protein DnaJ
XC_RS139103210.894692chaperone protein DnaK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13880RTXTOXIND711e-15 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 71.0 bits (174), Expect = 1e-15
Identities = 32/175 (18%), Positives = 64/175 (36%), Gaps = 20/175 (11%)

Query: 104 LTRLASERAQVQAEIAAIQQSRAVAGIDVERSRQLLAEGLAGRRDYELTQIKVAEADAKL 163
+E ++++ I+ A + + QL K+ + +
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF---------KNEILDKLRQTTDNI 311

Query: 164 AESRAKLTRIDIQLNRQSAQLVRAPRDGRVQQLNAASGSAMVSPGTVLAVIAPERVERAV 223
+L + + + ++RAP +VQQL + +V+ L VI PE V
Sbjct: 312 GLLTLELAKNEERQQAS---VIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEV 368

Query: 224 ELYIDGRDVPLIRPGRPVRLEFEGWPAIQFSGWPSVAHGMFDGRVRAIDPNAAPD 278
+ +D+ I G+ I+ +P +G G+V+ I+ +A D
Sbjct: 369 TALVQNKDIGFINVGQNAI--------IKVEAFPYTRYGYLVGKVKNINLDAIED 415



Score = 67.5 bits (165), Expect = 2e-14
Identities = 29/207 (14%), Positives = 73/207 (35%), Gaps = 20/207 (9%)

Query: 5 ADRNAHFPT-LAAMRPPSIAKA--LAWMLLIGVAIAAAILALAPWVQTASGKGQVVSLDP 61
D N P L + P + +A+ ++ + IA + L A+ G++
Sbjct: 36 KDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLT---H 92

Query: 62 GDRQQQVTAFVPGRVETWYVHDGQHVSRGDPIARVGDLDPDLLTRLASERAQVQAEIAAI 121
R +++ V+ V +G+ V +GD + ++ L + + Q A +
Sbjct: 93 SGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALG----AEADTLKTQSSLLQARL 148

Query: 122 QQSRAVAGIDVERSRQLLAEGLAGRRDYELTQIKVAEADAKLAES-----RAKLTRIDIQ 176
+Q+R +L L ++ + L + + + + ++
Sbjct: 149 EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN 208

Query: 177 LNRQSAQLVRAPRDGRVQQLNAASGSA 203
L+++ A+ + ++N +
Sbjct: 209 LDKKRAER-----LTVLARINRYENLS 230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13885RTXTOXIND371e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.1 bits (86), Expect = 1e-04
Identities = 27/197 (13%), Positives = 62/197 (31%), Gaps = 24/197 (12%)

Query: 287 RAVLTRIDQATARLMLAQNDLKPRLDVSFEVSKDLGAPGVGGPNRSPTDAIIGFRFSVPL 346
+ R++Q +++ +L ++ + V ++I +FS
Sbjct: 142 SLLQARLEQTRYQILSRSIELNKLPELKL--PDEPYFQNVSEEEVLRLTSLIKEQFST-W 198

Query: 347 ENRSAKGRV--AEARAEIEALDQRSRFLRDQISVEVESIVISLNAAQRLAG--------I 396
+N+ + + + RAE + R + VE L+ L +
Sbjct: 199 QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR----LDDFSSLLHKQAIAKHAV 254

Query: 397 ADEERSLAD---RLAAAERRRFELGSG----DFFLVNQREETANDARVRLIDAQARIASA 449
++E + L + + ++ S + N+ +L I
Sbjct: 255 LEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 450 RAELAAATADRDALMLR 466
ELA + A ++R
Sbjct: 315 TLELAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13900PF04183330.002 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 32.5 bits (74), Expect = 0.002
Identities = 18/95 (18%), Positives = 30/95 (31%), Gaps = 11/95 (11%)

Query: 106 SLANGEAFADWLEQTLPQAPQLRYCLDPVIGDTHTGPYVEPGLERVFAERLLPHAWLVTP 165
+A G + WL+Q L ++G+ G G A P+ +
Sbjct: 291 YIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYA---ALARAPYRY---- 343

Query: 166 NAFELG---RLTGLPSLEQDDAIVAARALLARGPQ 197
LG R L+ D++ V L+
Sbjct: 344 -QEMLGVIWRENPCRWLKPDESPVLMATLMECDEN 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS13910SHAPEPROTEIN1392e-38 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 139 bits (351), Expect = 2e-38
Identities = 81/392 (20%), Positives = 145/392 (36%), Gaps = 91/392 (23%)

Query: 5 IGIDLGTTNSCVAIMDGGKARVIENSEGDRTTPSIVAYTKDGE------VLVGASAKRQA 58
+ IDLGT N+ + + G + E PS+VA +D VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIV-LNE--------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPKNTFYAVKRLIGRKFTDGEVQKDISHVPYGILAHDNGDAWVQTSDSKRMAPQEISA 118
P N A++ + KD + + +++
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFV-------------------TEKMLQ 92

Query: 119 RVLEKMKKTAEDFLGEKVTEAVITVPAYFNDSQRQATKDAGRIAGLDVKRIINEPTAAAL 178
++++ + ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 93 HFIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAI 149

Query: 179 AYGLDKNGGDRKIAVYDLGGGTFDVSIIEIAEVDGEKQFEVLATNGDTFLGGEDFDNRVI 238
GL + V D+GGGT +V++I + V + +GG+ FD +I
Sbjct: 150 GAGLPVS-EATGSMVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAII 199

Query: 239 EYLVDEFNKDQGIDLRKDPLALQRLKDAAERAKIELSTS----QQTEVNLPYVTADASGP 294
Y+ + G + AER K E+ ++ + E+ + P
Sbjct: 200 NYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVP 246

Query: 295 KHLNIKLTRAKLEALVE------DLVKKSIEPCRTALNDAGLRASDINE--VILVGGQTR 346
+ + + LEAL E V ++E C L ASDI+E ++L GG
Sbjct: 247 RGFTLN-SNEILEALQEPLTGIVSAVMVALEQCPPEL------ASDISERGMVLTGGGAL 299

Query: 347 MPKVQQAVADFFGKEPRKDVNPDEAVAVGAAI 378
+ + + + + G +P VA G
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGK 331


88XC_RS14060XC_RS14110N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS14060-1101.298309hypothetical protein
XC_RS14065-1100.049816histidine kinase
XC_RS14070-112-0.119442hypothetical protein
XC_RS14075-1110.590346diguanylate cyclase
XC_RS14080-2120.258139RND transporter
XC_RS14085-114-0.019893short-chain dehydrogenase
XC_RS14090-2120.329007multidrug efflux RND transporter permease
XC_RS140950100.705710MexE family multidrug efflux RND transporter
XC_RS14100-1100.523812aklaviketone reductase
XC_RS14105-2120.795090transcriptional regulator
XC_RS14110-2121.251624cell envelope biogenesis protein OmpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14060AEROLYSIN250.026 Aerolysin signature.
		>AEROLYSIN#Aerolysin signature.

Length = 493

Score = 25.4 bits (55), Expect = 0.026
Identities = 14/36 (38%), Positives = 18/36 (50%), Gaps = 1/36 (2%)

Query: 7 RYQLSCNALPHPAQGWRADWRIQQVGASKAGD-LAR 41
RYQ +P + W +W IQQ G S + LAR
Sbjct: 379 RYQWDKRYIPGEVKWWDWNWTIQQNGLSTMQNNLAR 414


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14085DHBDHDRGNASE936e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 93.2 bits (231), Expect = 6e-25
Identities = 60/200 (30%), Positives = 88/200 (44%), Gaps = 15/200 (7%)

Query: 5 KIALVTGGTRGIGLETVRQLAQAGVHTLLAGRKRDDAVAAALKLQAEGLPVEAIQLDVND 64
KIA +TG +GIG R LA G H + L+AE EA DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 65 DISIAAAVGTVEQRHGHLDILINNAGIMIEDMQRKPSEQ-SLDTWKRTFDTNLFAVVGVT 123
+I +E+ G +DIL+N AG+ ++ S + W+ TF N V +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGV----LRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 124 KAFLPLLRRSLAGRIVNVSSILGSLTLHTQQGSPIYDFKIPAYDASKSALNSWTVHLAHE 183
++ + +G IV +GS + S + AY +SK+A +T L E
Sbjct: 125 RSVSKYMMDRRSGSIV----TVGSNPAGVPRTS------MAAYASSKAAAVMFTKCLGLE 174

Query: 184 LRESAIKVNMVHPGYVKTDM 203
L E I+ N+V PG +TDM
Sbjct: 175 LAEYNIRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14090ACRIFLAVINRP10370.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1037 bits (2684), Expect = 0.0
Identities = 437/1041 (41%), Positives = 637/1041 (61%), Gaps = 20/1041 (1%)

Query: 4 SRFFIDRPIFAAVLSIIIFAAGLIAMPLLPISEYPEVVPPSVQVRAVYPGANPKVIAETV 63
+ FFI RPIFA VL+II+ AG +A+ LP+++YP + PP+V V A YPGA+ + + +TV
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 ATPLEEAINGVENMMYMKSVAGSDGVLVVTVTFKPGTDPDQAQVQVQNRVSQAQARLPED 123
+E+ +NG++N+MYM S + S G + +T+TF+ GTDPD AQVQVQN++ A LP++
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 124 VRRQGVTTQKQSPTLTMVVHLTSPKGKYDSLYLSNYATLKVKDELSRLPGVGQIQIFGAG 183
V++QG++ +K S + MV S +S+Y VKD LSRL GVG +Q+FG
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG-A 180

Query: 184 DYAMRIWLDPDKVAARGLTASDVVAAIREQNVQVSAGQLGAEPMPNKSEFLLSINAQGRL 243
YAMRIWLD D + LT DV+ ++ QN Q++AGQLG P + SI AQ R
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 244 TTEEEFGNIVIRSGTSGEIVRLSDVARLELGAGKYTLRSQLDSKSAVGMGVFQSPGANAI 303
EEFG + +R + G +VRL DVAR+ELG Y + ++++ K A G+G+ + GANA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 304 ELSDAVRAKMAELEKQFPQDMAWSAAYDPTIFVRDSIKAVVSTLLEAVLLVVLVVILFLQ 363
+ + A++AK+AEL+ FPQ M YD T FV+ SI VV TL EA++LV LV+ LFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 364 TWRASIIPLLAVPVSVVGTFAALYVLGFSINTLTLFGLVLAIGIVVDDAIVVVENVER-N 422
RA++IP +AVPV ++GTFA L G+SINTLT+FG+VLAIG++VDDAIVVVENVER
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 423 IEEGLSPLAAAHQAMREVSGPIIAIALVLCAVFVPMAFLSGVTGQFYKQFAVTIAISTVI 482
+E+ L P A ++M ++ G ++ IA+VL AVF+PMAF G TG Y+QF++TI + +
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 483 SAINSLTLSPALAAMLLKAHDAPKDGPTRLIDRLFGWIFRPFNRFFNTSSHKYQGAVSRA 542
S + +L L+PAL A LLK FGW FN F+ S + Y +V +
Sbjct: 481 SVLVALILTPALCATLLKPV---SAEHHENKGGFFGW----FNTTFDHSVNHYTNSVGKI 533

Query: 543 LGKRGMVFMVYLLLLVGTGVMFKLVPGGFIPTQDKLYLIAGAKLPEGASLERTSEVISQI 602
LG G ++Y L++ G V+F +P F+P +D+ + +LP GA+ ERT +V+ Q+
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 603 SDIALQTE--GVAHAVAFPGLNPLQFTNTPNTGTLFLTLKPFSERSR---TAAQINAEIN 657
+D L+ E V G + N G F++LKP+ ER+ +A +
Sbjct: 594 TDYYLKNEKANVESVFTVNGFSFS--GQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 658 ARISQIEKGFAFAFMPPPILGLGQGSGYSLYIQDRAGLGYGQLQTAVTAMSGAISQTPG- 716
+ +I GF F P I+ LG +G+ + D+AGLG+ L A + G +Q P
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 717 MQFPIGTYQANVPQLDAKVDRDKAKAQGVPLTNVFDTLQTYLGSAYINDFNRFGRTYQVI 776
+ + Q +VD++KA+A GV L+++ T+ T LG Y+NDF GR ++
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 777 AQADGQFRDSVEDIANLRTRNDRGQMVPIGSMVTLGQTYGPDPVIRYNGYPAADLIGEAD 836
QAD +FR ED+ L R+ G+MVP + T YG + RYNG P+ ++ GEA
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 837 TRVLSSAQAMQTLAGMAPKVLPNGMNIEWTDLSYQQSIQGNSALIVFPMAVLLAFLVLAA 896
SS AM + +A K LP G+ +WT +SYQ+ + GN A + ++ ++ FL LAA
Sbjct: 832 PGT-SSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 897 LYESWTLPLAVILIVPMTLLSALFGVWLTGGDNNVFVQVGLVVLMGLACKNAILIVEFAR 956
LYESW++P++V+L+VP+ ++ L L N+V+ VGL+ +GL+ KNAILIVEFA+
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 957 EL-EMHGKGIVEAALEACRLRLRPIVMTSIAFIAGTVPLVFGHGAGAEVRSVTGITVFAG 1015
+L E GKG+VEA L A R+RLRPI+MTS+AFI G +PL +GAG+ ++ GI V G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 1016 MLGVTLFGLFLTPVFYVALRK 1036
M+ TL +F PVF+V +R+
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRR 1030



Score = 86.8 bits (215), Expect = 2e-19
Identities = 72/328 (21%), Positives = 124/328 (37%), Gaps = 19/328 (5%)

Query: 735 VDRDKAKAQGVPLTNVFDTLQTY---LGSAYIND-FNRFGRTYQVIAQADGQFRDSVEDI 790
+D D + +V + L+ + + + G+ A +F+ + E+
Sbjct: 188 LDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK-NPEEF 246

Query: 791 ANLRTR-NDRGQMVPIGSMVTLGQTYGPDPVI-RYNGYPAADLI----GEADTRVLSSAQ 844
+ R N G +V + + + VI R NG PAA L A+ + A
Sbjct: 247 GKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKA- 305

Query: 845 AMQTLAGMAPKVLPNGMNIEWT-DLSY--QQSIQGNSALIVFPMAVLLAFLVLAALYESW 901
LA + P P GM + + D + Q SI + A++L FLV+ ++
Sbjct: 306 IKAKLAELQP-FFPQGMKVLYPYDTTPFVQLSIHEVVKTLF--EAIMLVFLVMYLFLQNM 362

Query: 902 TLPLAVILIVPMTLLSALFGVWLTGGDNNVFVQVGLVVLMGLACKNAILIVE-FARELEM 960
L + VP+ LL + G N G+V+ +GL +AI++VE R +
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 961 HGKGIVEAALEACRLRLRPIVMTSIAFIAGTVPLVFGHGAGAEVRSVTGITVFAGMLGVT 1020
EA ++ +V ++ A +P+ F G+ + IT+ + M
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 1021 LFGLFLTPVFYVALRKWVTRGEPAAPAA 1048
L L LTP L K V+
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14095RTXTOXIND448e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 8e-07
Identities = 26/172 (15%), Positives = 61/172 (35%), Gaps = 23/172 (13%)

Query: 1 MTPNATPFRFPLRTVLTGAVLAVVLAGCGSKAAETGAPPPPSVSVAPVLLKEISQWDEFS 60
TP + R ++ V+A +L+ G VA +
Sbjct: 50 ETPVSRRPRLVAYFIMGFLVIAFILSVLG-----------QVEIVATA-----------N 87

Query: 61 GRIEPV-ESVELRPRVSGYIDKVNYAEGTEVKKGDVLFSIDDRSYRAEFARANAALVRAR 119
G++ S E++P + + ++ EG V+KGDVL + A+ + ++L++AR
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQAR 147

Query: 120 TQASLARSEATRARKLSEQQAISTETWEQRRAAADQADAEVLAAQAAVDTAK 171
+ + + + + + + + ++ + T +
Sbjct: 148 LEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQ 199



Score = 34.8 bits (80), Expect = 6e-04
Identities = 18/102 (17%), Positives = 37/102 (36%), Gaps = 7/102 (6%)

Query: 104 YRAEFARANAALVRARTQASLARSEATRARKLSEQ--QAISTETWEQRRAAADQADAEVL 161
++ A L ++Q SE A++ + Q E ++ R Q +
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLR----QTTDNIG 312

Query: 162 AAQAAVDTAKLNMDWTRVRAPIDGRAGRAMV-TAGNLVTAGD 202
+ + + +RAP+ + + V T G +VT +
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354



Score = 29.4 bits (66), Expect = 0.024
Identities = 12/70 (17%), Positives = 27/70 (38%)

Query: 102 RSYRAEFARANAALVRARTQASLARSEATRARKLSEQQAISTETWEQRRAAADQADAEVL 161
RAE A + R + + +S L +QAI+ ++ +A E+
Sbjct: 210 DKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELR 269

Query: 162 AAQAAVDTAK 171
++ ++ +
Sbjct: 270 VYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14110OMPADOMAIN1165e-33 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (291), Expect = 5e-33
Identities = 42/126 (33%), Positives = 67/126 (53%), Gaps = 11/126 (8%)

Query: 117 TLNLPDGITFDFGKSALKPQFYTALNGVASTLREYN--QTMVEVVGHTDSVGSDAVNQRL 174
L + F+F K+ LKP+ AL+ + S L + V V+G+TD +GSDA NQ L
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGL 273

Query: 175 SEERANAVAQYLTAQGVQRERMETMGAGKRYPIADNSTDAGR---------AQNRRVEIR 225
SE RA +V YL ++G+ +++ G G+ P+ N+ D + A +RRVEI
Sbjct: 274 SERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIE 333

Query: 226 LIPLRQ 231
+ ++
Sbjct: 334 VKGIKD 339


89XC_RS14265XC_RS14325N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS14265-391.096658hypothetical protein
XC_RS14270-2101.534455beta-glucosidase
XC_RS142801111.416340*hypothetical protein
XC_RS142851102.419209hypothetical protein
XC_RS14290-2152.205516multidrug resistance protein B
XC_RS14295-2132.202384multidrug transporter
XC_RS14300-1121.920028multidrug RND transporter
XC_RS14305-2101.695926MarR family transcriptional regulator
XC_RS14310-3111.146708transcriptional regulator
XC_RS14315-4110.045239DNA topoisomerase IV subunit A
XC_RS14320-210-0.144915thiopurine S-methyltransferase
XC_RS14325-190.245494bacterioferritin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14265MALTOSEBP357e-04 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 35.1 bits (80), Expect = 7e-04
Identities = 30/121 (24%), Positives = 52/121 (42%), Gaps = 22/121 (18%)

Query: 448 DTATFAARSNEGHQGLAVSGQW---RAETRAVPVGALFVPI-----AQPKARLVMAILEP 499
D + A N+G + ++G W +T V G +P ++P ++ A +
Sbjct: 235 DYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINA 294

Query: 500 QAPDSLLQWGLFNLAFERKEYMEDYVAEDVAREMLARDPALKA----QFEQRLASDPAFA 555
+P+ L KE++E+Y+ D E + +D L A +E+ LA DP A
Sbjct: 295 ASPNKELA----------KEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIA 344

Query: 556 A 556
A
Sbjct: 345 A 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14270PYOCINKILLER340.003 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 33.6 bits (76), Expect = 0.003
Identities = 64/344 (18%), Positives = 107/344 (31%), Gaps = 34/344 (9%)

Query: 336 FDDMVTRILRSLFAHGAFDHPTQRQPIDGKAGQRAAQ--RVAEEGSVLLRNEQATLPLSK 393
F + ++ + + A + + Q AA+ R AEE + +A +
Sbjct: 193 FTEAISSLQIRMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAM 252

Query: 394 DVRRIAVIGGYADKGVMSGGGSSRVDYTINGGNAVPGLTPTTWPGPVIIHPSSPLQALRA 453
V + G++ + I+ AV G + P + + +S + R
Sbjct: 253 PANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASAPSVMAVGFASLTYSSRT 312

Query: 454 A--LPDVQIDYVDGKDRAAAARAAKAADVAIVFATQWAAESVDLPDMQLPDGQDALIEAV 511
A D D V A AA + + + A + + LP +
Sbjct: 313 AEQWQDQTPDSVR------YALGMDAAKLGLPPSVNLNAVAKASGTVDLP----MRLTNE 362

Query: 512 AKANPKTTVVLETNG-------PVRMPWAEHVPAVLQAWYPGIGGGEAIANLLTGAVNPS 564
A+ N T V+ T+G PVRM + + P L +P
Sbjct: 363 ARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEVTVPSTTAEAPPLILTWTPASPP 422

Query: 565 GHLPVTWPVDESQLPRPSIPGLGFKPAKPGEDTIDYAIEGANVGYKWFAARNLTPRYPFG 624
G+ + P P G P K +T I +L +P
Sbjct: 423 GNQNPSSTTPVVPKPVPVYEGATLTPVKATPETYPGVITLPE---------DLIIGFPAD 473

Query: 625 HGLS--YTQFRMGGLRVEAA--NGQLTANFEVENTGQRDGAAVP 664
G+ Y FR AA GQ + + Q +GA +P
Sbjct: 474 SGIKPIYVMFRDPRDVPGAATGKGQPVSGNWLGAASQGEGAPIP 517


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14290TCRTETB1171e-30 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 117 bits (295), Expect = 1e-30
Identities = 94/400 (23%), Positives = 166/400 (41%), Gaps = 26/400 (6%)

Query: 33 LAMASFMQVLDTTIANVSLPTIAGNLGASSQQATWVITSFAVSTAIALPLTGWLSRRFGE 92
L + SF VL+ + NVSLP IA + WV T+F ++ +I + G LS + G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 93 TKLFVWSTLAFTIASLLCGLAQSM-GMLVVARALQGFVAGPMYPITQSLLVSIY-PREKR 150
+L ++ + S++ + S +L++AR +QG +P ++V+ Y P+E R
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQG-AGAAAFPALVMVVVARYIPKENR 137

Query: 151 GQALALLAMITVVAPIAGPILGGWITDNYSWEWIFLINVPLGIIASSIVGSQLRHRPEQL 210
G+A L+ I + GP +GG I W +L+ +P + + I L ++
Sbjct: 138 GKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIP---MITIITVPFLMKLLKKE 192

Query: 211 ERPR--MDYIGLILLVVGVGALQLVLDLGNDEDWFSSDKIVVLACVAAVALVVFVIWELT 268
R + D G+IL+ VG+ L F++ + V+ ++ ++FV
Sbjct: 193 VRIKGHFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRK 242

Query: 269 DKDPIVDLKLFRHRNFRAGTLAMVVAYAAFFSVSLLIPQWLQRDMGYTAIWAGLATAPIG 328
DP VD L ++ F G L + + ++P ++ + G G
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 329 ILPVLMT-PFVGKYALRFDLRMLATIAFIFMSFTSFFRSNFNLQVDFAHVATIQLVMGVG 387
+ V++ G R + I F+S SF ++F L + +++ V
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS-VSFLTASFLL--ETTSWFMTIIIVFVL 359

Query: 388 VALFFMPVL--QILLSDLDGREIAAGSGLATFLRTLGGSF 425
L F + I+ S L +E AG L F L
Sbjct: 360 GGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14295RTXTOXIND726e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 72.2 bits (177), Expect = 6e-16
Identities = 49/296 (16%), Positives = 89/296 (30%), Gaps = 42/296 (14%)

Query: 82 VERGQLLVQLDPSDTAVALQQAEANLAKTVRQVRGLYRSVEGAQAELSAREVTLRSARAD 141
V R L++ S Q E NL K + A ++ E R ++
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAER-------LTVLARINRYENLSRVEKSR 236

Query: 142 FARRKDLAASGAIS--------------NEELAHARDELAAAEAAVSGSRESFERNRALI 187
L AI+ EL + +L E+ + ++E ++ L
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 188 DDTEVATQ-PDVQAAAAQLRQ----AFLNHARTGVVAPVSGYVARRSAQ-VGQRVQPGTV 241
E+ + L + + APVS V + G V
Sbjct: 297 -KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 242 LMAVVPPEQ-MWVEANFKETQLKHMRLGQTVELHSDLYGGGVDYT--GRIESLGLGTGSA 298
LM +VP + + V A + + + +GQ + + + YT G + G
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF----PYTRYGYLV----GK--- 404

Query: 299 FSLLPAQNASGNWIKIVQRVPVRIAIEPKQLAANPLRLGLSMKADVNLRDQQGSVL 354
+ + +V V + I + L M ++ SV+
Sbjct: 405 VKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVI 460



Score = 52.5 bits (126), Expect = 1e-09
Identities = 28/183 (15%), Positives = 64/183 (34%), Gaps = 14/183 (7%)

Query: 16 PSKRRTLLRIVAILVILAVIGLVVWYFLIGRWAEDTDDAYVQG------NQVQITPMVGG 69
P RR R+VA ++ ++ + + A G +I P+
Sbjct: 52 PVSRR--PRLVAYFIMGFLVIAFILSV----LGQVEIVATANGKLTHSGRSKEIKPIENS 105

Query: 70 TVVSIGAEDGMRVERGQLLVQLDPSDTAVALQQAEANLAKTVRQVRGLYRSVEGAQAELS 129
V I ++G V +G +L++L + +++L R + Y+ + +
Sbjct: 106 IVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLL-QARLEQTRYQILSRSIELNK 164

Query: 130 AREVTLRS-ARADFARRKDLAASGAISNEELAHARDELAAAEAAVSGSRESFERNRALID 188
E+ L +++ ++ E+ + +++ E + R A I+
Sbjct: 165 LPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARIN 224

Query: 189 DTE 191
E
Sbjct: 225 RYE 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14300ACRIFLAVINRP310.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.015
Identities = 14/40 (35%), Positives = 19/40 (47%), Gaps = 7/40 (17%)

Query: 15 AVRRLRPILVTTLA-------LALAACASSRGLAPQGTVL 47
RLRPIL+T+LA LA++ A S G +
Sbjct: 967 VRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGV 1006


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14325HELNAPAPROT434e-08 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 43.3 bits (102), Expect = 4e-08
Identities = 22/108 (20%), Positives = 43/108 (39%), Gaps = 15/108 (13%)

Query: 46 HEMQEE-----TEHADALLRRILFLEGDPDMRPAEF---------SPGKTVVEMLERDLV 91
HE EE E D + R+L + G P E+ + EM++ +
Sbjct: 47 HEKFEELYDHAAETVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVN 106

Query: 92 VEYEVRAALAAGMKLCEDHGDYVSRDMLLKQLQDTEEDHAWWLEQQLG 139
++ + + L E++ D + D+ + +++ E+ W L LG
Sbjct: 107 DYKQISSESKFVIGLAEENQDNATADLFVGLIEEVEK-QVWMLSSYLG 153


90XC_RS14935XC_RS14975N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS14935011-0.446255transcriptional regulator
XC_RS14940211-1.962887DNA repair protein RecO
XC_RS14945211-1.579378GTPase Era
XC_RS14950211-0.478908Ribonuclease 3
XC_RS14955111-0.826758hypothetical protein
XC_RS14960190.408844signal peptidase I
XC_RS14965080.779154elongation factor 4
XC_RS14970-182.105631peptidase S1
XC_RS14975081.982320hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14935HTHFIS674e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.2 bits (164), Expect = 4e-15
Identities = 30/118 (25%), Positives = 45/118 (38%)

Query: 11 PRVLLVEDDPISRGFLQAVLESLPAQVDWADSLSSALDRARARRHDLWLIDVNLPDGTGS 70
+L+ +DD R L L V + ++ A DL + DV +PD
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 DLLRALRLLHPDVPALAHTADGNAAMQRGLQSDGFLEMLVKPLTSERLLQAVRRGLAR 128
DLL ++ PD+P L +A G + L KP L+ + R LA
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14945TCRTETOQM320.004 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.7 bits (72), Expect = 0.004
Identities = 20/70 (28%), Positives = 34/70 (48%), Gaps = 10/70 (14%)

Query: 61 LVDTPGLHREQKRAMNRVMNRAARGSLEGVDAAVLVIEAGRWDEEDT-LAFRVLSDADVP 119
++DTPG H + + R SL +D A+L+I A + T + F L +P
Sbjct: 72 IIDTPG-HMDFLAEVYR--------SLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 120 VVLVVNKVDR 129
+ +NK+D+
Sbjct: 123 TIFFINKIDQ 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14955PF05272270.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.3 bits (60), Expect = 0.019
Identities = 19/89 (21%), Positives = 33/89 (37%), Gaps = 4/89 (4%)

Query: 33 MYQEYYAVRTSMKGLANEAGSADMDPSKLQDMFFKRLD--INSSESVKPGDVKFERIEGG 90
Q+ Y+V T+ +A+ + DP K M ++ +N + + +R G
Sbjct: 786 AAQKGYSVNTTFVTIADLVQALGADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRRRGY 845

Query: 91 WRMKVNYEVRRELV--GNLDVVGKFDTEQ 117
R +V V E G D +Q
Sbjct: 846 MRPQVWPPVIAEDKEADQAHAPGDQDQQQ 874


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14965TCRTETOQM1447e-39 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 144 bits (366), Expect = 7e-39
Identities = 95/455 (20%), Positives = 179/455 (39%), Gaps = 85/455 (18%)

Query: 3 NIRNFSIIAHVDHGKSTLADRIIQLCGG---LQAREMEAQVLDSNPIERERGITIKAQSV 59
I N ++AHVD GK+TL + ++ G L + + D+ +ER+RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 SLPYTAKDGQTYFLNFIDTPGHVDFSYEVSRSLAACEGALLVVDAAQGVEAQSVANCYTA 119
S + +N IDTPGH+DF EV RSL+ +GA+L++ A GV+AQ+ +
Sbjct: 62 SFQWENTK-----VNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 120 VEQGLEVVPVLNK-----IDLP----------TADIERAKA----------------EIE 148
+ G+ + +NK IDL +A+I + + +
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 149 AVIG--------------IDAEDAVAV----------------SAKTGLNIDLVLEAIVH 178
VI ++A + SAK + ID ++E I +
Sbjct: 177 TVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITN 236

Query: 179 RIPPPKPRDTDKLQALIIDSWFDNYLGVVSLVRVMQGEIKPGSKIQVMSTGRTHLVDKVG 238
+ R +L + + ++ +R+ G + +++ + + +
Sbjct: 237 KFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYT 296

Query: 239 VFTPKRKELVALGAGEVGWINASIKDVHGAPVGDTLTLAADPAPHALPGFQEMQPRVFAG 298
+ ++ +GE+ + + + +GDT L + P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKL-NSVLGDTKLLPQRERI------ENPLPLLQTT 349

Query: 299 LFPVDAEDYPDLREALDKLRLNDAALRFE--PESSEAMGFGFRCGFLGMLHMEIVQERLE 356
+ P + L +AL ++ +D LR+ + E + FLG + ME+ L+
Sbjct: 350 VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEII-----LSFLGKVQMEVTCALLQ 404

Query: 357 REYNLNLISTAPTVVY--EVLKTDGTIIPMDNPSK 389
+Y++ + PTV+Y LK I ++ P
Sbjct: 405 EKYHVEIEIKEPTVIYMERPLKKAEYTIHIEVPPN 439


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14970V8PROTEASE728e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 72.0 bits (176), Expect = 8e-16
Identities = 33/163 (20%), Positives = 58/163 (35%), Gaps = 28/163 (17%)

Query: 133 AGKSMGSGFIISADGYVLTNHHVVDGASEVTVKLTDRR-----------EFKA-KVVGSD 180
G + SG ++ +LTN HVVD L F A ++
Sbjct: 99 TGTFIASGVVV-GKDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYS 157

Query: 181 EQFDVALLKIEA--------KGLPTVRIGDSNTLKPGQWVVAIGSPFGLDHSVTAGIVSA 232
+ D+A++K + + + ++ + Q + G P V+
Sbjct: 158 GEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VAT 210

Query: 233 TGRSNPYADQRYVPFIQTDVAINQGNSGGPLLNTRGEVVGINS 275
S +Q D++ GNSG P+ N + EV+GI+
Sbjct: 211 MWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS14975IGASERPTASE300.016 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.016
Identities = 29/208 (13%), Positives = 49/208 (23%), Gaps = 15/208 (7%)

Query: 74 TAKVRAAVAEEPAPAAAPAARPAARWRWGGGAALAASVAAIALFVSRERLPEVAAPAPAE 133
T + A E A+ A VA + E A E
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNV--KANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 134 -----QVFATTAQLPASTPAPQAPKAPDAPDDVVALAAAVPAAALASARRGAATRNQQVA 188
+V Q + +PK + A + + + N
Sbjct: 1108 KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 189 RSVAARNQQAPARMVASAAAPAATPAVASPAAANPFTHPEATLQARPWPRAALAGSGESP 248
+ PA+ +S T + + +PE T A P S +
Sbjct: 1168 -------TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 249 LNASFS-QSRQAPAFYPFEPAPQAAAAA 275
S +S + + A
Sbjct: 1221 NRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


91XC_RS15120XC_RS15190N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS151203131.163890hydrogenase
XC_RS151255151.356483type III secretion system protein
XC_RS151306151.516308HPr kinase
XC_RS151354140.959573ATP synthase
XC_RS15140-115-0.171700type III secretion system protein
XC_RS15145113-1.677181HPr kinase
XC_RS15150013-2.093239hypothetical protein
XC_RS15155112-1.447111HPr kinase
XC_RS15160113-1.648651HPr kinase
XC_RS15165213-1.820705type III secretion system protein
XC_RS15170517-0.897227hypersensitivity response secretion protein
XC_RS15175319-0.332166type III secretion protein
XC_RS15180419-0.741061aldolase
XC_RS15185221-2.776707flagellar biosynthesis protein FliP
XC_RS15190125-3.119165aldolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15120TYPE3OMGPROT324e-105 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 324 bits (831), Expect = e-105
Identities = 103/309 (33%), Positives = 159/309 (51%), Gaps = 16/309 (5%)

Query: 300 ASVWPEMSQARRDAPLAVDAGS---GGELASDAPVIEADPRTNGILIRDRPERMAAYGTL 356
A++ + + VD AS +EADP N I++RD PERM Y L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 357 IQQLDNRPKLLQIDATIIEIRDGALQDLGVDWRFHSRRVDVQTGDGRGGQLGYDGSLSGA 416
I LD +++ +I++I L +LGVDWR V ++TG+ + G S
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWR-----VGIRTGNNHQVVIKTTGDQSNI 326

Query: 417 AAAGAAAPLGGTLTAVLGDAGRYLMTRVSALEQTNKAKIVSTPQVATLDNVEAVMDHKQQ 476
A+ GA G+L G YL+ RV+ LE A++VS P + T +N +AV+DH +
Sbjct: 327 ASNGAL----GSLVDARGL--DYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSET 380

Query: 477 AFVRVSGYASADLYNLSAGVSLRVLPSVVPGSPNGQMRLDVRIEDGQLGANT--VDGIPV 534
+V+V+G A+L ++ G LR+ P V+ ++ L++ IEDG N+ ++GIP
Sbjct: 381 YYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEGIPT 440

Query: 535 ITSSEITTQAFVNEGQSLLIAGYASDTDQTDLNNVPGLSRIPLVGNLFKHRQQSGSRLQR 594
I+ + + T A V GQSL+I G D L+ VP L IP +G LF+ + + R R
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 595 LFLLTPHIV 603
LF++ P I+
Sbjct: 501 LFIIEPRII 509



Score = 236 bits (604), Expect = 5e-72
Identities = 68/230 (29%), Positives = 113/230 (49%), Gaps = 5/230 (2%)

Query: 16 LAAALLLGLLPLLPPHANAASVPWHSRSFKYVADRKDLKEVLRDLSASQSITTWISPEVT 75
+L G L LL ++ A + W + YVA + L+++L D A+ T +S ++
Sbjct: 8 FFKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIN 67

Query: 76 GTLSGKFEA-TPQKFLDDLSGTFGFVWYYDGSVLRIWGANETKNATLSLGAASTSALRDA 134
+SG+FE PQ FL ++ + VWYYDG+VL I+ +E + + L + + L+ A
Sbjct: 68 DKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQA 127

Query: 135 LARMRLDDPRFPVRYDETAHLAVVSGPPGYVDTVAAIAKQVEQVARQR----DATEVQVF 190
L R + +PRF R D + L VSGPP Y++ V A +EQ + R A +++F
Sbjct: 128 LQRSGIWEPRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIF 187

Query: 191 QLHYAQAADHTTRIGGQDIQVPGMASLLRNIYGVRGAPTAALPGPGANFG 240
L YA A+D T ++ PG+A++L+ + +
Sbjct: 188 PLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQA 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15125TYPE3IMRPROT1696e-54 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 169 bits (431), Expect = 6e-54
Identities = 51/240 (21%), Positives = 104/240 (43%), Gaps = 3/240 (1%)

Query: 8 LLALSSQGVSLLTLLALCGVRVFVLFFVLPATAQDSLPGMTRNGVIYVLSSFIAYGQPAD 67
L S Q +S L L +RV L P ++ S+P + G+ +++ IA PA+
Sbjct: 2 LQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPAN 61

Query: 68 ALARIEAAGLVGLVFKEAFIGLLIGFAASTVFWVAESVGLLIDDVSGYNNVQMINPLSGE 127
+ L L ++ IG+ +GF F + G +I G + ++P S
Sbjct: 62 DVPVFSFFAL-WLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 128 QSTPVSTVLMQLAIVSFYALGGMLMLLGALFESFRWWPLSQLMPDMGAIGESFVIQQTDG 187
++ ++ LA++ F G L L+ L ++F P+ + + + +
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGG--EPLNSNAFLALTKAGSL 178

Query: 188 MMAAIVKLSAPVMLVLVLVDLAIGFVARAADKLDPSNLSQPIRGVLALLLLALLTSVFIA 247
+ + L+ P++ +L+ ++LA+G + R A +L + P+ + + L+A L +
Sbjct: 179 IFLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15140FLGFLIH300.008 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 29.8 bits (66), Expect = 0.008
Identities = 31/109 (28%), Positives = 43/109 (39%), Gaps = 9/109 (8%)

Query: 114 RLAEIVAHACEQVLHGHDPA----ALYARAAQALDGALDEANALQVSVHPDALDDARRAF 169
RL ++ A QV+ G P AL + Q L + Q+ VHPD L
Sbjct: 119 RLMQMALEAARQVI-GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDML 177

Query: 170 DAAAAAGGWSMPVELCGDTTLALGACVCEWDTGVFETDLRDQLRSLRRV 218
A + GW L GD TL G C D G + + + + L R+
Sbjct: 178 GATLSLHGW----RLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRL 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15150FLGMRINGFLIF811e-19 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 81.2 bits (200), Expect = 1e-19
Identities = 37/165 (22%), Positives = 71/165 (43%), Gaps = 8/165 (4%)

Query: 23 LYSGLTENDANDMLAVLLTAGVDAEKLTPDDGKTWAVNAPHDQVAYALNVLRTHGMPHER 82
L+S L++ D ++A L + G A+ P D+V L G+P +
Sbjct: 53 LFSNLSDQDGGAIVAQLTQMNIPYR-FANGSG---AIEVPADKVHELRLRLAQQGLP--K 106

Query: 83 HANLG-EMFKKDGLISTPTEERVRFIYGVSQQLSQTLSNIDGVIAADVQIVLPNNDPLSA 141
+G E+ ++ + E+V + + +L++T+ + V +A V + +P
Sbjct: 107 GGAVGFELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 142 SVKPSSAAVFIKFRVGSDLT-SLVPSIKTLVMHSVEGLTYENVSV 185
K SA+V + G L + ++ LV +V GL NV++
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTL 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15165TYPE3IMSPROT332e-115 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 332 bits (854), Expect = e-115
Identities = 110/349 (31%), Positives = 190/349 (54%), Gaps = 2/349 (0%)

Query: 1 MSDEKTEKPTEKKLQDARRDGEVPISPDVTAAAVLLAALLVMKLAGSYFVEHLRALMSIG 60
MS EKTE+PT KK++DAR+ G+V S +V + A+++A ++ Y+ EH LM I
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 61 FDFTTNTRDATALHRALGRIGIQGVLLTLPFVTACLAAGLIGTFVQTGLNASLKPVTPKF 120
+ + + AL + + ++ L P +T + VQ G S + + P
Sbjct: 61 AE-QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 121 DSLNPVNGVKKLFSLRSLINLLKLGIKAAVIGVVLWYGLRALMPTIIGLAYQPPADIAQI 180
+NP+ G K++FS++SL+ LK +K ++ +++W ++ + T++ L I +
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 181 GWRALGILCALAVLVFVLVGAADWSVQHWLFIRDKRMSKDEQKREHKESEGDPEVKGKRK 240
+ L L + + FV++ AD++ +++ +I++ +MSKDE KRE+KE EG PE+K KR+
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 241 EFAKELVFGDPRERVAKAKVMVVNPTHYAVALAYEPDGFGLPQVVAKGVDEGALELRAYA 300
+F +E+ + RE V ++ V+V NPTH A+ + Y+ LP V K D +R A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 301 HNQGIPIVANPPLARAL-HEVELGEAVPESLFETVAVVLRWVDELGRDN 348
+G+PI+ PLARAL + + +P E A VLRW++ +
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEK 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15180TYPE3OMOPROT533e-10 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 53.5 bits (128), Expect = 3e-10
Identities = 41/177 (23%), Positives = 77/177 (43%), Gaps = 15/177 (8%)

Query: 144 PSPLPAWLSALRVTTRLRIGQRTATAALLQSLRPGDVLLHALATAPVRSGELLWGIPGGA 203
P+ LR R IG +LL + GDVLL + A V G
Sbjct: 138 PAVGGGRPKMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRAEVYCYAKKLG----- 192

Query: 204 VLRAPVRLTLQQMILETAPTMQHDMPASDSSSSATDVAAL-ELPVQLEVD--QLALSLSV 260
++ I+ +QH ++++ +A + L +LPV+LE + ++L+
Sbjct: 193 -----HFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAE 247

Query: 261 LSGLQPGQILELSVPVDQADIRLVVYGQTIGIGRLLAVGEHLGVQILS-MSETAHAD 316
L + Q+L L + ++ ++ G +G G L+ + + LGV+I +SE+ + +
Sbjct: 248 LEAMGQQQLLSLPTNAEL-NVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGNGE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15185TYPE3IMPPROT2457e-85 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 245 bits (626), Expect = 7e-85
Identities = 80/219 (36%), Positives = 132/219 (60%), Gaps = 8/219 (3%)

Query: 3 MPDVGSLLLVVIMLGLLPFAAMVVTSYTKIVVVLGLLRNAIGVQQVPPNMVLNGVALLVS 62
M + SL+ ++ LLPF T + K +V ++RNA+G+QQ+P NM LNGVALL+S
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 63 CFVMAPVGMEAFKAAQNYSPGADN-SRVVVLLDACREPFRQFLLKHTREREKAFFIRSAQ 121
FVM P+ +A+ ++ ++ S + +D + +R +L+K++ FF +
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 122 QIWPKDKA-------DTLKPDDLLVLAPAFTLSELTEAFRIGFLLYLVFIVIDLVVANAL 174
+ ++ D ++ + L PA+ LSE+ AF+IGF LYL F+V+DLVV++ L
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 175 MAMGLSQVTPTNVAIPFKLLLFVALDGWSMLIHGLVLSY 213
+A+G+ ++P ++ P KL+LFVALDGW++L GL+L Y
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQY 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15190TYPE3IMQPROT629e-17 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 62.5 bits (152), Expect = 9e-17
Identities = 24/78 (30%), Positives = 43/78 (55%)

Query: 4 DDLVRFTSEALLLCLKVSLPVVGVAAVAGLLIAFIQAVMSLQDASISFALKLVVVVAAIA 63
DDLV ++AL L L +S VA + GLL+ Q V LQ+ ++ F +KL+ V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 VTAPWGASAIMQFGQALM 81
+ + W ++ +G+ ++
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


92XC_RS15385XC_RS15410N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS15385-1140.938548chemotaxis protein CheY
XC_RS15390-1130.869392transcriptional regulator
XC_RS15395-1160.746400chemotaxis protein CheY
XC_RS15400-115-0.058743chemotaxis protein CheR
XC_RS15405-2140.291508chemotaxis protein CheB
XC_RS154100150.598939histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15385HTHFIS693e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.5 bits (170), Expect = 3e-17
Identities = 27/114 (23%), Positives = 54/114 (47%), Gaps = 5/114 (4%)

Query: 6 RLLMVEDQQELRDLIGEALRDAGITVDTADDGHCALRMLRENGPYDVVFSDIRMPNGMSG 65
+L+ +D +R ++ +AL AG V + R + G D+V +D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMP-DENA 62

Query: 66 IELSEQVSQLLPQARIILASGFAKAQLPPLPAQ---VDFLPKPYRLRQLIGLLK 116
+L ++ + P +++ S ++ D+LPKP+ L +LIG++
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15390PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 45/256 (17%), Positives = 84/256 (32%), Gaps = 63/256 (24%)

Query: 132 ERHEAQQLLQHAQQALLQSQKVEALGRLTLGLAHDFNNLLTVMVTSLDLIALRAGEDART 191
++ + + Q AQ L++Q + H N +L+ I ED
Sbjct: 150 DQWKMASMAQEAQLMALKAQ-INP---------HFMFN-------ALNNIRALILEDP-- 190

Query: 192 RMLVEAAQTAVDRGTLLTRQLLAFARGQR--LVPERHAPAAVVERSLELLRRACPAGISL 249
A+ + + L R L ++ ++ L E VV+ L+L +
Sbjct: 191 ----TKAREMLTSLSELMRYSLRYSNARQVSLADE----LTVVDSYLQLASIQFEDRLQF 242

Query: 250 RVDFADALPDVRVDPGQLEAALLNLVFNSCDAMPAGGSIVLSASTQQRAPLDDPHGSARA 309
A+ DV+V P ++ + N + + +P GG I+L +
Sbjct: 243 ENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDN------------G 290

Query: 310 YVGIAVSDTGPGMSAQVAQRASEPFFTTKEVGKGSGLGLSQVHG-----FAAQSGGFVDL 364
V + V +TG K + +G GL V + ++ + L
Sbjct: 291 TVTLEVENTGSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQ--IKL 334

Query: 365 QTAPGRGTTVTLLLPA 380
G +L+P
Sbjct: 335 SEKQG-KVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15395HTHFIS862e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-19
Identities = 32/118 (27%), Positives = 60/118 (50%), Gaps = 2/118 (1%)

Query: 923 LDGATVLLAEDDVRNIFALSSVLEPLGVTLQIARNGREALEHLAKHEVDLVLMDIMMPEM 982
+ GAT+L+A+DD L+ L G ++I N +A + DLV+ D++MP+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 983 DGITAMRQIRANRQWQDLPIIALTAKAMADDREHCLQAGANDYIAKPIDVDKLVSLCR 1040
+ + +I+ DLP++ ++A+ + GA DY+ KP D+ +L+ +
Sbjct: 61 NAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116



Score = 67.5 bits (165), Expect = 1e-13
Identities = 37/151 (24%), Positives = 58/151 (38%), Gaps = 15/151 (9%)

Query: 660 ILAVEDETRFAQALVDLAHELDFDCVVAPNAEEALRLAAELRPSGILLDIGLPDASGLSV 719
IL +D+ L +D + NA R A ++ D+ +PD + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 720 LERLK-RNPATRHIPVHVVSA---LERSQIALELGAVGYLIKPATRELLAGAIRQLEETN 775
L R+K P +PV V+SA + A E GA YL KP L G I +
Sbjct: 66 LPRIKKARP---DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 776 ARALRRLL--------IVEDDSALRANLELL 798
R +L +V +A++ +L
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153



Score = 65.2 bits (159), Expect = 6e-13
Identities = 28/143 (19%), Positives = 59/143 (41%), Gaps = 7/143 (4%)

Query: 781 RLLIVEDDSALRANLELLLARDQLEIVAVGTIAEAMDQLGSATFDCMVTDLALPDGSGYD 840
+L+ +DD+A+R L L+R ++ A + + D +VTD+ +PD + +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 841 LLERMAGNDAVAFPPVIVYTGRALTRDEEQRLRRYSKSIIIKGVRSPERLLDEVTLFLHS 900
LL R+ A PV+V + + + + + + K L E+ +
Sbjct: 65 LLPRI--KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD-----LTELIGIIGR 117

Query: 901 VEASLPSDQQRLLREARRRDAVL 923
A +L +++ ++
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS15410HTHFIS793e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 3e-18
Identities = 33/151 (21%), Positives = 59/151 (39%), Gaps = 12/151 (7%)

Query: 16 AKILIVDDVPQNLVAMEALLQREGIQVLCAASGAQALELLLEHDVALALLDVHMPEMDGF 75
A IL+ DD + L R G V ++ A + D L + DV MP+ + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 76 SLAELMRGSQRSRHVPIIFLTASPNDPMRAFQGYETGAVDFLHKPIEPHVILSKVNVFIE 135
L ++ + +P++ ++A N M A + E GA D+L KP + ++
Sbjct: 64 DLLPRIK--KARPDLPVLVMSAQ-NTFMTAIKASEKGAYDYLPKPFDLTELIG------- 113

Query: 136 LYQQRRLLKARNASLERALTLNETMMAVLTH 166
R L + ++ M ++
Sbjct: 114 --IIGRALAEPKRRPSKLEDDSQDGMPLVGR 142


93XC_RS18025XC_RS18090N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS18025018-0.215776type II secretion system protein D
XC_RS180303173.439358general secretion pathway protein N
XC_RS180353173.255502general secretion pathway protein M
XC_RS180401162.812113type II secretion system protein L
XC_RS180453181.990590general secretion pathway protein K
XC_RS180502131.522865general secretion pathway protein J
XC_RS180551120.442637type II secretion system protein I
XC_RS18060-2150.913724type II secretion system protein H
XC_RS180650161.919308type II secretion system protein G
XC_RS180700162.029493hypothetical protein
XC_RS180750162.068780type II secretion system protein F
XC_RS18080-1132.326766general secretion pathway protein E
XC_RS180850142.772034protease
XC_RS18090-1133.019537membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18025BCTERIALGSPD408e-135 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 408 bits (1050), Expect = e-135
Identities = 175/703 (24%), Positives = 298/703 (42%), Gaps = 115/703 (16%)

Query: 65 VIRRGSGTMINQSAAAAPSPTLGMASSGSATFNFEGESVQAVVKAILGDMLGQNYVIAPG 124
VIR S T++ +A A++ + +F+G +Q + + + L + +I P
Sbjct: 6 VIRSFSLTLLIFAALLFRP-----AAAEEFSASFKGTDIQEFINTVSKN-LNKTVIIDPS 59

Query: 125 VQGTVTLATPNPVSPAQALNLLEMVLG-WNNARMVFSGGRYNIVPA-DQALAGTVAPSTA 182
V+GT+T+ + + ++ Q VL + A + + G +V + D A S A
Sbjct: 60 VRGTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDA 119

Query: 183 SPSAARGFEVRVVPLKYISASEMKKVLEPYARPNAIVGTDA---SRNVITLGGTRAELEN 239
+P RVVPL ++A ++ +L NA VG+ NV+ + G A ++
Sbjct: 120 APGIGDEVVTRVVPLTNVAARDLAPLLRQL-NDNAGVGSVVHYEPSNVLLMTGRAAVIKR 178

Query: 240 YLRTVQIFDVDWLSGMSVGVFPIQSGKAEKISADLEKVFGEQSKT--PSAGMFRFMPLEN 297
L V+ VD SV P+ A + + ++ + SK+ P + + + E
Sbjct: 179 LLTIVE--RVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADER 236

Query: 298 ANAVLVI---TPQPRYLDQIQQWLDRIDSAGGGVRLFSYELKYIKAKDLADRLSEVFGGR 354
NAVLV + R + I+Q LDR + G ++ LKY KA DL + L+ +
Sbjct: 237 TNAVLVSGEPNSRQRIIAMIKQ-LDRQQATQGNTKVIY--LKYAKASDLVEVLTGI---- 289

Query: 355 GNGGNSGPSLVPGGVVNMLGNNSGGADRDESLGSSSGATGGDIGGTSNGSSQSGTSGSFG 414
+S S+ +
Sbjct: 290 ---------------------------------------------SSTMQSEKQAAKPVA 304

Query: 415 GSSGSGMLQLPPSTNQNGSVTLEVEGDKVGVSAVAETNTLLVRTSAQAWKSIRDVIEKLD 474
+ +++ TN L V ++ + VI +LD
Sbjct: 305 ALDKNIIIKAHGQTN-----ALIVTAAPDVMNDLER------------------VIAQLD 341

Query: 475 VMPMQVHIEAQIAEVTLTGRLQYGVNWYFENAVTTPSNADGSGGPNLPSAAGRGIW---G 531
+ QV +EA IAEV L G+ W +NA T SG P + AG + G
Sbjct: 342 IRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMT--QFTNSGLPISTAIAGANQYNKDG 399

Query: 532 DVSGSVTS-----NGVAWTFLGKNAAAIISALDQVTNLRLLQTPSVFVRNNAEATLNVGS 586
VS S+ S NG+A F N A +++AL T +L TPS+ +N EAT NVG
Sbjct: 400 TVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQ 459

Query: 587 RIPINSTSINTGLGSDSSFSSVQYIDTGVILKVRPRVTKDGMVFLDIVQEVSTPGARPAA 646
+P+ + S T D+ F++V+ G+ LKV+P++ + V L+I QEVS+
Sbjct: 460 EVPVLTGSQTT--SGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSV------ 511

Query: 647 CTAAATTTVNSAACNVDINTRRVKTEAAVQNGDTIMLAGLIDDSTTDGSNGIPFLSKLPV 706
A + S+ NTR V V +G+T+++ GL+D S +D ++ +P L +PV
Sbjct: 512 ---ADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPV 568

Query: 707 VGALFGRKTQNSDRREVIVLITPSIVRNPQDARDLTDEYGSKF 749
+GALF ++ +R +++ I P+++R+ + R + + F
Sbjct: 569 IGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYTAF 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18030TONBPROTEIN345e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 33.8 bits (77), Expect = 5e-04
Identities = 20/84 (23%), Positives = 27/84 (32%), Gaps = 7/84 (8%)

Query: 145 IEGPGGTQTLELQVFNGQGGQPPTAIGGRPQAPGAVPPLPPNVPPAPATPAPPPAEVPQQ 204
IE P Q + + + +PP QA P P P PP E P
Sbjct: 36 IELPAPAQPISVTMVTPADLEPP-------QAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 88

Query: 205 QPGGQAPPTVPPQRSDGAQEAPRP 228
+ P P+ QE P+
Sbjct: 89 IEKPKPKPKPKPKPVKKVQEQPKR 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18055PilS_PF08805349e-05 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 34.1 bits (78), Expect = 9e-05
Identities = 7/55 (12%), Positives = 22/55 (40%), Gaps = 4/55 (7%)

Query: 1 MKHQRGYSLIEVIVAFALLALALTLLLGSLSGAARQVRGADDSTRATLHAQSLLA 55
+ +G +L+EV++ ++ + S + S+ + +++A
Sbjct: 22 KEQDKGATLMEVLLVVGVIVVLAASAYKLYSMV----QSNIQSSNEQNNVLTVIA 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18060BCTERIALGSPH310.002 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 30.7 bits (69), Expect = 0.002
Identities = 24/108 (22%), Positives = 49/108 (45%), Gaps = 1/108 (0%)

Query: 21 QLRGSSLLEMLLVIALIALAGVLAAAALTGGIDGMRLRSAGKAIAAQLRYTRTQAIATGT 80
+ RG +LLEM+L++ L+ ++ + A D ++ + AQLR+ + + + TG
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLAR-FEAQLRFVQQRGLQTGQ 60

Query: 81 PQRFLIDPQQRRWEAPGGHHGDLPAALEVRFTGARQVQSRQDQGAIQF 128
+ P + ++ G PA + ++G R + R + A
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSG 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18065BCTERIALGSPG1412e-46 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 141 bits (358), Expect = 2e-46
Identities = 41/132 (31%), Positives = 61/132 (46%), Gaps = 18/132 (13%)

Query: 15 QAGMSLLEIIIVIVLIGAVLTLVGSRVLGGADRGKANLAKSQIQTLAGKIENFQLDTGKL 74
Q G +LLEI++VIV+IG + +LV ++G ++ A S I L ++ ++LD
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHY 66

Query: 75 PSKLDDLVTQPGGSSGWLGPYAKPVELN------------DPWGHTIEYRVPGDGQAFDL 122
P+ T G S P P+ N DPWG+ PG+ A+DL
Sbjct: 67 PT------TNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 123 ISLGKDGRPGGS 134
+S G DG G
Sbjct: 121 LSAGPDGEMGTE 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18075BCTERIALGSPF436e-154 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 436 bits (1122), Expect = e-154
Identities = 135/411 (32%), Positives = 215/411 (52%), Gaps = 12/411 (2%)

Query: 1 MPLYRYKALDAHGEMLDGQMEAANDAEVALRLQEQGHLPV---ETRLATGENGSPSLRML 57
M Y Y+ALDA G+ G EA + + L+E+G +P+ E R ++GS L L
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLS-L 59

Query: 58 LRKKPFDNAALVQFTQQLATLIGAGQPLDRALSILMDLPEDDKSRRVIADIRDTVRGGAP 117
RK + L T+QLATL+ A PL+ AL + E +++A +R V G
Sbjct: 60 RRKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHS 119

Query: 118 LSVALERQHGLFSKLYINMVRAGEAGGSMQDTLQRLADYLERSRALKGKVINALIYPAIL 177
L+ A++ G F +LY MV AGE G + L RLADY E+ + ++ ++ A+IYP +L
Sbjct: 120 LADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVL 179

Query: 178 LAVVGCALLFLLGYVVPQFAQMYESLDVALPWFTQAVLSVGLLVRDW--WLVLVVIPGVL 235
V + LL VVP+ + + + ALP T+ ++ + VR + W++L ++ G +
Sbjct: 180 TVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFM 239

Query: 236 G--LWLDRKRRNAAFRAALDAWLLRQKVIGSLIARLETARLTRTLGTLLRNGVPLLAAIG 293
+ L +++R R + LL +IG + L TAR RTL L + VPLL A+
Sbjct: 240 AFRVMLRQEKR----RVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMR 295

Query: 294 IARNVMSNTALVEDVAAAADDVKNGHGLSMSLARGKRFPRLALQMIQVGEESGALDTMLL 353
I+ +VMSN ++ A D V+ G L +L + FP + MI GE SG LD+ML
Sbjct: 296 ISGDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLE 355

Query: 354 KTADTFELETAQAIDRALAALVPLITLVLASVVGLVIISVLVPLYDLTNAI 404
+ AD + E + + AL PL+ + +A+VV +++++L P+ L +
Sbjct: 356 RAADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18085SUBTILISIN2041e-62 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 204 bits (521), Expect = 1e-62
Identities = 103/362 (28%), Positives = 152/362 (41%), Gaps = 68/362 (18%)

Query: 156 PQLVPNDPLYAQYQWHLSNPNGGINAPAAWDLSQGAGVVVAVLDTGILPGHPDFAGNLLQ 215
Q++ + + + I APA W+ ++G GV VAVLDTG HPD ++
Sbjct: 10 YQVIKQEQQVNEIPRGV----EMIQAPAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIG 65

Query: 216 GYDFITDAEVSRRPTDARVPGALDYGDWEEADNVCYDGSVAQESSWHGTHVSGTVAEATH 275
G +F D+ D + ++ + HGTHV+GT+A AT
Sbjct: 66 GRNF--------------------------TDDDEGDPEIFKDYNGHGTHVAGTIA-ATE 98

Query: 276 NGVGMAGVAPKATILPVRVLGRCG-GYTSDIADAIVWASGGTVDGVPANTNPAEVINMSL 334
N G+ GVAP+A +L ++VL + G G I I +A VD +I+MSL
Sbjct: 99 NENGVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQKVD----------IISMSL 148

Query: 335 GGGEPCDPATQVAINGAVSRGTTVVVAAGNSGEDAAN----HSPASCNNTITVGATRITG 390
GG E A+ AV+ V+ AAGN G+ P N I+VGA
Sbjct: 149 GGPEDVP-ELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDR 207

Query: 391 GIIYYSNYGSQVDLSGPGGGSVDGNPGGYIWQAGYDGATTPTSGSYSYMGIGGTSMASPH 450
+SN ++VDL PG I G Y GTSMA+PH
Sbjct: 208 HASEFSNSNNEVDLVAPGED---------ILSTVPGG---------KYATFSGTSMATPH 249

Query: 451 VAGVVALVQSASIGLGDGPLTPAAMEALLKQTSRRFPVTPPTSTPIGSGIVDAKAALEAV 510
VAG +AL++ + + LT + A L + + +P G+G++ A E
Sbjct: 250 VAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPKME---GNGLLYLTAVEELS 306

Query: 511 LV 512
+
Sbjct: 307 RI 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS18090OMADHESIN562e-09 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 55.7 bits (133), Expect = 2e-09
Identities = 71/241 (29%), Positives = 108/241 (44%), Gaps = 24/241 (9%)

Query: 1270 GAESVADGTSAAAFGFGAEATSNYSTALGGYSTASGFNSTALGNFSTASGSSSVAVGGDA 1329
G + A G + A G AEA + A+G S A+G NS A+G S A G S+V G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1330 TASGAYSIAAGQASVASGYNSVAVGGALLGLLPTEASGDFSTALGGAAWAPGLNSTALGN 1389
TA + VA+G ++ D A+G + A NS A+G+
Sbjct: 122 TAQK---------------DGVAIGA-------RASTSDTGVAVGFNSKADAKNSVAIGH 159

Query: 1390 FAESTGES--SVALGADSVADRDFAVSVGSAGNERQITNVAAGTQGTDAVNLDQLTAVAE 1447
+ S+A+G S DR+ +VS+G RQ+T++AAGT+ TDAVN+ QL E
Sbjct: 160 SSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIE 219

Query: 1448 TAQGTSKYFKASGSDDSDAGAYIEGDNALAAGEGANASSDNSTAVGAGAQAVAENATAVG 1507
Q + A +++A A + + L S T A +A A++ +
Sbjct: 220 KTQENTNKRSAELLANANAYADNKSSSVLGIANNYTDSKSAETLENARKEAFAQSKDVLN 279

Query: 1508 M 1508
M
Sbjct: 280 M 280



Score = 55.7 bits (133), Expect = 2e-09
Identities = 65/193 (33%), Positives = 94/193 (48%), Gaps = 22/193 (11%)

Query: 72 GRGASAPAAHATAIGAGSNASATGAVATGADSSASGVNSSAIGRQTNAIGENAVAIGYNS 131
G ASA H+ AIGA + A+ AVA GA S A+GVNS AIG + A+G++AV G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 132 FVRQAG----------ENGVALGANAGVTGANSVALGAGSRTHEDDVVSVGSGNGRGG-- 179
++ G + GVA+G N+ NSVA+G S + S+ G+
Sbjct: 122 TAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDR 181

Query: 180 ---------PATRRITNVSAGVNANDAVNVAQLQAVSEVAEDTATFFKAQPGDDSIGAYA 230
R++T+++AG DAVNVAQL+ E ++ A+ ++ AYA
Sbjct: 182 ENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANA-NAYA 240

Query: 231 DGAGAVAAGDAAN 243
D + G A N
Sbjct: 241 DNKSSSVLGIANN 253



Score = 50.3 bits (119), Expect = 8e-08
Identities = 54/164 (32%), Positives = 81/164 (49%), Gaps = 5/164 (3%)

Query: 2104 AATSTAVGNAAVANHITGTAIGGSAYAHGPNDTAIGSNARVNADGSTAVGANTQIAAVAT 2163
AT+ A AAVA A G ++ A GP A+G +A STA I A A+
Sbjct: 76 GATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGARAS 135

Query: 2164 NA---VAMGEGAQVSAASGTAIGQGARASAQG--AVALGQGSVADRANTVSVGSVGGERQ 2218
+ VA+G ++ A + AIG + +A ++A+G S DR N+VS+G RQ
Sbjct: 136 TSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQ 195

Query: 2219 VANVAAGTRATDAVNKGQLDSGVAAANSYTDSRYSAMADSFETY 2262
+ ++AAGT+ TDAVN QL + T+ R + + + Y
Sbjct: 196 LTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAY 239



Score = 49.5 bits (117), Expect = 1e-07
Identities = 51/144 (35%), Positives = 72/144 (50%), Gaps = 3/144 (2%)

Query: 834 GANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDESIALGWEAQAEGDQGIALGAGS 893
G NA A SIA+G A A A+++G S A G S+A+G ++A GD + GA S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 894 QADAYSTAIGGYATASGASATAVGNNSRAVDGYATALGSDSMASGN--FSTTVGGASVAS 951
A AIG A+ S + AVG NS+A + A+G S + N +S +G S
Sbjct: 122 TAQKDGVAIGARASTSD-TGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTD 180

Query: 952 GRGATAIGAESVARMDRDTAIGTE 975
+ +IG ES+ R A GT+
Sbjct: 181 RENSVSIGHESLNRQLTHLAAGTK 204



Score = 49.5 bits (117), Expect = 1e-07
Identities = 59/165 (35%), Positives = 82/165 (49%), Gaps = 26/165 (15%)

Query: 371 GTQTSASGTSSTAVGGPVDLIPGLGFFVQTQASGEAASALGAGAIASGTYTTAVGTLSEA 430
G SA G S A+G +A+ AA A+GAG+IA+G + A+G LS+A
Sbjct: 62 GLNASAKGIHSIAIGA------------TAEAAKGAAVAVGAGSIATGVNSVAIGPLSKA 109

Query: 431 SGTEATAVGYFAYAPGEG------------ATAVGPESWASGELSTALGYYS--TARGAN 476
G A G + A +G AVG S A + S A+G+ S A
Sbjct: 110 LGDSAVTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGY 169

Query: 477 SVALGANSVATRADTVSVGAAGAERQITSVAAGTEGTDAVNLNQL 521
S+A+G S R ++VS+G RQ+T +AAGT+ TDAVN+ QL
Sbjct: 170 SIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQL 214



Score = 46.8 bits (110), Expect = 9e-07
Identities = 56/159 (35%), Positives = 79/159 (49%), Gaps = 4/159 (2%)

Query: 650 ALGVGSVAFGGTSTAVGGASVAFGTDSAAFGANAAAGGTASTAIGANSNAFGERTVALGG 709
ALG+ A G + A G S A GA A A A+ A+GA S A G +VA+G
Sbjct: 46 ALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGP 105

Query: 710 ASNASGDESIALGVSSLASALGTTAVGSNANASIANATAVGFNSSAGDDYATALGGDSN- 768
S A GD ++ G +S A G A+G+ A+ S AVGFNS A + A+G S+
Sbjct: 106 LSKALGDSAVTYGAASTAQKDG-VAIGARASTS-DTGVAVGFNSKADAKNSVAIGHSSHV 163

Query: 769 -ASGYFSTAVGGTSIANGRGATAIGYESIGNGAASTALG 806
A+ +S A+G S + + +IG+ES+ A G
Sbjct: 164 AANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAG 202



Score = 45.3 bits (106), Expect = 2e-06
Identities = 64/211 (30%), Positives = 95/211 (45%), Gaps = 60/211 (28%)

Query: 903 GGYATASGASATAVGNNSRAVDGYATALGSDSMASGNFSTTVGGASVASGRGATAIGAES 962
G A+A G + A+G + A G A A+G+ S+A+G S +G S A G A GA S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 963 VARMDRDTAIGTESVADGGDSTALGANARASYDSSVALGANANSSNYYSVALGTYAVATG 1022
A+ D VA+GA A++S
Sbjct: 122 TAQKD-----------------------------GVAIGARASTS--------------- 137

Query: 1023 GSATSIGGQSYAPGNESVALGWQSNASGTRSVSLGSGAYTPADDG--VALGAGSIADRDN 1080
+ VA+G+ S A SV++G ++ A+ G +A+G S DR+N
Sbjct: 138 --------------DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDREN 183

Query: 1081 TVSVGSVGSERQITNVAAGTEGTDAVNLDQL 1111
+VS+G RQ+T++AAGT+ TDAVN+ QL
Sbjct: 184 SVSIGHESLNRQLTHLAAGTKDTDAVNVAQL 214



Score = 44.9 bits (105), Expect = 3e-06
Identities = 58/199 (29%), Positives = 94/199 (47%), Gaps = 8/199 (4%)

Query: 820 GTESVAYGDDSTALGANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDESIALGWEA 879
G + A G S A+GA A AA ++AVG + A G ++++G S A GD ++ G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 880 QAEGDQGIALGAGSQADAYSTAIGGYATASGASATAVGNNSR--AVDGYATALGSDSMAS 937
A+ D G+A+GA + A+G + A ++ A+G++S A GY+ A+G S
Sbjct: 122 TAQKD-GVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTD 180

Query: 938 GNFSTTVGGASVASGRGATAIG-----AESVARMDRDTAIGTESVADGGDSTALGANARA 992
S ++G S+ A G A +VA++ ++ E+ ANA A
Sbjct: 181 RENSVSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYA 240

Query: 993 SYDSSVALGANANSSNYYS 1011
SS LG N ++ S
Sbjct: 241 DNKSSSVLGIANNYTDSKS 259



Score = 44.5 bits (104), Expect = 4e-06
Identities = 54/159 (33%), Positives = 80/159 (50%), Gaps = 4/159 (2%)

Query: 1170 AIGSGASATAQYANASGYNAAASGYGSVSTGAFSQASGDYGVALGGESEASGAQSTAVGA 1229
A+G A G NA+A G S++ GA ++A+ VA+G S A+G S A+G
Sbjct: 46 ALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGP 105

Query: 1230 AAGASGDGAFAGGALSVAEGTESTALGYFASATGESATAVGAESVADGTSAAAFGFGAEA 1289
+ A GD A GA S A+ + A+G AS T ++ AVG S AD ++ A G +
Sbjct: 106 LSKALGDSAVTYGAASTAQ-KDGVAIGARAS-TSDTGVAVGFNSKADAKNSVAIGHSSHV 163

Query: 1290 TSN--YSTALGGYSTASGFNSTALGNFSTASGSSSVAVG 1326
+N YS A+G S NS ++G+ S + +A G
Sbjct: 164 AANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAG 202



Score = 42.6 bits (99), Expect = 2e-05
Identities = 44/132 (33%), Positives = 70/132 (53%), Gaps = 4/132 (3%)

Query: 764 GGDSNASGYFSTAVGGTSIANGRGATAIGYESIGNGAASTALGFASVAWGEGGTAIGTES 823
G +++A G S A+G T+ A A A+G SI G S A+G S A G+ G S
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 824 VAYGDDSTALGANAYAADTGSIAVGTYANAYGPRAISLGGQSNAAGDE--SIALGWEAQA 881
A D A+GA A +DTG +AVG + A ++++G S+ A + SIA+G ++
Sbjct: 122 TAQ-KDGVAIGARASTSDTG-VAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 882 EGDQGIALGAGS 893
+ + +++G S
Sbjct: 180 DRENSVSIGHES 191



Score = 42.6 bits (99), Expect = 2e-05
Identities = 57/210 (27%), Positives = 89/210 (42%), Gaps = 21/210 (10%)

Query: 1653 GFIPARASGTGAAAFGGGAWATADYTTAIGWNSYADGVNASALGQSAAALADNALAIGGN 1712
G + A A G + A G A A A+G S A GVN+ A+G + AL D+A+ G
Sbjct: 61 GGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAA 120

Query: 1713 SRADAIGASVVGVDASATGINSTGVGRQVNVIGENAVSVGYNSFVRESAVNGVALGANAG 1772
S A G + +G AS + + V+VG+NS + ++
Sbjct: 121 STAQKDGVA-IGARASTS---------------DTGVAVGFNSKADAKNSVAIGHSSHVA 164

Query: 1773 ATGADSVALGSGSRTYEANTVSVGSGNGRGGPATRRIVNVSDGEVATDAVNKGQLDALAA 1832
A S+A+G S+T N+VS+G + R++ +++ G TDAVN QL
Sbjct: 165 ANHGYSIAIGDRSKTDRENSVSIGHES-----LNRQLTHLAAGTKDTDAVNVAQLKKEIE 219

Query: 1833 DVQTTSGMVQTTGEGVARATGDRATAAGAG 1862
Q + A A D +++ G
Sbjct: 220 KTQENTNKRSAELLANANAYADNKSSSVLG 249



Score = 40.3 bits (93), Expect = 9e-05
Identities = 38/103 (36%), Positives = 58/103 (56%), Gaps = 4/103 (3%)

Query: 1858 AAGAGATASGARSVAVAAGSTASATGASAMGVDSSASGVNSTAMGRQTNSIGENGVALGY 1917
A G A+A G S+A+ A + A+ A A+G S A+GVNS A+G + ++G++ V G
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 1918 NSFVRQSGANAVALGANAGASGADSVALGSGSRTYEANVVSVG 1960
S ++ G VA+GA A S VA+G S+ N V++G
Sbjct: 120 ASTAQKDG---VAIGARASTSDT-GVAVGFNSKADAKNSVAIG 158



Score = 40.3 bits (93), Expect = 9e-05
Identities = 45/131 (34%), Positives = 70/131 (53%), Gaps = 4/131 (3%)

Query: 552 AAGSNAAAFNDYSTALGSSSVASAQGATAVGSGANATTDNATAVGFNSTAVAENTTALGG 611
A G NA+A +S A+G+++ A+ A AVG+G+ AT N+ A+G S A+ ++ G
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 612 NSSASGDGSTAVGGATRATASGATALGYESIANGADSTALGVGS--VAFGGTSTAVGGAS 669
S+A DG GA +T+ A+G+ S A+ +S A+G S A G S A+G S
Sbjct: 120 ASTAQKDGVAI--GARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRS 177

Query: 670 VAFGTDSAAFG 680
+S + G
Sbjct: 178 KTDRENSVSIG 188



Score = 39.5 bits (91), Expect = 2e-04
Identities = 42/123 (34%), Positives = 61/123 (49%), Gaps = 12/123 (9%)

Query: 610 GGNSSASGDGSTAVGGATRATASGATALGYESIANGADSTALGVGSVAFGGTSTAVGGAS 669
G N+SA G S A+G A A A+G A S A GV SVA G S A+G ++
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVG-------AGSIATGVNSVAIGPLSKALGDSA 114

Query: 670 VAFGTDSAAFGANAAAGGTAST-----AIGANSNAFGERTVALGGASNASGDESIALGVS 724
V +G S A A G AST A+G NS A + +VA+G +S+ + + ++ +
Sbjct: 115 VTYGAASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIG 174

Query: 725 SLA 727
+
Sbjct: 175 DRS 177



Score = 38.0 bits (87), Expect = 5e-04
Identities = 37/130 (28%), Positives = 67/130 (51%)

Query: 1477 AAGEGANASSDNSTAVGAGAQAVAENATAVGMDALASGIGAAALGNNAQALGENSSAVGS 1536
A G A+A +S A+GA A+A A AVG ++A+G+ + A+G ++ALG+++ G+
Sbjct: 60 AGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGA 119

Query: 1537 NALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVGFRSAASNVGSAAFGGYSES 1596
+ A G + + + A S+A A ++ A AA++ S A G S++
Sbjct: 120 ASTAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 1597 SGRLSSALGY 1606
S ++G+
Sbjct: 180 DRENSVSIGH 189



Score = 37.2 bits (85), Expect = 8e-04
Identities = 38/141 (26%), Positives = 67/141 (47%)

Query: 1200 GAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFAGGALSVAEGTESTALGYFA 1259
G + A G + +A+G +EA+ + AVGA + A+G + A G LS A G + G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1260 SATGESATAVGAESVADGTSAAAFGFGAEATSNYSTALGGYSTASGFNSTALGNFSTASG 1319
+A + S +D A F A+A ++ + + A+ S A+G+ S
Sbjct: 122 TAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDR 181

Query: 1320 SSSVAVGGDATASGAYSIAAG 1340
+SV++G ++ +AAG
Sbjct: 182 ENSVSIGHESLNRQLTHLAAG 202



Score = 35.6 bits (81), Expect = 0.002
Identities = 59/189 (31%), Positives = 85/189 (44%), Gaps = 25/189 (13%)

Query: 1538 ALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVGFRSAASNVGSAAFGGYSESS 1597
A A D N Q ALG E A G ++A + S A G +E++
Sbjct: 23 AFADDYDGIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAA 82

Query: 1598 GRLSSALGYGAVASSDYSTAVGAASLASGASAVAVGEFSEATGDESVAVGGSTFFGFIPA 1657
+ A+G G++A+ S A+G S A G SAV G S A D VA+G A
Sbjct: 83 KGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQKD-GVAIG---------A 132

Query: 1658 RASGTGAAAFGGGAWATADYTTAIGWNSYADGVNASALGQSAAALADN--ALAIGGNSRA 1715
RAS T+D A+G+NS AD N+ A+G S+ A++ ++AIG S+
Sbjct: 133 RAS-------------TSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179

Query: 1716 DAIGASVVG 1724
D + +G
Sbjct: 180 DRENSVSIG 188



Score = 35.6 bits (81), Expect = 0.003
Identities = 47/142 (33%), Positives = 71/142 (50%), Gaps = 4/142 (2%)

Query: 1135 AQGEDATAAGSNATADADYSSAFGASSQATAIGAVAIGSGASATAQYANASGYNAAASGY 1194
+ A G NA+A +S A GA+++A AVA+G+G+ AT + A G + A G
Sbjct: 53 VRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGD 112

Query: 1195 GSVSTGAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFAGGALS--VAEGTES 1252
+V+ GA S A D GVA+G + S AVG + A + A G S A S
Sbjct: 113 SAVTYGAASTAQKD-GVAIGARASTSDT-GVAVGFNSKADAKNSVAIGHSSHVAANHGYS 170

Query: 1253 TALGYFASATGESATAVGAESV 1274
A+G + E++ ++G ES+
Sbjct: 171 IAIGDRSKTDRENSVSIGHESL 192



Score = 34.9 bits (79), Expect = 0.004
Identities = 45/147 (30%), Positives = 73/147 (49%), Gaps = 4/147 (2%)

Query: 1518 AALGNNAQALGENSSAVGSNALASDIGATANGAGAQAISTYATALGSEAVASDNQATAVG 1577
A G NA A G +S A+G+ A A+ A A GAG+ A + A+G + A + A G
Sbjct: 59 GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYG 118

Query: 1578 FRSAASNVGSAAFGGYSESSGRLSSALGYGAVASSDYSTAVGAAS--LASGASAVAVGEF 1635
S A G A G S+ A+G+ + A + S A+G +S A+ ++A+G+
Sbjct: 119 AASTAQKDGVAI--GARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDR 176

Query: 1636 SEATGDESVAVGGSTFFGFIPARASGT 1662
S+ + SV++G + + A+GT
Sbjct: 177 SKTDRENSVSIGHESLNRQLTHLAAGT 203



Score = 34.1 bits (77), Expect = 0.007
Identities = 43/137 (31%), Positives = 57/137 (41%), Gaps = 1/137 (0%)

Query: 393 GLGFFVQTQASGEAASALGAGAIASGTYTTAVGTLSEASGTEATAVGYFAYAPGEGATAV 452
G+ Q S A ALG A G + A G + A+G A A A AV
Sbjct: 30 GIPNLTAVQISPNADPALGLEYPVRPPVPGAGGLNASAKGIHSIAIGATAEAAKGAAVAV 89

Query: 453 GPESWASGELSTALGYYSTARGANSVALGANSVATRADTVSVGAAGAERQITSVAAGTEG 512
G S A+G S A+G S A G ++V GA S A + D V++GA +
Sbjct: 90 GAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQK-DGVAIGARASTSDTGVAVGFNSK 148

Query: 513 TDAVNLNQLTAVSDVAS 529
DA N + S VA+
Sbjct: 149 ADAKNSVAIGHSSHVAA 165



Score = 33.7 bits (76), Expect = 0.010
Identities = 38/114 (33%), Positives = 57/114 (50%), Gaps = 2/114 (1%)

Query: 1127 GTGDGTAFAQGEDATAAGSNATADADYSSAFGASSQATAIGAVAIGSGASATAQYANASG 1186
G G A A+G + A G+ A A + A GA S AT + +VAIG + A A G
Sbjct: 59 GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYG 118

Query: 1187 YNAAASGYGSVSTGAFSQASGDYGVALGGESEASGAQSTAVGAAAGASGDGAFA 1240
+ A G V+ GA + S D GVA+G S+A S A+G ++ + + ++
Sbjct: 119 AASTAQKDG-VAIGARASTS-DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYS 170



Score = 33.3 bits (75), Expect = 0.013
Identities = 38/130 (29%), Positives = 61/130 (46%), Gaps = 3/130 (2%)

Query: 1043 GWQSNASGTRSVSLGSGAYTPADDGVALGAGSIADRDNTVSVGSVGSERQITNVAAGTEG 1102
G ++A G S+++G+ A VA+GAGSIA N+V++G + + V G
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 1103 TDAVNLDQLNAVAGTAETTARLYAGTGDGTAFAQGEDATAAGSNATADADYSSAFGASSQ 1162
T + + A A T++T A + A A+ A S+ A+ YS A G S+
Sbjct: 122 TAQKDGVAIGARASTSDTGV---AVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSK 178

Query: 1163 ATAIGAVAIG 1172
+V+IG
Sbjct: 179 TDRENSVSIG 188


94XC_RS19325XC_RS19360N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS19325-2101.746739PilA
XC_RS19330-292.282426hypothetical protein
XC_RS19335-1102.090528DNA topoisomerase I
XC_RS193401114.514171threonylcarbamoyl-AMP synthase
XC_RS19345-2113.257295hypothetical protein
XC_RS19350-2102.654009hypothetical protein
XC_RS19355-1112.553377diguanylate cyclase
XC_RS19360-1113.404881tropinone reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19325PF07201280.043 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 27.5 bits (61), Expect = 0.043
Identities = 9/36 (25%), Positives = 13/36 (36%), Gaps = 7/36 (19%)

Query: 13 RGGPVDVEALRGLYRDGVI-------ALDSLVWREG 41
+ G ++ LR YRD V+ L R
Sbjct: 189 QSGVNPLQPLRDTYRDAVMGYQGIYAIWSDLQKRFP 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19330PF03544300.017 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 29.6 bits (66), Expect = 0.017
Identities = 21/103 (20%), Positives = 30/103 (29%), Gaps = 1/103 (0%)

Query: 60 PAATPETASVFTPAPITADPSPGAATSAEQVPEAGFTPVTPQ-TTQPDSVSAAHISTPVD 118
PA +V P +P P E EA P+ +P + P
Sbjct: 57 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKR 116

Query: 119 AGSVASSAPSPGAWSARTAAEPSEHQPATTASPATPAQATPLP 161
S P+ + A S A T+ P T + P
Sbjct: 117 DVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19350IGASERPTASE352e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 2e-04
Identities = 26/155 (16%), Positives = 43/155 (27%), Gaps = 9/155 (5%)

Query: 43 VQQQKKLMTAPAAMPLPSSAAAPVSARPAPARAATQAPPVAAPAPQAPAAVPTATTA--- 99
V+ +K + + +P A P V PQ+ T
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 100 -TSSLPPPPLFECTAHDNGRYFTE--ESEPATRCLPMQVTNLAGGPAQGGGSACEVVTDR 156
TSS P+ E T + G E E+ P + + P + V
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRS---VRSV 1230

Query: 157 CAPVPDQSLCAAWRQRAEQAEAAWRFSDEAQSAAR 191
V + + R + ++ S AR
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDAR 1265



Score = 29.6 bits (66), Expect = 0.009
Identities = 30/176 (17%), Positives = 52/176 (29%), Gaps = 11/176 (6%)

Query: 40 PKGVQQQKKLMTAPAAMPLPSSAAAPVSARPAPARAATQAPPVAAPAPQAPAAVPTATTA 99
P+ ++ + + T P A P A PV PAP A P+ TT
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAP----ATPSETTE 1038

Query: 100 TSSLPPPPLFECTAHDNGRYFTEESEPATRCLPMQVTNLAGGPAQGGGSACEVVTDRCAP 159
T + + + E+ R +V A + EV
Sbjct: 1039 TVAENSKQESKTVEKNEQD--ATETTAQNR----EVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 160 VPDQSLCAAWRQRAEQAEAAWRFSDEAQSAARKQRYDQARRVMDES-RCAGTPATP 214
Q+ E+ E A +++ Q + ++ E+ + PA
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19360DHBDHDRGNASE1123e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (282), Expect = 3e-32
Identities = 75/253 (29%), Positives = 112/253 (44%), Gaps = 10/253 (3%)

Query: 8 LDGQTALITGASAGIGLAIARELLGFGADLLMVARDADALAQARDELAEEFPERELHGLA 67
++G+ A ITGA+ GIG A+AR L GA + A D + + + + R
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIA--AVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 68 ADVSDDEERRAILDWVEDHADGLHLLINNAGGNITRAAIDYTEDEWRGIFETNVFSAFEL 127
ADV D I +E + +L+N AG +++EW F N F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 128 SRYAHPLLTRHAASAIVNVGSVSGITHVRSGAPYGMTKAALQQMTRNLAVEWAEDGIRVN 187
SR + + +IV VGS S A Y +KAA T+ L +E AE IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 188 AVAPWYIRTRRTSGPLSDPDYYEQVIERT--------PMRRIGEPEEVAAAVGFLCLPAG 239
V+P T +D + EQVI+ + P++++ +P ++A AV FL
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 240 SYITGECIAVDGG 252
+IT + VDGG
Sbjct: 244 GHITMHNLCVDGG 256


95XC_RS19430XC_RS19515N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS19430-1130.957850hisitidine kinase
XC_RS194350100.739439cell division protein FtsX
XC_RS19440-18-0.191644ABC transporter ATP-binding protein
XC_RS19445-19-0.066885ATP-dependent RNA helicase RhlB
XC_RS19450-211-1.184767thioredoxin
XC_RS19455-28-1.260771transcription termination factor Rho
XC_RS19460-210-1.648274hypothetical protein
XC_RS19465-19-1.705856isocitrate dehydrogenase kinase/phosphatase
XC_RS19470-110-0.403680biotin transporter BioY
XC_RS19475-1130.001794hypothetical protein
XC_RS19480-1160.358789isocitrate dehydrogenase
XC_RS194901150.688860peptidoglycan-binding protein LysM
XC_RS194951150.881221NADPH-dependent 7-cyano-7-deazaguanine
XC_RS195001150.973043N-acyl-L-amino acid amidohydrolase
XC_RS195052140.603783membrane protein
XC_RS195101110.414231acriflavine resistance protein B
XC_RS19515090.403107acriflavin resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19430HTHFIS589e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 57.9 bits (140), Expect = 9e-12
Identities = 31/115 (26%), Positives = 46/115 (40%), Gaps = 12/115 (10%)

Query: 140 GATVLYIEDSRVVAEATKRMLERQSLKVVHVLTAEDAFALLTAESLGRTERRIDVVLTDV 199
GAT+L +D + + L R V A + + A D+V+TDV
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-------GDLVVTDV 55

Query: 200 TLKGELNGRDVVERIRIDFAYGKRRLPVLVMTGDTNPRNQSELLRAGANDLVQKP 254
+ E N D++ RI+ LPVLVM+ + GA D + KP
Sbjct: 56 VMPDE-NAFDLLPRIKKARP----DLPVLVMSAQNTFMTAIKASEKGAYDYLPKP 105



Score = 52.1 bits (125), Expect = 8e-10
Identities = 21/88 (23%), Positives = 39/88 (44%), Gaps = 4/88 (4%)

Query: 12 DAPRVMVVDGSKLVRKLIADVLKRDLPNVQVIGCASIAEARDALEAGAVDLVTTSLSLPD 71
++V D +R ++ L R V ++ A + AG DLV T + +PD
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRA--GYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 72 GDGLTLARSVREAAGQAYVPVIVVSGDA 99
+ L +++A + +PV+V+S
Sbjct: 60 ENAFDLLPRIKKA--RPDLPVLVMSAQN 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19445IGASERPTASE320.008 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.008
Identities = 33/181 (18%), Positives = 54/181 (29%), Gaps = 24/181 (13%)

Query: 378 QKIPVEPVTTELLTPLPRTPRATVE--GEEVDDDAGDSVGTIFREAREQRAADEARRGGG 435
+ PV P P P TP T E E ++ E EQ A + +
Sbjct: 1021 DEAPVPP-------PAPATPSETTETVAENSKQESKTV------EKNEQDATETTAQNRE 1067

Query: 436 RSGPGGASRSGSGGGRRDGAGADGKPRPPRRKPRVEGEADPAAAPSETPVVVAAAAETPA 495
+ ++S + A + E + V E P
Sbjct: 1068 VAK---EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 496 VTAAEGERAPRKRRRRRNGRPVEGAEPVVASTPVPAPAAPRKPTQVVAKPVRAAAKPSGS 555
VT+ + K+ + +P AEP + P P+ T A + A + S +
Sbjct: 1125 VTS----QVSPKQEQSETVQP--QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSN 1178

Query: 556 P 556

Sbjct: 1179 V 1179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19455cdtoxina300.031 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 29.7 bits (66), Expect = 0.031
Identities = 15/43 (34%), Positives = 18/43 (41%), Gaps = 2/43 (4%)

Query: 17 GSQQPNLPLPSSPAPEAPRPAQTPSPAAESAPAQS-SGDGGAA 58
+P LPLP P P P P P +APA S G+
Sbjct: 48 SPDEPGLPLPG-PGPALPTNGAIPIPEPGTAPAVSLMNMDGSV 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19460PF03544367e-05 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 36.5 bits (84), Expect = 7e-05
Identities = 23/131 (17%), Positives = 36/131 (27%), Gaps = 24/131 (18%)

Query: 100 AKPGEKNQYLVSIRSAHFGGEASDGSSVRAKDMTP-----PSYPKAAFQDGATGIVYLLL 154
+N S+ S + A P YP A G V +
Sbjct: 125 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 184

Query: 155 KIGRDGKVADLVAEQVNLTVSIPAAKRERVRKTLADASIAKARTWTFDPPSDGPEHTAPY 214
+ DG+V + V + + PA R W ++P G
Sbjct: 185 DVTPDGRV-----DNVQILSAKPA-------NMFEREVKNAMRRWRYEPGKPGSG----- 227

Query: 215 WTMRVPVSFDI 225
+ V + F I
Sbjct: 228 --IVVNILFKI 236


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19470PF09025280.036 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 27.7 bits (61), Expect = 0.036
Identities = 18/68 (26%), Positives = 26/68 (38%), Gaps = 12/68 (17%)

Query: 38 QVRSLEQRLG--YPLLQRHARGVSATLQGQQLLDRIAPHLDAIA----------EAFQPF 85
QV + EQ LG P R G+ G++LL R A L + A P
Sbjct: 28 QVLAFEQALGGEPPAAGRRLAGLENGALGERLLQRFAQPLQGLEADRLELKAMLRAELPL 87

Query: 86 AARREDTL 93
+++ L
Sbjct: 88 GRQQQTFL 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19500FLGHOOKFLIK290.038 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 29.0 bits (64), Expect = 0.038
Identities = 18/61 (29%), Positives = 29/61 (47%), Gaps = 4/61 (6%)

Query: 13 ALPTLAAAQAAPRPEVQAA--AAPLQAKVVQWRRDFHQHPELSNREERTAATVAAQLRKL 70
A P + Q P P V A +APL + +W++ QH L R+ + +A + + L
Sbjct: 211 ASPLITPHQTQPLPTVAAPVLSAPLGSH--EWQQSLSQHISLFTRQGQQSAELRLHPQDL 268

Query: 71 G 71
G
Sbjct: 269 G 269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19505RTXTOXIND508e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 50.2 bits (120), Expect = 8e-09
Identities = 30/158 (18%), Positives = 60/158 (37%), Gaps = 22/158 (13%)

Query: 55 AATQDVPVYATALGTVTAL-NTVTVNPQVSGQLMSLNFQEGQEVKKGALLAQIDPRT--- 110
+ V + ATA G +T + + P + + + +EG+ V+KG +L ++
Sbjct: 75 SVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA 134

Query: 111 ----LQASYDQALAAKRQNQALLATA---RVNYQRSNDPAYKQYVS-----------RTD 152
Q+S QA + + Q L + ++ + D Y Q VS +
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ 194

Query: 153 LDTQRNQVAQYEAAVAANDAQMRSAQVQLQFTRVTAPI 190
T +NQ Q E + A+ + ++ + +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRV 232



Score = 34.4 bits (79), Expect = 9e-04
Identities = 22/178 (12%), Positives = 62/178 (34%), Gaps = 29/178 (16%)

Query: 93 EGQEVKKGALLAQIDPRTLQASYDQ-------ALAAKR----------QNQALLATARVN 135
+ + ++ +LA+I+ + ++ +L K+ +N+ + A +
Sbjct: 210 DKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELR 269

Query: 136 YQRSNDPAYKQYVSRTDLD-TQRNQVAQYEAAVAANDAQMRSAQV---------QLQFTR 185
+S + + + Q+ + E + + Q +
Sbjct: 270 VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 186 VTAPIDGIAGIRGV-DVGNIVSTTSTIVTLT-QIRPIYVSFSLPERELPAVRSGQAAT 241
+ AP+ V G +V+T T++ + + + V+ + +++ + GQ A
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAI 387



Score = 31.7 bits (72), Expect = 0.005
Identities = 11/104 (10%), Positives = 35/104 (33%), Gaps = 1/104 (0%)

Query: 79 NPQVSGQLMSLNFQEGQEVKKGALLAQIDPRTLQASYDQALAAKRQNQALLATARVNYQR 138
+V + Q + +++ +A LA + + L +
Sbjct: 181 EEEVLRLTSLIKEQFSTW-QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 139 SNDPAYKQYVSRTDLDTQRNQVAQYEAAVAANDAQMRSAQVQLQ 182
+ +KQ +++ + Q N+ + + +Q+ + ++
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEIL 283


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19510ACRIFLAVINRP7310.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 731 bits (1888), Expect = 0.0
Identities = 300/1072 (27%), Positives = 503/1072 (46%), Gaps = 65/1072 (6%)

Query: 4 STIFIRRPIATSLLMAGILLLGILGYRQLPVSALPEIDAPSLVVSTQYPGANATTMASLV 63
+ FIRRPI +L +++ G L QLPV+ P I P++ VS YPGA+A T+ V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 64 TTPLERQLGQISGLQMMTSDS-SAGLSTIILQFSMDRDIDIAAQDVQAAIRQAT--LPSS 120
T +E+ + I L M+S S SAG TI L F D DIA VQ ++ AT LP
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 121 LPYQPVYNRVNPADAAILTLKLTSDT--LPLREVNRYADAILAQRLSQVPGVGLVSIAGN 178
+ Q + + + ++ SD +++ Y + + LS++ GVG V + G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 179 VRPAVRIQVNPAQLSNMGLTMESLRSALTQTNVSAPKGSLN------GKTQSYSIGTNDQ 232
A+RI ++ L+ LT + + L N G L G+ + SI +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 233 LTDAAQYRETIISYN-NGAPVRLADVAKVVDGVENDQLAAWADGKPAVLLEIRRQPGANI 291
+ ++ + + N +G+ VRL DVA+V G EN + A +GKPA L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 292 VQTVEQIRSILPQLQGVLPADVHLDVFSDRTETIRASVHEVKFTLVLTIFLVVAVIFVFL 351
+ T + I++ L +LQ P + + D T ++ S+HEV TL I LV V+++FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 352 RRLWATIIPSVAVPLSLAGTFGVMAFAGMSLDNLSLMALVVATGFVVDDAIVMIENIVRY 411
+ + AT+IP++AVP+ L GTF ++A G S++ L++ +V+A G +VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 412 IEQGKSGP-EAAEIGARQIGFTVLSLTVSLVAVFLPLLLMPGVTGRLFHEFAWVLSIAVV 470
+ + K P EA E QI ++ + + L AVF+P+ G TG ++ +F+ + A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 471 ISMLVSLTLTPMMCAYLLKPDALPEGEDAHERAAAEGKTNLWTRTVGVYERSLDWVLDHQ 530
+S+LV+L LTP +CA LLKP + E ++ + +V Y S+ +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHE--NKGGFFGWFNTTFDHSVNHYTNSVGKILGST 537

Query: 531 PLTLAVAIGAVALTVVLYVAIPKGLLPEQDTGLITGVVQVDQNVAFPQMEQRTQAVAAAL 590
L + VA VVL++ +P LPE+D G+ ++Q+ + ++ V
Sbjct: 538 GRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 591 RKDPA--VTGVAAFIGAGSMNPTLNQGQLSIVLKTRGDRD----DLETVVARLQKAVSGI 644
K+ V V G N G + LK +R+ E V+ R + + I
Sbjct: 598 LKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKI 657

Query: 645 PGVALFLKPVQDV-TLDTRVAATEYQYSMADVDSTELASWA-TRMTEAMRKLPELADVDN 702
+ + + L T A + L + A + L V
Sbjct: 658 RDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 703 NLANQGRALELSIDRDKASMLGVPMQTIDDTLYDAFGQRQISTIFTELNQYRVVLEVAPE 762
N +L +D++KA LGV + I+ T+ A G ++ ++ ++ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 763 FRSSTALMNQLAVASNGSGALTGTNATSFGQVTSSNSSTATGVGNQNTGIVVGAGSIIPL 822
FR +++L V S G ++P
Sbjct: 778 FRMLPEDVDKLYVRSA-------------------------------------NGEMVPF 800

Query: 823 AALAEAKVTNTPLVVSHQQQLPAVTISFNLAPGHSLSQAVAAIEQARADLKIPTQVHAEF 882
+A + + LP++ I APG S A+A +E + K+P + ++
Sbjct: 801 SAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLAS--KLPAGIGYDW 858

Query: 883 VGKAAEFTGSQTDIVWLLLASIVVIYIVLGVLYESYIHPLTIISTLPPAGVGALLALMVC 942
G + + S L+ S VV+++ L LYES+ P++++ +P VG LLA +
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 943 GLSLSVDGIVGIVLLIGIVKKNAIMMIDFAIEA-RRTGVNAHEAIRRACLLRFRPIMMTT 1001
V +VG++ IG+ KNAI++++FA + + G EA A +R RPI+MT+
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 1002 AAAMLGALPLALGTGIGSELRRPLGIAIVGGLLLSQLVTLYTTPVIYLYMER 1053
A +LG LPLA+ G GS + +GI ++GG++ + L+ ++ PV ++ + R
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 73.3 bits (180), Expect = 3e-15
Identities = 72/454 (15%), Positives = 156/454 (34%), Gaps = 48/454 (10%)

Query: 614 QGQLSIVLKTRGDRDD-LETVVARLQKAVSGIPGVALFLKPVQDVTLDTRVAATEYQYSM 672
+++ ++ D D V +LQ A +P + + + + +
Sbjct: 87 SVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDN 146

Query: 673 ADVDSTELASWATR-MTEAMRKLPELADVDNNLANQGRALELSIDRDKASMLGVPMQTID 731
+++ + + + + +L + DV L A+ + +D D
Sbjct: 147 PGTTQDDISDYVASNVKDTLSRLNGVGDV--QLFGAQYAMRIWLDADL------------ 192

Query: 732 DTLYDAFGQRQISTIFTELNQYRVVLEVAPEFRSSTALMNQLAVASNGS-GALTGTNATS 790
LN+Y++ L Q + G G
Sbjct: 193 ------------------LNKYKLTPV-----DVINQLKVQNDQIAAGQLGGTPALPGQQ 229

Query: 791 FGQVTSSNSSTATGVGNQNTGIVVGA-GSIIPLAALAEAKVT--NTPLVVSHQQQLPAVT 847
+ + + V + GS++ L +A ++ N ++ + PA
Sbjct: 230 LNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAG 288

Query: 848 ISFNLAPGHSLSQAVAAIEQARADLK--IPTQVHAEFVGKAAEF-TGSQTDIVWLLLASI 904
+ LA G + AI+ A+L+ P + + F S ++V L +I
Sbjct: 289 LGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI 348

Query: 905 VVIYIVLGVLYESYIHPLTIISTLPPAGVGALLALMVCGLSLSVDGIVGIVLLIGIVKKN 964
+++++V+ + ++ L +P +G L G S++ + G+VL IG++ +
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDD 408

Query: 965 AIMMIDFAIEARR-TGVNAHEAIRRACLLRFRPIMMTTAAAMLGALPLALGTGIGSELRR 1023
AI++++ + EA ++ ++ +P+A G + R
Sbjct: 409 AIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYR 468

Query: 1024 PLGIAIVGGLLLSQLVTLYTTPVIYLYMERAGER 1057
I IV + LS LV L TP + + +
Sbjct: 469 QFSITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS19515ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1943), Expect = 0.0
Identities = 287/1034 (27%), Positives = 488/1034 (47%), Gaps = 26/1034 (2%)

Query: 3 ISAPFIKRPIGTSLLAIGLFVIGLMCYLRLGVAALPNIQIPVIFVHATQSGADASTMAST 62
++ FI+RPI +LAI L + G + L+L VA P I P + V A GADA T+ T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VTAPLERHLGQLPGIDRMRSSS-SESSSMVFMIFQSNRDIDSAAQDVQTAINSAQSDLPS 121
VT +E+++ + + M S+S S S + + FQS D D A VQ + A LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 GLGTPMYQKANPNDDPVIAIALTSDT--QSADELYNVADSLLAQRLRQITGISSVDIAGA 179
+ + ++ SD + D++ + S + L ++ G+ V + GA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 STPAVRVDVDLRALNALGLTPDDLRNAVRAANVTSPTGFL------SDGNTTMAIIANDS 233
A+R+ +D LN LTP D+ N ++ N G L +IIA
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 VAKAADFAQLAISTQSNGRIVRLGDVATVYDGQQDAYQAAWFNGKPAVVMYAFTRAGANI 293
+F ++ + S+G +VRL DVA V G ++ A NGKPA + GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 VETVDQVKAQIPELRAYLQPGTTLTPYFDRTPTIRASLHEVQATLMISLAMVVLTMALFL 353
++T +KA++ EL+ + G + +D TP ++ S+HEV TL ++ +V L M LFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 RRLAPTLIAAVTVPLSLAGSALVMYVLGFTLNNLSLLALVIAIGFVVDDAIVVIENIMRH 413
+ + TLI + VP+ L G+ ++ G+++N L++ +V+AIG +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 L-DEGMPRLQAALTGAREIGFTIVSITASLVAVFIPMLFASGMVGAFFREFTVTLVAAIV 472
+ ++ +P +A +I +V I L AVFIPM F G GA +R+F++T+V+A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 VSMLVSLTLTPALCSRFLSAHTAP--ETPSRFGAWLDRMHDRMLAVYTVALDFSLRHALL 530
+S+LV+L LTPALC+ L +A E F W + D + YT ++ L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 531 LSLTPLVLIAATIFLGGAVKKGSFPPQDTGLIWGRANSSATVSFADMVSRQRRITDMLMA 590
L +++A + L + P +D G+ A + ++TD +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 591 DP-----AVKTVGARLGSGRQGSSASLNIELKKRDE--GRRETTAQVVARLSAKADRYPD 643
+ +V TV SG+ ++ + LK +E G + V+ R + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 644 LDLRLRAIQDLPSDGGGGTSQGAQYRVSLQGNDLAQLQEWLPKVQAALKKNP-RLRDVGT 702
D + G + + G L + ++ ++P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 703 DVDTSGLRQNIVIDRAKAARLGISVGAIDGALYGAFGQRSISTIYSDLNQYSVVVNALPS 762
+ + + +D+ KA LG+S+ I+ + A G ++ + V A
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 763 QTATPKALDEVFVPNRAGQMAPITAVATQVPGLAPPQITHDNQYSTMDLSYNLAPGVSTG 822
P+ +D+++V + G+M P +A T P++ N +M++ APG S+G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 823 EADLIIKTTVEGLRMPGDIRISDGGGF-NVQLNPNSMGVLLLAAVLTVYIVLGMLYESLI 881
+A +++ ++P I G +L+ N L+ + + V++ L LYES
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 882 HPVTILSTLPAAGVGALLALFITNTELSVISMIALVLLIGIVKKNAIMMIDFALVAQREH 941
PV+++ +P VG LLA + N + V M+ L+ IG+ KNAI++++FA +
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 942 GKDARAAAREASIVRFRPIMMTTMVAILAAVPLAVGLGEGSELRRPLGIAMIGGLVFSQG 1001
GK A A +R RPI+MT++ IL +PLA+ G GS + +GI ++GG+V +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1002 LTLLSTPALYVIFS 1015
L + P +V+
Sbjct: 1016 LAIFFVPVFFVVIR 1029



Score = 112 bits (281), Expect = 4e-27
Identities = 80/506 (15%), Positives = 170/506 (33%), Gaps = 31/506 (6%)

Query: 2 NISAPFIKRPIGTSLLAIGLFVIGLMCYLRLGVAALPNIQIPVIFVHA-TQSGADASTMA 60
N + L+ + ++ +LRL + LP V +GA
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 61 STVTAPLERHLGQLPGID--------RMRSSSSESSSMVFMIFQSNRDIDSAAQDVQTAI 112
+ + +L S ++++ M F+ + + + + I
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 113 NSAQSDL---PSGLGTPMYQKANPNDDPVIAIALTSDTQSA-----DELYNVADSLLAQR 164
+ A+ +L G P + A + D L + LL
Sbjct: 648 HRAKMELGKIRDGFVIPF--NMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMA 705

Query: 165 LRQITGISSVDIAG-ASTPAVRVDVDLRALNALGLTPDDLRNAVRAANVTSPTGFLSDGN 223
+ + SV G T +++VD ALG++ D+ + A + D
Sbjct: 706 AQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRG 765

Query: 224 TTMAIIA---NDSVAKAADFAQLAISTQSNGRIVRLGDVATVYDGQQDAYQAAWFNGKPA 280
+ D +L + + +NG +V T + + + +NG P+
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVRS-ANGEMVPFSAFTTSHWVYG-SPRLERYNGLPS 823

Query: 281 VVMYAFTRAGANIVETVDQVKAQIPELRAYLQPGTTLTPYFDRTPTIRASLHEVQATLMI 340
+ + G + A + L + L G + + R S ++ A + I
Sbjct: 824 MEIQGEAAPGTS----SGDAMALMENLASKLPAGIGYD-WTGMSYQERLSGNQAPALVAI 878

Query: 341 SLAMVVLTMALFLRRLAPTLIAAVTVPLSLAGSALVMYVLGFTLNNLSLLALVIAIGFVV 400
S +V L +A + + + VPL + G L + + ++ L+ IG
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 401 DDAIVVIENIM-RHLDEGMPRLQAALTGAREIGFTIVSITASLVAVFIPMLFASGMVGAF 459
+AI+++E EG ++A L R I+ + + + +P+ ++G
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 460 FREFTVTLVAAIVVSMLVSLTLTPAL 485
+ ++ +V + L+++ P
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVF 1024


96XC_RS21125XC_RS21165N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS21125-3141.520617glycine--tRNA ligase beta subunit
XC_RS21130-2162.057072glycine--tRNA ligase alpha subunit
XC_RS21135-2172.518204type II secretion system protein E
XC_RS21140-2142.604434glutamine amidotransferase
XC_RS21145-2162.397576membrane protein
XC_RS21150-2132.663981preprotein translocase subunit TatC
XC_RS211550113.033500Sec-independent protein translocase protein
XC_RS21160-1113.279466Sec-independent protein translocase protein
XC_RS21165-1122.444606hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21125BCTERIALGSPD310.023 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 30.7 bits (69), Expect = 0.023
Identities = 25/100 (25%), Positives = 40/100 (40%), Gaps = 6/100 (6%)

Query: 306 IANIVSKDVAEVAKGYERVIRPRFADAKFFFDEDLKQGLEAMGAGLASVTYQAKLGTVAD 365
IA I D + +G +VI ++A A DL + L + + + S AK D
Sbjct: 253 IAMIKQLDRQQATQGNTKVIYLKYAKAS-----DLVEVLTGISSTMQSEKQAAKPVAALD 307

Query: 366 KVARVAALAEAIAPQVGADPVQARRAAEL-AKNDLQSRMV 404
K + A + A V A P + A+ D++ V
Sbjct: 308 KNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQV 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21145PF04335310.003 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 31.3 bits (71), Expect = 0.003
Identities = 14/70 (20%), Positives = 28/70 (40%), Gaps = 11/70 (15%)

Query: 168 LLWLLLMVATFAAL--TLALFVM-------PPQVMFDRSTGGHALRESLRASLHNLP--A 216
L W++ VA A +A+ + P + DR+TG ++ L A
Sbjct: 34 LAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLHGDATITYDEA 93

Query: 217 MLVFFVLAFI 226
+ +F+ ++
Sbjct: 94 VRKYFLATYV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21155TATBPROTEIN861e-23 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 86.2 bits (213), Expect = 1e-23
Identities = 37/89 (41%), Positives = 56/89 (62%), Gaps = 1/89 (1%)

Query: 1 MFDIGFSELALIAVVALVVLGPERLPKAARFAGLWVRRARMQWDSVKQELERELEAEELK 60
MFDIGFSEL L+ ++ LVVLGP+RLP A + W+R R +V+ EL +EL+ +E +
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ 60

Query: 61 RSLQDVQ-ASLREAEGQLRNTQQQVEDGA 88
SL+ V+ ASL +L+ + ++ A
Sbjct: 61 DSLKKVEKASLTNLTPELKASMDELRQAA 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21160TATBPROTEIN307e-04 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 29.6 bits (66), Expect = 7e-04
Identities = 10/41 (24%), Positives = 18/41 (43%)

Query: 1 MGSFSIWHWLIVLVIVLLVFGTKRLTSGAKDLGSAVKEFKK 41
M L+V +I L+V G +RL K + ++ +
Sbjct: 1 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRS 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21165PF05616290.021 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.3 bits (65), Expect = 0.021
Identities = 17/54 (31%), Positives = 24/54 (44%), Gaps = 2/54 (3%)

Query: 236 GDTPPPAPRPVSQ-SPAQAPATPPPPSNEASTMPIQPSATPPQQQGFQPVSDGE 288
G P +P+ + SPA+ PA P P+ T P P P P +DG+
Sbjct: 318 GSAEAPNAQPLPEVSPAENPANNPAPNENPGTRP-NPEPDPDLNPDANPDTDGQ 370


97XC_RS21560XC_RS21610N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
XC_RS21560-1133.510490membrane protein
XC_RS21565-2123.523801RND transporter
XC_RS21570-1142.365882multidrug transporter
XC_RS21575-1162.067010DeoR family transcriptional regulator
XC_RS215800163.647876hypothetical protein
XC_RS215851153.972615LysR family transcriptional regulator
XC_RS215900163.536645short-chain dehydrogenase
XC_RS215950163.077661cardiolipin synthetase
XC_RS216001154.019073membrane protein
XC_RS216050142.087482hypothetical protein
XC_RS216100131.722523plasmid stabilization protein ParE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21560RTXTOXIND505e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.8 bits (119), Expect = 5e-09
Identities = 19/142 (13%), Positives = 50/142 (35%), Gaps = 2/142 (1%)

Query: 86 ALEQARAALAERQATLTQLRREIARDRSLQDLVAAEDAEVRRSNVQKAQAAVATAQSAVD 145
+A L ++ L Q+ EI + LV +++ + +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA 319

Query: 146 LAQLNLDRTEVRSPADGHISDRTVR-VGDYVSAGRPVVAVL-DTGSFRVDGYFEETRLQG 203
+ + +R+P + V G V+ ++ ++ + + V + +
Sbjct: 320 KNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGF 379

Query: 204 VHPGQRVDVHLMGEPVTLHGHV 225
++ GQ + + P T +G++
Sbjct: 380 INVGQNAIIKVEAFPYTRYGYL 401



Score = 42.5 bits (100), Expect = 1e-06
Identities = 21/168 (12%), Positives = 58/168 (34%), Gaps = 19/168 (11%)

Query: 10 PALLTLAMVVVAAVVLQHLWRYYMDAPWTRDAHVGADVV------QVAPDVSGLVEQVAV 63
+A ++ +V+ + A + ++ P + +V+++ V
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVL--GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIV 112

Query: 64 ADNQAVTRGQLLFVVDRARYAIALEQARAALAERQATLTQLRREIARD----RSLQDLVA 119
+ ++V +G +L + + +++L QA L Q R +I L +L
Sbjct: 113 KEGESVRKGDVLLKLTALGAEADTLKTQSSLL--QARLEQTRYQILSRSIELNKLPELKL 170

Query: 120 AEDAEVRRSNVQKAQAAVATAQSAVD-----LAQLNLDRTEVRSPADG 162
++ + + ++ + + Q L+ + R+
Sbjct: 171 PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21565RTXTOXIND300.021 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.021
Identities = 14/117 (11%), Positives = 36/117 (30%), Gaps = 5/117 (4%)

Query: 357 TLPSSGAHARVRATEAGADAAVAQFDHTVLQA-LREVQTTLSRYAQDLDRLRLLEQA-QQ 414
LP V E ++ + + Q + + L + + + +
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 415 QAELASSQN---RRLYQGGRTPYLSSLDAERTLATADMTLADAQAQVSKDQIQLFLA 468
+ + S+ L + L+ E A L ++Q+ + + ++ A
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21575TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 3e-04
Identities = 22/86 (25%), Positives = 42/86 (48%), Gaps = 14/86 (16%)

Query: 69 AIFA-MTFLMRPIGAWYFGRFADRYGRRLALTISVSMMALCSFVIAITPTVSTIGIAAPI 127
A++A M F P+ G +DR+GRR L +S++ A+ ++A P +
Sbjct: 50 ALYALMQFACAPVL----GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW-------- 97

Query: 128 ILLLARLLQGFATGGEYGTSATYMSE 153
+L + R++ G TG + Y+++
Sbjct: 98 VLYIGRIVAGI-TGATGAVAGAYIAD 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21590DHBDHDRGNASE771e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.0 bits (189), Expect = 1e-18
Identities = 43/187 (22%), Positives = 83/187 (44%), Gaps = 3/187 (1%)

Query: 6 LITGASSGFGRGLTQTLLARGDRVAA---TVRRADALADLQAAHGNALTVLQLDVRDTAA 62
ITGA+ G G + +TL ++G +AA + + + A DVRD+AA
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAA 71

Query: 63 VQAVVAQAFAALGRIDVVISNAGYGTLGAAEAATDAQVRALIDTNLIGSISVIQAALPHL 122
+ + A+ +G ID++++ AG G + +D + A N G + ++ ++
Sbjct: 72 IDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYM 131

Query: 123 RRQGGGHVVQVSSEGGQIAYPGFSLYHASKWGIEGYVEAVRQEVAGFGIQFTLAEPGPAR 182
+ G +V V S + + Y +SK + + + E+A + I+ + PG
Sbjct: 132 MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGSTE 191

Query: 183 TNFGAAL 189
T+ +L
Sbjct: 192 TDMQWSL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21600VACCYTOTOXIN300.004 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.004
Identities = 30/138 (21%), Positives = 49/138 (35%), Gaps = 18/138 (13%)

Query: 32 RQPWKPVAATLRTALLLVATAALGYAAYG----LPGVAPGVALGAVVGAGLGVVSLRYTH 87
R+ +P+ + L+ T +AA+ +P + G+A GA VG G++
Sbjct: 8 RKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGLLGWGLKQ 67

Query: 88 AEWVDGRGWYTPNPWIGGALMLV---------LLGRLAWRWTDGAFSGGAA-----VAGS 133
AE + W A L L DG + G A V
Sbjct: 68 AEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAARHYWVKDG 127

Query: 134 QASPLTLGIAAALVLYSL 151
Q + L + + A+ Y+L
Sbjct: 128 QWNKLEVDMQNAVGTYNL 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
XC_RS21610PF05616353e-04 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 35.5 bits (81), Expect = 3e-04
Identities = 33/117 (28%), Positives = 44/117 (37%), Gaps = 6/117 (5%)

Query: 132 PPQGSASGGRTKVDFVGDTSQPDQPVPSPTPVPPSPTPTPVQPPPAASPVQSTLVQQAKN 191
P Q A+ GR D G+T+ Q +P P P S QP P SP ++ A N
Sbjct: 288 PVQVVATFGR---DSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPN 344

Query: 192 PVP---PQGDTAPGSLAERRRQTRRQTRPTPPQPPAPPAASTQRRPETWTGRPPGML 245
P P + P + T Q P P P + + R E G G+L
Sbjct: 345 ENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKERKEGEDGGLL 401



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.