PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomeacinet.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NZ_CP015121 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1A4U85_RS00965A4U85_RS01180Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS00965118-3.861805MFS transporter
A4U85_RS00970522-6.784156single-stranded DNA-binding protein
A4U85_RS00975624-7.543322site-specific integrase
A4U85_RS00980622-7.386325site-specific integrase
A4U85_RS00985521-7.157636hypothetical protein
A4U85_RS00990421-6.632951hypothetical protein
A4U85_RS00995419-5.648236AAA family ATPase
A4U85_RS01000-113-1.820058type II toxin-antitoxin system Phd/YefM family
A4U85_RS01005-213-0.003568AAA family ATPase
A4U85_RS01010-2143.849183AzlD domain-containing protein
A4U85_RS01015-1143.995861AzlC family ABC transporter permease
A4U85_RS01020-1143.411907helix-turn-helix transcriptional regulator
A4U85_RS01025-1143.002302amino acid permease
A4U85_RS01030-1143.199914PLP-dependent aminotransferase family protein
A4U85_RS010350172.0386254-aminobutyrate--2-oxoglutarate transaminase
A4U85_RS01040-1140.831710NAD-dependent succinate-semialdehyde
A4U85_RS01045013-0.278894LysR family transcriptional regulator
A4U85_RS01050-1150.838409hydrolase
A4U85_RS01055-2151.380378pirin family protein
A4U85_RS01060-1150.751542FAD-dependent oxidoreductase
A4U85_RS010651190.758851class I SAM-dependent methyltransferase
A4U85_RS010703201.485725GFA family protein
A4U85_RS010803222.049245*MFS transporter
A4U85_RS010853221.401091LysR family transcriptional regulator
A4U85_RS010904210.365367DMT family transporter
A4U85_RS01095217-1.349527hypothetical protein
A4U85_RS01100416-4.297868hypothetical protein
A4U85_RS01105515-4.781049aldo/keto reductase
A4U85_RS01110618-6.785609AraC family transcriptional regulator
A4U85_RS01120818-7.846079TetR/AcrR family transcriptional regulator
A4U85_RS011251022-8.713020RloB domain-containing protein
A4U85_RS01130920-7.549342ATP-binding protein
A4U85_RS01135721-5.318462hypothetical protein
A4U85_RS01140-115-1.764447helix-turn-helix transcriptional regulator
A4U85_RS01150-213-0.460676NAD(P)H-dependent oxidoreductase
A4U85_RS01155-314-0.115026hypothetical protein
A4U85_RS01160-2171.151524hypothetical protein
A4U85_RS01165-1191.398550DHCW motif cupin fold protein
A4U85_RS011750182.271205*potassium transporter Kup
A4U85_RS011802151.435201hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS00965TCRTETA883e-21 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 88.3 bits (219), Expect = 3e-21
Identities = 74/380 (19%), Positives = 145/380 (38%), Gaps = 10/380 (2%)

Query: 8 RSTFALSSIFALRMLGLFMIIPVFSVVGQSYQYAT--PALIGLAVGIYGLSQAILQIPFS 65
R + S AL +G+ +I+PV + + ++ A G+ + +Y L Q
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 66 LLADRFSRKPLVVFGLLLFAIGGAIAGLSDTIYGVIIGRAIAG-AGAVSAVVMALLADVT 124
L+DRF R+P+++ L A+ AI + ++ + IGR +AG GA AV A +AD+T
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT 124

Query: 125 REEQRTKAMAAMGMSIGLSFVVAFSLGPWLTSLVGISGLFFVTTIMGLIAIAMLLLVPKV 184
++R + M G V LG + + F + GL + L+P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 185 TRHHRNYQQGYMAQLKQVIQMGDLNRLHVSVFALHLLLTAMFIYVPSQLIEFAHIPLA-S 243
+ R + + + ++ A+ ++ + + + F
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 244 HGLVYLPLLVISLFFAFPSIIIAEKYRKMRGIFLTAITGILA---GLLLLIFGYQSKYVL 300
+ + L + + +I G + G++A G +LL F +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 301 LAGLGIFFIAFNVMEALLPSWLSKSAPIQSKATAMGVNASSQFLGAFFGGTLGGQLLMLH 360
+ + + + L + LS+ + + G A+ L + G L +
Sbjct: 305 P--IMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 361 -NTAIGWSVLAGIAIIWLLI 379
T GW+ +AG A+ L +
Sbjct: 363 ITTWNGWAWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS00970cloacin310.002 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.2 bits (70), Expect = 0.002
Identities = 20/54 (37%), Positives = 23/54 (42%)

Query: 130 NAPQQGGGYQNNNQGGGYGQNNGGYGGQGGFGNGGNSPQGGGFAPKAPQQPASA 183
N P GG + GGG G NGG G G G+G AP A PA +
Sbjct: 43 NNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALS 96



Score = 30.8 bits (69), Expect = 0.003
Identities = 18/50 (36%), Positives = 20/50 (40%)

Query: 135 GGGYQNNNQGGGYGQNNGGYGGQGGFGNGGNSPQGGGFAPKAPQQPASAP 184
G +NN GGG G GG G GGN GGG +AP
Sbjct: 38 GWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAP 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01020HTHTETR300.005 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 29.6 bits (66), Expect = 0.005
Identities = 9/29 (31%), Positives = 16/29 (55%)

Query: 9 AKGLTRERQRAGLSLAEVARRAGVAKSTL 37
A L ++ + SL E+A+ AGV + +
Sbjct: 20 ALRLFSQQGVSSTSLGEIAKAAGVTRGAI 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01050ISCHRISMTASE411e-06 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 40.8 bits (95), Expect = 1e-06
Identities = 28/161 (17%), Positives = 54/161 (33%), Gaps = 24/161 (14%)

Query: 7 RLDKDNAAVLLVDHQAGLLSLVRDIDP--DKFKNNVLAVANAAKYFNLPTILTTSFET-- 62
D + A +L+ D Q + + N+ + N +P + T +
Sbjct: 25 VPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQN 84

Query: 63 -----------GPNGPLVPELKEIHPDAPFIPRPGQI-------NAWDNEDFVKAVKATG 104
GP P ++I P + +A+ + ++ ++ G
Sbjct: 85 PDDRALLTDFWGPGLNSGPYEEKI--ITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEG 142

Query: 105 KKQLIIAGVVTEVCVAFPALSALAEGFEVFVITDASGTFNE 145
+ QLII G+ + A A E + F + DA F+
Sbjct: 143 RDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSL 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01080TCRTETB537e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.3 bits (128), Expect = 7e-10
Identities = 70/362 (19%), Positives = 131/362 (36%), Gaps = 48/362 (13%)

Query: 55 LPAFSQSFQISPASSSLALSLTTAFLAISIVLSSAFSQAIGRRGVIFTSMLCAAILNIVS 114
LP + F PAS++ + +I + S +G + ++ ++ +++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 115 MFTPNWHSLLI-ARALEGLLLGGVPAVTMAWIAEEIAPEHLGKTMGLYIAGTAFGGMMGR 173
++ SLLI AR ++G PA+ M +A I E+ GK GL + A G +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 174 VGMGILVEYFSW---------------------------RTALGLLGAICFICSIAFLKL 206
G++ Y W + + G I I F L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 207 LPVSRN--FVQKKGLNLGFHIQMWRAH---------LSNTKLLRLFAIGFLLTSV---FV 252
S + F+ L+ ++ R N + G ++ FV
Sbjct: 217 FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFV 276

Query: 253 TLFNYATFRLSGAPYSLSQTQISLIFLSYSFGMVSSSLAGSLADRFGKKTMMMSGFALMI 312
++ Y + S ++ +IF ++ + G L DR G ++ G +
Sbjct: 277 SMVPYMMKDVHQ--LSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334

Query: 313 LGSL---MTLLSSLFGIIIGIAFITTGFFITHSLTSSSVGAESKQAKAHAS-SLYLLFYY 368
+ L L ++ + + I I F+ G T ++ S+ V + KQ +A A SL +
Sbjct: 335 VSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF 394

Query: 369 MG 370
+
Sbjct: 395 LS 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01120HTHTETR475e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 47.3 bits (112), Expect = 5e-09
Identities = 14/65 (21%), Positives = 22/65 (33%)

Query: 5 EASFRALRVLHTAKNLFNQYGFHKVGVDRIIADSQIPKATFYNYFHSKERLIQMSLTFQT 64
EA +L A LF+Q G + I + + + Y +F K L
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 DALKE 69
+ E
Sbjct: 68 SNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01180GPOSANCHOR310.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 31.2 bits (70), Expect = 0.001
Identities = 14/55 (25%), Positives = 21/55 (38%), Gaps = 6/55 (10%)

Query: 90 EMAPPAAPTDAVPPAPNQAAPAPQD----PNTPPPAANPNQSADPMAKDGA-LPA 139
E+A A + P+ A P + P PNQ+ PM + LP+
Sbjct: 454 ELAKLRAGKASDSQTPD-AKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPS 507


2A4U85_RS01880A4U85_RS01905Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS018802183.551559shikimate dehydrogenase
A4U85_RS018853193.787765oxygen-dependent coproporphyrinogen oxidase
A4U85_RS018904173.934554GTP cyclohydrolase II
A4U85_RS018953204.3403921-deoxy-D-xylulose-5-phosphate synthase
A4U85_RS019005223.920749inositol monophosphatase
A4U85_RS019055223.301589DEAD/DEAH box helicase
3A4U85_RS02500A4U85_RS02560Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS02500011-4.0116731-acyl-sn-glycerol-3-phosphate acyltransferase
A4U85_RS02505012-4.376017phospholipase D family protein
A4U85_RS02510316-5.120323metallophosphoesterase
A4U85_RS02515216-5.302376OmpA family protein
A4U85_RS02520316-4.851083sensor domain-containing diguanylate cyclase
A4U85_RS02525316-3.976309YfiR family protein
A4U85_RS02555315-3.196456**dihydrolipoyl dehydrogenase
A4U85_RS02560214-2.629978hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS02515OMPADOMAIN1007e-28 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 99.6 bits (248), Expect = 7e-28
Identities = 42/121 (34%), Positives = 63/121 (52%), Gaps = 11/121 (9%)

Query: 48 TLGLPERLLFDFNDATLKQSHEAELTRLANQLNKYDLN--KLKIVGHTDDVGNPEYNQKL 105
L +LF+FN ATLK +A L +L +QL+ D + ++G+TD +G+ YNQ L
Sbjct: 214 HFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGL 273

Query: 106 SEERAQSVANLFLTHGFKKENIYVIGRGSTQPYVPNTTNENR---------AINRRVAIV 156
SE RAQSV + ++ G + I G G + P NT + + A +RRV I
Sbjct: 274 SERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIE 333

Query: 157 I 157
+
Sbjct: 334 V 334


4A4U85_RS03620A4U85_RS03780Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS036202150.264508iron-containing alcohol dehydrogenase
A4U85_RS03625315-0.145933hypothetical protein
A4U85_RS03630317-0.165788lipoyl(octanoyl) transferase LipB
A4U85_RS03635418-0.020395DUF493 domain-containing protein
A4U85_RS036404250.511091RNA polymerase sigma factor RpoD
A4U85_RS036451210.376418hypothetical protein
A4U85_RS036500220.764002YecA family protein
A4U85_RS036551301.777787rhodanese-related sulfurtransferase
A4U85_RS036603341.939987DUF1289 domain-containing protein
A4U85_RS036653361.708773citrate (Si)-synthase
A4U85_RS036703331.400919succinate dehydrogenase, cytochrome b556
A4U85_RS036754331.846799succinate dehydrogenase, hydrophobic membrane
A4U85_RS036804352.013392succinate dehydrogenase flavoprotein subunit
A4U85_RS036853321.297835succinate dehydrogenase iron-sulfur subunit
A4U85_RS036903341.249864hypothetical protein
A4U85_RS191354381.6070412-oxoglutarate dehydrogenase E1 component
A4U85_RS036953361.4124112-oxoglutarate dehydrogenase complex
A4U85_RS037002350.553404dihydrolipoyl dehydrogenase
A4U85_RS037051260.523303ADP-forming succinate--CoA ligase subunit beta
A4U85_RS0371512265.993215succinate--CoA ligase subunit alpha
A4U85_RS0372011245.229055tryptophan--tRNA ligase
A4U85_RS0372510234.697840DUF808 domain-containing protein
A4U85_RS0373010234.653830neutral zinc metallopeptidase
A4U85_RS0373510234.587541cation:proton antiporter
A4U85_RS0374011254.955740Ig-like domain-containing protein
A4U85_RS03745216-5.358309hypothetical protein
A4U85_RS03750015-4.154378CapA family protein
A4U85_RS03755023-0.620841hypothetical protein
A4U85_RS037600260.523326MGMT family protein
A4U85_RS037651270.229555universal stress protein
A4U85_RS037701260.455951CatB-related O-acetyltransferase
A4U85_RS037752280.641534transcription elongation factor GreA
A4U85_RS037802260.717293carbamoyl-phosphate synthase large subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03665TCRTETOQM300.018 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 30.2 bits (68), Expect = 0.018
Identities = 7/26 (26%), Positives = 11/26 (42%)

Query: 179 YKYTVGQPFIYPRNDLNYAENFLHMM 204
Y T G+P PR + + +M
Sbjct: 610 YHVTTGEPVCQPRRPNSRIDKVRYMF 635


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03740INTIMIN534e-08 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 52.8 bits (126), Expect = 4e-08
Identities = 70/378 (18%), Positives = 110/378 (29%), Gaps = 21/378 (5%)

Query: 641 VEAIATDPAGNPSLPGTATVDAVGPNTDGVNFTVDSVTADNVINASEASGNVTVTGVLKN 700
V A A D GN S T+ + V TAD ++ + +T T +K
Sbjct: 527 VTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKK 586

Query: 701 VPADAANTVVTVVINGQTYTATVDSTAGTWTVSVPGSDLTADADKTIDAKVTFTDAAGNS 760
AN V+ I + T +A + + G V A +
Sbjct: 587 NGVAQANVPVSFNIV----SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMT 642

Query: 761 SSVNDTQTYTIDTTAPDAPVINPVNATDPITGTAEPGSTVTVTYPDGSTTTVVAGPDGTW 820
S++N +D T I D T A +T T V+ + T+
Sbjct: 643 SALNANAVIFVDQTKASITEIKA----DKTTAVANGQDAITYTVKVMKGDKPVSNQEVTF 698

Query: 821 TVPNPGLNDGDKVTAIATDPAGNPSLPGTAIVDAVGPNTDGVNFTVDSVTADNVINASEA 880
T G A T V V V + + +
Sbjct: 699 TT-TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTID 757

Query: 881 SGNVTV--TGVLKNVPADAANTVVTVVINGQTYTATVDSTAGTWTVSVPGSGLTADADKT 938
GN+ + TGV +P V + A+ + TW + P +
Sbjct: 758 DGNIEIVGTGVKGKLPT------VWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQ 811

Query: 939 IDAKVTFTDAAGNSSSVNDTQTYTIDTTAPDAPVINPVNGTDPITGTAEPGSTVTVTYPD 998
+ K T SS N T TYTI T P++ ++ ++ A
Sbjct: 812 VTLKEKGTTTISVISSDNQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP 868

Query: 999 GSTTTVVAGPDGTWTVPN 1016
S+ + W N
Sbjct: 869 -SSQNELENVFKAWGAAN 885



Score = 52.4 bits (125), Expect = 6e-08
Identities = 68/340 (20%), Positives = 107/340 (31%), Gaps = 32/340 (9%)

Query: 361 VTAVATDPAGNTSGPGTAIVDAVAPTVALNDVLTNDSTPALTGTVNDPTATVVVK--VDG 418
VTA A D GN+S + ++ ++ V D T T D T + V
Sbjct: 527 VTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKK 586

Query: 419 IDYPAVNNGDGTWTLADNTLPVL----TDGPHTITVTATDAAGNAGTDTGVVTVDTAAPN 474
N ++ + T+G TVT + T+A N
Sbjct: 587 NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN 646

Query: 475 TAGVTFTIDSVTADNVINASE----AAGNVTITGVLKNIPADA--TNTAVTVVINGVTYN 528
V F + + I A + A G IT +K + D +N VT +
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLS 706

Query: 529 ATVDKTAGTWTVSVPGSGLTADADKTIDAKVTFTDAAGNSSSVNDTQTYTLDTTAPNAPV 588
+ +KT V + T + A+V+ + V T T+D
Sbjct: 707 NSTEKTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIE--- 762

Query: 589 IDPVNGTDPITGTAEPGSTVTVTYPNGDTATVVAGPDG--SWSVPNPGLNDGDEVEAIAT 646
I GT G TV G +G +G +W NP + D T
Sbjct: 763 ---------IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVT 813

Query: 647 DPAGNPSLPGTATVDAVGPNTDGVNFTVDSVTADNVINAS 686
GT T+ + + +T+ + + V N S
Sbjct: 814 -----LKEKGTTTISVISSDNQTATYTIATPNSLIVPNMS 848



Score = 49.7 bits (118), Expect = 4e-07
Identities = 69/379 (18%), Positives = 111/379 (29%), Gaps = 21/379 (5%)

Query: 832 KVTAIATDPAGNPSLPGTAIVDAVGPNTDGVNFTVDSVTADNVINASEASGNVTVTGVLK 891
KVTA A D GN S + + V TAD ++ + +T T +K
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVK 585

Query: 892 NVPADAANTVVTVVINGQTYTATVDSTAGTWTVSVPGSGLTADADKTIDAKVTFTDAAGN 951
AN V+ I + T +A + + G V A
Sbjct: 586 KNGVAQANVPVSFNIV----SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEM 641

Query: 952 SSSVNDTQTYTIDTTAPDAPVINPVNGTDPITGTAEPGSTVTVTYPDGSTTTVVAGPDGT 1011
+S++N +D T I D T A +T T V+ + T
Sbjct: 642 TSALNANAVIFVDQTKASITEIKA----DKTTAVANGQDAITYTVKVMKGDKPVSNQEVT 697

Query: 1012 WTVPNPGLNDGDKVTAIATDPAGNPSLPGTAIVDAVGPNTDGVNFTVDSVTADNVINASE 1071
+T G A T V V V + + +
Sbjct: 698 FTT-TLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTI 756

Query: 1072 ASGNVTV--TGVLKNVPADAANTVVTVVINGQTYTATVDSTAGTWTVSVSGSDLTADADK 1129
GN+ + TGV +P V + A+ + TW + +
Sbjct: 757 DDGNIEIVGTGVKGKLPT------VWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSG 810

Query: 1130 TIDAKVTFTDAAGNSSSVNDTQTYTVDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYP 1189
+ K T SS N T TYT+ T PN+ ++ ++ A
Sbjct: 811 QVTLKEKGTTTISVISSDNQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKL 867

Query: 1190 DGTTATVVAGTDGSWSVPN 1208
++ + +W N
Sbjct: 868 P-SSQNELENVFKAWGAAN 885



Score = 48.5 bits (115), Expect = 9e-07
Identities = 72/372 (19%), Positives = 121/372 (32%), Gaps = 48/372 (12%)

Query: 1905 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 1963
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 1964 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTIDT 2017
N GT L+ N+ T G T+T+ + V +
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 2018 VAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVPN 2068
+ NA + D + A +T T V+ + ++ ++
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK 704

Query: 2069 PGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSIPA-- 2119
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 705 LSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI 763

Query: 2120 LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTL----------ADGPHTIT 2168
+ V TV + V+ A NG TW A+ + ++ G TI+
Sbjct: 764 VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTIS 823

Query: 2169 VTATDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATV 2228
V ++D N TA TI T PN+ ++ ++ A ++
Sbjct: 824 VISSD------NQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP-SSQNE 873

Query: 2229 VAGPDGSWSVPN 2240
+ +W N
Sbjct: 874 LENVFKAWGAAN 885



Score = 48.1 bits (114), Expect = 1e-06
Identities = 76/373 (20%), Positives = 121/373 (32%), Gaps = 50/373 (13%)

Query: 2249 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTATIVV--N 2306
VTA A D GN+S T++ + + +D V D T T D T I
Sbjct: 526 KVTARAYDRNGNSSNNVLLTIT--VLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTAT 583

Query: 2307 VDGVDYPAVNNG------DGTRTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTID 2360
V N GT L+ N+ T G T+T+ + V +
Sbjct: 584 VKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTS 643

Query: 2361 TVAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVP 2411
+ NA + D + A +T T V+ + ++ ++
Sbjct: 644 ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLG 703

Query: 2412 NPGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSTPA- 2463
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 704 KLSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIE 762

Query: 2464 -LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTL----------ADGPHTI 2511
+ V TV + V+ A NG TW A+ + ++ G TI
Sbjct: 763 IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTI 822

Query: 2512 TVTATDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTAT 2571
+V ++D N TA TI T PN+ ++ ++ A ++
Sbjct: 823 SVISSD------NQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP-SSQN 872

Query: 2572 VVAGPDGSWSVPN 2584
+ +W N
Sbjct: 873 ELENVFKAWGAAN 885



Score = 47.8 bits (113), Expect = 1e-06
Identities = 72/372 (19%), Positives = 121/372 (32%), Gaps = 48/372 (12%)

Query: 1561 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 1619
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 1620 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTIDT 1673
N GT L+ N+ T G T+T+ + V +
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 1674 VAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVPN 1724
+ NA + D + A +T T V+ + ++ ++
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK 704

Query: 1725 PGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSTPA-- 1775
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 705 LSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI 763

Query: 1776 LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTL----------ADGPHTIT 1824
+ V TV + V+ A NG TW A+ + ++ G TI+
Sbjct: 764 VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTIS 823

Query: 1825 VTATDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATV 1884
V ++D N TA TI T PN+ ++ ++ A ++
Sbjct: 824 VISSD------NQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP-SSQNE 873

Query: 1885 VAGPDGSWSVPN 1896
+ +W N
Sbjct: 874 LENVFKAWGAAN 885



Score = 47.8 bits (113), Expect = 1e-06
Identities = 72/372 (19%), Positives = 121/372 (32%), Gaps = 48/372 (12%)

Query: 2593 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 2651
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 2652 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTIDT 2705
N GT L+ N+ T G T+T+ + V +
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 2706 VAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVPN 2756
+ NA + D + A +T T V+ + ++ ++
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK 704

Query: 2757 PGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSTPA-- 2807
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 705 LSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI 763

Query: 2808 LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTL----------ADGPHTIT 2856
+ V TV + V+ A NG TW A+ + ++ G TI+
Sbjct: 764 VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTIS 823

Query: 2857 VTATDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATV 2916
V ++D N TA TI T PN+ ++ ++ A ++
Sbjct: 824 VISSD------NQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP-SSQNE 873

Query: 2917 VAGPDGSWSVPN 2928
+ +W N
Sbjct: 874 LENVFKAWGAAN 885



Score = 47.0 bits (111), Expect = 2e-06
Identities = 77/404 (19%), Positives = 123/404 (30%), Gaps = 66/404 (16%)

Query: 1024 KVTAIATDPAGNPSLPGTAIVDAVGPNTDGVNFTVDSVTADNVINASEASGNVTVTGVLK 1083
KVTA A D GN S + + V TAD ++ + +T T +K
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVK 585

Query: 1084 NVPADAANTVVTVVINGQTYTATVDSTAGTWTVSVSGSDLTADADKTIDAKVTFTDAAGN 1143
AN V+ I + T +A + + SG V A
Sbjct: 586 KNGVAQANVPVSFNIV----SGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEM 641

Query: 1144 SSSVNDTQTYTVDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGTDGS 1203
+S++N VD + + D + A +T T V+ + +
Sbjct: 642 TSALNANAVIFVDQTKASITEIKA----DKTTAVANGQDAITYTVKVMKGDKPVSNQEVT 697

Query: 1204 W--SVPNPGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTN 1254
+ ++ N + A +T+ PG VSA D+ AP V LT
Sbjct: 698 FTTTLGKLSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTI 756

Query: 1255 DSTPA--LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTLADGPHTITVTA 1311
D + V TV + V+ A NG TW A+ + +
Sbjct: 757 DDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS------------ 804

Query: 1312 TDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAG 1371
V + VT+ + + +T++V D TAT
Sbjct: 805 ------VDASSGQVTL---------------------KEKGTTTISVISSDNQTATYTIA 837

Query: 1372 PDGSWSVPNPGNLVDGDTVTATATDPAGNTS--LPGTGTVSADI 1413
S VPN A + N LP + ++
Sbjct: 838 TPNSLIVPNMSK----RVTYNDAVNTCKNFGGKLPSSQNELENV 877



Score = 46.2 bits (109), Expect = 4e-06
Identities = 53/289 (18%), Positives = 93/289 (32%), Gaps = 26/289 (8%)

Query: 3453 ADGPHTVSVTATDVAGNVSTPVTGTVTVDATAPTLAIT---TDDLALAAGEDAN----IT 3505
+ V+ A D GN S V T+TV + + TD A A+ IT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAIT 579

Query: 3506 FTFSEAVAGFDVSDITVVGGTLTGLTTTDNITWTAVFTPDGTGTAPSIAVADGSYTDVAG 3565
+T + G +++ V ++G + +G+G A ++ +
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT----NGSGKA-TVTLKSDK---PGQ 631

Query: 3566 NLGTGDVLDGTDGFIVDLVAPVVTFADVSTNDTTPALTGTIDDPTA-TVVVTVDGVDYPA 3624
+ + + T + V V T T + A T V V D P
Sbjct: 632 VVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKP- 690

Query: 3625 VNNGDGTWTLADNTLPVL-----ADGPHTVSVTATDVAGNVSTPVTGTVTVDATAPTLAI 3679
V+N + T+T L +G V++T+T ++ + V VD AP +
Sbjct: 691 VSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEF 750

Query: 3680 TTD---DLALAAGEDANITFTFSEAVAGFDVSDITVVGGTLTGLTTTDN 3725
T D + + ++ GG + N
Sbjct: 751 FTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSAN 799



Score = 45.8 bits (108), Expect = 6e-06
Identities = 72/372 (19%), Positives = 120/372 (32%), Gaps = 48/372 (12%)

Query: 2937 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 2995
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 2996 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTIDT 3049
N GT L+ N+ T G T+T+ + V +
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 3050 VAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVPN 3100
+ NA + D + A +T T V+ + ++ ++
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK 704

Query: 3101 PGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSTPA-- 3151
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 705 LSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI 763

Query: 3152 LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPVL----------ADGPHTVS 3200
+ V TV + V+ A NG TW A+ + + G T+S
Sbjct: 764 VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTIS 823

Query: 3201 VTATDAAGNVGNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATV 3260
V ++D N TA TI T PN+ ++ ++ A ++
Sbjct: 824 VISSD------NQTATYTIAT--PNSLIVPNMS-KRVTYNDAVNTCKNFGGKLP-SSQNE 873

Query: 3261 VAGPDGSWSVPN 3272
+ +W N
Sbjct: 874 LENVFKAWGAAN 885



Score = 45.8 bits (108), Expect = 6e-06
Identities = 58/295 (19%), Positives = 99/295 (33%), Gaps = 38/295 (12%)

Query: 3833 ADGPHTVSVTATDVAGNVSTPVTGTVTVDATAPTLAIT---TDDLALAAGEDAN----IT 3885
+ V+ A D GN S V T+TV + + TD A A+ IT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAIT 579

Query: 3886 FTFSEAVAGFDVSDITVVGGTLTGLTTTDNITWTAVFTPDGTGTAPSIAVADGSYTDVVG 3945
+T + G +++ V ++G + +G+G A ++ + VV
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT----NGSGKA-TVTLKSDKPGQVVV 634

Query: 3946 NLGTGDVLDGTDGFIVDLVAPVVTFADVSTN-------DTTPALTGTIDDPTATVVVTVD 3998
+ T ++ + A V F D + D T A+ D T V V
Sbjct: 635 SAKTAEMTSALN-------ANAVIFVDQTKASITEIKADKTTAVANGQD--AITYTVKVM 685

Query: 3999 GVDYPAVNNGDGTWTLADNTLPVL-----ADGPHTVSVTATDVAGNVSTPVTGTVTVDAT 4053
D P V+N + T+T L +G V++T+T ++ + V VD
Sbjct: 686 KGDKP-VSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVK 744

Query: 4054 APTLAITTD---DLALAAGEDANITFTFSEAVAGFDVSDITVVGGTLTGLTTTDN 4105
AP + T D + + ++ GG + N
Sbjct: 745 APEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSAN 799



Score = 41.2 bits (96), Expect = 2e-04
Identities = 41/255 (16%), Positives = 79/255 (30%), Gaps = 28/255 (10%)

Query: 4213 ADGPHTVSVTATDVAGNVSTPVTGTVTVDATAPTLAI-------TTDDLALAAGESANIT 4265
+ V+ A D GN S V T+TV + + T D + A + IT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAIT 579

Query: 4266 FTFSEAVAGFDANDITLVGGTLSALVTTDNITWTAVFTPDGTGTAPSIAVADGSYTDLAG 4325
+T + G ++ + +S + +G+G A ++ +
Sbjct: 580 YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT----NGSGKA-TVTLKSDK---PGQ 631

Query: 4326 NLGTGDVLDGTDGFVVDIVAPVVGFTNVTTNDDTPPLTGTIDDPTA-TVVVTVDGVDYP- 4383
+ + + T + V V T T + A T V V D P
Sbjct: 632 VVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 4384 -----ATNNGDGTWTLADNTLPSLIDGPHTVSVTATDPAGNVSIPATGTVTISSSSILAF 4438
G + + + +G V++T+T P + V+ + + A
Sbjct: 692 SNQEVTFTTTLGKLSNSTEKTDT--NGYAKVTLTSTTPG---KSLVSARVSDVAVDVKAP 746

Query: 4439 DNTDHAVLSPQPSLV 4453
+ L+ +
Sbjct: 747 EVEFFTTLTIDDGNI 761



Score = 40.8 bits (95), Expect = 2e-04
Identities = 73/397 (18%), Positives = 120/397 (30%), Gaps = 73/397 (18%)

Query: 1217 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 1275
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 1276 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAVVTIDT 1329
N GT L+ N+ T G T+T+ + V +
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 1330 VAPNAPVLDPINAT-------DPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSW--SVPN 1380
+ NA + D + A +T T V+ + ++ ++
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK 704

Query: 1381 PGNLVDGDTVTATATDPAGNTSLPGTGTVSA-------DITAPVVALDDVLTNDSTPA-- 1431
N + A +T+ PG VSA D+ AP V LT D
Sbjct: 705 LSNSTEKTDTNGYAKVTLTSTT-PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI 763

Query: 1432 LTGTVNDPTATVVVNVDGVDYPAV-NNGDGTWTLADNTLPTLADGPHTITVTATDAAGNV 1490
+ V TV + V+ A NG TW A+ + + V
Sbjct: 764 VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS------------------V 805

Query: 1491 GNDTAVVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSWSV 1550
+ VT+ + + +T++V D TAT S V
Sbjct: 806 DASSGQVTL---------------------KEKGTTTISVISSDNQTATYTIATPNSLIV 844

Query: 1551 PNPGNLVDGDTVTATATDPAGNTS--LPGTGTVSADI 1585
PN A + N LP + ++
Sbjct: 845 PNMSK----RVTYNDAVNTCKNFGGKLPSSQNELENV 877



Score = 40.1 bits (93), Expect = 3e-04
Identities = 74/346 (21%), Positives = 113/346 (32%), Gaps = 31/346 (8%)

Query: 3281 TVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA-TVVVNV 3339
VTA A D GN+S T++ ++ V +T+ + + + A T V
Sbjct: 526 KVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATV 584

Query: 3340 DGVDYPAVNNG------DGTWTLADNTLPTLADGPHTITVTATDAAGNVGN-DTAVVTID 3392
N GT L+ N+ T G T+T+ + V + TA +T
Sbjct: 585 KKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSA 644

Query: 3393 TSLPVVSLDDLTTNDTTPALTGAIDDPTATVVVNVDGIDYPATNNGDGTWTLADNTLPVL 3452
+ V D T T D T V D I Y
Sbjct: 645 LNANAVIFVDQTKASITEIKA----DKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTT 700

Query: 3453 ADGPHTVSVTATDVAGNVSTPVTGTVTVDATAPTLAITTDDLALAAGEDANITFTFSEAV 3512
G + S TD G VT+ +T P ++ + ++ A + F
Sbjct: 701 TLGKLSNSTEKTDTNG------YAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF-TT 753

Query: 3513 AGFDVSDITVVGGTLTGLTTTDNITWTAVFTPDGTGTAPSIAVADGSYTDVAGNLGTGDV 3572
D +I +VG + G T + + V G G YT + N V
Sbjct: 754 LTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN--------GKYTWRSANPAIASV 805

Query: 3573 LDGTDGFIVDLVAPVVTFADVSTNDTTPALTGTIDDPTATVVVTVD 3618
D + G V L T V ++D A T TI P + +V +
Sbjct: 806 -DASSG-QVTLKEKGTTTISVISSDNQTA-TYTIATPNSLIVPNMS 848



Score = 35.8 bits (82), Expect = 0.006
Identities = 74/362 (20%), Positives = 118/362 (32%), Gaps = 74/362 (20%)

Query: 1127 ADKTIDAKVTFTDAAGNSSSVNDTQTYTVDTVAPNAPVLDPINATDPVSGQAEPGSTVTV 1186
D GNSS+ N T TV N V+D + TD + + +
Sbjct: 521 GSNVYKVTARAYDRNGNSSN-NVLLTITV---LSNGQVVDQVGVTDFTADKTSAKA---- 572

Query: 1187 TYPDGTTATVVAGTDGSWSVPNPGNLVDGDTVTATA--------TDPAG------NTSLP 1232
DGT A T V V + V+ TA T+ +G + P
Sbjct: 573 ---DGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKP 629

Query: 1233 GTGTVSADITAPVVALDD---VLTNDSTPALTGTVNDPTATVVVNVDGVDY-PAVNNGDG 1288
G VSA AL+ + + + ++T D T V D + Y V GD
Sbjct: 630 GQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDK 689

Query: 1289 TWTLADNTLPTL------------ADGPHTITVTATDAA--------GNVGNDTAVVTID 1328
+ + T T +G +T+T+T +V D ++
Sbjct: 690 PVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVE 749

Query: 1329 TVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGPDG--SWSVPNPGNLVD 1386
+D N + G G TV G +G +G +W NP
Sbjct: 750 FFTT--LTIDDGNIE--IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPA---- 801

Query: 1387 GDTVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTATVVVN 1446
A+ +G +L GT + + ++D+ A T T+ P + +V N
Sbjct: 802 ----IASVDASSGQVTLKEKGTTTISVI----------SSDNQTA-TYTIATPNSLIVPN 846

Query: 1447 VD 1448
+
Sbjct: 847 MS 848



Score = 34.7 bits (79), Expect = 0.018
Identities = 67/342 (19%), Positives = 94/342 (27%), Gaps = 68/342 (19%)

Query: 3329 NDPTATVVVNVDGVDYPAVNNGDGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAV 3388
N+ T+ V + V + G + ADG IT TAT N A
Sbjct: 540 NNVLLTITV----LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKK----NGVAQ 591

Query: 3389 VTIDTSLPVVSLDDLTTNDTTPALTGAIDDPTATVVVNVDGIDYPATNNGDGTWTLADNT 3448
+ S +VS A+ A A NG G T+
Sbjct: 592 ANVPVSFNIVS---------GTAVLSANS----------------ANTNGSGKATVT--- 623

Query: 3449 LPVLADGPHTVSVTATDVAGNVSTPVTGTV--TVDATAPTLAITTDDLALAAGEDANITF 3506
L V + A S V A I D A IT+
Sbjct: 624 ---LKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITY 680

Query: 3507 TFSEAVAGFDVSDITVVGGTLTGLTTTDNITWTAVFTPDGTGTAPSIAVADGSYTDVAGN 3566
T VS+ +T+T T + T +
Sbjct: 681 TVKVMKGDKPVSN--------------QEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTT 726

Query: 3567 LGTGDVLDGTDGFIVDLVAPVVTFADVSTNDTTPA--LTGTIDDPTATVVVTVDGVDYPA 3624
G V VD+ AP V F T D + + TV + V+ A
Sbjct: 727 PGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKA 786

Query: 3625 V-NNGDGTWTLADNTLPVL----------ADGPHTVSVTATD 3655
NG TW A+ + + G T+SV ++D
Sbjct: 787 SGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSD 828



Score = 33.5 bits (76), Expect = 0.039
Identities = 68/358 (18%), Positives = 116/358 (32%), Gaps = 51/358 (14%)

Query: 2985 NDPTATVVVNVDGVDYPAVNNGDGTWTLADNTLPTLADGPHTITVTATDAAGNVGNDTAV 3044
N+ T+ V + V + G + ADG IT TAT V
Sbjct: 540 NNVLLTITV----LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVP 595

Query: 3045 VTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGPDGSWSVPNPGNL 3104
V+ + V+ A VL +A SG+A TVT+ V A S N +
Sbjct: 596 VSFNIVSGTA-VLSANSANTNGSGKA----TVTLKSDKPGQVVVSAKTAEMTSALNANAV 650

Query: 3105 VDGDTVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTATVV 3164
+ D A+ T+ + + A V D ++ T T+ + +
Sbjct: 651 IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS-- 708

Query: 3165 VNVDGVDYPAVNNGDGTWTLADNTLPVLADGPHTVSVTATDAAGNVGNDTAVVTIDTVAP 3224
+ +G + + + P V+A +V D ++
Sbjct: 709 --------TEKTDTNGYA-----KVTLTSTTPGKSLVSAR--VSDVAVDVKAPEVEFFTT 753

Query: 3225 NAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGPDG--SWSVPNPGNLVDGDTV 3282
+D N + G G TV G +G +G +W NP
Sbjct: 754 --LTIDDGNIE--IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPA-------- 801

Query: 3283 TATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTATVVVNVD 3340
A+ +G +L GT + + ++D+ A T T+ P + +V N+
Sbjct: 802 IASVDASSGQVTLKEKGTTTISVI----------SSDNQTA-TYTIATPNSLIVPNMS 848



Score = 33.5 bits (76), Expect = 0.039
Identities = 74/367 (20%), Positives = 114/367 (31%), Gaps = 69/367 (18%)

Query: 2297 NDPTATIVVNVDGVDYPAVNNGDGTRTLADNTLPTLADGPHTITVTATDAAGNVGNDTAV 2356
N+ TI V + V + G + ADG IT TAT V
Sbjct: 540 NNVLLTITV----LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTAT-----------V 584

Query: 2357 VTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGT-TATVVAGPDGSWSVPNPGN 2415
N PV I VSG A + T G T T+ + G V
Sbjct: 585 KKNGVAQANVPVSFNI-----VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK-- 637

Query: 2416 LVDGDTVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTATV 2475
TA T ++ A IT + A + A+T TV
Sbjct: 638 -------TAEMTSALNANAVIFVDQTKASITE-IKADKTTAVANGQDAITYTVKVMKGDK 689

Query: 2476 VVNVDGVDYPAVNNGDGTWTLADNTLPTLADGPHTITVTATDAA--------GNVGNDTA 2527
V+ V + G + + T +G +T+T+T +V D
Sbjct: 690 PVSNQEVTF---TTTLGKLSNSTEK--TDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVK 744

Query: 2528 VVTIDTVAPNAPVLDPINATDPVSGQAEPGSTVTVTYPDGTTATVVAGPDG--SWSVPNP 2585
++ +D N + G G TV G +G +G +W NP
Sbjct: 745 APEVEFFTT--LTIDDGNIE--IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANP 800

Query: 2586 GNLVDGDTVTATATDPAGNTSLPGTGTVSADITAPVVALDDVLTNDSTPALTGTVNDPTA 2645
A+ +G +L GT + + ++D+ A T T+ P +
Sbjct: 801 A--------IASVDASSGQVTLKEKGTTTISVI----------SSDNQTA-TYTIATPNS 841

Query: 2646 TVVVNVD 2652
+V N+
Sbjct: 842 LIVPNMS 848


5A4U85_RS03970A4U85_RS04065Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS03970317-6.554469transcription elongation factor GreB
A4U85_RS03980417-6.648088*SgcJ/EcaC family oxidoreductase
A4U85_RS03985315-6.324503non-ribosomal peptide synthetase
A4U85_RS03990623-11.173331hypothetical protein
A4U85_RS03995521-10.250249phage-related reverse
A4U85_RS04000418-5.374702RNA-directed DNA polymerase
A4U85_RS04015-1152.312370TetR/AcrR family transcriptional regulator
A4U85_RS04020-1152.932034nitroreductase family protein
A4U85_RS040251162.619221SDR family oxidoreductase
A4U85_RS040301172.946613TetR/AcrR family transcriptional regulator
A4U85_RS040351173.077349glycerate kinase
A4U85_RS04040-1152.728653FdhF/YdeP family oxidoreductase
A4U85_RS040450122.837107formate dehydrogenase accessory
A4U85_RS040500132.212369DUF1232 domain-containing protein
A4U85_RS040550142.4984504-phosphoerythronate dehydrogenase
A4U85_RS040600152.061931EF-P lysine aminoacylase GenX
A4U85_RS040652152.103642Tim44 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04015HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.9 bits (124), Expect = 1e-10
Identities = 20/105 (19%), Positives = 41/105 (39%), Gaps = 2/105 (1%)

Query: 5 ALPTRALYVVNKAIDLFHHRGFHLIGVDRIVKESEITKATFYNYFHSKERLIEICLMVQK 64
A TR +++ A+ LF +G + I K + +T+ Y +F K L + +
Sbjct: 9 AQETRQH-ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 EKLQEQVIAMVEYDLGTP-AIDKLKKLYFLHTDLEGPYYLLFKAI 108
+ E + G P ++ + ++ L + + L I
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04025DHBDHDRGNASE804e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 80.5 bits (198), Expect = 4e-20
Identities = 52/185 (28%), Positives = 91/185 (49%), Gaps = 2/185 (1%)

Query: 7 VLITGASSGIGSVYADRFAQRGYHLILVARDTNRLDKISKDLQEKYGVQVEFIQADLSND 66
ITGA+ GIG A A +G H+ V + +L+K+ L+ + E AD+ +
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVRDS 69

Query: 67 QDIRKI-EDVLKNDADIEILVNNAGIALNGNFLTQDRNEIEKLLTLNMTAVVRLSHAMSQ 125
I +I + + I+ILVN AG+ G + E E ++N T V S ++S+
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 126 SLIRKGKGAIINLGSVLGLAPEFGSTIYGASKSFIQFFSQGLHLELKDHGVHVQAVLPSA 185
++ + G+I+ +GS P Y +SK+ F++ L LEL ++ + V P +
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 186 TKTEI 190
T+T++
Sbjct: 190 TETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04030HTHTETR557e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 7e-12
Identities = 19/76 (25%), Positives = 35/76 (46%)

Query: 1 MKVSKTQVKENRDKIVEKATQLFRSKGYDGVGIAELMSSAGFTHGGFYKHFSSKTDLVTI 60
+ +K + +E R I++ A +LF +G + E+ +AG T G Y HF K+DL +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 TAKYGLEQVLKRIEGL 76
+ + +
Sbjct: 62 IWELSESNIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04050PF04647260.050 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 25.9 bits (57), Expect = 0.050
Identities = 15/88 (17%), Positives = 33/88 (37%), Gaps = 11/88 (12%)

Query: 39 VVAVAYAFSPIDLIPDFIPILGFIDDAVILPILIWLAVRFTPQQVIFDAEQQAKEWLDEH 98
+V A+ + P + +L I L L++L P+ +I
Sbjct: 86 LVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNPRNLI-----------SNT 134

Query: 99 EKRPKNYLVAVLIILIWLTLAVMAYFYF 126
E+R L +++++ ++ AY +
Sbjct: 135 EQRKTLKLKTSMVLMVLFGGSIGAYRLY 162


6A4U85_RS04220A4U85_RS04290Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS04220216-2.982184phosphoribosylglycinamide formyltransferase
A4U85_RS04225217-3.278017phosphoribosylformylglycinamidine cyclo-ligase
A4U85_RS04230319-4.103251AI-2E family transporter
A4U85_RS04235220-4.228052DnaA regulatory inactivator Hda
A4U85_RS04240219-4.447979rhombotarget A
A4U85_RS04245016-3.713846CSLREA domain-containing protein
A4U85_RS04250-113-1.493703DUF3106 domain-containing protein
A4U85_RS04255-114-0.574050hypothetical protein
A4U85_RS04260-213-0.087315RNA polymerase sigma factor
A4U85_RS042653130.902843RNA methyltransferase
A4U85_RS042704151.175008class 1 fructose-bisphosphatase
A4U85_RS042755151.284075NF038105 family protein
A4U85_RS042803151.390289peptidoglycan-associated lipoprotein Pal
A4U85_RS042852120.641022Tol-Pal system beta propeller repeat protein
A4U85_RS042902121.579450cell envelope integrity protein TolA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04280OMPADOMAIN1094e-31 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 109 bits (274), Expect = 4e-31
Identities = 32/117 (27%), Positives = 52/117 (44%), Gaps = 11/117 (9%)

Query: 77 VHFDYDSSDLSTEDYQTLQAHAQFL--MANANSKVALTGHTDERGTREYNMALGERRAKA 134
V F+++ + L E L L + + V + G+TD G+ YN L ERRA++
Sbjct: 221 VLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 135 VQNYLITSGVNPQQLEAVSYGKEAPV---------NPGHDESAWKENRRVEINYEAV 182
V +YLI+ G+ ++ A G+ PV +RRVEI + +
Sbjct: 281 VVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGI 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04290IGASERPTASE731e-15 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 72.8 bits (178), Expect = 1e-15
Identities = 56/363 (15%), Positives = 113/363 (31%), Gaps = 39/363 (10%)

Query: 39 PEQPKKLTTVLVKPEDLPPPLAKEVEQPTVAENQAEEVLSPIVDETLPQNLPAAPPPP-- 96
PE K+ TV ++ P + + P+V N E P PP P
Sbjct: 983 PEVEKRNQTV--DTTNITTPNNIQADVPSVPSNNEEI--------ARVDEAPVPPPAPAT 1032

Query: 97 ----------TAQQLAAQKQKAEQAQQAKLAEEKRKAEEAAKVKQAAEQQRLEEAQKQQA 146
++Q + +K EQ A+ + A+EA +A Q +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 147 EAKRQAEAKARAEAEQKRKAEQNAKAEADAKARQKVTEEAKRKAETEARLKREAQKAENA 206
+ + E K A E++ K AK E + K E K ++ + ++ A
Sbjct: 1093 KETQTTETKETATVEKEEK----AKVETE-----KTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 207 KLQAQQEAKRKAEADAKAKQQKAAEDAKRKAESDAKAKQQAADNAKRKAEADAKAKQQKA 266
+ + + + + +++ A+ + E+ + +Q ++ +
Sbjct: 1144 EPARENDPTVNIK-EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT 1202

Query: 267 AEDAKRKAEADAKAKQQKAAEDAKRKAEADAKAKQQKAAEDAKRKAEADAKAKQQKAAED 326
+ + + K ++ ++ D A D + A
Sbjct: 1203 TPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNA--- 1259

Query: 327 AKRKAEADAKAKQQKAAEDARRKAEIEAEEKAASAKKAQEEAAQKKGEAKKIASSAKRDF 386
+DA+AK Q A + + + + + K +SS R F
Sbjct: 1260 ----VLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRF 1315

Query: 387 EQK 389
K
Sbjct: 1316 SSK 1318


7A4U85_RS04500A4U85_RS04575Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS04500-117-3.621075tRNA-(ms[2]io[6]A)-hydroxylase
A4U85_RS04505120-4.091804SDR family oxidoreductase
A4U85_RS19140621-7.223873*hypothetical protein
A4U85_RS04515317-5.428899hypothetical protein
A4U85_RS04520217-5.017437PaaI family thioesterase
A4U85_RS04525215-5.243752Hsp70 family protein
A4U85_RS04530213-4.694679hypothetical protein
A4U85_RS0453519-3.239001pyridoxal phosphate-dependent aminotransferase
A4U85_RS04540010-0.244797hypothetical protein
A4U85_RS04545111-0.986748sensor domain-containing diguanylate cyclase
A4U85_RS04550110-1.061153hypothetical protein
A4U85_RS04560216-1.327702excinuclease ABC subunit UvrB
A4U85_RS04565117-1.443319lipocalin family protein
A4U85_RS04570116-0.609437EamA family transporter RarD
A4U85_RS04575215-0.417179glyceraldehyde-3-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04505DHBDHDRGNASE939e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.8 bits (230), Expect = 9e-25
Identities = 63/248 (25%), Positives = 105/248 (42%), Gaps = 23/248 (9%)

Query: 3 ALVTGASAGFGYSISKKLIESGYKVIGCGRRAEKLEELQKQLG-ENFYPLVF--DMTDTA 59
A +TGA+ G G ++++ L G + EKLE++ L E + F D+ D+A
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 60 ---ENINKLFKELPNEFQIDQIDLLVNNAGLALGLEPADKADLDDWYTMIDTNVKGLVTV 116
E ++ +E + ID+LVN AG+ L ++W N G+
Sbjct: 71 AIDEITARIERE------MGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 117 TRLILPSMVKKKSGLIINMGSIAGTYPYPGGNVYGATKAFVEQFSLNLRADLAGTGVRVT 176
+R + M+ ++SG I+ +GS P Y ++KA F+ L +LA +R
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 177 NIEPG---------LCGGTEFSLVRFKGDQEKANSLYDKKNPILPEDIANTVAWIAS-QP 226
+ PG L + KG E + K P DIA+ V ++ S Q
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 227 PHININRI 234
HI ++ +
Sbjct: 244 GHITMHNL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04530SHAPEPROTEIN1092e-28 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 109 bits (275), Expect = 2e-28
Identities = 67/349 (19%), Positives = 147/349 (42%), Gaps = 45/349 (12%)

Query: 8 IAIDLGTSNSLVGIFENGQSKLINNPYGMKTTPSAIALDEQGQILIGQAALELRSRGKEV 67
++IDLGT+N+L+ + GQ ++N PS +A+ + + + + G +
Sbjct: 13 LSIDLGTANTLI--YVKGQGIVLNE-------PSVVAIRQDR----AGSPKSVAAVGHDA 59

Query: 68 LTSFKRLMGTS-------KRLKLGQQSFSAVELSSLILKS-LKQDVEQALQCEVEEAVIT 119
K+++G + + +K G + ++ +L+ +KQ + ++
Sbjct: 60 ----KQMLGRTPGNIAAIRPMKDG--VIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVC 113

Query: 120 VPAYFNDIQRQATISAAELAGLKVSRLINEPTAAALAYGLGQTQDSCFLIFDLGGGTFDV 179
VP ++R+A +A+ AG + LI EP AAA+ GL ++ + ++ D+GGGT +V
Sbjct: 114 VPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEV 173

Query: 180 SIVELFDGITEVRASAGDNYLGGDDFVQLLMKQYWKQHATIFG-YTENTIPFDIETALRA 238
+++ L + + +GGD F + ++ + + ++ G T I +I +A
Sbjct: 174 AVISLNGVVY-----SSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPG 228

Query: 239 RAQHSLHVLSKEAKTDLIFKWE-DKEATLEITQEEFAVWAEPLLLRLRR---PLERALRD 294
+ V + + + + LE QE +++ L + L + +
Sbjct: 229 DEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISE 288

Query: 295 ARVLPQQVDQIIMVGGATRIPVVRKLVTKLFGRFPSTSVQPDEAIVRGT 343
+++ GG + + +L+ + G + P + RG
Sbjct: 289 R--------GMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLTCVARGG 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04565BCTLIPOCALIN1032e-30 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 103 bits (258), Expect = 2e-30
Identities = 54/154 (35%), Positives = 85/154 (55%), Gaps = 10/154 (6%)

Query: 43 VDKVELDRYLGVWYEVARKPAFFQKKCAYNVSATYTLNENGNIVVDNRCYDNQKQL-QQS 101
V EL+ YLG WYEVAR F++ + V+A Y + +G I V NR Y +K +++
Sbjct: 26 VSDFELNNYLGKWYEVARLDHSFERGLS-QVTAEYRVRNDGGISVLNRGYSEEKGEWKEA 84

Query: 102 IGEAFVVNPPYNTKLKVSFLPEAVRWIPIIRGDYWILKLD-EDYQTVLVGEPSRKYLWVL 160
G+A+ VN + LKVSF G Y + +LD E+Y V P+ +YLW+L
Sbjct: 85 EGKAYFVNGSTDGYLKVSFFGP-------FYGSYVVFELDRENYSYAFVSGPNTEYLWLL 137

Query: 161 SRTPHPHKEVVDEYLNYAKTLGFDIRDIIHTEHK 194
SRTP + ++D+++ +K GFD +I+ + +
Sbjct: 138 SRTPTVERGILDKFIEMSKERGFDTNRLIYVQQQ 171


8A4U85_RS05420A4U85_RS05835Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS05420323-0.396789site-specific integrase
A4U85_RS054253230.020412phage antirepressor
A4U85_RS054304240.559849hypothetical protein
A4U85_RS054355220.428770hypothetical protein
A4U85_RS054405220.614840hypothetical protein
A4U85_RS054455240.714380hypothetical protein
A4U85_RS054504230.957278hypothetical protein
A4U85_RS054554231.696761PD-(D/E)XK nuclease-like domain-containing
A4U85_RS05460322-0.259354hypothetical protein
A4U85_RS05465323-2.572859hypothetical protein
A4U85_RS05470321-3.504217hypothetical protein
A4U85_RS05475221-3.357189pentapeptide repeat-containing protein
A4U85_RS05480324-3.405145DUF4411 family protein
A4U85_RS05485424-4.516343ImmA/IrrE family metallo-endopeptidase
A4U85_RS05490524-3.566956hypothetical protein
A4U85_RS05495623-1.323386helix-turn-helix domain-containing protein
A4U85_RS055053240.026736helix-turn-helix transcriptional regulator
A4U85_RS055103240.542532helix-turn-helix domain-containing protein
A4U85_RS055152250.649514hypothetical protein
A4U85_RS055202260.621359helix-turn-helix domain-containing protein
A4U85_RS055253291.009817AAA family ATPase
A4U85_RS194105291.223672hypothetical protein
A4U85_RS055356290.461280hypothetical protein
A4U85_RS05540730-0.561339hypothetical protein
A4U85_RS05545931-0.664875hypothetical protein
A4U85_RS055501133-0.480998hypothetical protein
A4U85_RS055551130-1.260202VRR-NUC domain-containing protein
A4U85_RS055601031-1.691796hypothetical protein
A4U85_RS05565932-1.701826hypothetical protein
A4U85_RS05570829-0.649584hypothetical protein
A4U85_RS05585630-0.630504**hypothetical protein
A4U85_RS055903260.462759hypothetical protein
A4U85_RS055952240.719529DUF968 domain-containing protein
A4U85_RS056003240.800763recombination protein NinB
A4U85_RS056053240.952712hypothetical protein
A4U85_RS056102220.731584DUF2280 domain-containing protein
A4U85_RS056151211.694170hypothetical protein
A4U85_RS056203201.344107DUF4055 domain-containing protein
A4U85_RS056254211.275399minor capsid protein
A4U85_RS056302201.291363hypothetical protein
A4U85_RS056352191.505749hypothetical protein
A4U85_RS056402191.906791hypothetical protein
A4U85_RS056452210.094943hypothetical protein
A4U85_RS05650325-0.405555hypothetical protein
A4U85_RS05655427-0.878633hypothetical protein
A4U85_RS05660526-1.625289hypothetical protein
A4U85_RS05665525-0.807987hypothetical protein
A4U85_RS056705230.242023GIY-YIG nuclease family protein
A4U85_RS056802170.977990hypothetical protein
A4U85_RS056852180.842341hypothetical protein
A4U85_RS056902170.933428hypothetical protein
A4U85_RS056953180.142253hypothetical protein
A4U85_RS05700218-1.757655hypothetical protein
A4U85_RS05705723-4.053347hypothetical protein
A4U85_RS05715519-1.944339Arc family DNA-binding protein
A4U85_RS05720418-2.900056Arc family DNA-binding protein
A4U85_RS05725419-3.082817hypothetical protein
A4U85_RS05730418-2.755830DUF4468 domain-containing protein
A4U85_RS05735318-2.429818tape measure protein
A4U85_RS05740318-1.755555hypothetical protein
A4U85_RS057454170.313049hypothetical protein
A4U85_RS057504192.196792hypothetical protein
A4U85_RS057552182.766514DUF1833 family protein
A4U85_RS057601161.837040hypothetical protein
A4U85_RS057650131.496783hypothetical protein
A4U85_RS057700111.300013hypothetical protein
A4U85_RS05775015-1.730377glycoside hydrolase family protein
A4U85_RS057950130.275454*hypothetical protein
A4U85_RS05800-1132.420847DUF3820 family protein
A4U85_RS05805-1133.071767alpha/beta fold hydrolase
A4U85_RS19155-1133.249113hypothetical protein
A4U85_RS05810-1133.313647serine hydroxymethyltransferase
A4U85_RS05815-2133.331896multidrug efflux RND transporter outer membrane
A4U85_RS058200152.952787multidrug efflux RND transporter permease
A4U85_RS058250121.619887multidrug efflux RND transporter periplasmic
A4U85_RS058300140.367548multidrug efflux transcriptional repressor AdeL
A4U85_RS058352150.022312ABC transporter substrate-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05510HTHFIS290.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.003
Identities = 11/32 (34%), Positives = 18/32 (56%), Gaps = 2/32 (6%)

Query: 15 EVSRVLQAL--ASSNQSQVAEQLGIDPSTLSR 44
E +L AL NQ + A+ LG++ +TL +
Sbjct: 437 EYPLILAALTATRGNQIKAADLLGLNRNTLRK 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05675ANTHRAXTOXNA290.014 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 28.6 bits (63), Expect = 0.014
Identities = 18/70 (25%), Positives = 33/70 (47%), Gaps = 14/70 (20%)

Query: 1 MNAIVKIETPEQAYQVIEQMKEQLKDVKCSLLEMTSMIG--------DIEKHEGEFFFNF 52
+N +VK E + I+Q ++ LK + +LE+ S +G D+ +H+
Sbjct: 63 INNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDIDLVEHKE------ 116

Query: 53 LKSLSAQGKK 62
L+ LS + K
Sbjct: 117 LQDLSEEEKN 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05740CHANLCOLICIN310.047 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.047
Identities = 34/167 (20%), Positives = 66/167 (39%), Gaps = 5/167 (2%)

Query: 932 KIQQTQDALDAQAKFLLQEVMTNKSYSKTKNALLNDDLDYRSLEKIVGKNFIGWDYEGKK 991
++ AL +AK + + K S ++ ++ D + ++L + + D E K
Sbjct: 176 AEEKRLAALSEEAKAV---EIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKT 232

Query: 992 LGKDKASQHLAKQDSYYNQLNKILGASPDAASKAIGDLSKFEDEAYKARAKTLEEIKQLQ 1051
L + A Y +L++++ A+ + + FE + A + E KQ Q
Sbjct: 233 LAGKRNELAQASAK--YKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQ 290

Query: 1052 ATYDSETVARSKKREEEINKATILGQSNLIPKINERYDAEDKLAQKQ 1098
T + R +I KA +N I ++AE+ L + Q
Sbjct: 291 VTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQ 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05810SURFACELAYER310.012 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 30.8 bits (69), Expect = 0.012
Identities = 25/124 (20%), Positives = 43/124 (34%), Gaps = 9/124 (7%)

Query: 255 VFPGNQGGPLMHAIAAKAICFKEAMSDDFKAYQQQVVKNAQAMAEVFIARGYDVVSGG-- 312
+ NQG + ++ A A D K A+ + A+ +V S G
Sbjct: 212 IDADNQGQLNITSVVAAINSKYFAAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYF 271

Query: 313 TDNHLFLLSL-IKQDVTGKDADAWLGAAHITVNKNSVPNDPRSPFVTS------GIRIGT 365
H F +++ + GK A + V VP+ ++ + R+GT
Sbjct: 272 KAPHTFTVNVKATSNKNGKSATLPVTVTVPNVADPVVPSQSKTIMHNAYFYDKDAKRVGT 331

Query: 366 PAVT 369
VT
Sbjct: 332 DKVT 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05820ACRIFLAVINRP10980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1098 bits (2841), Expect = 0.0
Identities = 429/1042 (41%), Positives = 651/1042 (62%), Gaps = 18/1042 (1%)

Query: 3 ISKFFIDRPIFAGVLSVLILLAGLLSVFQLPISEYPEVVPPSVVVRAQYPGANPKVIAET 62
++ FFI RPIFA VL++++++AG L++ QLP+++YP + PP+V V A YPGA+ + + +T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 VASPLEESINGVEDMLYMQSQANSDGNLTITVNFKLGIDPDKAQQLVQNRVSQAMPRLPE 122
V +E+++NG+++++YM S ++S G++TIT+ F+ G DPD AQ VQN++ A P LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 123 DVQRLGVTTLKSSPTLTMVVHLTSPDNRYDMTYLRNYAVLNVKDRLARLQGVGEVGLFGS 182
+VQ+ G++ KSS + MV S + + +Y NVKD L+RL GVG+V LFG
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 183 GDYAMRVWLDPQKVAQRNLTATEIVNAIREQNIQVAAGTIGASPSNS--PLQLSVNAQGR 240
YAMR+WLD + + LT +++N ++ QN Q+AAG +G +P+ L S+ AQ R
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 LTTEQEFADIILKTAPDGAVTRLGDVARVELAASQYGLRSLLDNKQAVAIPIFQAPGANA 300
+EF + L+ DG+V RL DVARVEL Y + + ++ K A + I A GANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 LQVSDQVRSTMKELSKDFPSSIKYDIVYDPTQFVRASIKAVVHTLLEAIALVVVVVILFL 360
L + +++ + EL FP +K YD T FV+ SI VV TL EAI LV +V+ LFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QTWRASIIPLLAVPVSIIGTFALMLAFGYSINALSLFGMVLAIGIVVDDAIVVVENVER- 419
Q RA++IP +AVPV ++GTFA++ AFGYSIN L++FGMVLAIG++VDDAIVVVENVER
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 420 NIEAGLNPREATYRAMREVSGPIIAIALTLVAVFVPLAFMTGLTGQFYKQFAMTIAISTV 479
+E L P+EAT ++M ++ G ++ IA+ L AVF+P+AF G TG Y+QF++TI +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 480 ISAFNSLTLSPALAALLLKGHDAKPDALTRIMNRVFGRFFALFNRVFSRASDRYSQGVSR 539
+S +L L+PAL A LLK ++ + G FF FN F + + Y+ V +
Sbjct: 480 LSVLVALILTPALCATLLKP-------VSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGK 532

Query: 540 VISHKASAMGVYAALLGLTVGISYIVPGGFVPAQDKQYLISFAQLPNGASLDRTEAVIRK 599
++ + +YA ++ V + +P F+P +D+ ++ QLP GA+ +RT+ V+ +
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQ 592

Query: 600 MSDTALK--QPGVESAVAFPGLSINGFTNSSSAGIVFVTLKPFDERKAKDLSANAIAGAL 657
++D LK + VES G S +G + +AG+ FV+LKP++ER + SA A+
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSG--QAQNAGMAFVSLKPWEERNGDENSAEAVIHRA 650

Query: 658 NQKYSAIQDAYIAVFPPPPVMGLGTMGGFKLQLEDRGALGYSALNDAAQNFM-KAAQSAP 716
+ I+D ++ F P ++ LGT GF +L D+ LG+ AL A + AAQ
Sbjct: 651 KMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPA 710

Query: 717 ELGPMFSSYQINVPQLNVDLDRVKAKQQGVAVTDVFNTMQIYLGSQYVNDFNRFGRVYQV 776
L + + + Q +++D+ KA+ GV+++D+ T+ LG YVNDF GRV ++
Sbjct: 711 SLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKL 770

Query: 777 RAQADAPFRANPEDILQLKTRNSAGQMVPLSSLVNVTQTYGPEMVVRYNGYTSADINGGP 836
QADA FR PED+ +L R++ G+MVP S+ YG + RYNG S +I G
Sbjct: 771 YVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA 830

Query: 837 APGYSSSQAEAAVERIAAQTLPRGIKFEWTDLTYQKILAGNAGLWVFPISVLLVFLVLAA 896
APG SS A A +E +A++ LP GI ++WT ++YQ+ L+GN + IS ++VFL LAA
Sbjct: 831 APGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 897 QYESLTLPLAVILIVPMGILAALTGVWLTAGDNNIFTQIGLMVLVGLACKNAILIVEFAR 956
YES ++P++V+L+VP+GI+ L L N+++ +GL+ +GL+ KNAILIVEFA+
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 957 EL-EMQGATAFKAAVEASRLRLRPILMTSIAFIMGVVPLVTSTGAGSEMRHAMGVAVFFG 1015
+L E +G +A + A R+RLRPILMTS+AFI+GV+PL S GAGS ++A+G+ V G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 1016 MIGVTFFGLFLTPAFYVLIRSL 1037
M+ T +F P F+V+IR
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRC 1031



Score = 83.3 bits (206), Expect = 3e-18
Identities = 86/461 (18%), Positives = 169/461 (36%), Gaps = 40/461 (8%)

Query: 610 VESAVAFP-GLSINGFTN-------SSSAGIVFVTLKPFDERKAKDLSANAIAGALNQKY 661
V+ V ++NG N S SAG V +TL F D++ + L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLT-FQSGTDPDIAQVQVQNKLQLAT 115

Query: 662 SAIQDAYIAVFPPPPVMGLGTMGGFKLQLE---DRGALGYSALNDAAQNFMKAAQSAPEL 718
+ + + + + D ++D + +K L
Sbjct: 116 PLLPQE----VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVK-----DTL 166

Query: 719 GPMFSSYQINV----PQLNVDLDRVKAKQQGVAVTDVFNTM-----QIYLGSQYVNDFNR 769
+ + + + + LD + + DV N + QI G Q
Sbjct: 167 SRLNGVGDVQLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAG-QLGGTPAL 225

Query: 770 FGRVYQVRAQADAPFRANPEDILQLKTR-NSAGQMVPLSSLVNVTQTYGP-EMVVRYNGY 827
G+ A F NPE+ ++ R NS G +V L + V ++ R NG
Sbjct: 226 PGQQLNASIIAQTRF-KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK 284

Query: 828 TSADINGGPAPGYSSSQ-AEAAVERIA--AQTLPRGIKFEWT-DLTYQKILAGNAGLWVF 883
+A + A G ++ A+A ++A P+G+K + D T L+ + +
Sbjct: 285 PAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTL 344

Query: 884 PISVLLVFLVLAAQYESLTLPLAVILIVPMGILAALTGVWLTAGDNNIFTQIGLMVLVGL 943
+++LVFLV+ +++ L + VP+ +L + N T G+++ +GL
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGL 404

Query: 944 ACKNAILIVE-FARELEMQGATAFKAAVEASRLRLRPILMTSIAFIMGVVPLVTSTGAGS 1002
+AI++VE R + +A ++ ++ ++ +P+ G+
Sbjct: 405 LVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 1003 EMRHAMGVAVFFGMIGVTFFGLFLTPAF-YVLIRSLNSKHK 1042
+ + + M L LTPA L++ ++++H
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHH 505


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05825RTXTOXIND513e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 3e-09
Identities = 25/139 (17%), Positives = 58/139 (41%), Gaps = 14/139 (10%)

Query: 44 ATVDVASVVSKTITDWQEYSGRLEAIDQVDIRPQVSGKLIAVHFKDGSLVKKGDLLFTID 103
V++ + + +T +SGR + I +P + + + K+G V+KGD+L +
Sbjct: 78 GQVEIVATANGKLT----HSGRSKEI-----KPIENSIVKEIIVKEGESVRKGDVLLKLT 128

Query: 104 PRPFEAELNRAKAQLASAEAQVTYTASNLSRIQRLIQSNAVSRQE----LDLAENDARSA 159
EA+ + ++ L A + LSR L + + + +++E +
Sbjct: 129 ALGAEADTLKTQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 160 NANLQAARAAVQSARLNLE 178
+ ++ + Q+ + E
Sbjct: 188 TSLIKEQFSTWQNQKYQKE 206



Score = 49.4 bits (118), Expect = 1e-08
Identities = 19/105 (18%), Positives = 40/105 (38%), Gaps = 12/105 (11%)

Query: 110 ELNRAKAQLASAEAQVTYTASNLSRIQRLIQSNAVSRQELDLAENDARS--------ANA 161
+ + + A ++ S L +I+ I S +++E L ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEILS---AKEEYQLVTQLFKNEILDKLRQTTD 309

Query: 162 NLQAARAAVQSARLNLEYTRITAPVSGRISRAEV-TVGNVVSAGN 205
N+ + + + I APVS ++ + +V T G VV+
Sbjct: 310 NIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


9A4U85_RS06200A4U85_RS06315Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS06200-219-3.267002class I SAM-dependent methyltransferase
A4U85_RS06205-119-3.377088glycosyltransferase
A4U85_RS06210019-3.022284NirD/YgiW/YdeI family stress tolerance protein
A4U85_RS06215-113-1.957196BLUF domain-containing protein
A4U85_RS06220-114-1.962965hypothetical protein
A4U85_RS06225-213-1.839819LysE family translocator
A4U85_RS06230-313-1.523021AraC family transcriptional regulator
A4U85_RS06235-213-3.087603alpha/beta fold hydrolase
A4U85_RS06240-215-4.501866sodium/glutamate symporter
A4U85_RS06245116-5.585535ATP-binding protein
A4U85_RS06250220-7.054779FadR family transcriptional regulator
A4U85_RS06255324-8.190723hypothetical protein
A4U85_RS06260021-7.159901TetR/AcrR family transcriptional regulator
A4U85_RS06265012-3.532097transposase
A4U85_RS06270011-2.145877Csu fimbrial major subunit CsuAB
A4U85_RS06275114-2.333730spore coat protein U domain-containing protein
A4U85_RS06280011-1.330304Csu fimbrial biogenesis protein CsuB
A4U85_RS06285010-0.930223Csu fimbrial biogenesis chaperone CsuC
A4U85_RS06290111-1.645349Csu fimbrial usher CsuD
A4U85_RS06295112-2.215652Csu fimbrial tip adhesin CsuE
A4U85_RS19365314-4.023782hypothetical protein
A4U85_RS06305214-3.942244glycine betaine/L-proline transporter ProP
A4U85_RS06310318-4.306194NUDIX hydrolase
A4U85_RS06315218-3.769918HD domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06210adhesinb280.012 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.012
Identities = 16/67 (23%), Positives = 24/67 (35%), Gaps = 11/67 (16%)

Query: 1 MKKILFTTLTGLVLLTSSAAFARTDPALLNQAAKNVVTVSKAKTLADETGVTLTGTIVKH 60
MKK F L L + +A ++ + NVV ++ I K+
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVAT-----------NSIIADITKN 49

Query: 61 IAGDHYE 67
IAGD
Sbjct: 50 IAGDKIN 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06260HTHTETR602e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.4 bits (146), Expect = 2e-13
Identities = 14/73 (19%), Positives = 30/73 (41%)

Query: 17 TTLKGQERIKQILRNAEIVFLTKGYSGFSMRGVATQSNISLSTLQHYFQNKDILLKALLN 76
T + QE + IL A +F +G S S+ +A + ++ + +F++K L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 77 KLICDYIQRIEIL 89
+ +
Sbjct: 65 LSESNIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06290PF00577390e-124 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 390 bits (1003), Expect = e-124
Identities = 139/755 (18%), Positives = 267/755 (35%), Gaps = 79/755 (10%)

Query: 120 LDKLKDVSYEYQSSNQYFKLNFPPAWMPTQVLGKDSWYKPEVAQSGI-GLLNNYDF--YT 176
+ D + + Q L P A+M + G + PE+ GI L NY+F +
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARG---YIPPELWDPGINAGLLNYNFSGNS 195

Query: 177 YRPYQGGSTSSLFTEQRFFSPLGV--IKNSGVYVKNQYKNEGNAESVDNDGYRRYDTSWQ 234
+ GG++ + + +G ++++ + N ++ S + ++ +T +
Sbjct: 196 VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNS----SDSSSGSKNKWQHINTWLE 251

Query: 235 FDNQKNATSFLLGDIITGSKTTWGSSVRLGGFQVQRNYSTRPDLITYPLPQFIGQAALPS 294
D + LGD T + + G Q+ + + PD P G A +
Sbjct: 252 RDIIPLRSRLTLGDGYTQG-DIF-DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 295 TVDLIINGQKTSSTEVQSGPFILNNVPFINGKGEAVVVTTDAVGRQVTTSVPFYISNTLL 354
V + NG ++ V GPF +N++ G+ V +A G +VP+ L
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 355 KPGLFDYSLSLGKIREDYGLKNFSYGKFASAADARYGVNDWLTVEGRTELSSDLQLLGAG 414
+ G YS++ G+ R + + +G+ T+ G T+L+ + G
Sbjct: 370 REGHTRYSITAGEYRSGNAQQ---EKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFG 426

Query: 415 SVLKLANLGVLSASFTQSKADKSMSEDRTKDLEGNQYTVGYSYNRNRFGFSIN------- 467
+ LG LS TQ+ + +G Y+ + N G +I
Sbjct: 427 IGKNMGALGALSVDMTQANSTL----PDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYS 482

Query: 468 -------HNQRDDEYTDLSRLQYSNLISVNSNKSLTANTYFATKNS---------GTFGV 511
+ + +I V + N + + G
Sbjct: 483 TSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST 542

Query: 512 GYINTKANDFKN-----RFLNLSWAPVLPTYMNGVTVSLSA--NRDFIEKEWSAAFQL-- 562
Y++ + T + +LS ++ +K L
Sbjct: 543 LYLSGSHQTYWGTSNVDEQFQAGLN----TAFEDINWTLSYSLTKNAWQKGRDQMLALNV 598

Query: 563 SIPL----------FQRNATVNSGYAFNKQGDTGY-VNFNRSVPSEGGFGVDL----TRR 607
+IP R+A+ + + + G ++ + +
Sbjct: 599 NIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGG 658

Query: 608 FNENSEDLNQARVNYRNSYINTDFGLSGNHDY-NYWFGLSGSLIYMAGDLFASNRLGESF 666
+ NS A +NYR Y N + G S + D ++G+SG ++ A + L ++
Sbjct: 659 GDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTV 718

Query: 667 ALIDTNQVPDVLVRYENSLIGRSNKKGHIFVPSVTPYYSGKYSVDPIDLPSNFTITQVEQ 726
L+ D + EN R++ +G+ +P T Y + ++D L N +
Sbjct: 719 VLVKAPGAKD--AKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVA 776

Query: 727 RIAAKRGSGVVIKFPVHQSISANVYLTQADGKPVPVGAVV-HRADQESSYVGMDGIVYLE 785
+ RG+ V +F I + LT + KP+P GA+V + Q S V +G VYL
Sbjct: 777 NVVPTRGAIVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTSESSQSSGIVADNGQVYLS 835

Query: 786 NLKPNNTVTVQ--RSDQSICKADFSVDVEQAKQQI 818
+ V V+ + + C A++ + E +Q +
Sbjct: 836 GMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLL 870


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06305TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.4 bits (123), Expect = 4e-09
Identities = 51/289 (17%), Positives = 101/289 (34%), Gaps = 44/289 (15%)

Query: 77 FIFRPLGGLFFGHLGDKYGRQKVLAITVIIMSISTFGIGLIPSYETIGLWAPILLLIVKI 136
F P+ G L D++GR+ VL +++ ++ + P LW +L I +I
Sbjct: 57 FACAPVLG----ALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRI 104

Query: 137 VQGFSIGGEYSGAAIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAATVALITHAVG 192
V G + G + A ++A+ + +R GFM + FG +AG VLG
Sbjct: 105 VAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL---------- 153

Query: 193 ETRFAEWGWRIPFFLALPLGIIGLYLRNRLEETPVYQQHSEQQAQKSKPQKFSFKEIFVK 252
+ PFF A L + L + + + P +
Sbjct: 154 ---MGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMT 210

Query: 253 HKRSLLVC---IGLVISTNVTYYMLLTYLPSYFSHNLGYSEAHGALIIIAVMVGMLFVQP 309
+L+ + LV +++ + + + A + Q
Sbjct: 211 VVAALMAVFFIMQLVGQVPAALWVIFG------EDRFHWDATTIGISLAAFGILHSLAQA 264

Query: 310 VI-GYLSDKFGRRPFIFIGSFSLIFLSYPAFVLLNSGVNYQIFIGLLIL 357
+I G ++ + G R + +G + ++LL + +++L
Sbjct: 265 MITGPVAARLGERRALMLG----MIADGTGYILLAFATRGWMAFPIMVL 309



Score = 41.0 bits (96), Expect = 8e-06
Identities = 26/126 (20%), Positives = 54/126 (42%), Gaps = 12/126 (9%)

Query: 261 IGLVISTNVTYYMLLTYLPSYFSHNLGYSEAHGALIIIAVMVGMLFVQPVIGYLSDKFGR 320
IGL++ V +L + S AH +++ + PV+G LSD+FGR
Sbjct: 21 IGLIMP--VLPGLLRDLVHS------NDVTAHYGILLALYALMQFACAPVLGALSDRFGR 72

Query: 321 RPFIFIGSFSLIFLSYPAFVLLNSGVNYQIFIGLLILALSLNMSIGVMASTLPALFPTEI 380
RP + + SL + ++ + + ++IG ++ ++ + V + + + +
Sbjct: 73 RPVLLV---SLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDE 128

Query: 381 RYSALG 386
R G
Sbjct: 129 RARHFG 134


10A4U85_RS06800A4U85_RS06875Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS06800215-2.093797hypothetical protein
A4U85_RS06805314-1.207626ion transporter
A4U85_RS06810315-1.246762MFS transporter
A4U85_RS06815217-2.908750hypothetical protein
A4U85_RS06820117-3.125611DUF4198 domain-containing protein
A4U85_RS06825116-2.624962LysR family transcriptional regulator
A4U85_RS06830017-3.662197DUF441 domain-containing protein
A4U85_RS06835316-3.816553hypothetical protein
A4U85_RS06840-115-2.120956rRNA pseudouridine synthase
A4U85_RS06845015-2.049729GNAT family N-acetyltransferase
A4U85_RS06850014-0.841363type II toxin-antitoxin system RelB/DinJ family
A4U85_RS06855-114-1.306613pseudouridylate synthase
A4U85_RS06860-112-1.148306tRNA
A4U85_RS06865-113-1.847100DNA mismatch repair endonuclease MutL
A4U85_RS06870115-2.735108tRNA (adenosine(37)-N6)-dimethylallyltransferase
A4U85_RS06875216-2.618964RNA chaperone Hfq
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06810TCRTETA569e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.0 bits (135), Expect = 9e-11
Identities = 70/391 (17%), Positives = 136/391 (34%), Gaps = 32/391 (8%)

Query: 15 SLFLAIFSLAVGGFCIGTTEFVAMGLIQEIAHNLKITVPEAGHFISAYALGVVIGAPIIA 74
L + + ++A+ IG V GL++++ H+ + G ++ YAL AP++
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDV-TAHYGILLALYALMQFACAPVLG 64

Query: 75 ILGAKVPRKTLLLCLMLFYGIANACTALAHTPETVLVSRFIAGLPHGAYFGVGALVAAEL 134
L + R+ +LL + + A A A + + R +AG+ GA V A++
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADI 123

Query: 135 AGPSRRASAVAQMMMGLTVATVIGVPLATWLGQHFGWRAGFEFSATIAFFTLIAVACFVP 194
RA M V G L +G F A F +A + + +P
Sbjct: 124 TDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 195 NIPVQATAS-----IKTELAGLKNINMWLTLAVGAIGFGGMFSVYSYVSPILTEYT--KV 247
+ LA + +A F M V + + + +
Sbjct: 183 E-SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRF 241

Query: 248 NIQIVPIALALWGIGMVIGGLAAGWLADKNL-----NKTIVGVLISSAIAFVVASFLMSN 302
+ I ++L G ++ LA + + ++ +I+ +++ +F
Sbjct: 242 HWDATTIGISLAAFG-ILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATR- 299

Query: 303 SYSAIGSLFLIGLTVMGLGG----ALQTRL-MDVAGDAQTLAASLNHSAFNLANALGAFL 357
G + + ++ GG ALQ L V + Q + +L + +G L
Sbjct: 300 -----GWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLL 354

Query: 358 GGWVLSHQMGWIAPIWVGFVLSLGGLIILLI 388
+ + W G+ G + LL
Sbjct: 355 FTAIYAAS----ITTWNGWAWIAGAALYLLC 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06845SACTRNSFRASE552e-12 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 55.0 bits (132), Expect = 2e-12
Identities = 22/92 (23%), Positives = 42/92 (45%), Gaps = 5/92 (5%)

Query: 43 ENKESVFFIHIKDEKITGFVLLYLGFSSVACSTYYILDDVYVTPLFRRQGSAKQLIDTAI 102
E + F++ + G + + + Y +++D+ V +R++G L+ AI
Sbjct: 61 EEEGKAAFLYYLENNCIGRIKI-----RSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI 115

Query: 103 LFAKQENALRISLETQSNNHESHRLYEKMGFI 134
+AK+ + + LETQ N + Y K FI
Sbjct: 116 EWAKENHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06875cloacin372e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 37.0 bits (85), Expect = 2e-05
Identities = 22/73 (30%), Positives = 26/73 (35%)

Query: 76 GAQGAGFPAQGGSQGGFGGQGAGFGGAQGAGFGGQGGFGGQGGFGGQGGFGGQGGFGGQG 135
G G GG G G G G + G+G+ + G G G GG G G G
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGG 67

Query: 136 GFGGQGGFGGQGG 148
GG G G
Sbjct: 68 NGNSGGGSGTGGN 80



Score = 28.9 bits (64), Expect = 0.012
Identities = 23/74 (31%), Positives = 26/74 (35%), Gaps = 1/74 (1%)

Query: 88 SQGGFGGQGAGFGGAQGAGFGGQGGFGGQGGFGGQGGFGGQGG-FGGQGGFGGQGGFGGQ 146
S G G G G GG G G GG G+ + +GG G G G G
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSG 61

Query: 147 GGFGGHQGGFDNDS 160
G GG G S
Sbjct: 62 HGNGGGNGNSGGGS 75


11A4U85_RS06980A4U85_RS07090Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS06980213-0.551539spore coat protein U domain-containing protein
A4U85_RS06985015-0.934609molecular chaperone
A4U85_RS06990015-0.225675fimbrial biogenesis outer membrane usher
A4U85_RS06995-114-0.447276spore coat protein U domain-containing protein
A4U85_RS07000215-0.933479glutathione S-transferase family protein
A4U85_RS07005218-1.703292SDR family NAD(P)-dependent oxidoreductase
A4U85_RS07010-117-0.995055SDR family NAD(P)-dependent oxidoreductase
A4U85_RS07015-114-1.262703chorismate mutase
A4U85_RS07020-115-1.141428Lrp/AsnC family transcriptional regulator
A4U85_RS07030-114-1.994876hypothetical protein
A4U85_RS07035-214-1.939495TetR/AcrR family transcriptional regulator
A4U85_RS07040-212-2.382467TonB-dependent receptor
A4U85_RS07045117-4.778580mechanosensitive ion channel
A4U85_RS07050115-3.382282hypothetical protein
A4U85_RS07055214-3.097513hypothetical protein
A4U85_RS07060215-2.392721hypothetical protein
A4U85_RS07065216-3.067756hypothetical protein
A4U85_RS07070214-2.450513N-acetyltransferase
A4U85_RS07075213-2.229884TonB-dependent siderophore receptor
A4U85_RS07080215-3.500847OmpW family protein
A4U85_RS19160518-5.078631hypothetical protein
A4U85_RS07085318-5.370708hypothetical protein
A4U85_RS07090317-4.184937ABC-F family ATP-binding cassette
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS06990PF005772872e-86 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 287 bits (736), Expect = 2e-86
Identities = 151/800 (18%), Positives = 274/800 (34%), Gaps = 78/800 (9%)

Query: 62 LNISINSNP--SED--LVAVRQDQDKKLYIRARDLKTLRLKMDDSISDSQW------ICL 111
++I +N+ + D +Q + L ++ L +
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLT 139

Query: 112 NELKDIRFKYLENEQSLNLQVPPHMMTGYSVDLKGQQITSPQLLKIKPLNAAILNYSLY- 170
+ + D + +Q LNL +P M+ + P L +NA +LNY+
Sbjct: 140 SMIHDATAQLDVGQQRLNLTIPQAFMS------NRARGYIPPELWDPGINAGLLNYNFSG 193

Query: 171 HTITNDENVFSGSAEGIFNSAIGNFSSGVL-------YNGNDENSYSHEKWVRLESKWQY 223
+++ N S A S + N + L YN +D +S S KW + + +
Sbjct: 194 NSVQNRIGGNSHYAYLNLQSGL-NIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLER 252

Query: 224 VDPEKIRIYTLGDFISNSSDWGSSVRLAGFQWSSAYTQRGDIVTSALPQFSGSAALPSTL 283
TLGD + + + G Q +S D P G A + +
Sbjct: 253 DIIPLRSRLTLGDGYTQGDIF-DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQV 311

Query: 284 DLYVNQQKIYSGLVPSGPFDIKQLPFISG-NEVTLVTTDATGRQSITKKPYYFSSKILAK 342
+ N IY+ VP GPF I + ++ + +A G I PY + +
Sbjct: 312 TIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQRE 371

Query: 343 GINEFSVDVGVPRYNYGLYSNDYDDATFASGAIRYGYSNSLTLSGGVEASTDGLSNIGTG 402
G +S+ G R F + +G T+ GG + + D G
Sbjct: 372 GHTRYSITAGEYRSGNAQQEKPR----FFQSTLLHGLPAGWTIYGGTQLA-DRYRAFNFG 426

Query: 403 FAKNLFGIGVINADIAASQYKDENGYSALLGLEGRISKNISFN--------TSYRKIFDN 454
KN+ +G ++ D+ + + S G R N S N YR
Sbjct: 427 IGKNMGALGALSVDMTQANSTLPDD-SQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 455 YFDLARVSQVRY------LKDNQSDAESQNYLNYSALADEIFRAGINYNFYAG-YGA-YL 506
YF+ A + R +D + + Y+ ++ + + G YL
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYL 545

Query: 507 GYNQIKY-----SDNQYKLLSANLSGSLNK-NWGFYTSAYKD-YENHKDYGIYFAL---- 555
+ Y D Q+ A L+ + NW S K+ ++ +D + +
Sbjct: 546 SGSHQTYWGTSNVDEQF---QAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 556 -------RYTPSNKFNAITSVSSDS-GRLSYRQEIFGLSDPQIGSFGWG---GYVERDQD 604
+ +A S+S D GR++ ++G + + + + GY
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYG-TLLEDNNLSYSVQTGYAGGGDG 661

Query: 605 NHDNNASIYASYRARAAYLAGRYNRIGDNDQVALSATGSLVAAAGRLFAANEIGDGYAVV 664
N + +YR Y+ D Q+ +G ++A A + + D +V
Sbjct: 662 NSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLV 721

Query: 665 TNAGPQSQILNGGVNLGFTDKSGRFLIPSLMPYQENHIYLDPSFLPLNWSVNSTEQKTVV 724
G + + TD G ++P Y+EN + LD + L N +++ V
Sbjct: 722 KAPGAKDAKVENQ-TGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 725 GYRQGTMIDFGAHQVISGLVKLVDKNNSPLLPGYSVQ-INGQQDGVVGYDGEVFIPNLLK 783
+F A I L+ + NN PL G V + Q G+V +G+V++ +
Sbjct: 781 TRGAIVRAEFKARVGIKLLM-TLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPL 839

Query: 784 QNKLVVDLLDHGSCQVDFTY 803
K+ V + + Y
Sbjct: 840 AGKVQVKWGEEENAHCVANY 859


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS070002FE2SRDCTASE310.003 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 31.2 bits (70), Expect = 0.003
Identities = 11/35 (31%), Positives = 17/35 (48%), Gaps = 3/35 (8%)

Query: 64 STRIARYLDETFPDTPRLYPEDANQKALAELWEDW 98
S+ +A Y D + + P + E+ K L LW W
Sbjct: 67 SSLLAVYSDHIYRNQPMMIREN---KPLISLWAQW 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07005DHBDHDRGNASE523e-10 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 52.4 bits (125), Expect = 3e-10
Identities = 35/163 (21%), Positives = 68/163 (41%), Gaps = 10/163 (6%)

Query: 18 VGASQGIGAAVCHRFAKEGLKVYVAGRTFQKIEAVAAEIHANVGEAVAFRLDAEDINQVQ 77
GA+QGIG AV A +G + +K+E V + + A A AF D D +
Sbjct: 14 TGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAAID 73

Query: 78 ALFDTIISQNERITAVIHNVGGNIPSIFLRSPL-SFFTQMWQSTF----LSAYLVAQSCL 132
+ I + I ++ N+ + + S + W++TF + ++S
Sbjct: 74 EITARIEREMGPIDILV-----NVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 133 KIFKEQNHGTLIFTGASASLRGKPFFAAFTMGKSALRTYALNL 175
K ++ G+++ G++ + + AA+ K+A + L
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07035HTHTETR503e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.4 bits (120), Expect = 3e-10
Identities = 15/65 (23%), Positives = 23/65 (35%)

Query: 5 EASFRALRMLHTARDLFKQYGFHKVGVDRIIAESKITKATFYNYFHSKERLIEMCLTFQK 64
EA +L A LF Q G + I + +T+ Y +F K L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 DGLKE 69
+ E
Sbjct: 68 SNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07070SACTRNSFRASE384e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.4 bits (89), Expect = 4e-06
Identities = 18/87 (20%), Positives = 38/87 (43%), Gaps = 17/87 (19%)

Query: 52 VAVEDNTIVGHVAISPVQISSGEKNWYGLG---PISVAPNKQGQGIGSLLMNSSLEKLKK 108
+ +N +G + I NW G I+VA + + +G+G+ L++ ++E K+
Sbjct: 69 LYYLENNCIGRIKIR--------SNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKE 120

Query: 109 LGAKGCVL------LGDPKYYSRFGFK 129
G +L + +Y++ F
Sbjct: 121 NHFCGLMLETQDINISACHFYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07080OUTRMMBRANEA300.014 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 29.9 bits (67), Expect = 0.014
Identities = 34/158 (21%), Positives = 52/158 (32%), Gaps = 27/158 (17%)

Query: 205 PAIEAQYQFGKSGVNKFRPYLGVGLMYAHFNDIKLNDEIRSDLISA---------GHMIQ 255
P E Q G G + PY+G + Y + + + A G+ I
Sbjct: 50 PTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPIT 109

Query: 256 NVLD--GKAGAALDRKESSGNMVVKVDADDAIAPIFTAGFTYDFNDSWYTVASISYAKLN 313
+ LD + G + R ++ N V + D ++P+F G Y T Y N
Sbjct: 110 DDLDIYTRLGGMVWRADTKSN-VYGKNHDTGVSPVFAGGVEYAITPEIATRL--EYQWTN 166

Query: 314 NRTQIDVINQNTGARLIHGSTKVDIDPIITYLGVGYRF 351
N I ++ LGV YRF
Sbjct: 167 NIGDAHTIGTRPDNGMLS-------------LGVSYRF 191


12A4U85_RS07480A4U85_RS07595Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS07480015-3.175533RIP metalloprotease RseP
A4U85_RS07485016-3.520475outer membrane protein assembly factor BamA
A4U85_RS07490118-3.851246OmpH family outer membrane protein
A4U85_RS07495-117-2.771670UDP-3-O-(3-hydroxymyristoyl)glucosamine
A4U85_RS07500-218-2.4833443-hydroxyacyl-ACP dehydratase FabZ
A4U85_RS07505018-2.733691acyl-ACP--UDP-N-acetylglucosamine
A4U85_RS07510120-3.356751tetratricopeptide repeat protein
A4U85_RS07520020-3.484499regulatory protein RecX
A4U85_RS07525-117-3.339062recombinase RecA
A4U85_RS07530319-4.871642RNA-binding protein
A4U85_RS07535520-6.492314HAD-IA family hydrolase
A4U85_RS07540317-5.192028hypothetical protein
A4U85_RS07545114-3.677047hypothetical protein
A4U85_RS07550115-3.642155hypothetical protein
A4U85_RS07555115-3.134289GNAT family N-acetyltransferase
A4U85_RS07560015-2.771204Lrp/AsnC family transcriptional regulator
A4U85_RS07565114-2.642345kynureninase
A4U85_RS07570113-2.819397amino acid permease
A4U85_RS07575215-3.189691alpha/beta hydrolase
A4U85_RS07580113-2.689862S8 family peptidase
A4U85_RS07585112-3.017665SulP family inorganic anion transporter
A4U85_RS07590112-3.503396SOS response-associated peptidase family
A4U85_RS07595113-3.210100PQQ-dependent sugar dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07580SUBTILISIN2023e-64 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 202 bits (515), Expect = 3e-64
Identities = 80/284 (28%), Positives = 125/284 (44%), Gaps = 24/284 (8%)

Query: 116 TTQSNPDWGLDRIDQKALPLNSTYSYLQTGSGTTAYIVDTGILSSHQEFSGRVLSGYTAI 175
+ G++ I A+ + G G ++DTG + H + R++ G
Sbjct: 17 QQVNEIPRGVEMIQAPAVWNQTR------GRGVKVAVLDTGCDADHPDLKARIIGGRNFT 70

Query: 176 SDGNG----TTDCNGHGTHVAGTVGGT-----TYGVAKNVNLVPIRILGCDGSGASSNVI 226
D G D NGHGTHVAGT+ T GVA +L+ I++L GSG +I
Sbjct: 71 DDDEGDPEIFKDYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDWII 130

Query: 227 AGLDWILKNGKKPAVVNMSLGGATSSS-LDSAVENLYNNGYVMVVAAGNSNTDA----CT 281
G+ + ++ +++MSLGG L AV+ + +++ AAGN
Sbjct: 131 QGIYYAIEQKVD--IISMSLGGPEDVPELHEAVKKAVASQILVMCAAGNEGDGDDRTDEL 188

Query: 282 SSPARVSKAITVAATDNTDTRASYSNYGSCVDIFAPGSQINSSWIGSNTATKILNGTSMA 341
P ++ I+V A + + +SN + VD+ APG I S+ G AT +GTSMA
Sbjct: 189 GYPGCYNEVISVGAINFDRHASEFSNSNNEVDLVAPGEDILSTVPGGKYAT--FSGTSMA 246

Query: 342 TPHVAGVVAEMLQSTPTASPQTISTNLLNQASSNVVKNPSGSPN 385
TPHVAG +A + Q + + ++ L SP
Sbjct: 247 TPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNSPK 290


13A4U85_RS07900A4U85_RS07980Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS07900-1194.061853DUF421 domain-containing protein
A4U85_RS079051194.944048cell division protein FtsB
A4U85_RS079101174.5093602-C-methyl-D-erythritol 4-phosphate
A4U85_RS079151185.1088393-oxoacid CoA-transferase subunit A
A4U85_RS079202164.554085CoA transferase subunit B
A4U85_RS079251153.7466193-oxoadipyl-CoA thiolase
A4U85_RS079300142.5864263-carboxy-cis,cis-muconate cycloisomerase
A4U85_RS07935-1151.7006383-oxoadipate enol-lactonase
A4U85_RS079400151.655460MFS transporter
A4U85_RS079450140.9868374-carboxymuconolactone decarboxylase
A4U85_RS079501141.477924protocatechuate 3,4-dioxygenase subunit beta
A4U85_RS079551141.093386protocatechuate 3,4-dioxygenase subunit alpha
A4U85_RS079601120.822022type I 3-dehydroquinate dehydratase
A4U85_RS079652130.605296right-handed parallel beta-helix
A4U85_RS07970113-0.138016carbohydrate porin
A4U85_RS07975114-0.781531glucose/quinate/shikimate family membrane-bound
A4U85_RS07980115-3.626943hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07935ALARACEMASE290.025 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 28.6 bits (64), Expect = 0.025
Identities = 7/40 (17%), Positives = 15/40 (37%), Gaps = 1/40 (2%)

Query: 221 AEFMQKAINNSQLAKLE-ASHLSNIEQPQRFTQELTRFIQ 259
Q+ + + ++ SH + E P + + R Q
Sbjct: 139 LTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQ 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07940TCRTETA463e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 45.6 bits (108), Expect = 3e-07
Identities = 38/179 (21%), Positives = 63/179 (35%), Gaps = 5/179 (2%)

Query: 33 IICFLIIFTDGIDTAAMGFIAPALAQDWGVDRSQ---LGPVMSAALGGMIIGALVSGPTA 89
I+ + D + + + P L +D G +++ A V G +
Sbjct: 8 IVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 90 DRFGRKIVLAFSMLVFGGFTLASAYATNLDSLVVLRFLTGIGLGAAMPNATTLFSEYCPT 149
DRFGR+ VL S+ A A L L + R + GI GA A ++
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDG 126

Query: 150 RIRSLLVTCMFCGYNLGMATGGFISSWLIPTYGWHSLFLLGGWSPLILMVLVILVLPES 208
R+ M + GM G + + + H+ F + + +LPES
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCFLLPES 184



Score = 29.0 bits (65), Expect = 0.041
Identities = 33/132 (25%), Positives = 53/132 (40%), Gaps = 11/132 (8%)

Query: 289 LPTLMRETGASMERAAFIG---GLFQFGGVVSALFIGWAMDKFNPNRVIAIFYFAAGLFA 345
LP L+R+ S + A G L+ A +G D+F V+ + L
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLV-----SLAG 82

Query: 346 IAVGQSL-GNSTLLAVLVLCAGIA-INGAQSSMP-ALSARFYPTQCRATGVSWMTGIGRF 402
AV ++ + L VL + +A I GA ++ A A RA +M+ F
Sbjct: 83 AAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 403 GAVFGAWIGAVL 414
G V G +G ++
Sbjct: 143 GMVAGPVLGGLM 154


14A4U85_RS08280A4U85_RS08330Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08280215-3.010358bile acid:sodium symporter family protein
A4U85_RS19380216-3.347267hypothetical protein
A4U85_RS08290316-3.692888TetR/AcrR family transcriptional regulator
A4U85_RS08295315-3.340181ShlB/FhaC/HecB family hemolysin
A4U85_RS08300216-3.580793DUF637 domain-containing protein
A4U85_RS08305111-3.631559DUF596 domain-containing protein
A4U85_RS08310111-2.919670hypothetical protein
A4U85_RS19180011-1.978238hypothetical protein
A4U85_RS08315110-1.228726MFS transporter
A4U85_RS08320113-0.709929amidohydrolase family protein
A4U85_RS08325215-0.507809SLC13 family permease
A4U85_RS08330317-0.131364GntR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08290HTHTETR542e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.9 bits (129), Expect = 2e-11
Identities = 16/65 (24%), Positives = 28/65 (43%)

Query: 5 KTSSKKLQVIHTAIRLFVTYGFHTTGVDLIIKEAKITKATFYNYFHSKERLIEMCIAFQK 64
+ + ++ A+RLF G +T + I K A +T+ Y +F K L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 SLLKE 69
S + E
Sbjct: 68 SNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08300PF05860661e-14 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 65.6 bits (160), Expect = 1e-14
Identities = 19/144 (13%), Positives = 45/144 (31%), Gaps = 29/144 (20%)

Query: 74 AGIVADSAANAANRAVIGAGKNSAGTVVPVVNIQTPK-NGISHNIYKQFDVLAEGAVLNN 132
A I D+ + ++ T + + H+ +++F V G N
Sbjct: 1 AQITPDTTLPINSNITTEGNT-------RIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN 52

Query: 133 SRQGATTKTVGNVAANPFLATGEARVILNEVNSSAASRFEGNLEVAGQMADVIIANPSGI 192
+ + I++ V + S +G + A++ + NP+GI
Sbjct: 53 N-------------------PTNIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGI 92

Query: 193 SIKGGGFINANKAIFTTGKPQLNA 216
++ + + +L
Sbjct: 93 IFGQNARLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08315TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 25/129 (19%), Positives = 51/129 (39%), Gaps = 3/129 (2%)

Query: 275 LWMPQILKAFH-LTAMQTGLLNMIPFGLAAAFM-IVWGVHADKSGN-KSLNTAIPLFVTS 331
+P ++K H L+ + G + + P ++ + G+ D+ G LN + S
Sbjct: 277 SMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVS 336

Query: 332 FGLLLTIFTSSLTLSLLLFSLVLMGNYAIKGPFWALVSERLPPTLVAVGIAAVNTIAHIG 391
F + ++ ++ VL G K +VS L G++ +N + +
Sbjct: 337 FLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLS 396

Query: 392 TGLMNSIMG 400
G +I+G
Sbjct: 397 EGTGIAIVG 405


15A4U85_RS08470A4U85_RS08520Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08470017-3.808539NUDIX hydrolase
A4U85_RS08475317-4.364081methylenetetrahydrofolate reductase C-terminal
A4U85_RS08480820-6.693675hypothetical protein
A4U85_RS08485720-4.623892hypothetical protein
A4U85_RS08490721-3.257387hypothetical protein
A4U85_RS08495819-3.870708hypothetical protein
A4U85_RS08500821-3.138064LlaJI family restriction endonuclease
A4U85_RS08505820-3.447661AAA family ATPase
A4U85_RS08510922-3.763346DNA (cytosine-5-)-methyltransferase
A4U85_RS08515819-4.784283DNA mismatch endonuclease Vsr
A4U85_RS08520518-4.384098DUF262 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08480PF05272270.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.3 bits (60), Expect = 0.008
Identities = 8/22 (36%), Positives = 12/22 (54%)

Query: 51 YFRGQSWPEMVREDRSQSDARS 72
Y R Q WP ++ ED+ A +
Sbjct: 845 YMRPQVWPPVIAEDKEADQAHA 866


16A4U85_RS08580A4U85_RS08615Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08580422-7.907426hypothetical protein
A4U85_RS08585424-8.475007HD domain-containing protein
A4U85_RS08590524-9.052809helix-turn-helix domain-containing protein
A4U85_RS08595525-8.944365TetR/AcrR family transcriptional regulator
A4U85_RS08600727-9.442987hypothetical protein
A4U85_RS08605725-8.677099DEAD/DEAH box helicase family protein
A4U85_RS08610217-4.914534GNAT family N-acetyltransferase
A4U85_RS08615116-3.072300hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08580PF07269300.001 Transport secretion system IV, VirB7 protein
		>PF07269#Transport secretion system IV, VirB7 protein

Length = 55

Score = 29.6 bits (66), Expect = 0.001
Identities = 18/61 (29%), Positives = 27/61 (44%), Gaps = 7/61 (11%)

Query: 1 MKRCLLVLLLGLGLAACNDNDHDDQVSTEKPALAPSLDVGTYIISTETDQELPMAGKYYS 60
MK CLL L + L C N D+ ++ K + P L+VG + +D MA +
Sbjct: 1 MKYCLLCLA--IVLTGCQTN---DKPASCKGPIFP-LNVGRW-QPAPSDLHPGMADGQHE 53

Query: 61 G 61

Sbjct: 54 R 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08595HTHTETR482e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.5 bits (115), Expect = 2e-09
Identities = 12/77 (15%), Positives = 29/77 (37%), Gaps = 1/77 (1%)

Query: 7 PTRALKVVNTSIELFHRRGFHIAGVDRLVKESEITKATFYNYFHSKERLIEICLMVQKER 66
TR +++ ++ LF ++G + + K + +T+ Y +F K L + +
Sbjct: 11 ETRQ-HILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 67 LQEKVVAMVEYDHDTNA 83
+ E +
Sbjct: 70 IGELELEYQAKFPGDPL 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08610SACTRNSFRASE431e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 42.6 bits (100), Expect = 1e-07
Identities = 32/152 (21%), Positives = 65/152 (42%), Gaps = 21/152 (13%)

Query: 33 IGFLAPIEKNEVVNYWREVNSK---------------LANGNNRLWIAIQQGTIIGSVQL 77
G + P +N V Y E SK + ++ + IG +++
Sbjct: 23 FGRMIPAFENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKI 82

Query: 78 SLVSKKNGVHRAEVEKLMVLTTARKQGIATLLMNELENFAREKSLRLLVLDTREGDVSEL 137
N A +E + V RK+G+ T L+++ +A+E L+L+T++ ++S
Sbjct: 83 R----SNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISAC 138

Query: 138 -LYSKIGFVRVGVIPSFALSSNGSYDGTAIYY 168
Y+K F +G + + S+ + + AI++
Sbjct: 139 HFYAKHHF-IIGAVDTMLYSNFPTANEIAIFW 169


17A4U85_RS08685A4U85_RS08735Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08685215-1.795467hypothetical protein
A4U85_RS08690215-0.591918molecular chaperone
A4U85_RS08695110-0.896660fimbrial biogenesis outer membrane usher
A4U85_RS08700111-1.242861fimbrial protein
A4U85_RS08705216-2.578216DUF4882 family protein
A4U85_RS08710217-3.204983S-(hydroxymethyl)glutathione dehydrogenase/class
A4U85_RS08715116-2.898321hypothetical protein
A4U85_RS08720318-4.139166catalase family protein
A4U85_RS08725419-5.288780hypothetical protein
A4U85_RS08730520-5.796343hypothetical protein
A4U85_RS08735119-3.971336phospholipase D family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08700PF005777600.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 760 bits (1965), Expect = 0.0
Identities = 257/852 (30%), Positives = 407/852 (47%), Gaps = 47/852 (5%)

Query: 37 EAAASAPVEAEFDSAFLIGDAQ-KVDISRFKYGNPVLPGEYNVDVYVNGQWFGKRRMIFK 95
A + E F+ FL D Q D+SRF+ G + PG Y VD+Y+N + R + F
Sbjct: 38 AQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFN 97

Query: 96 ALDPNQNAVTCFTGMNLLEYGVKQEILTKHAPLQKENNSCYKIEEWVENAFYEFDTSRLR 155
D Q V C T L G+ ++ L +++C + + +A + D + R
Sbjct: 98 TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA--DDACVPLTSMIHDATAQLDVGQQR 155

Query: 156 VDISIPQVALQKNAQGYVDPSVWDRGINAGFLSYSGSAYKTFNQSGDRSETTNAFMGVTA 215
++++IPQ + A+GY+ P +WD GINAG L+Y+ S N+ G S A++ + +
Sbjct: 156 LNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS--HYAYLNLQS 213

Query: 216 GLNLAGWQLRHNGQWQWQDTPAENQSKSDYQETSTYLQRAFPKYRGVLTLGDSFTNGEVF 275
GLN+ W+LR N W + + + + SK+ +Q +T+L+R R LTLGD +T G++F
Sbjct: 214 GLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIF 273

Query: 276 DSYGYRGIDFSSDDRMLPNSMLGYAPRIRGNAKTNAKVEVRQQGQLIYQTTVAPGNFEIN 335
D +RG +SDD MLP+S G+AP I G A+ A+V ++Q G IY +TV PG F IN
Sbjct: 274 DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTIN 333

Query: 336 DLYPTGFGGEIEVSVIEANGEIQKFSVPYASVVQMLRPGMNRYSLTVGQFRDQDIDLD-P 394
D+Y G G+++V++ EA+G Q F+VPY+SV + R G RYS+T G++R + + P
Sbjct: 334 DIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKP 393

Query: 395 WIIQGKYQQGINNYLTGYTGIQASENYAAILLGAAVAT-PIGAIAFDVTHSEAEFEKQAS 453
Q G+ T Y G Q ++ Y A G +GA++ D+T + + +
Sbjct: 394 RFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQ 453

Query: 454 QSGQSFRLSYSKLITPTNTNLTLAAYRYSTENFYKLRDALLIRDLEEKGVNTYAAG---- 509
GQS R Y+K + + TN+ L YRYST ++ D R
Sbjct: 454 HDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKP 513

Query: 510 ----------RQRSEFQITLNQGLPEGWGNFYVVGSWVDYWNRSESTKQYQIGYSNNYHG 559
+R + Q+T+ Q L Y+ GS YW S +Q+Q G + +
Sbjct: 514 KFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFED 572

Query: 560 LTYGLSAINRKVEYGSNDASHDTEYLMTLSFPINFKKN----------SVNVNVTASEDS 609
+ + LS K + D + ++ P + S + +++ +
Sbjct: 573 INWTLSYSLTK---NAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNG 629

Query: 610 RT---VGASGMVG--DRFSYGASVSHQD----YANPTFNANGRYRTNYATVGGSYSIADS 660
R G G + + SY + + T A YR Y YS +D
Sbjct: 630 RMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDD 689

Query: 661 YQQAMVSLSGSVVAHSDGILFGPEQGQTMVLVHAPDAAGAKVNNTVGLSVNKAGYAVVPY 720
+Q +SG V+AH++G+ G T+VLV AP A AKV N G+ + GYAV+PY
Sbjct: 690 IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPY 749

Query: 721 VTPYRLNDITLDPQEMSSEVELEETSQRIAPFAGAIAKVDFATKTGYAVYINSKTADGNS 780
T YR N + LD ++ V+L+ + P GAI + +F + G + + T +
Sbjct: 750 ATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTL-THNNKP 808

Query: 781 LPFAAQVFNQKDEAVGIVAQGSMIYLRTPLAQDRLYVKWGDESNERCSVEYNISNELRNK 840
LPF A V ++ ++ GIVA +YL ++ VKWG+E N C Y + E ++
Sbjct: 809 LPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPE--SQ 866

Query: 841 QQSIVMTEAVCK 852
QQ + A C+
Sbjct: 867 QQLLTQLSAECR 878


18A4U85_RS08890A4U85_RS08945Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08890117-3.848476GntP family permease
A4U85_RS08895522-5.794414hypothetical protein
A4U85_RS08900319-5.145784TetR/AcrR family transcriptional regulator
A4U85_RS08905115-4.105833DoxX family protein
A4U85_RS08910113-3.109072DUF3298 domain-containing protein
A4U85_RS19385114-1.191909hypothetical protein
A4U85_RS08920113-1.315707LysR family transcriptional regulator
A4U85_RS08925-112-1.732121CoA transferase subunit A
A4U85_RS08930-113-2.058537CoA transferase subunit B
A4U85_RS08935013-2.727030short-chain fatty acid transporter
A4U85_RS08940-214-3.256323thiolase family protein
A4U85_RS08945-212-3.763540LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08900HTHTETR455e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.6 bits (105), Expect = 5e-08
Identities = 25/153 (16%), Positives = 50/153 (32%), Gaps = 5/153 (3%)

Query: 5 STKTNQIDNKNRLIIQTVGDLFLEYGYSRVSINLIISKIGGSKRDLYAQFGDKEGLFRSV 64
TK + + I+ LF + G S S+ I G ++ +Y F DK LF +
Sbjct: 4 KTKQEAQETRQH-ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 65 IADVCQQVLDPLKALPVE--GGSIEQALTSFGRIFLSVLLSSRVIALQKLVLSEATRYPE 122
+ + + G + + S + R L +++ + E
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 123 FAKT--FVQLGPVSAYNLAAELLMKRAEVGEIR 153
A + + +Y+ + L E +
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLP 155


19A4U85_RS09145A4U85_RS09175Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09145-115-3.392846hypothetical protein
A4U85_RS09150-116-3.862544O-succinylhomoserine sulfhydrylase
A4U85_RS09155-117-3.730959alpha/beta fold hydrolase
A4U85_RS09160120-5.522666methyltransferase domain-containing protein
A4U85_RS09165022-6.576809cell envelope integrity protein TolA
A4U85_RS09170020-5.002029hypothetical protein
A4U85_RS09175-224-3.102280hypothetical protein
20A4U85_RS09300A4U85_RS09330Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09300-115-3.693814bifunctional adenosylcobinamide
A4U85_RS09305017-4.2351702'-5' RNA ligase family protein
A4U85_RS09310014-3.344193HAD family hydrolase
A4U85_RS09315216-3.176323acetyltransferase
A4U85_RS09320118-3.526951hypothetical protein
A4U85_RS09325118-3.295702PepSY domain-containing protein
A4U85_RS09330-116-3.262391hypothetical protein
21A4U85_RS09400A4U85_RS09435Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09400-114-3.775239alkane 1-monooxygenase
A4U85_RS09405115-3.319770AraC family transcriptional regulator
A4U85_RS09410217-2.002374SurA N-terminal domain-containing protein
A4U85_RS09415221-1.503915HU family DNA-binding protein
A4U85_RS09420220-1.691796phasin family protein
A4U85_RS09425222-1.727463hypothetical protein
A4U85_RS09430222-0.524696Rrf2 family transcriptional regulator
A4U85_RS09435222-0.921174IscS subfamily cysteine desulfurase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09415DNABINDINGHU1217e-40 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 121 bits (305), Expect = 7e-40
Identities = 49/88 (55%), Positives = 68/88 (77%)

Query: 2 NKSELIDAIAEKGGVSKTDAGKALDATIASITEALKKGDTVTLVGFGTFSVKERAARTGR 61
NK +LI +AE ++K D+ A+DA ++++ L KG+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPKTGEELQIKATKVPSFKAGKGLKDSV 89
NP+TGEE++IKA+KVP+FKAGK LKD+V
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


22A4U85_RS09555A4U85_RS09675Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09555211-0.715877ABC transporter substrate-binding protein
A4U85_RS19200014-0.148688hypothetical protein
A4U85_RS095600130.000714TonB-dependent receptor
A4U85_RS09565-118-0.567640energy transducer TonB
A4U85_RS09570-118-1.702815MotA/TolQ/ExbB proton channel family protein
A4U85_RS09575-314-2.610415biopolymer transporter ExbD
A4U85_RS09580-414-3.017508biopolymer transporter ExbD
A4U85_RS09585-314-3.412130hypothetical protein
A4U85_RS09590-315-3.232892malate synthase G
A4U85_RS09595-114-3.874878cell division protein ZapE
A4U85_RS09600014-3.868798AraC family transcriptional regulator
A4U85_RS09605123-2.214831NAD(P)/FAD-dependent oxidoreductase
A4U85_RS09610324-1.305166lysine exporter LysO family protein
A4U85_RS09615630-0.511549orotidine-5'-phosphate decarboxylase
A4U85_RS09620528-0.400188DUF1049 domain-containing protein
A4U85_RS096254240.418341integration host factor subunit beta
A4U85_RS09630222-0.14537830S ribosomal protein S1
A4U85_RS09635-217-1.383526(d)CMP kinase
A4U85_RS09640-418-2.488990SRPBCC family protein
A4U85_RS09645-220-2.692031tRNA adenosine(34) deaminase TadA
A4U85_RS09650-220-2.733147enoyl-CoA hydratase/isomerase family protein
A4U85_RS09655023-4.378825uracil-DNA glycosylase
A4U85_RS09660020-3.9941676-carboxytetrahydropterin synthase
A4U85_RS09665018-3.419394type II secretion system minor pseudopilin GspK
A4U85_RS09670-118-2.931057type II secretion system minor pseudopilin GspJ
A4U85_RS09675-117-3.453680type II secretion system minor pseudopilin GspI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09565PF03544701e-16 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 70.0 bits (171), Expect = 1e-16
Identities = 34/182 (18%), Positives = 69/182 (37%), Gaps = 9/182 (4%)

Query: 41 IQKPAEKPVELQIIQDIKPPPPPKPEEPKPKEKPPEPPKMVEKVAKVPEPPKEVEKVATP 100
+ PA+ + +P P+PE E P E P ++EK P+P + K
Sbjct: 54 MVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ 113

Query: 101 VQKTTPVAQPTKVATPAPAAPSTPSPSPVAAPAPVAAAAPAPKPAGVTRGVSEGSAGCEK 160
++ + + AP+ P+ S A + A P ++R +
Sbjct: 114 PKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRN---------Q 164

Query: 161 PEYPREALMNEEQGTVRIRVLVDTSGKVIDAKVKKSSGSKILDKAATKAYSLCTFKPAMK 220
P+YP A +G V+++ V G+V + ++ + + + ++ A ++P
Sbjct: 165 PQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKP 224

Query: 221 DG 222

Sbjct: 225 GS 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09620TYPE3IMSPROT270.027 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 26.6 bits (59), Expect = 0.027
Identities = 11/88 (12%), Positives = 34/88 (38%), Gaps = 1/88 (1%)

Query: 4 ILIALLIIVFGYSLALVLQNPTELPVDLLFTQVPAMRLGLLLLLTLALGIVVGLLLGVQV 63
+ L +++ + ++++ + L + + L +L + I + + +
Sbjct: 141 LKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISI 200

Query: 64 FRV-FQKSWEIKRLRKDIDHLRKEQIQS 90
F+ IK L+ D +++E +
Sbjct: 201 ADYAFEYYQYIKELKMSKDEIKREYKEM 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09625DNABINDINGHU1065e-34 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 106 bits (267), Expect = 5e-34
Identities = 34/89 (38%), Positives = 51/89 (57%), Gaps = 1/89 (1%)

Query: 7 NKSDLIERIALKNPHLAEPLVEEAVKIMIDQMIEALSSDNRIEIRGFGSFALHHREPRVG 66
NK DLI ++A L + AV + + L+ ++++ GFG+F + R R G
Sbjct: 3 NKQDLIAKVAEAT-ELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 67 RNPKTGKSVDVAAKAVPHFKPGKALRDAV 95
RNP+TG+ + + A VP FK GKAL+DAV
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09635PF05272280.034 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.034
Identities = 10/34 (29%), Positives = 13/34 (38%), Gaps = 2/34 (5%)

Query: 5 IITIDGPSGSGKGTLAAKLAAYYQF--HLLDSGA 36
+ ++G G GK TL L F D G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGT 631


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09670BCTERIALGSPG290.008 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.1 bits (65), Expect = 0.008
Identities = 11/26 (42%), Positives = 16/26 (61%)

Query: 24 RLTRASGFTLVELLVAIAIFAVLSLL 49
+ GFTL+E++V I I VL+ L
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASL 28


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09675BCTERIALGSPH383e-06 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 37.6 bits (87), Expect = 3e-06
Identities = 17/54 (31%), Positives = 29/54 (53%), Gaps = 3/54 (5%)

Query: 1 MKSKGFTLLEVMVALAIFAVAAVALTKVAMQYTQSTSNAILRTKAQFVAMNEVA 54
M+ +GFTLLE+M+ L + V+A V + + S ++ +T A+F A
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGM---VLLAFPASRDDSAAQTLARFEAQLRFV 51


23A4U85_RS09760A4U85_RS09805Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09760220-0.034241peptidylprolyl isomerase
A4U85_RS09765-118-0.423053fructose-bisphosphate aldolase class II
A4U85_RS09770119-2.127406hypothetical protein
A4U85_RS09775018-2.630035phosphoglycerate kinase
A4U85_RS09780-115-2.787305DUF2237 domain-containing protein
A4U85_RS09785-115-3.233801septum formation protein Maf
A4U85_RS09790114-3.792383cupin domain-containing protein
A4U85_RS09795215-3.406754TetR/AcrR family transcriptional regulator
A4U85_RS09800215-2.582214helix-turn-helix transcriptional regulator
A4U85_RS09805215-2.056710serine/threonine transporter SstT
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09760RTXTOXIND310.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.008
Identities = 35/216 (16%), Positives = 71/216 (32%), Gaps = 14/216 (6%)

Query: 39 NSVILKSDLEQGMAEAAHELQAQKKEVPPQQYLQFQVLDQLILRQAQLEQVKKYGIKPDE 98
+++ +S L Q E Q + + + + ++ D+ + E+V + E
Sbjct: 135 DTLKTQSSLLQARLEQT-RYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKE 193

Query: 99 KSLNEAVLKVASQSGSKSLEAFQQKLDAIAPGTYENLRSRIAEDLAINR-LRQQQVMSRI 157
+ K + A + + A YENL L L +Q +++
Sbjct: 194 QFSTWQNQKYQKELNLDKKRAERLTVLARINR-YENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 158 KISDQ-----DVDNFLKSPQGQ-AALGNQAHVIHMRISGDNPQEVQNVAKEVRSQLAQSN 211
+ +Q + N L+ + Q + ++ Q E+ +L Q+
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQ----LVTQLFKNEILDKLRQTT 308

Query: 212 DLNALKKLSTATVKVEGADMGFR-PLSDIPAELAAR 246
D L L A + R P+S +L
Sbjct: 309 DNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVH 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09795HTHTETR594e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 58.9 bits (142), Expect = 4e-13
Identities = 24/112 (21%), Positives = 43/112 (38%), Gaps = 5/112 (4%)

Query: 1 MSKKDDIITTALRLFNSYSYNSIGVDRIISESGVAKMTFYKYFPSKEKLIEECLLLRNSL 60
+ I+ ALRLF+ +S + I +GV + Y +F K L E L S
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 61 LQNSLTAAISKEDETNPLARIKAIFLWYSDWFNSED----FNGCMFQKALEE 108
+ L + +PL+ ++ I + + +E+ +F K
Sbjct: 70 IG-ELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120


24A4U85_RS09945A4U85_RS10090Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09945-220-3.170769tRNA epoxyqueuosine(34) reductase QueG
A4U85_RS09950018-4.210274biotin synthase BioB
A4U85_RS09955118-4.237712hypothetical protein
A4U85_RS09960118-4.445017hypothetical protein
A4U85_RS09965219-3.750209type 1 fimbrial protein
A4U85_RS09970219-3.749358fimbria/pilus periplasmic chaperone
A4U85_RS09975118-3.587685fimbrial biogenesis outer membrane usher
A4U85_RS09980018-3.360813fimbrial protein
A4U85_RS09990120-3.606403hypothetical protein
A4U85_RS09995019-3.478363LysR family transcriptional regulator
A4U85_RS10000118-5.507834DMT family transporter
A4U85_RS10005217-5.343748hypothetical protein
A4U85_RS10010319-4.396731cytosine permease
A4U85_RS10015520-4.570835PACE efflux transporter
A4U85_RS10020620-4.853684LysR family transcriptional regulator
A4U85_RS10025620-4.785149integrase family protein
A4U85_RS10035417-3.484215*hypothetical protein
A4U85_RS10040418-4.835574ATP-binding protein
A4U85_RS10045416-5.472088SIR2 family protein
A4U85_RS10050416-5.671921GIY-YIG nuclease family protein
A4U85_RS10055316-5.444560hypothetical protein
A4U85_RS10060317-5.724075hypothetical protein
A4U85_RS10065421-6.493703class I SAM-dependent DNA methyltransferase
A4U85_RS10070522-6.697002Arc family DNA-binding protein
A4U85_RS10075121-6.647006Arc family DNA-binding protein
A4U85_RS10080019-5.511289IS21 family transposase
A4U85_RS10085016-6.186984ATP-binding protein
A4U85_RS10090015-3.474401hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09975PF005777240.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 724 bits (1870), Expect = 0.0
Identities = 244/840 (29%), Positives = 392/840 (46%), Gaps = 51/840 (6%)

Query: 47 FDTNFLVGNAQ-KIDIGRFKYGNPILPGEYSLDVYINGQWLGKRKFVFKSTRSNENAKTC 105
F+ FL + Q D+ RF+ G + PG Y +D+Y+N ++ R F + S + C
Sbjct: 49 FNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPC 108

Query: 106 FTPDMLLEYGVKPEILHH-EVSSTFTCNDLDKWVNDAFYQFDTSRLRLDISIPQVALQKN 164
T L G+ + + + C L ++DA Q D + RL+++IPQ +
Sbjct: 109 LTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNR 168

Query: 165 AQGYVDPRLWDRGINAAFFAYNASAYRIVNNNHEKNH-AFMGTNVGLNLYDWQLRHTGQW 223
A+GY+ P LWD GINA YN S + N +H A++ GLN+ W+LR W
Sbjct: 169 ARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTW 228

Query: 224 KWQDHNEIQEKVSSYTSNNTYAQKAFPKLNSIVTLGDYFTNNNFFDALPYRGINISSDDR 283
+ + + + NT+ ++ L S +TLGD +T + FD + +RG ++SDD
Sbjct: 229 SYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDN 288

Query: 284 MLPNSMLGYAPQIRGYAKTNAKVEVRQQGNLIYQTTVTPGSFEINDLYPTGFGGELQVSI 343
MLP+S G+AP I G A+ A+V ++Q G IY +TV PG F IND+Y G G+LQV+I
Sbjct: 289 MLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTI 348

Query: 344 YETNGEIQKFSIPYASVIEMLRPKMSRYSFTLGHFRDAN-INLKPWLVQGKYQRGINNYL 402
E +G Q F++PY+SV + R +RYS T G +R N KP Q G+
Sbjct: 349 KEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGW 408

Query: 403 TSYTGFQATENYQSFLIGSAFAT-PIGAISFDATQSSAEFEQKPTLKGRSYRLSYNRLFT 461
T Y G Q + Y++F G +GA+S D TQ+++ G+S R YN+
Sbjct: 409 TIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLN 468

Query: 462 PTNTNLTLATYKYSTENYLKLRDSILIRDLQQQDIDSFSVG--------------KQKSE 507
+ TN+ L Y+YST Y D+ R V ++ +
Sbjct: 469 ESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGK 528

Query: 508 FQITLNQGLPNQWGNFYLVGSWINYWNQPNRNQQFQFGYSNQFKDLTYSISAITHELDQE 567
Q+T+ Q L YL GS YW N ++QFQ G + F+D+ +++S
Sbjct: 529 LQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSY------SL 581

Query: 568 NQNSGH---ETQYLLSLSFPLQFKKN----------TVNFNSSISEDSRTLGMSGYIG-- 612
+N+ + L+++ P + +++ S + R ++G G
Sbjct: 582 TKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTL 641

Query: 613 ---NRFDYGSSISYQDQG----QTSLNINGTYRTNYTTIGASFGQSDTYQQEMINLNGSL 665
N Y Y G ++ YR Y + SD +Q ++G +
Sbjct: 642 LEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGV 701

Query: 666 VAHSQGILFGPDQVQTMVLVYAPQATGARVGNTSGLSINKQGYAVIPYVTPYRLNDISLD 725
+AH+ G+ G T+VLV AP A A+V N +G+ + +GYAV+PY T YR N ++LD
Sbjct: 702 LAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALD 761

Query: 726 PQGMPSTVELTETSHRIAPYAGSITKVNFSTKTGYAVFISTQTPNGGHLPFAAQVFNQNN 785
+ V+L + P G+I + F + G + + T T N LPF A V ++++
Sbjct: 762 TNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLM-TLTHNNKPLPFGAMVTSESS 820

Query: 786 EIVGMVAQGSRIYLRTPLTQDHLYVKWGQSNSEECQLDYDIQSKILQEKQSIIMTEAVCK 845
+ G+VA ++YL + VKWG+ + C +Y + + ++Q + A C+
Sbjct: 821 QSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPE--SQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10040LIPPROTEIN48300.028 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 30.4 bits (68), Expect = 0.028
Identities = 21/64 (32%), Positives = 28/64 (43%), Gaps = 6/64 (9%)

Query: 267 QASLKTQRPTLIQALRSVREGVFNIEQSAHQEMKQYLRTLISITQIEINSGSPWGNFPKA 326
A L +P LI + + FN QSA + +K + T IEIN+ P NF A
Sbjct: 56 NAELLKLKPVLITDEGKIDDKSFN--QSAFEALKAINKQ----TGIEINNVEPSSNFESA 109

Query: 327 KNFF 330
N
Sbjct: 110 YNSA 113


25A4U85_RS10620A4U85_RS10685Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS106204145.129159allantoin permease
A4U85_RS106259154.318858Lrp/AsnC family transcriptional regulator
A4U85_RS106307163.129201LysE family translocator
A4U85_RS106354152.280807translesion error-prone DNA polymerase V
A4U85_RS106404140.869759hypothetical protein
A4U85_RS106453130.445340stress-induced protein
A4U85_RS10650112-3.158145hypothetical protein
A4U85_RS10655012-2.157987glucose 1-dehydrogenase
A4U85_RS10660011-2.406693catalase HPII
A4U85_RS10665419-5.652494iron-containing redox enzyme family protein
A4U85_RS10670225-5.728835CinA family protein
A4U85_RS10675025-4.979341hypothetical protein
A4U85_RS10680026-3.768977hypothetical protein
A4U85_RS10685-122-3.481341hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10655DHBDHDRGNASE1121e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (280), Expect = 1e-31
Identities = 82/259 (31%), Positives = 122/259 (47%), Gaps = 17/259 (6%)

Query: 41 SEKLKGKVAVISGGDSGIGRSVAVLFAREGADI-AVLYLEEDQDAEITKQLIEKEGQQCL 99
++ ++GK+A I+G GIG +VA A +GA I AV Y E E ++ E +
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKL--EKVVSSLKAEARHAE 60

Query: 100 LLKGDISDPDLAKQNIDKVLQHFGKINILVNNAGVQYQQKEIESISNEQLEKTFKTNIFA 159
D+ D + ++ + G I+ILVN AGV + I S+S+E+ E TF N
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTG 119

Query: 160 MFYLTKEAIPYM--EEGDSIINTTSITSYQGHDELIDYASTKGAITSFTRSLSNNLMKQK 217
+F ++ YM SI+ S + + YAS+K A FT+ L L +
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEY- 178

Query: 218 KGIRVNGVAPGPIWT----PLIPSSFDAETV-----EKFGKDTPMGRMGQPSEVAPAYLF 268
IR N V+PG T L AE V E F P+ ++ +PS++A A LF
Sbjct: 179 -NIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 269 LASDDASYITGQVIHVNGG 287
L S A +IT + V+GG
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


26A4U85_RS10920A4U85_RS11095Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS10920214-0.073074LysR family transcriptional regulator
A4U85_RS109252171.352165carboxymuconolactone decarboxylase family
A4U85_RS109303171.802640NAD-dependent succinate-semialdehyde
A4U85_RS109350162.039814hypothetical protein
A4U85_RS109401162.847889cytosine permease
A4U85_RS192151173.616148hypothetical protein
A4U85_RS109450163.297810aldehyde dehydrogenase
A4U85_RS109501162.270376LLM class flavin-dependent oxidoreductase
A4U85_RS19220-1152.259373flavin reductase family protein
A4U85_RS19225-1163.055661alpha/beta hydrolase
A4U85_RS109601111.840026amino acid synthesis family protein
A4U85_RS109650131.478763GntR family transcriptional regulator
A4U85_RS109700142.184563DUF1330 domain-containing protein
A4U85_RS109750141.175143cytochrome b
A4U85_RS10980-1130.733188catalase family peroxidase
A4U85_RS10985014-1.416979MFS transporter
A4U85_RS10990017-2.390182hydroxymethylglutaryl-CoA lyase
A4U85_RS10995015-3.312231CoA transferase
A4U85_RS11000623-6.867783LysR family transcriptional regulator
A4U85_RS11005725-7.502407alpha/beta hydrolase
A4U85_RS11010525-8.234491helix-turn-helix transcriptional regulator
A4U85_RS11015626-7.337698hypothetical protein
A4U85_RS11020626-7.584586helix-turn-helix transcriptional regulator
A4U85_RS11030428-7.230168AAA family ATPase
A4U85_RS11035118-4.206237hypothetical protein
A4U85_RS11040-116-3.001625sel1 repeat family protein
A4U85_RS11045-214-0.746082suppressor of fused domain protein
A4U85_RS11050-1151.312824hypothetical protein
A4U85_RS11055-1182.540233PaaI family thioesterase
A4U85_RS11060-1182.626793carbonic anhydrase
A4U85_RS110650182.946713phenylacetic acid degradation operon negative
A4U85_RS110701173.090138phenylacetate--CoA ligase
A4U85_RS110752173.0511833-oxoadipyl-CoA thiolase
A4U85_RS110801172.2586833-hydroxyacyl-CoA dehydrogenase
A4U85_RS110852162.1617592-(1,2-epoxy-1,2-dihydrophenyl)acetyl-CoA
A4U85_RS110902162.315889enoyl-CoA hydratase/isomerase family protein
A4U85_RS110952142.205049phenylacetate-CoA oxygenase/reductase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10985TCRTETA462e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.0 bits (109), Expect = 2e-07
Identities = 67/368 (18%), Positives = 125/368 (33%), Gaps = 31/368 (8%)

Query: 63 YGLGASLFVIGYVIFEVPSNLLLHKFGARKWIARIMISWGVATALMVFVTTEWQFYVLRF 122
YG+ +L+ + L +FG R + + V A+M W Y+ R
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGR- 103

Query: 123 LIGAMEAGFAPGVLYYMTLWFPQSYRGRIISLFFLASAF-SGIFGGPISGVLLSSLDGVM 181
++ + Y+ R R F+++ F G+ GP+ G G+M
Sbjct: 104 IVAGITGATGAVAGAYIADITDGDERAR--HFGFMSACFGFGMVAGPVLG-------GLM 154

Query: 182 NMRGWHWLFLVGGVPCVLLGLCVLTYLKDHIDDAQWLSSSEKVHLKRLVEPKQPEKASHS 241
H F L L L + +R + + +
Sbjct: 155 GGFSPHAPFFAAAALNGLNFLTGCFLLPE-----------SHKGERRPLRREALNPLASF 203

Query: 242 LWAAIKTPGLLILGAIYFLIQ-IASYGLNFWGPQLIKSSGIDDTQMIGFLSAIPYLMGAI 300
WA T + L A++F++Q + W D T IG A ++ ++
Sbjct: 204 RWARGMTV-VAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDAT-TIGISLAAFGILHSL 261

Query: 301 TMVIV-GRLADKSGERLKFTAGLLSLGAIGFFTAGFFDANPVVVVISLALLGAGIVAAIP 359
++ G +A + GER G+++ G+ F + I + L GI +P
Sbjct: 262 AQAMITGPVAARLGERRALMLGMIA-DGTGYILLAFATRGWMAFPIMVLLASGGI--GMP 318

Query: 360 SFWNLPPKVLVGAGAGAAGGTALINTLGQVGGIVSPIMVGHIKDITGSTTPALYIIASLC 419
+ + + + G G+ + L + IV P++ I + +T IA
Sbjct: 319 ALQAMLSRQVDEERQGQLQGS--LAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAA 376

Query: 420 VVCIALLI 427
+ + L
Sbjct: 377 LYLLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11020HTHTETR522e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.9 bits (124), Expect = 2e-10
Identities = 25/190 (13%), Positives = 67/190 (35%), Gaps = 12/190 (6%)

Query: 13 VVNKAIDLFHHRGFHLIGVDRIVKESEITKATFYNYFQSKERLIEICLMVQKEKLQEQVV 72
+++ A+ LF +G + I K + +T+ Y +F+ K L + + + E +
Sbjct: 16 ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELEL 75

Query: 73 A-MVEYDLSTLAIDKLKKLYYLHTDLEGPYYLLFKAIFEIKNSYPIAYQTAMRYRTWLKN 131
++ L++ + ++ L + + L I K + + + L
Sbjct: 76 EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCL 135

Query: 132 EIYSQLRVLNADA-------SFTDAKLFVYMVEGTIIQLLSS----DGALEREKMLDYFL 180
E Y ++ + + ++ G I L+ + + + +K ++
Sbjct: 136 ESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKKEARDYV 195

Query: 181 NSFVRNFSPC 190
+ + C
Sbjct: 196 AILLEMYLLC 205


27A4U85_RS11275A4U85_RS11305Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS11275016-3.664012type VI secretion system-associated protein
A4U85_RS11285013-3.297820type VI secretion system membrane subunit TssM
A4U85_RS11290215-4.119298hypothetical protein
A4U85_RS11295117-3.753288type VI secretion system baseplate subunit TssG
A4U85_RS11300116-3.998433type VI secretion system baseplate subunit TssF
A4U85_RS11305216-2.996150type VI secretion system baseplate subunit TssE
28A4U85_RS11355A4U85_RS11520Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS11355014-3.213205allophanate hydrolase
A4U85_RS11360-114-0.683626transcriptional regulator
A4U85_RS19230014-0.748881hypothetical protein
A4U85_RS11370-114-0.339643ATP-binding protein
A4U85_RS113751193.073058helix-turn-helix transcriptional regulator
A4U85_RS113801203.178620NAD(P)H-dependent oxidoreductase
A4U85_RS113851192.963260urea carboxylase
A4U85_RS11390014-0.717559DUF1989 domain-containing protein
A4U85_RS11395-116-1.746359DUF1989 domain-containing protein
A4U85_RS11400218-2.638199glutathione-dependent formaldehyde
A4U85_RS11405419-3.293278SRPBCC family protein
A4U85_RS11410318-2.971718TetR/AcrR family transcriptional regulator
A4U85_RS11415115-1.643331hypothetical protein
A4U85_RS11420-114-0.289638SMI1/KNR4 family protein
A4U85_RS11425-1130.316270DUF637 domain-containing protein
A4U85_RS11430-1172.933064ShlB/FhaC/HecB family hemolysin
A4U85_RS114352193.541003chloride channel protein
A4U85_RS114402193.388016ATP-grasp domain-containing protein
A4U85_RS114451192.5340455-oxoprolinase/urea amidolyase family protein
A4U85_RS114500130.422922putative hydro-lyase
A4U85_RS11455013-1.581949LamB/YcsF family protein
A4U85_RS11460213-2.641553divalent metal cation transporter
A4U85_RS11465416-4.143871LysR family transcriptional regulator MumR
A4U85_RS11470421-4.877223hypothetical protein
A4U85_RS11475721-5.051306serine hydrolase
A4U85_RS11480619-5.254363HAD-IA family hydrolase
A4U85_RS11485417-3.174282GNAT family N-acetyltransferase
A4U85_RS11490416-3.527783hypothetical protein
A4U85_RS11495317-3.0380433-hydroxyacyl-CoA dehydrogenase
A4U85_RS11500215-3.136106DUF1826 domain-containing protein
A4U85_RS11505016-2.587979MerR family transcriptional regulator
A4U85_RS11510016-2.534982DUF2147 domain-containing protein
A4U85_RS11515116-2.963228MFS transporter
A4U85_RS11520116-3.092260TetR/AcrR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11355PERTACTIN300.047 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 29.7 bits (66), Expect = 0.047
Identities = 33/117 (28%), Positives = 48/117 (41%), Gaps = 21/117 (17%)

Query: 54 ILKDQNITDLPLYGVPFAV----KDNIDVAGFHTTAACKEVQYLANQDAAVV----AKLK 105
+L D ++T +P G P AV + + V G H T +A D A+V A ++
Sbjct: 202 VLGDTSVTAVPASGAPAAVFVFGANELTVDGGHITGG--RAAGVAAMDGAIVHLQRATIR 259

Query: 106 KAGAIVVGKTNLDQFATGLVGVRSPYGAVKNSFNP--EYISGGSSSGSSVVVANGIV 160
+ A G + G P GAV F P + G S S+V +A IV
Sbjct: 260 RGDAPAGG---------AVPGGAVPGGAVPGGFGPLLDGWYGVDVSDSTVDLAQSIV 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11410HTHTETR533e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.5 bits (128), Expect = 3e-11
Identities = 17/65 (26%), Positives = 24/65 (36%)

Query: 5 EASFRALRVLHTAKDLFNQYGFHKVGIDRIIAESKVTKATFYNHFHSKERLIEMCLTFQK 64
EA +L A LF+Q G + I + VT+ Y HF K L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 DGLKE 69
+ E
Sbjct: 68 SNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11425PF05860622e-13 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 62.1 bits (151), Expect = 2e-13
Identities = 19/144 (13%), Positives = 45/144 (31%), Gaps = 29/144 (20%)

Query: 76 ADIVADSAANAANRAIIAAGKNSAGKTVPVVNIQTPK-NGISHNIYKQFDVLAEGAVLNN 134
A I D+ + ++ T + + H+ +++F V G N
Sbjct: 1 AQITPDTTLPIN-------SNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN 52

Query: 135 SRQGATTKTVGAVAANPFLATGEARVILNEVNSSAASRFEGNLEVAGQMADVIIANPSGI 194
+ + I++ V + S +G + A++ + NP+GI
Sbjct: 53 N-------------------PTNIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGI 92

Query: 195 NIKGGGFINANKAIFTTGKPQLNA 218
++ + + +L
Sbjct: 93 IFGQNARLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11430PF00577330.003 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 33.3 bits (76), Expect = 0.003
Identities = 16/127 (12%), Positives = 42/127 (33%), Gaps = 1/127 (0%)

Query: 187 DFLNLQQLDQGLENLKRAYTVDIQILPSNESVQETQGYSDLVIKLQAHSKISLNLGLDNS 246
D + +E V + +G L + Q +L L +
Sbjct: 491 DTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQ 550

Query: 247 GSKDTGKYIGSLGVNINNPFYLSDSLSFNFSHSLDNLHRDLNKNYFVSYQLPLGNYDFST 306
T +N F + + ++S + + + ++ ++ +P ++ S
Sbjct: 551 TYWGTSNVDEQFQAGLNTAF-EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 307 SYSRYQY 313
S S++++
Sbjct: 610 SKSQWRH 616



Score = 32.5 bits (74), Expect = 0.005
Identities = 30/204 (14%), Positives = 58/204 (28%), Gaps = 24/204 (11%)

Query: 221 TQGY---SDLVIKLQAHSKISLNLGLDNSGSKDTGKYI------GSLGVNINNPFYLSDS 271
T GY +D I G+ K T Y G L + + + +
Sbjct: 483 TSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST 542

Query: 272 LSFNFSHSLDNLHRDLNKNYFVSYQLPLGNYDFSTSYSRYQYEQNVLGANSV-------- 323
L + SH ++++ + + +++ SYS + +
Sbjct: 543 LYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 324 ---LRYHGLSEQGNLNVSRVLSRSG----QHKTSLYGKLYHKQNSNFIDDIEIEVQRRRT 376
LR S+ + + S +S + +YG L N ++
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 377 SGWNAGIQHRQYLGVAVLDAGLDY 400
SG G + G +
Sbjct: 663 SGSTGYATLNYRGGYGNANIGYSH 686


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11440RTXTOXIND310.011 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.011
Identities = 13/49 (26%), Positives = 23/49 (46%)

Query: 509 APINGVISAWKVENGEQVTEGQVVAIMEAMKMEVQVLAHRSGVIQIGAE 557
N ++ V+ GE V +G V+ + A+ E L +S ++Q E
Sbjct: 101 PIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11475BLACTAMASEA320.005 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 31.7 bits (72), Expect = 0.005
Identities = 17/91 (18%), Positives = 40/91 (43%), Gaps = 13/91 (14%)

Query: 166 NDKTPMAVGSTFKLLVLKTYEDAIKKGELKRETIVSLKEKNRSLPTGVLQNLP-----AG 220
+++ PM STFK+++ + G+ + E + ++++ ++ P
Sbjct: 59 DERFPMM--STFKVVLCGAVLARVDAGDEQLERKIHYRQQD------LVDYSPVSEKHLA 110

Query: 221 TPVNLELLAQLMIQISDNTATDSLIEVLKKP 251
+ + L I +SDN+A + L+ + P
Sbjct: 111 DGMTVGELCAAAITMSDNSAANLLLATVGGP 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11520TCRTETA704e-15 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 69.9 bits (171), Expect = 4e-15
Identities = 77/390 (19%), Positives = 148/390 (37%), Gaps = 45/390 (11%)

Query: 23 LVTCLLLMIMDGYDIQSMAYAAPLIIEEW---GVQKSMLGVVFSASLFGLFVGSFLLSSL 79
L+ L + +D I + P ++ + + G++ + F + +L +L
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 80 SDRFGRRPILLISTFIFSILMLLTPHVGNIEQLTVIRFVTGIFLGGIMPNVMAYSSEIVP 139
SDRFGRRP+LL+S ++ + + L + R V GI G AY ++I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITD 125

Query: 140 YKSRIFTMMVISCGYTVGAMLGGGISALLVPWGGWQAIFYFGGIIPLIIFFITFFKLPES 199
R +S + G + G + L+ + A F+ + + F F LPES
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPES 184

Query: 200 LYF----LSENSKNSKNSSKILFWLKKFYPALTFNAEIKIINNTEVQVKKSPLELFKNQR 255
L + N S + + + ++++ QV + +F R
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVG----QVPAALWVIFGEDR 240

Query: 256 AFFTYSIWTISILNMISLYFLANWLPTLAKESGLSLNQALLIGSTLQLGGTIGSVVMGLK 315
F + TI I ++ + + + SL QA++ G G ++++G+
Sbjct: 241 --FHWDATTIGI--SLAAFGILH-----------SLAQAMITGPVAARLGERRALMLGMI 285

Query: 316 IDKTGFYKVLIPVFLVAVISVALIGYSVSHIVLLFIIIFIAGFAIVGGQPAINALSASYY 375
D TG+ L+ ++ + I++ +A I G PA+ A+ +
Sbjct: 286 ADGTGY---------------ILLAFATRGWMAFPIMVLLASGGI--GMPALQAMLSRQV 328

Query: 376 PVSLRTTGVGWSIGIARLGSVIGPLFGGYL 405
+ G + L S++GPL +
Sbjct: 329 DEERQGQLQGSLAALTSLTSIVGPLLFTAI 358



Score = 35.2 bits (81), Expect = 4e-04
Identities = 35/156 (22%), Positives = 55/156 (35%), Gaps = 8/156 (5%)

Query: 280 LPTLAKESGLSLNQALLIGSTLQLGGT---IGSVVMGLKIDKTGFYKVLIPVFLVAVISV 336
LP L ++ S + G L L + V+G D+ G VL+ A +
Sbjct: 28 LPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDY 87

Query: 337 ALIGYSVSHIVLLFIIIFIAGFAIVGGQPAINALSASYYPVSLRTTGVGWSIGIARLGSV 396
A++ + + +L+I +AG G A A R G+ G V
Sbjct: 88 AIMATA-PFLWVLYIGRIVAGITGATG-AVAGAYIADITDGDERARHFGFMSACFGFGMV 145

Query: 397 IGPLFGGYLSQFLVITHL-FVIAAIPSLFVIIMLMI 431
GP+ GG + F H F AA + +
Sbjct: 146 AGPVLGGLMGGFSP--HAPFFAAAALNGLNFLTGCF 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11525HTHTETR582e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 57.7 bits (139), Expect = 2e-12
Identities = 33/169 (19%), Positives = 65/169 (38%), Gaps = 14/169 (8%)

Query: 32 ETSSKKLHIIRTAIRLFTTHGFHTTGVDLIVKESEIPKATLYNYFHSKERLIEMCIAFQK 91
E + HI+ A+RLF+ G +T + I K + + + +Y +F K L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 92 SLLKEDVLAIIYSSRYCTPTDKLKEIVVLHVN---SNSLYHLLLKAFFEIKVAYQQAYRM 148
S + E + + P L+EI++ + + LL++ F + +
Sbjct: 68 SNIGE-LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVV 126

Query: 149 A-------IEYRKWLTREIFELIFSLEIRA-LKPD--ANMVLNLIDGLM 187
+E + + + I + + A L A ++ I GLM
Sbjct: 127 QQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLM 175


29A4U85_RS11595A4U85_RS11650Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS11595113-3.508473HlyD family type I secretion periplasmic adaptor
A4U85_RS11600115-4.374105type I secretion system permease/ATPase
A4U85_RS11605420-6.377427TolC family protein
A4U85_RS11610622-7.386325AAA family ATPase
A4U85_RS11615624-7.543322type II toxin-antitoxin system Phd/YefM family
A4U85_RS11620624-7.260682AAA family ATPase
A4U85_RS11625422-5.308041hypothetical protein
A4U85_RS11630420-4.827876hypothetical protein
A4U85_RS11635218-4.190071site-specific integrase
A4U85_RS11640014-2.384015site-specific integrase
A4U85_RS116450120.661310LPS export ABC transporter ATP-binding protein
A4U85_RS116502110.825739lipopolysaccharide transport periplasmic protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11595RTXTOXIND2658e-87 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 265 bits (679), Expect = 8e-87
Identities = 105/430 (24%), Positives = 192/430 (44%), Gaps = 55/430 (12%)

Query: 19 PPLPRASLVIWIVGIGLVIFFIWAWFFKLEEVSTGTGKVIPSSKEQVIQSLEGGILTKLN 78
P R LV + + LVI FI + ++E V+T GK+ S + + I+ +E I+ ++
Sbjct: 52 PVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEII 111

Query: 79 VQEGDIVQKGQILAQLDPTRFASNVGESRSLLIAAQATAARLRAEVN------------- 125
V+EG+ V+KG +L +L ++ +++S L+ A+ R +
Sbjct: 112 VKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLP 171

Query: 126 ---GTPLVFPEIVAQDPKLVHEETALYESRRADLEQTL--------------SGLRQALQ 168
V E V + L+ E+ + +++++ E L + +
Sbjct: 172 DEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSR 231

Query: 169 LVQQELAMTEPLVAKGAASEVEVLRLRREANDLQNKMNDARNQ----------------- 211
+ + L L+ K A ++ VL + + N++ ++Q
Sbjct: 232 VEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL 291

Query: 212 ----YYVKAREELSKANTDTQTQQQVVMGRNDSLQRAVFKAPVRGVVKEITVTTHGGVIP 267
+ + ++L + + + + Q +V +APV V+++ V T GGV+
Sbjct: 292 VTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVT 351

Query: 268 QNGKLMTIVPIDEQLLIEARILPRDIAFIRPGQEALVKITAYDYSIYGGLKGKVTVISPD 327
LM IVP D+ L + A + +DI FI GQ A++K+ A+ Y+ YG L GKV I+ D
Sbjct: 352 TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLD 411

Query: 328 TIRDEVKQDQFYYRVYIRTDSDKLYNKEGKAFGITPGMVATVDIRTGEKTVLDYLLKPF- 386
I D+ + + V I + + L + K ++ GM T +I+TG ++V+ YLL P
Sbjct: 412 AIEDQ--RLGLVFNVIISIEENCL-STGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLE 468

Query: 387 NKAKEALRER 396
E+LRER
Sbjct: 469 ESVTESLRER 478


30A4U85_RS11740A4U85_RS12070Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS117402201.546867hypothetical protein
A4U85_RS117502263.634018hypothetical protein
A4U85_RS117554233.393242hypothetical protein
A4U85_RS117604233.882316hypothetical protein
A4U85_RS117652264.165642hypothetical protein
A4U85_RS117702254.081033hypothetical protein
A4U85_RS117752263.731424hypothetical protein
A4U85_RS117802243.283632head-tail connector protein
A4U85_RS117851252.965992hypothetical protein
A4U85_RS117901252.451747terminase
A4U85_RS117950200.630228terminase small subunit
A4U85_RS118002200.749468hypothetical protein
A4U85_RS118050180.863124hypothetical protein
A4U85_RS118100190.434582ATP-binding protein
A4U85_RS118150170.336363DUF1376 domain-containing protein
A4U85_RS11820016-0.120792DNA adenine methylase
A4U85_RS11825019-0.112648hypothetical protein
A4U85_RS11830219-1.649203hypothetical protein
A4U85_RS11835420-1.882180hypothetical protein
A4U85_RS11840217-1.306150hypothetical protein
A4U85_RS11845018-1.485667hypothetical protein
A4U85_RS11850-117-1.531107helix-turn-helix domain-containing protein
A4U85_RS11855016-1.330304helix-turn-helix transcriptional regulator
A4U85_RS11860118-0.759113hypothetical protein
A4U85_RS11865017-0.844931DUF2303 family protein
A4U85_RS11870115-1.308432hypothetical protein
A4U85_RS11875116-1.049561YqaJ viral recombinase family protein
A4U85_RS11880215-0.864917hypothetical protein
A4U85_RS11885014-1.247030ATP-dependent helicase
A4U85_RS11890420-5.193552hypothetical protein
A4U85_RS11895622-6.354459hypothetical protein
A4U85_RS19430525-8.262204hypothetical protein
A4U85_RS11905625-8.181278hypothetical protein
A4U85_RS11910824-8.041115hypothetical protein
A4U85_RS11915725-8.065801hypothetical protein
A4U85_RS11920725-6.561376DUF1311 domain-containing protein
A4U85_RS11925525-7.051149hypothetical protein
A4U85_RS11930424-5.458647beta family protein
A4U85_RS11935322-4.825703sce7726 family protein
A4U85_RS11940021-3.123830hypothetical protein
A4U85_RS119451180.429038hypothetical protein
A4U85_RS11950019-0.323360DUF4142 domain-containing protein
A4U85_RS119550210.243074hypothetical protein
A4U85_RS11960120-1.146685hypothetical protein
A4U85_RS11965120-1.334817helix-turn-helix transcriptional regulator
A4U85_RS11970420-2.405308MFS transporter
A4U85_RS11975822-6.918412minor capsid protein
A4U85_RS11980923-8.307226hypothetical protein
A4U85_RS119851021-8.271353hypothetical protein
A4U85_RS19245720-6.295541hypothetical protein
A4U85_RS11995519-5.534028NAD(P)-binding domain-containing protein
A4U85_RS12000321-5.928267hypothetical protein
A4U85_RS12005317-4.461617cold-shock protein
A4U85_RS12010317-4.214678LysE family translocator
A4U85_RS12015012-2.072054N-acetylmuramidase
A4U85_RS12020212-2.414315hypothetical protein
A4U85_RS12025413-2.903082hypothetical protein
A4U85_RS12030214-2.770245LexA family transcriptional regulator
A4U85_RS12035117-2.873188transposase
A4U85_RS12045016-2.432418*exodeoxyribonuclease VII large subunit
A4U85_RS12050120-2.381355exodeoxyribonuclease VII small subunit
A4U85_RS12055-120-1.834046hypothetical protein
A4U85_RS120601141.529656LysE family transporter
A4U85_RS120702132.106946hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS1178560KDINNERMP280.003 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.4 bits (63), Expect = 0.003
Identities = 16/58 (27%), Positives = 25/58 (43%), Gaps = 6/58 (10%)

Query: 3 VKN-ILDGVTNILGMDAPKAQVIAPPKQPTRQDSKSPDS--SATIDRVQQAQNSMSGG 57
VK +LD N G D +A + P P +S P + + QAQ+ ++G
Sbjct: 63 VKTDVLDLTINTRGGDVEQALL---PAYPKELNSTQPFQLLETSPQFIYQAQSGLTGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11815RTXTOXIND280.037 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.3 bits (63), Expect = 0.037
Identities = 10/68 (14%), Positives = 25/68 (36%), Gaps = 1/68 (1%)

Query: 88 LDDLKSQAESNKSSKSERAKKAAEARWNKEQDTSNTNASTEHQSSNAYAYAQAMHKHDAS 147
L L ++A++ K+ S + + R+ + N E Y Q + + +
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE-LKLPDEPYFQNVSEEEVL 185

Query: 148 NAQGMLET 155
+++
Sbjct: 186 RLTSLIKE 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11890HTHFIS541e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 54.5 bits (131), Expect = 1e-12
Identities = 23/49 (46%), Positives = 33/49 (67%)

Query: 22 NAHAEVMALLEKPLLAETLIKTRGNQTKAAELLGLNRGTLRQRLKAHNI 70
+ V+A +E PL+ L TRGNQ KAA+LLGLNR TLR++++ +
Sbjct: 427 GLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11970TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.1 bits (86), Expect = 1e-04
Identities = 63/327 (19%), Positives = 111/327 (33%), Gaps = 11/327 (3%)

Query: 78 FLVPLGDLVNRRRLMTLQLLALISALLMVAFAHSTIVLLTGMLAVGLLGTAMTQGLIAYA 137
L L D RR ++ + L ++A A VL G + G+ G AY
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGA-VAGAYI 120

Query: 138 ASAAAPHEQGHVVGTAQSGVFIGLLLARVFSGGISDIAGWRGVYFCAA--IIMLMIALPL 195
A E+ G + G++ V G + + + AA + + L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFL 180

Query: 196 WRRLPHLNVQPVTMRYPQLLSSMLKLLRQEKVLQVRGVLALLMFAAFNILWSALVLPLSA 255
+P+ L+S + R V+ + +M + +AL +
Sbjct: 181 LPESHKGERRPLRREALNPLAS-FRWARGMTVVAALMAVFFIM-QLVGQVPAALWVIFGE 238

Query: 256 PPYNFSHTVIG-SFGLFGVIGALAATRAGQWADRGYAQRTSLAALLILLLAWWPLSLMTY 314
+++ T IG S FG++ +LA +R +L +I + L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 SLWALVIGIVLLDLGGQALHVTNQSMIFRTRPEAHSRLVGLYMLFYAVGSGLGAISTTAT 374
W +VLL GG + + + E +L G ++ S +G + TA
Sbjct: 299 RGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358

Query: 375 YAYA-----GWLGVCSLGACVSLLALL 396
YA + GW + + L L
Sbjct: 359 YAASITTWNGWAWIAGAALYLLCLPAL 385



Score = 33.6 bits (77), Expect = 0.001
Identities = 37/182 (20%), Positives = 69/182 (37%), Gaps = 10/182 (5%)

Query: 23 RSVVLLFAIASGASVANVYYAQPLLDILASDFNVSHAVIGGVVTATQIGCALALVFLV-P 81
V L A+ + A + F+ IG + A I +LA + P
Sbjct: 210 TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 82 LGDLVNRRRLMTLQLLALISALLMVAFAHSTIVLLTGMLAVGLLGTAMTQGLIAYAASAA 141
+ + RR + L ++A + +++AFA + M+ + G M L A +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMP-ALQAMLSRQV 328

Query: 142 APHEQGHVVGTAQS-----GVFIGLLLARVFSGGISDIAGWRGVYFCAAIIMLMIALPLW 196
QG + G+ + + LL +++ + I W G + A + ++ LP
Sbjct: 329 DEERQGQLQGSLAALTSLTSIVGPLLFTAIYA---ASITTWNGWAWIAGAALYLLCLPAL 385

Query: 197 RR 198
RR
Sbjct: 386 RR 387


31A4U85_RS12150A4U85_RS12195Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS121504241.751920peroxiredoxin
A4U85_RS121552160.347109KTSC domain-containing protein
A4U85_RS12160012-0.063753O-methyltransferase
A4U85_RS121650140.433676DUF1653 domain-containing protein
A4U85_RS12170517-3.863137alkyl hydroperoxide reductase subunit F
A4U85_RS12175722-5.450141hypothetical protein
A4U85_RS12180821-5.542681glutathione S-transferase N-terminal
A4U85_RS12185821-5.340373ExeM/NucH family extracellular endonuclease
A4U85_RS121901023-5.963468tRNA (guanosine(46)-N7)-methyltransferase TrmB
A4U85_RS12195924-5.791603putative Ig domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12175IGASERPTASE270.018 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.0 bits (59), Expect = 0.018
Identities = 21/93 (22%), Positives = 40/93 (43%), Gaps = 1/93 (1%)

Query: 22 DKAPETGATTGEHLENAAQQATADIKSAGDQAASDIATATDNAS-AKIDAAADHAADATA 80
AP T + T E + ++Q + ++ A A + A AK + A+ + A
Sbjct: 1027 PPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 81 KAAAETEATARKATADTAQAVENAAADVKKDAQ 113
++ +ET+ T T +TA + A V+ +
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKT 1119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12195RTXTOXINA1328e-33 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 132 bits (334), Expect = 8e-33
Identities = 74/253 (29%), Positives = 117/253 (46%), Gaps = 32/253 (12%)

Query: 1299 DQISGSDIGDTIYGGTGNDIIKGLEGSDQLFGDEGDDWLDGSYENDFIYGGEGNDQLFGG 1358
++ T G L ++L G D GS D +G +G+D + G
Sbjct: 693 EKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFGSKFTDIFHGADGDDLIEGN 752

Query: 1359 DGEDQLEGGNGDDVIDSGEGNDQLIGGTGNDRLLGGEGNDQLQGGEGDDYLEASNGNDFI 1418
DG D+L G G+D + G G+DQL GG GND+L+G GN+ L GG+GDD +
Sbjct: 753 DGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDEFQVQG----- 807

Query: 1419 EGGTGNDQLIGGSGEDSLYGNDGDDHLYGGDGNDILSGGQGNNILEGGNGEDEYVFDILG 1478
+ L GG G D LYG++G D L GG+G+D+L GG GN+I Y +
Sbjct: 808 -NSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDI---------YRYLSGY 857

Query: 1479 YNNVINDT--DASVLRFLSVSSENIKLEIVGNNLNIF--------YGVDSSVTITDF--- 1525
+++I+D L + ++ + GN+L ++ G + +T ++
Sbjct: 858 GHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEK 917

Query: 1526 ----ILNYNIKEI 1534
I N+ I++I
Sbjct: 918 ESGDISNHEIEQI 930



Score = 124 bits (312), Expect = 4e-30
Identities = 72/187 (38%), Positives = 102/187 (54%), Gaps = 10/187 (5%)

Query: 1293 HLLGG--DDQISGSDIGDTIYGGTGNDIIKGLEGSDQLFGDEGDDWLDGSYENDFIYGGE 1350
L+G D+ GS D +G G+D+I+G +G+D+L+GD+G+D L G +D +YGG+
Sbjct: 721 ELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGD 780

Query: 1351 GNDQLFGGDGEDQLEGGNGDDVI---DSGEGNDQLIGGTGNDRLLGGEGNDQLQGGEGDD 1407
GND+L G G + L GG+GDD + + L GG GND+L G EG D L GGEGDD
Sbjct: 781 GNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD 840

Query: 1408 YLEASNGNDF--IEGGTGNDQLI-GGSGEDSLYGNDG--DDHLYGGDGNDILSGGQGNNI 1462
L+ GND G G+ + G ED L D D + +GND++ N+
Sbjct: 841 LLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNV 900

Query: 1463 LEGGNGE 1469
L G+
Sbjct: 901 LSIGHKN 907



Score = 108 bits (272), Expect = 2e-25
Identities = 78/270 (28%), Positives = 111/270 (41%), Gaps = 38/270 (14%)

Query: 876 EGNDGNDEIWGQGGNDTINGGDGEDRISGD-----GINID---VQYHGDDTVS---GGNG 924
DG+D+++ G+ I G G D + D + ID G+ TV+ GG+
Sbjct: 615 HLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGDV 674

Query: 925 NDLIWGEGGSDTIYGGEGE---------DYIEGDETSIHIQYHKDDLLYGGEGNDTIFGD 975
L + G E +I G + + + L G D FG
Sbjct: 675 KVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFGS 734

Query: 976 GGRDIIYGGKDNDYIYGDSNQIDIQYHNNDTLYGDAGNDNIRGDGGSDIIYGGDGDDFLV 1035
DI +G +D I G ND LYGD GND + G G D +YGGDG+D L+
Sbjct: 735 KFTDIFHGADGDDLIEG--------NDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLI 786

Query: 1036 ADVINNRAFDGDDQLYGGNGKDSLSGFDGN---DTLDGGAGDDIIFGGKGDDYFISNEGN 1092
G++ L GG+G D + + L GG G+D ++G +G D EG+
Sbjct: 787 GV-------AGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGD 839

Query: 1093 DQLSGGEGNDIYHIQVRDSEVKVLDSSGEN 1122
D L GG GNDIY + D G+
Sbjct: 840 DLLKGGYGNDIYRYLSGYGHHIIDDDGGKE 869



Score = 102 bits (256), Expect = 1e-23
Identities = 97/384 (25%), Positives = 146/384 (38%), Gaps = 68/384 (17%)

Query: 559 NIVFSGGGRDQVYGGVGNDIIMTSSRLNAVKDSLLNQMGDADTGLTDLDKEYNILNFNFE 618
+ VF G +Y G G+D++ D D K N+
Sbjct: 621 DKVFLSAGSANIYAGKGHDVVYYDK-------------TDTGYLTIDGTKATEAGNYTVT 667

Query: 619 INLGDNNRFSHYT----NNSIGQITEQI--LPSFNIYYTGKNGKYYIYNLLESLNYGHGY 672
LG + + S+G+ TE+ + GKN + L S+
Sbjct: 668 RVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKN--LTETDNLYSVEE---- 721

Query: 673 AGSDIAYGGIGNDIILGSSDTDFLYGAVGDDAIYGMDGKDYLYGGENKDILYGGDGADQI 732
G D GS TD +GA GDD I G DG D LYG + D L GG+G DQ+
Sbjct: 722 -----LIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQL 776

Query: 733 IGGQGDDELTGGLEGDELYGGDGDDIIQADLNDSIDTLAPPVEPKTYSRFGHDLVYGGAG 792
GG G+D+L G + L GGDGDD Q N ++++GG G
Sbjct: 777 YGGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNS----------------LAKNVLFGGKG 820

Query: 793 KDQIWGNLGSDEIYGDDDDDQIAGDHPNLAGDIHGDDYLYG-GNGNDQIVGDGG------ 845
D+++G+ G+D + G + DD + G + N D Y Y G G+ I DGG
Sbjct: 821 NDKLYGSEGADLLDGGEGDDLLKGGYGN-------DIYRYLSGYGHHIIDDDGGKEDKLS 873

Query: 846 ------KDIIQGGNGNDLIFGDNSELDGKYHAEDNIEGNDGNDEIWGQGGNDTINGGDGE 899
+D+ GNDLI ++ I + ++ G N I +
Sbjct: 874 LADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFDK 933

Query: 900 D--RISGDGINIDVQYHGDDTVSG 921
I+ D + ++Y + +
Sbjct: 934 SGRIITPDSLKKALEYQQRNNKAS 957



Score = 91.6 bits (227), Expect = 3e-20
Identities = 81/324 (25%), Positives = 121/324 (37%), Gaps = 35/324 (10%)

Query: 787 VYGGAGKDQIWGNLGSDEIYGDDDDDQIAGDHPNL-AGDIHGDDYLYGGNGNDQIVGDGG 845
+ G G D+++ + GS IY D + D + I G GN V G
Sbjct: 614 SHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGD 673

Query: 846 KDIIQGGNGN-DLIFGDNSELDGKYHAEDNIEGNDGNDEIWGQGGNDTINGGDGEDRISG 904
++Q ++ G +E E E + + G D+ G
Sbjct: 674 VKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFG 733

Query: 905 DGINIDVQYHGDDTVSGGNGNDLIWGEGGSDTIYGGEGEDYIEGDETSIHIQYHKDDLLY 964
D G +G+DLI G G+D +YG +G D + G DD LY
Sbjct: 734 SKFT--------DIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGN--------GDDQLY 777

Query: 965 GGEGNDTIFGDGGRDIIYGGKDNDYIYGDSNQIDIQYHN----NDTLYGDAGNDNIRGDG 1020
GG+GND + G G + + GG +D N + ND LYG G D + G
Sbjct: 778 GGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGE 837

Query: 1021 GSDIIYGGDGDDFLVADVINNRAFDGDDQLY-GGNGKDSLSGFDGNDTLDGGAGDDIIFG 1079
G D++ GG G+D + G + G +D LS D + D+ F
Sbjct: 838 GDDLLKGGYGNDIYR-----YLSGYGHHIIDDDGGKEDKLSLADID-------FRDVAFK 885

Query: 1080 GKGDDYFISNEGNDQLSGGEGNDI 1103
+G+D + + LS G N I
Sbjct: 886 REGNDLIMYKGEGNVLSIGHKNGI 909



Score = 87.7 bits (217), Expect = 5e-19
Identities = 63/215 (29%), Positives = 92/215 (42%), Gaps = 32/215 (14%)

Query: 2446 VFKLIFVVQNQTINGTSNADTLYGASGNDTLTGQAGNDILYGQAGNDTLNGGTGNDTMYG 2505
V +LI + G+ D +GA G+D + G GND LYG GNDTL+GG G+D +YG
Sbjct: 719 VEELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYG 778

Query: 2506 GKGDDTYIVDSTADVISESVNEGTDIVQSSVTYTLLNNVENLTLTGTTAINGTGNALNNV 2565
G G+D I + + ++ +G D Q + NV
Sbjct: 779 GDGNDKLIGVAGNNYLNGG--DGDDEFQV----------------------QGNSLAKNV 814

Query: 2566 IVGNSAINTLTGGVGDDYLNGGVGADKLLGGIGNDSYVIDNT--GDIVTENAGEGIDTVL 2623
+ G + L G G D L+GG G D L GG GND Y + I+ ++ G+ L
Sbjct: 815 LFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSL 874

Query: 2624 SSITYT------LGNNLENLTLIGSTAINGTGNAL 2652
+ I + GN+L G+ G N +
Sbjct: 875 ADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGI 909



Score = 79.6 bits (196), Expect = 1e-16
Identities = 85/425 (20%), Positives = 139/425 (32%), Gaps = 107/425 (25%)

Query: 1318 IIKGLEGSDQLFGDEGDDWLDGSYENDFIYGGEGNDQLFGGDGEDQLE------------ 1365
+G D++F G + +D +Y + + DG E
Sbjct: 613 ESHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGG 672

Query: 1366 ---------------------------------GGNGDDVIDSGEGNDQLIGGTGNDRLL 1392
G D+ ++LIG T D+
Sbjct: 673 DVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFF 732

Query: 1393 GGEGNDQLQGGEGDDYLEASNGNDFIEGGTGNDQLIGGSGEDSLYGNDGDDHLYGGDGND 1452
G + D G +GDD IEG GND+L G G D+L G +GDD LYGGDGND
Sbjct: 733 GSKFTDIFHGADGDDL---------IEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGND 783

Query: 1453 ILSGGQGNNILEGGNGEDEYVFDILGYNNVINDTDASVLRFLSVSSENIKLEIVGNNLNI 1512
L G GNN L GG+G+DE+ + + ++ G++
Sbjct: 784 KLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDD--- 840

Query: 1513 FYGVDSSVTITDFILNYNIKEIQFSDTVWYPEDLSQKINFLHNGSDGNDIIV-GNPYFNN 1571
L G GNDI + Y ++
Sbjct: 841 ----------------------------------------LLKGGYGNDIYRYLSGYGHH 860

Query: 1572 TVYSKNGDD---AIYGGSGVDYIYGGNGNDAIVDPGNIGFLYGGEGDDNLIGNNNSEFYG 1628
+ G + ++ D + GND I+ G L G + N + G
Sbjct: 861 IIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESG 920

Query: 1629 GEGYDSY-QVNGRNTKIFDSD-----FKGSIYIHNASDIHESNYVQEKANMIISPFINSL 1682
Q+ ++ +I D + + AS ++ ++ + + ++P IN +
Sbjct: 921 DISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVYGNDALAYGSQGDLNPLINEI 980

Query: 1683 SEFLS 1687
S+ +S
Sbjct: 981 SKIIS 985



Score = 79.6 bits (196), Expect = 1e-16
Identities = 62/224 (27%), Positives = 94/224 (41%), Gaps = 36/224 (16%)

Query: 919 VSGGNGNDLIWGEGGSDTIYGGEGEDY------------IEGDETSIHIQYHKDDLLYGG 966
G+G+D ++ GS IY G+G D I+G + + Y +L G
Sbjct: 614 SHLGDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGD 673

Query: 967 --------EGNDTIFGDGGRDIIYGGKDNDYIYGDSNQIDIQYHNNDTLYGDAGNDNIRG 1018
+ + G Y + +I G + ++ + L G D G
Sbjct: 674 VKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEELIGTTRADKFFG 733

Query: 1019 DGGSDIIYGGDGDDFLVADVINNRAFDGDDQLYGGNGKDSLSGFDGNDTLDGGAGDDIIF 1078
+DI +G DGDD + + DG+D+LYG GNDTL GG GDD ++
Sbjct: 734 SKFTDIFHGADGDDLIEGN-------DGNDRLYGD---------KGNDTLSGGNGDDQLY 777

Query: 1079 GGKGDDYFISNEGNDQLSGGEGNDIYHIQVRDSEVKVLDSSGEN 1122
GG G+D I GN+ L+GG+G+D + +Q VL N
Sbjct: 778 GGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGN 821



Score = 75.4 bits (185), Expect = 3e-15
Identities = 58/258 (22%), Positives = 97/258 (37%), Gaps = 45/258 (17%)

Query: 1816 GNGNDQINSGLGSSQVYAGEGDDVI-----------------ITPDVFAIDKISGGAGN- 1857
G+G+D++ GS+ +YAG+G DV+ + + ++ GG
Sbjct: 617 GDGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYLTIDGTKATEAGNYTVTRVLGGDVKV 676

Query: 1858 --DVIYTYDPLSNNQISNSPYLNINLAR---NIYDIKDVVYGGEGDDYIYLGNQKTEAYG 1912
+V+ + + + Y + D +Y E + + + +G
Sbjct: 677 LQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVE---ELIGTTRADKFFG 733

Query: 1913 DDGNDIIVSYSLSSNIQFLYGGDGNDTLKAGDGGAYLDGGAGTDHLVGGSGDDTFIVDEQ 1972
DI + + + G DGND L G L GG G D L GG G+D I
Sbjct: 734 SKFTDI---FHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAG 790

Query: 1973 DTYEENDPNGGYDTIHISQNIDLSLGYLEAVTLLGSQNLSIYGNTSDNKLIGNAGNNYID 2032
+ Y + G D + N ++G ++KL G+ G + +D
Sbjct: 791 NNYL--NGGDGDDEFQVQGN--------------SLAKNVLFGGKGNDKLYGSEGADLLD 834

Query: 2033 GRAGSDYMQGGLGNDYYV 2050
G G D ++GG GND Y
Sbjct: 835 GGEGDDLLKGGYGNDIYR 852



Score = 68.1 bits (166), Expect = 4e-13
Identities = 49/190 (25%), Positives = 74/190 (38%), Gaps = 44/190 (23%)

Query: 1802 DYVGGTGSGDIIYAGNGNDQINSGLGSSQVYAGEGDDVIITPDVFAIDKISGGAGNDVIY 1861
D G+ DI + +G+D I G+ ++Y +G+D +SGG G+D +
Sbjct: 729 DKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDT-----------LSGGNGDDQL- 776

Query: 1862 TYDPLSNNQISNSPYLNINLARNIYDIKDVVYGGEGDDYIYLGNQKTEAYGDDGNDIIVS 1921
YGG+G+D + G DG+D
Sbjct: 777 -------------------------------YGGDGNDKLIGVAGNNYLNGGDGDDEFQV 805

Query: 1922 YSLSSNIQFLYGGDGNDTLKAGDGGAYLDGGAGTDHLVGGSGDDTFIVDEQDTYEENDPN 1981
S L+GG GND L +G LDGG G D L GG G+D + + D +
Sbjct: 806 QGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDD 865

Query: 1982 GG-YDTIHIS 1990
GG D + ++
Sbjct: 866 GGKEDKLSLA 875



Score = 67.3 bits (164), Expect = 9e-13
Identities = 77/391 (19%), Positives = 145/391 (37%), Gaps = 74/391 (18%)

Query: 1887 DIKDVVYGGEGDDYIYLGNQKTEAYGDDGNDIIVSYSLSSNIQFLYGGD-----GNDTLK 1941
+I+ + G+GDD ++L Y G+D+ V Y + G GN T+
Sbjct: 609 EIRIESHLGDGDDKVFLSAGSANIYAGKGHDV-VYYDKTDTGYLTIDGTKATEAGNYTVT 667

Query: 1942 ---AGDGGAYLDGGAGTDHLVGGSGDDTFIVDEQDTYEENDPNGGYDTIHISQNIDLSLG 1998
GD + + VG + T + T+ D ++ + + +G
Sbjct: 668 RVLGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYEFTHINGKNLTETDNLYSVEEL---IG 724

Query: 1999 YLEAVTLLGSQ----------NLSIYGNTSDNKLIGNAGNNYIDGRAGSDYMQGGLGNDY 2048
A GS+ + I GN +++L G+ GN+ + G G D + GG GND
Sbjct: 725 TTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDGNDK 784

Query: 2049 YVVDTTETIETDENGNTYFIEGDQVVEDVDGGIDTLERWEDARFISQDENGNPVLTDSYK 2108
+ GN Y GD G+ D ++
Sbjct: 785 LI---------GVAGNNYLNGGD---------------------------GD----DEFQ 804

Query: 2109 ILENNIENLILKGNA--KTGFGNDLDNIIVGNEQDNYIDGLAGNDTYIFSRGGGTDTYSF 2166
+ N++ +L G +G++ +++ G E D+ + G GND Y + G G
Sbjct: 805 VQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHI--I 862

Query: 2167 EDNIDAVNILKIQGYSANDVSAQKYGDSVYLSFK-------GTNDHIWLSNYYIADTENT 2219
+D+ + L + DV+ ++ G+ + + G + I N++ ++ +
Sbjct: 863 DDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDI 922

Query: 2220 -TYKMDQINFDSGVIWGVDDINTLVNRALTN 2249
++++QI SG I D + + N
Sbjct: 923 SNHEIEQIFDKSGRIITPDSLKKALEYQQRN 953



Score = 65.4 bits (159), Expect = 3e-12
Identities = 73/305 (23%), Positives = 106/305 (34%), Gaps = 69/305 (22%)

Query: 2480 AGNDILYGQAGNDTLNGGTGNDTMYGGKGDDTYIVDSTADVISESVNEGTDIVQSSVTYT 2539
G+D ++ AG+ + G G+D +Y K D Y+ T D T+ +VT
Sbjct: 618 DGDDKVFLSAGSANIYAGKGHDVVYYDKTDTGYL---TIDGTKA-----TEAGNYTVTRV 669

Query: 2540 LLNNVENLTLTGTTAINGTGNALNNVIVGNSAINTLTGGVGDDYLNGGVGADKLLGGIGN 2599
L +V+ L G + T G + ++L+G
Sbjct: 670 LGGDVKVLQEVVKEQEVSVGKRTEKTQYRSYE-FTHINGKNLTETDNLYSVEELIGTTRA 728

Query: 2600 DSYVIDNTGDIVTENAGEGIDTVLSSITYTLGNNLENLTLIGSTAINGTGNALNNVLVGN 2659
D + DI + +G D + GN N+ L G+
Sbjct: 729 DKFFGSKFTDIF--HGADGDDLI-------------------------EGNDGNDRLYGD 761

Query: 2660 SAINTLTAGVGDDYLDGGAGADKLLGGIGNDTYVIDNTGDIVTENAGEGIDTVLSSITYT 2719
+TL+ G GDD L GG G DKL+G GN+ Y+ GD + G +
Sbjct: 762 KGNDTLSGGNGDDQLYGGDGNDKLIGVAGNN-YLNGGDGDDEFQVQGNSLAK-------- 812

Query: 2720 LSSNLENLTLTGSTAINATGNTLNNTLTGNSGVNALNGGAGNDILDGQGGNDQLTGGTGI 2779
N L G G + L G G D+LDG G+D L GG G
Sbjct: 813 ------------------------NVLFGGKGNDKLYGSEGADLLDGGEGDDLLKGGYGN 848

Query: 2780 DTALY 2784
D Y
Sbjct: 849 DIYRY 853



Score = 60.0 bits (145), Expect = 1e-10
Identities = 50/194 (25%), Positives = 78/194 (40%), Gaps = 8/194 (4%)

Query: 329 IVSGDGDDSITGTKYSDALYGGSDDDSLNGAEDNDYLNGGVGNDILDGGEGDDNLVDEKG 388
++ D G+K++D +G DD + G + ND L G GND L GG GDD L G
Sbjct: 722 LIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLYGGDG 781

Query: 389 NDTYIFNGNFGKDSIYDLDGNGQIKIDDIALSV----GEKINNKLWKS--ADGQYSLALV 442
ND I G G + + DG+ + ++ +L+ G K N+KL+ S AD
Sbjct: 782 NDKLI--GVAGNNYLNGGDGDDEFQVQGNSLAKNVLFGGKGNDKLYGSEGADLLDGGEGD 839

Query: 443 EDFDGVATTQKVVISKEDDNNSIIIKYFKNGALGLNFGDLENNSTPPSDTSIYAFGSEGN 502
+ G ++ I K L L D + + + + EGN
Sbjct: 840 DLLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEGN 899

Query: 503 NIIFGERYVASFSG 516
+ G + +F
Sbjct: 900 VLSIGHKNGITFRN 913



Score = 58.4 bits (141), Expect = 4e-10
Identities = 38/123 (30%), Positives = 51/123 (41%), Gaps = 31/123 (25%)

Query: 328 SIVSGDGDDSITGTKYSDALYGGSDDDSLNGAEDNDYLNGGVGNDI-------------- 373
+ G+D+++G D LYGG +D L G N+YLNGG G+D
Sbjct: 757 RLYGDKGNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDEFQVQGNSLAKNVLF 816

Query: 374 ----------------LDGGEGDDNLVDEKGNDTYIFNGNFGKDSIYDLDG-NGQIKIDD 416
LDGGEGDD L GND Y + +G I D G ++ + D
Sbjct: 817 GGKGNDKLYGSEGADLLDGGEGDDLLKGGYGNDIYRYLSGYGHHIIDDDGGKEDKLSLAD 876

Query: 417 IAL 419
I
Sbjct: 877 IDF 879



Score = 45.7 bits (108), Expect = 3e-06
Identities = 27/89 (30%), Positives = 46/89 (51%), Gaps = 13/89 (14%)

Query: 498 GSEGNNIIFGERYVASFSGNDFIHATNEDAVIIAGDGNDLISTGDGNDVIYAG------- 550
G++GN+ ++G++ GND + N D + GDGND + GN+ + G
Sbjct: 751 GNDGNDRLYGDK------GNDTLSGGNGDDQLYGGDGNDKLIGVAGNNYLNGGDGDDEFQ 804

Query: 551 VVTSDQDSNIVFSGGGRDQVYGGVGNDII 579
V + N++F G G D++YG G D++
Sbjct: 805 VQGNSLAKNVLFGGKGNDKLYGSEGADLL 833


32A4U85_RS12480A4U85_RS12590Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS12480325-1.877399DUF2280 domain-containing protein
A4U85_RS12485325-2.896959hypothetical protein
A4U85_RS12490427-3.591286hypothetical protein
A4U85_RS12495427-3.501962hypothetical protein
A4U85_RS12500225-2.501884hypothetical protein
A4U85_RS125102240.133618antitermination protein
A4U85_RS125154260.971722DUF1064 domain-containing protein
A4U85_RS125204283.411380hypothetical protein
A4U85_RS125254263.202922hypothetical protein
A4U85_RS125302263.249815hypothetical protein
A4U85_RS125352273.322132hypothetical protein
A4U85_RS125401281.609811YdaU family protein
A4U85_RS125452281.745653DNA cytosine methyltransferase
A4U85_RS12550224-1.421379hypothetical protein
A4U85_RS12555224-1.381626hypothetical protein
A4U85_RS12560-120-0.579553helix-turn-helix domain-containing protein
A4U85_RS12565-120-0.716518helix-turn-helix domain-containing protein
A4U85_RS12570020-0.392353hypothetical protein
A4U85_RS12575021-0.249774hypothetical protein
A4U85_RS12580019-0.410000hypothetical protein
A4U85_RS12585018-0.716860ATP-binding protein
A4U85_RS12590320-1.533556hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12540IGASERPTASE354e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 4e-04
Identities = 23/118 (19%), Positives = 44/118 (37%), Gaps = 6/118 (5%)

Query: 84 EREIAEYHGKKKQASEAGKASAAKRAAKKKGSSNSDSSKDDQASNENSTVVENPLNEEQT 143
RE K+ S+ + ++ AK+ S+ + N ++VVENP N
Sbjct: 1146 ARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 144 DVQPTNNHKPLTINQEPIIDSSSNAREENSQFTPIQFAQYQIDDHKRYSMREFISEYS 201
QPT N ++ + + R S ++ A +D ++ + S +
Sbjct: 1206 TTQPTVN------SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNT 1257


33A4U85_RS12925A4U85_RS13130Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS12925115-3.205391DUF799 domain-containing protein
A4U85_RS12930-112-3.369989DUF4810 domain-containing protein
A4U85_RS12935-214-4.862224CsgG/HfaB family protein
A4U85_RS12940-216-5.108822hypothetical protein
A4U85_RS12945-312-2.076137hypothetical protein
A4U85_RS12950-212-1.632814DUF898 domain-containing protein
A4U85_RS12960-113-0.756262M48 family metallopeptidase
A4U85_RS129650160.033540TetR/AcrR family transcriptional regulator
A4U85_RS129701263.632643hypothetical protein
A4U85_RS129751263.849845type I restriction endonuclease subunit R
A4U85_RS129851335.173298restriction endonuclease subunit S
A4U85_RS129901345.243633SAM-dependent DNA methyltransferase
A4U85_RS129951416.348547DUF2559 family protein
A4U85_RS130001375.775336putative adenosine monophosphate-protein
A4U85_RS130051559.969486mercuric resistance transcriptional repressor
A4U85_RS1301038419.128955mercury(II) reductase
A4U85_RS1301548520.276675organomercurial transporter MerC
A4U85_RS1302569623.364971mercury resistance system periplasmic binding
A4U85_RS1303069724.112843mercuric ion transporter MerT
A4U85_RS1303559523.422459Hg(II)-responsive transcriptional regulator
A4U85_RS1304059422.697843signal peptidase II
A4U85_RS1925057817.730220cation transporter
A4U85_RS1925547517.024755Cd(II)/Pb(II)-responsive transcriptional
A4U85_RS1305046916.188333hypothetical protein
A4U85_RS1305525613.096299YqaJ viral recombinase family protein
A4U85_RS130651378.873216DUF932 domain-containing protein
A4U85_RS130702368.507130hypothetical protein
A4U85_RS130752305.763975hypothetical protein
A4U85_RS130803284.360034hypothetical protein
A4U85_RS130903252.187119hypothetical protein
A4U85_RS131005290.109505hypothetical protein
A4U85_RS13105526-1.538091inovirus-type Gp2 protein
A4U85_RS13115526-2.527307hypothetical protein
A4U85_RS13120525-2.100244hypothetical protein
A4U85_RS13125727-2.915033pyocin activator PrtN family protein
A4U85_RS13130526-2.719193AlpA family phage regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12935PF06291372e-06 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 37.3 bits (86), Expect = 2e-06
Identities = 20/51 (39%), Positives = 25/51 (49%), Gaps = 13/51 (25%)

Query: 1 MKKILFSSFLAMSLVGCAAGPQPLYNWGTYTHQTYLMYNAPEKATPNTQIT 51
MKK+LFS+ LAM + GCA QT+ + N P TP IT
Sbjct: 6 MKKMLFSAALAMLITGCAQ-------------QTFTVGNKPTAVTPKETIT 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12980HTHTETR727e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 7e-18
Identities = 28/193 (14%), Positives = 71/193 (36%), Gaps = 11/193 (5%)

Query: 9 KKYPKRDIVLSTATQLFIQENYISVGVDRIIAEADIAKATFYKHFPSKEELVFYCLKQLK 68
+ R +L A +LF Q+ S + I A + + Y HF K +L + +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 69 LDIQMAVEEHISKLT-TPLEKLEQLYLWYIDWVNQN----GVRGCPFHKAGM--EVGNLY 121
+I E+ +K PL L ++ + ++ + FHK E+ +
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 122 PSIQVALDEYWNWLFNLTTSILKEMEIK---DPIALTRLVLSLLDGIINSSVCEQGPIE- 177
+ + E ++ + ++ + ++ + G++ + + +
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDL 187

Query: 178 PEKTWEFIQTLIE 190
++ +++ L+E
Sbjct: 188 KKEARDYVAILLE 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS19250ACRIFLAVINRP270.041 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.041
Identities = 21/76 (27%), Positives = 33/76 (43%), Gaps = 10/76 (13%)

Query: 48 LFIGILLPMFAGIALLANAIAWLNHRQWRRTALGTIG-PILVLAAVFLMRAYGWQSGGLL 106
LF I+L L N R T + TI P+++L ++ A+G+ L
Sbjct: 344 LFEAIMLVFLVMYLFLQN---------MRATLIPTIAVPVVLLGTFAILAAFGYSINTLT 394

Query: 107 YVGLALMVGVSVWDFI 122
G+ L +G+ V D I
Sbjct: 395 MFGMVLAIGLLVDDAI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13125RTXTOXINA310.009 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.7 bits (69), Expect = 0.009
Identities = 13/74 (17%), Positives = 30/74 (40%), Gaps = 1/74 (1%)

Query: 89 TFKDWQAYFNQARADHLNELRQHRYSETLNAAKLDNRLE-EILKSYSAVLVVRVDLAYVV 147
TF++W + ++H E + + L LE + + ++ + LAY
Sbjct: 910 TFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASYVYGNDALAYGS 969

Query: 148 SPDIEQVDNDLETL 161
D+ + N++ +
Sbjct: 970 QGDLNPLINEISKI 983


34A4U85_RS13310A4U85_RS13690Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS13310-210-3.257878UPF0149 family protein
A4U85_RS13315016-5.627433hypothetical protein
A4U85_RS13320116-5.369343cell division protein ZapA
A4U85_RS13330115-5.126806integrase family protein
A4U85_RS13335114-1.963550hypothetical protein
A4U85_RS13340114-1.685077phosphoethanolamine--lipid A transferase
A4U85_RS13345317-0.714492TonB-dependent receptor
A4U85_RS133502193.076847N-acetylmuramidase
A4U85_RS133552181.910192hypothetical protein
A4U85_RS133600171.515224hypothetical protein
A4U85_RS133652200.094862hypothetical protein
A4U85_RS13370219-0.648823DUF1833 family protein
A4U85_RS13375217-1.021938hypothetical protein
A4U85_RS13380320-1.297491hypothetical protein
A4U85_RS13385419-0.286564tape measure protein
A4U85_RS13390117-1.946408hypothetical protein
A4U85_RS13395218-0.396839hypothetical protein
A4U85_RS134002190.843609helix-turn-helix domain-containing protein
A4U85_RS134053190.926228DUF4258 domain-containing protein
A4U85_RS134153190.962037hypothetical protein
A4U85_RS134202170.429872hypothetical protein
A4U85_RS134252190.597134hypothetical protein
A4U85_RS13430220-0.460676hypothetical protein
A4U85_RS13435322-1.529507SH3 domain-containing protein
A4U85_RS13440223-1.562862hypothetical protein
A4U85_RS13445321-1.178426hypothetical protein
A4U85_RS13450522-1.099510hypothetical protein
A4U85_RS13455523-1.506272hypothetical protein
A4U85_RS13460220-0.045211hypothetical protein
A4U85_RS134650181.009747hypothetical protein
A4U85_RS134700181.415667hypothetical protein
A4U85_RS134751191.037638hypothetical protein
A4U85_RS134801191.285487hypothetical protein
A4U85_RS134850201.623580hypothetical protein
A4U85_RS134902211.999644hypothetical protein
A4U85_RS134951241.695221hypothetical protein
A4U85_RS194352241.280484hypothetical protein
A4U85_RS135001231.613166hypothetical protein
A4U85_RS135052221.681819minor capsid protein
A4U85_RS135102210.374444DUF4055 domain-containing protein
A4U85_RS13515323-1.521233terminase family protein
A4U85_RS13520424-3.883199DUF2280 domain-containing protein
A4U85_RS13525324-4.559238hypothetical protein
A4U85_RS13530728-6.419659hypothetical protein
A4U85_RS13535730-8.181868hypothetical protein
A4U85_RS13540435-7.638174hypothetical protein
A4U85_RS13545231-7.393990hypothetical protein
A4U85_RS19395128-5.058699hypothetical protein
A4U85_RS13555229-4.057313hypothetical protein
A4U85_RS13560127-3.096891hypothetical protein
A4U85_RS13565224-2.198946hypothetical protein
A4U85_RS13570222-2.574991hypothetical protein
A4U85_RS13575221-1.724781DUF559 domain-containing protein
A4U85_RS13580121-2.079818hypothetical protein
A4U85_RS13585022-2.310006hypothetical protein
A4U85_RS13590021-2.389390hypothetical protein
A4U85_RS13595-120-2.307729replication protein
A4U85_RS13600222-0.310240hypothetical protein
A4U85_RS13605321-1.922092hypothetical protein
A4U85_RS13610422-3.308461hypothetical protein
A4U85_RS13615322-3.336799hypothetical protein
A4U85_RS13620424-3.402253LexA family transcriptional regulator
A4U85_RS13625524-3.640346hypothetical protein
A4U85_RS13630525-3.545102hypothetical protein
A4U85_RS13635729-2.358496hypothetical protein
A4U85_RS136401240.458128hypothetical protein
A4U85_RS136451210.561266hypothetical protein
A4U85_RS136500210.749316hypothetical protein
A4U85_RS136550180.741768hypothetical protein
A4U85_RS136601160.291174hypothetical protein
A4U85_RS136650160.056334ATP-binding protein
A4U85_RS13670317-0.847212hypothetical protein
A4U85_RS13675220-1.739155hypothetical protein
A4U85_RS13680318-3.059211hypothetical protein
A4U85_RS13685118-4.351495hypothetical protein
A4U85_RS13690-319-3.950849hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13315GPOSANCHOR280.018 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.5 bits (63), Expect = 0.018
Identities = 32/187 (17%), Positives = 60/187 (32%), Gaps = 3/187 (1%)

Query: 3 EQLQRLQAHIGVLKTRLHHLESENSALSEAKELAETEHHAQVVQKNSIITKKQE---EIE 59
+ L + LK L E S E + + + + +K + +E
Sbjct: 71 LKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALE 130

Query: 60 TLTEQLTQLQGQFQQLNQDANTLAERYSRLEKSTTDLKNRFQEILAERNELRVTKEKLQS 119
T + + L + LA R + LEK+ N A+ L K L++
Sbjct: 131 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 190

Query: 120 QQRQTQQELHDLQQDRDRLLQKNELAKAKVEAIIQRLAILGTAQDQHAQEIQQLAHPNAE 179
+Q + ++ L K + +A+ A+ R A L A + +
Sbjct: 191 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 250

Query: 180 AGEETQS 186
E +
Sbjct: 251 LEAEKAA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13345FLGPRINGFLGI300.042 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 29.9 bits (67), Expect = 0.042
Identities = 11/48 (22%), Positives = 21/48 (43%)

Query: 106 NIPFDPIFIEKVIVNKNTDNIRYGGNAIGGSVQIESGLIPKKIEEKPN 153
N+ + KV++N+ T I G + V + G + ++ E P
Sbjct: 252 NLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQ 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13490IGASERPTASE300.011 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.011
Identities = 31/190 (16%), Positives = 61/190 (32%), Gaps = 17/190 (8%)

Query: 37 DPKGLKSALQSERDAAKNAKLELQKLQKQFEGIDPEIVKKVFAQIDQDEEAKLIAEGKVN 96
P + +E A+N+K E + ++K + D ++ ++ ++ + A + N
Sbjct: 1027 PPAPATPSETTET-VAENSKQESKTVEKNEQ--DATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 97 EVIQKRTE---KMREEHEKLLKAEKERADKAEAYAQKFKQSVIQSQIV-----QAAIELE 148
EV Q +E E ++ EKE K E + + + SQ+ ++ +
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 149 ALPEATPDIAFLAQSKFAL-----DENGKAVAVDENGEVVIGKDGQTPMTPKEWVESLRE 203
A P D + + D A N E + +
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT 1202

Query: 204 QKPYYWPKPN 213
P N
Sbjct: 1203 TPATTQPTVN 1212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13530PF03544290.006 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 28.8 bits (64), Expect = 0.006
Identities = 10/46 (21%), Positives = 19/46 (41%)

Query: 10 TRKKEPKTKPKSRPLPKPTQKYLEAEATLKEELEDLSIGFEQKFQP 55
K+ P K +P PKP K ++ K +++ + F+
Sbjct: 86 PPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFEN 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13590BCTERIALGSPG290.013 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.7 bits (64), Expect = 0.013
Identities = 20/82 (24%), Positives = 32/82 (39%), Gaps = 16/82 (19%)

Query: 174 PILLAQKNEQKVHIPVSNEEAQKHLKSLMERLKI-NGRKPAPVQKLQAKEKEPELA---- 228
P L+ K + VS+ A L++ ++ K+ N P Q L++ + P L
Sbjct: 31 PNLMGNKEKADKQKAVSDIVA---LENALDMYKLDNHHYPTTNQGLESLVEAPTLPPLAA 87

Query: 229 --------KELGPDPFDNPYEY 242
K L DP+ N Y
Sbjct: 88 NYNKEGYIKRLPADPWGNDYVL 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13670IGASERPTASE320.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.003
Identities = 21/118 (17%), Positives = 41/118 (34%), Gaps = 9/118 (7%)

Query: 138 KPTEEVEEEKPSESHAESLKVKNSKPEVVEKAQTAEPAIESETADSEYQKKLDTLLQRVK 197
K E E+ + + K + E V+ AEPA E++ + + + +
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ--AEPARENDPTVNIKEPQ-------SQ 1161

Query: 198 DSKTPDEVNAVYRYTRTWSDKQMEPLLLATHKRLEELEKSKAQATEPPSLMVQIQNAP 255
+ T D + E + T + E ++ AT P++ + N P
Sbjct: 1162 TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


35A4U85_RS14065A4U85_RS14120Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS140651153.044149peptide deformylase
A4U85_RS140701163.030458MFS transporter
A4U85_RS140750173.448668hypothetical protein
A4U85_RS140801184.451446thiamine pyrophosphate-binding protein
A4U85_RS140851184.047474aldehyde dehydrogenase
A4U85_RS140901183.021030aspartate dehydrogenase
A4U85_RS140950162.518643SDR family oxidoreductase
A4U85_RS141001172.820632alpha/beta hydrolase
A4U85_RS141051153.002777cupin domain-containing protein
A4U85_RS141101152.765338MFS transporter
A4U85_RS141152142.691010FAD-dependent oxidoreductase
A4U85_RS141202172.001965VOC family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14070TCRTETA379e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.5 bits (87), Expect = 9e-05
Identities = 49/275 (17%), Positives = 90/275 (32%), Gaps = 46/275 (16%)

Query: 62 NTYGIFAAGY-----FFRPLGGVVMAHFGDLVGRKKLFSLSILLMALPTLFIGILPTFEN 116
YGI A Y P+ G D GR+ + +S+ A+ + P
Sbjct: 43 AHYGILLALYALMQFACAPVLG----ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW- 97

Query: 117 IGYLAPLLLLLMRMVQGIAIGGEIPAAWTFVSEHVPE----RKIGLANGLLTAGLSLGIL 172
+L + R+V GI G A ++++ R G + G+ G +
Sbjct: 98 -------VLYIGRIVAGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPV 149

Query: 173 LGALMSLWISLNFSEGQIHDWAWRIPFIVGGIFGLVALYLRTYLKETPVFKAMQARKEIS 232
LG LM G PF + +L + + +
Sbjct: 150 LGGLM----------GGFSP---HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREA 196

Query: 233 KEMPVKQVLKTHKTAVAIGMLFTWFLTGCVVVVILAMPNLLIGSFGFERAQ------TFE 286
T VA ++ +F+ + ++ +P L FG +R
Sbjct: 197 LNPLASFRWARGMTVVAA-LMAVFFI----MQLVGQVPAALWVIFGEDRFHWDATTIGIS 251

Query: 287 MQSAAIVMQMVGCILAGYFADRFGCGKVMMVGALA 321
+ + I+ + ++ G A R G + +M+G +A
Sbjct: 252 LAAFGILHSLAQAMITGPVAARLGERRALMLGMIA 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14095DHBDHDRGNASE1017e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 101 bits (253), Expect = 7e-28
Identities = 69/259 (26%), Positives = 116/259 (44%), Gaps = 12/259 (4%)

Query: 5 VEGKVAVVTGGSSGIGLAAVEILVAEGAKVAW--CGRDEERLNASKHYILEKFPHANIFT 62
+EGK+A +TG + GIG A L ++GA +A ++ S + A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 63 KACNVLKKEEVQQFAKEVKLNLGNVDMLINNAGQGRVSNFENTQDEDWMKEIELKYFSVL 122
+ E + +E+ G +D+L+N AG R + DE+W + V
Sbjct: 66 VRDSAAIDEITARIEREM----GPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 123 HPVRAFLDDLKQSANASITNVNSLLALQPEPHMIATSSARAALLNLTHSLAHEFTQYGVR 182
+ R+ + + SI V S A P M A +S++AA + T L E +Y +R
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 183 VNSILLGMVESA-QWKRRYETRSDLNLSWKEWTGNIAKNR-GIPMQRLGRPEEPARALVF 240
N + G E+ QW +D N + + G++ + GIP+++L +P + A A++F
Sbjct: 182 CNIVSPGSTETDMQW----SLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 241 LASPLASYTTGSAIDVSGG 259
L S A + T + V GG
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14110TCRTETA461e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.0 bits (109), Expect = 1e-07
Identities = 74/363 (20%), Positives = 131/363 (36%), Gaps = 26/363 (7%)

Query: 52 AKLGWLMTSFLLAYGFSSVFLSFLGDIFNPKKMLFWSVTSWGLLMLCMGFTTSYSGMLIL 111
A G L+ + L + L L D F + +L S+ + M + I
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 112 RVLLGLAEGPLFALAYTIVKQTYTDRQQARASTMFLLGTPIGA-FLGFPITAAVLAHHDW 170
R++ G+ I T RA + G + P+ ++
Sbjct: 103 RIVAGITGATGAVAGAYIADIT---DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP 159

Query: 171 HTTFFVMAALTLIAILSIVFGLRNLQL--KKTVELEGESKRTNFKGHIANTKVLVSNSAF 228
H FF AAL + L+ F L ++ + E + +F+ T V + F
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVF 219

Query: 229 WLVCLFNIALMTYLWGLNS-----WVPSYLMQDKGFNLKEFGVYSSFPFIAMLIGEVVGA 283
+++ L LW + W + G +L FG+ S AM+ G
Sbjct: 220 FIMQLVGQVPAA-LWVIFGEDRFHWDAT----TIGISLAAFGILHSL-AQAMITG----- 268

Query: 284 FLSDKLGRRAIQVFSGLLLAGIFMYVMVIMTEPLLIIAAMSLSAMAWGFGVAAVFALLAR 343
++ +LG R + G++ G ++ T + M L A + G G+ A+ A+L+R
Sbjct: 269 PVAARLGERRA-LMLGMIADGTGYILLAFATRGWMAFPIMVLLA-SGGIGMPALQAMLSR 326

Query: 344 VTTSNVGATAGGIFNGLGNFASAIAPVLIGYIVMQTHSFNLGITFLAAVAVIGSLFLVPL 403
G L + S + P+L I + + G ++A A+ L +P
Sbjct: 327 QVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALY--LLCLPA 384

Query: 404 LKR 406
L+R
Sbjct: 385 LRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14115PF06704280.031 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 27.9 bits (62), Expect = 0.031
Identities = 11/51 (21%), Positives = 23/51 (45%), Gaps = 3/51 (5%)

Query: 185 PEVSAFLKNMHEAQGTKIHLDSKSLHLVEAPDQKVEVVNHPQHSQLFDCVV 235
+ S +K++ GT + + L ++ D + V+ P HS + V+
Sbjct: 6 TDFSRLIKSLGAQLGTSLTAQNGVCALYDSQDNEAAVIEMPDHS---EMVI 53


36A4U85_RS14255A4U85_RS14590Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS14255215-2.407377alpha/beta fold hydrolase
A4U85_RS14260018-3.220296DUF1275 domain-containing protein
A4U85_RS14270018-3.294289DUF2235 domain-containing protein
A4U85_RS14275-116-1.795068DUF3304 domain-containing protein
A4U85_RS14280-116-0.307286DUF4123 domain-containing protein
A4U85_RS14285-1181.574445type VI secretion system tip protein VgrG
A4U85_RS142901294.147279hypothetical protein
A4U85_RS142951262.786675LysR family transcriptional regulator
A4U85_RS143001231.211721dihydrodipicolinate synthase family protein
A4U85_RS14305123-0.525267MFS transporter
A4U85_RS14310219-4.598577OprD family outer membrane porin
A4U85_RS14315823-9.464548hypothetical protein
A4U85_RS143201026-10.020656hypothetical protein
A4U85_RS14325925-8.828281hypothetical protein
A4U85_RS14330724-8.629449hypothetical protein
A4U85_RS14335826-8.775726hypothetical protein
A4U85_RS14340120-5.134185hypothetical protein
A4U85_RS14345120-4.543720PAAR domain-containing protein
A4U85_RS14350017-4.112259PAAR domain-containing protein
A4U85_RS14355-116-3.557193hypothetical protein
A4U85_RS14360-114-2.720988hypothetical protein
A4U85_RS14365-112-1.913022type I-F CRISPR-associated helicase Cas3
A4U85_RS14370114-1.371154hypothetical protein
A4U85_RS14375112-3.146271hypothetical protein
A4U85_RS14380-113-4.317753type I-F CRISPR-associated protein Csy2
A4U85_RS14385013-4.677024type I-F CRISPR-associated protein Csy3
A4U85_RS14390217-5.995784type I-F CRISPR-associated endoribonuclease
A4U85_RS14395317-6.787676type I-F CRISPR-associated endonuclease Cas1
A4U85_RS14405420-8.150682hypothetical protein
A4U85_RS14410522-7.233082nucleoid-associated protein
A4U85_RS19440524-6.290062single-stranded DNA-binding protein
A4U85_RS14415319-2.329764hypothetical protein
A4U85_RS14420219-1.844003hypothetical protein
A4U85_RS14425218-1.513114DUF3800 domain-containing protein
A4U85_RS14430120-0.439970hypothetical protein
A4U85_RS144351241.668501single-stranded DNA-binding protein
A4U85_RS144402261.799418DUF927 domain-containing protein
A4U85_RS144452260.275397hypothetical protein
A4U85_RS14450424-2.167642hypothetical protein
A4U85_RS14455326-1.086047hypothetical protein
A4U85_RS144601241.796214hypothetical protein
A4U85_RS144651253.526476DNA-binding protein
A4U85_RS14470-1223.912472hypothetical protein
A4U85_RS144750213.964539hypothetical protein
A4U85_RS144800215.366465tyrosine-type recombinase/integrase
A4U85_RS144901225.776700gamma-glutamyltransferase
A4U85_RS144950204.897682DHA2 family efflux MFS transporter permease
A4U85_RS14500-1173.568247EmrA/EmrK family multidrug efflux transporter
A4U85_RS145050172.793759thioesterase family protein
A4U85_RS14510-2172.337885porphobilinogen synthase
A4U85_RS14515-1192.003029FAD-dependent oxidoreductase
A4U85_RS145202161.124258TIGR01777 family oxidoreductase
A4U85_RS145252151.197582ABZJ_00895 family protein
A4U85_RS145301171.972879VOC family protein
A4U85_RS145351181.831903OsmC family protein
A4U85_RS145402181.667415exonuclease SbcCD subunit D
A4U85_RS145452151.467982AAA family ATPase
A4U85_RS145500171.800109YggS family pyridoxal phosphate-dependent
A4U85_RS14555-1161.379539type IV pilus twitching motility protein PilT
A4U85_RS14560-215-0.629288PilT/PilU family type 4a pilus ATPase
A4U85_RS14565-116-1.764185ferric iron uptake transcriptional regulator
A4U85_RS14570-116-1.961910outer membrane protein assembly factor BamE
A4U85_RS14575-113-1.577133RnfH family protein
A4U85_RS14580013-1.688923hypothetical protein
A4U85_RS14590213-1.977254bacteriohemerythrin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14305TCRTETB517e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 50.6 bits (121), Expect = 7e-09
Identities = 38/174 (21%), Positives = 73/174 (41%), Gaps = 1/174 (0%)

Query: 41 IATFFDAYTVLAIAFALPQLITEWHLTPAYVGAIIAAGYVGQLVGAIFFGSLAEKVGRLK 100
I +FF + + +LP + +++ PA + A + +G +G L++++G +
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 101 VLSFTILLFVAMDISCLFAWSGMSLLIF-RFLQGVGTGGEVPVASAYINEFIGAEKRGKF 159
+L F I++ + S SLLI RF+QG G + + +I E RGK
Sbjct: 81 LLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 160 FLLYEVLFPLGLMFAGMAAFFLMPIYGWKVMFIVGLVPSLLVIPLRFFLPESPR 213
F L + +G + W + ++ ++ + V L L + R
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14370PF07824270.004 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 27.2 bits (60), Expect = 0.004
Identities = 8/33 (24%), Positives = 16/33 (48%)

Query: 5 IINQNVQANGDYEVHNLTTGCSFMPYPQNRVDL 37
+++ +V + E ++ C F P+N DL
Sbjct: 25 MLDDDVLIYIEKEGDSINLLCPFCALPENINDL 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14435PYOCINKILLER260.008 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 26.3 bits (57), Expect = 0.008
Identities = 8/25 (32%), Positives = 14/25 (56%)

Query: 30 TFSVATTEFWKDKTTGERKEANSIN 54
T+S T E W+D+T + A ++
Sbjct: 307 TYSSRTAEQWQDQTPDSVRYALGMD 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14495TCRTETB1065e-27 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 106 bits (267), Expect = 5e-27
Identities = 87/397 (21%), Positives = 162/397 (40%), Gaps = 20/397 (5%)

Query: 27 FMVVLDTTIANVSVPHITGNLAVSSTQGTWVVTSYAVAEAICVPLTGWLAGRFGTVRVFI 86
F VL+ + NVS+P I + WV T++ + +I + G L+ + G R+ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 87 FGLIGFTVFSFLCGLATS-LEMLVFFRIGQGLCGGPLMPLSQTLLMRIFPQEKHAQAMGL 145
FG+I S + + S +L+ R QG L ++ R P+E +A GL
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 146 WAMTTVVGPILGPILGGLISDNLSWHWIFFINLP-VGIVCVLAAMRLLRVAETETISLRI 204
+G +GP +GG+I+ + HW + + +P + I+ V M+LL+ I
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 205 DTVGLGLLILWIGALQLMLDLGHERDWFNSTSIVVLALTAAIGFVVFLIWELTDKHPVVD 264
G++++ +G + ML F S+ + F++F+ P VD
Sbjct: 202 ----KGIILMSVGIVFFMLFTTSYSISFLIVSV--------LSFLIFVKHIRKVTDPFVD 249

Query: 265 VKVFRHRGFAISVLALSLGFGAFFGSIVLIPQWLQM--NLSYTATWAGYLTATMGFGSLT 322
+ ++ F I VL + FG G + ++P ++ LS + + +
Sbjct: 250 PGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIF 309

Query: 323 MSPIVAKLSTKHDPRALASFGLILLGIVTLMRAFWTTDADFMALAWPQILQGFAVPFFFI 382
I L + P + + G+ L + L +F + + + + F
Sbjct: 310 -GYIGGILVDRRGPLYVLNIGVTFLSVSFLTASF-LLETTSWFMTIIIVFVLGGLSFTKT 367

Query: 383 PLSNIALGSVLQQEIASAAGLMNFLRTMAGAIGASIA 419
+S I S+ QQE + L+NF ++ G +I
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14500RTXTOXIND1142e-30 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 114 bits (288), Expect = 2e-30
Identities = 70/411 (17%), Positives = 156/411 (37%), Gaps = 70/411 (17%)

Query: 25 KRKKFLGFFALILLIAAILYAIWALFLNHSVSTDNAYVGAETAQITSMVSGQVAQVLVKD 84
+R + + +F + L+ A + ++ + + + +I + + V +++VK+
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 85 TQTVHRGDVLVRIDDR--DAKIALAQAEAELAKAKRQYKQTAANSSSLNS---------- 132
++V +GDVL+++ +A Q+ A+ ++ Q + S LN
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 133 -------QVVVRADE-----INSAKAQVAQAQADYDKAALE------------------- 161
+ V+R ++ + Q Q + + DK E
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 162 --LNRRAQLAASGAVSKEELTKAQSAVETAKAGLELAKAGLAQATSSRKAAESTLAANEA 219
L+ + L A++K + + ++ A L + K+ L Q S +A+
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQ 294

Query: 220 LIQGVSETST------PDVQVAQAHVEQAQLDLERTVIRAPVDGVITRRNIQ-VGQRVAP 272
L + +E ++ + + + + + +VIRAPV + + + G V
Sbjct: 295 LFK--NEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT 352

Query: 273 GTSMMMIVPLND-LYVDANFKESQLKKVRPGQPVTLTSDLYGDDVEYHGKVVGFSGGTGS 331
++M+IVP +D L V A + + + GQ + + + +G +VG
Sbjct: 353 AETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF--PYTRYGYLVG------- 403

Query: 332 AFALIPAQNATGNWIKVVQRLPVRIALDPKELAEH----PLRVGLSMEAKV 378
I + +V V I+++ L+ PL G+++ A++
Sbjct: 404 KVKNINLDAIEDQRLGLVFN--VIISIEENCLSTGNKNIPLSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14520NUCEPIMERASE475e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 46.7 bits (111), Expect = 5e-08
Identities = 24/159 (15%), Positives = 57/159 (35%), Gaps = 36/159 (22%)

Query: 4 NVLITGASGFIGTHLIRFLLQKNYNVIAV-------------TRQA-----------GKK 39
L+TGA+GFIG H+ + LL+ + V+ + R
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 40 SDHPALQWVQKFEDISTRQIDYVVNLAGANIGEKRWTESRKKHLIESRVNTTQKLYAWLK 99
+D + + ++ + V + + E+ +S + + +
Sbjct: 62 ADREGMTDL-----FASGHFERVFISP-HRLAVRYSLENPHA-YADSNLTGFLNILEGCR 114

Query: 100 QSQIFPEVIVSGSAIGYYGIDNQEKWTEVCTEQSSPQPI 138
++I + S S++ YG++ + ++ + S P+
Sbjct: 115 HNKIQHLLYASSSSV--YGLNRKMPFST---DDSVDHPV 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14545RTXTOXIND340.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.003
Identities = 37/221 (16%), Positives = 71/221 (32%), Gaps = 43/221 (19%)

Query: 647 TQCRAELEQVQKYLAQLQVKQTHLQQELEQAFNLNQLHIELNQAPEQILQTLNELRQAT- 705
A+ + Q L Q +++QT Q IELN+ PE L + +
Sbjct: 130 LGAEADTLKTQSSLLQARLEQTRYQILSRS--------IELNKLPELKLPDEPYFQNVSE 181

Query: 706 ---QTAISLFDSENVRLTQAIKQHNQLVQTIQRNESLLNTAQQ----WQQQVQHIVECLS 758
SL + + Q Q + + + T ++ + L
Sbjct: 182 EEVLRLTSLIKE---QFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLD 238

Query: 759 ETEQHAWQQASSQTAKQTWAILDARAKQLELQEQLAQRFEQQQQDLKMLAANLEQMIKQI 818
+ +QA ++ A +L+ K +E +L + LEQ+ +I
Sbjct: 239 DFSSLLHKQAIAKHA-----VLEQENKYVEAVNELRV-----------YKSQLEQIESEI 282

Query: 819 DEIEQNLQ--------EITLKGQQNNEKAVSLIQQMTGRSD 851
++ Q EI K +Q + L ++ +
Sbjct: 283 LSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323



Score = 33.6 bits (77), Expect = 0.005
Identities = 22/122 (18%), Positives = 47/122 (38%), Gaps = 8/122 (6%)

Query: 468 ARQKLSEAKNELEQLTVSLGTVEQIELKLEQQRKDKDQKLAQVT----QLDLIQQKIKIY 523
+ L EQ + Q EL L+++R ++ LA++ + + ++ +
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 524 HELYAELQQFTEKHTQASAQEDQLKTVCQLAEQDYQTAKTEREKLQHILQQQRLLHTENI 583
L + Q KH + ++ V +L Q + E E L +++ L T+
Sbjct: 241 SSLLHK--QAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS--AKEEYQLVTQLF 296

Query: 584 EQ 585
+
Sbjct: 297 KN 298



Score = 32.5 bits (74), Expect = 0.011
Identities = 26/159 (16%), Positives = 54/159 (33%), Gaps = 9/159 (5%)

Query: 289 QLASEREQLKRLEVFSEIRPQVF-QQAQNLQTLQQLEPQIQQAQTKFNELVQIFETGQKQ 347
Q SE E L+ + E Q+ Q L + + + N + + +
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 348 Y----QLAEQQL---KQTLDFEQQHQQALNQVRQSIQERAFIADEYKKCKEKRQVLEQNL 400
L +Q L+ E ++ +A+N++R + I E KE+ Q++ Q
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 401 SPLQQQQNA-VQQHIAQLEQNKIHLQQQLIQTQQYAVLD 438
+ +I L +++ + A +
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVS 335



Score = 31.0 bits (70), Expect = 0.037
Identities = 39/238 (16%), Positives = 85/238 (35%), Gaps = 37/238 (15%)

Query: 487 GTVEQIELKLEQQRKDKDQKLAQVTQLDLIQQKIKI-YHELYAELQQFTEKHTQASAQED 545
V++I +K E + K L ++T L +K L A L+Q + S + +
Sbjct: 105 SIVKEIIVK-EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELN 163

Query: 546 QLKTVCQLAEQDYQTAKTE-REKLQHILQQQRLLHTENIEQLRANLKEGEACLVCGSTHH 604
+L + E +Q E +L ++++Q Q NL + A
Sbjct: 164 KLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRA--------- 214

Query: 605 PYRIDDSAVSKALFDLQQQQEQQAIALEQTKFNAWQT-------QQHALTQCR------- 650
+ + + + +E+++ + + + +HA+ +
Sbjct: 215 ---------ERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 651 AELEQVQKYLAQLQVKQTHLQQELEQAFNLNQLHI--ELNQAPEQILQTLNELRQATQ 706
EL + L Q++ + ++E + L + I +L Q + I EL + +
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14550ALARACEMASE375e-05 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 36.7 bits (85), Expect = 5e-05
Identities = 34/220 (15%), Positives = 71/220 (32%), Gaps = 21/220 (9%)

Query: 12 LQQIRTACELAQRAPETVQLLAVSKT----HPSERLREMYAAGQRAFGENYLQEALDKID 67
LQ ++ + ++A ++ +V K H ER+ F L+EA+
Sbjct: 11 LQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWS-AIGATDGFALLNLEEAI---- 65

Query: 68 ALQDLDIEWHFI--GHVQRNKTKHLAEQFDWVHGVDRLIIAERLSNQRGDDQAALNICLQ 125
L++ + + + + +Q V + L N R +
Sbjct: 66 TLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI----Y 121

Query: 126 VNIDGQESKDGCAPEDVAELVAQMSQLPKIRLRGLMV-IPAPDNTAAFADAKKLFDAVKV 184
+ ++ ++ G P+ V + Q+ + + LM ++ + A +
Sbjct: 122 LKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQA-A 180

Query: 185 QHAHPEDWDTLSMGMSGDLEAAIAAGSTMVRVGTALFGAR 224
+ S+ S A VR G L+GA
Sbjct: 181 EGLECR----RSLSNSAATLWHPEAHFDWVRPGIILYGAS 216


37A4U85_RS14680A4U85_RS14705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS146803320.551247arginyltransferase
A4U85_RS146853371.210566ribosomal protein S18-alanine
A4U85_RS146905401.759236metal-dependent hydrolase
A4U85_RS146955351.534362metal-dependent hydrolase
A4U85_RS147006402.128815elongation factor Tu
A4U85_RS147054331.295959elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14685SACTRNSFRASE466e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 46.1 bits (109), Expect = 6e-09
Identities = 25/101 (24%), Positives = 42/101 (41%), Gaps = 2/101 (1%)

Query: 39 CTVIEINNKVVGFCILQPVLDE-ANLLLMAIDPQMQGKGLGYQLLDASIE-RLENHPVQI 96
+ + N +G ++ + A + +A+ + KG+G LL +IE ENH +
Sbjct: 67 AFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGL 126

Query: 97 FLEVRESNKAAIGLYEKTGFHQIDVRRNYYPTQEGGRENAV 137
LE ++ N +A Y K F V Y E A+
Sbjct: 127 MLETQDINISACHFYAKHHFIIGAVDTMLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14700TCRTETOQM781e-17 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 78.0 bits (192), Expect = 1e-17
Identities = 50/149 (33%), Positives = 77/149 (51%), Gaps = 5/149 (3%)

Query: 13 VNVGTIGHVDHGKTTLTAAI--ATICAKTYGGEAKDYSQIDSAPEEKARGITINTSHVEY 70
+N+G + HVD GKTTLT ++ + G K ++ D+ E+ RGITI T +
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 71 DSPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVCAATDGPMPQTREHILLSRQVGVPY 130
+D PGH D++ + + +DGAIL+ +A DG QTR R++G+P
Sbjct: 64 QWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP- 122

Query: 131 IIVFLNKCDLVDDEELLELVEMEVRELLS 159
I F+NK D + L V +++E LS
Sbjct: 123 TIFFINKIDQNGID--LSTVYQDIKEKLS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS14705TCRTETOQM5960.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 596 bits (1538), Expect = 0.0
Identities = 169/686 (24%), Positives = 285/686 (41%), Gaps = 78/686 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVSHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TCFWSGMGNQFPQHRINVIDTPGHVDFTIEVERSMRVLDGACMVYCAVGGVQPQSETVWR 128
+ W ++N+IDTPGH+DF EV RS+ VLDGA ++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRLAFVNKMDRTGANFFRVVEQMKTRLGANPVPIVVPIGAEDTFTGVVDLIEM 188
K +P + F+NK+D+ G + V + +K +L A V V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIK----------QKVELYPNM 164

Query: 189 KAIIWDEASQGMKFEYGEIPADLVDTAQEWRTNMVEAAAEASEELMDKYLEEGDLSKEDI 248
+ E+ Q + E +++L++KY+ L ++
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 IAGLRARTLASEIQVMLCGSAFKNKGVQRMLDAVIEFLPSPTEVKAIEGILDDKDETKAS 308
R + + GSA N G+ +++ + S T
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 REASDEAPFSALAFKIMNDKFVGNLTFVRVYSGVLKQGDAVYNPVKSKRERIGRIVQMHA 368
++ FKI + L ++R+YSGVL D+V K K +I +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSIN 299

Query: 369 NERQDIDEIRAGDIAACVG----LKDVTTGDTLCDEKNIITLERMEFPDPVIQLAVEPKT 424
E ID+ +G+I L V GDT + ER+E P P++Q VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMSIALGRLAKEDPSFRVHTDEESGQTIIAGMGELHLDIIVDRMKREFGVEANIG 484
+E + AL ++ DP R + D + + I++ +G++ +++ ++ ++ VE I
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPMVAYRETIKKTVEQEGKFVRQTGGKGKFGHVYVRLEPLDVEAAGKEYEFAEEVVGGVV 544
+P V Y E K E + + + + + PL + G ++ V G +
Sbjct: 415 EPTVIYMERPLKKA--EYTIHIEVPPNPFWASIGLSVSPLPL---GSGMQYESSVSLGYL 469

Query: 545 PKEFFGAVDKGIQERMKNGVLAGYPVVGVKAVLFDGSYHDVDSDELSFKMAGSYAFRDGF 604
+ F AV +GI+ + G L G+ V K G Y+ S F+M
Sbjct: 470 NQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVL 528

Query: 605 MKADPVLLEPIMKVEVETPEDYMGDIMGDLNRRRGMVQGMDDLPGGTKAIKAEVPLAEMF 664
KA LLEP + ++ P++Y+ D + + L + E+P +
Sbjct: 529 KKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDT-QLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQMRSMSQGRATYSMEFAKYAET 690
Y + + + GR+ E Y T
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT 613


38A4U85_RS15075A4U85_RS15100Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS15075314-1.055982cytochrome c biogenesis protein CcsA
A4U85_RS15080317-0.755727tRNA-dihydrouridine synthase
A4U85_RS15085317-1.029506hypothetical protein
A4U85_RS15090317-1.259655hypothetical protein
A4U85_RS15095316-1.581674hypothetical protein
A4U85_RS15100416-1.412976hypothetical protein
39A4U85_RS15210A4U85_RS15385Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS152102240.632213cold shock domain-containing protein
A4U85_RS152154251.191660uracil phosphoribosyltransferase
A4U85_RS152203271.498598NADH-quinone oxidoreductase subunit NuoN
A4U85_RS152251312.200409NADH-quinone oxidoreductase subunit M
A4U85_RS152302322.765153NADH-quinone oxidoreductase subunit L
A4U85_RS152353323.715233NADH-quinone oxidoreductase subunit NuoK
A4U85_RS152404323.453226NADH-quinone oxidoreductase subunit J
A4U85_RS152455333.590331NADH-quinone oxidoreductase subunit NuoI
A4U85_RS152505313.513641NADH-quinone oxidoreductase subunit NuoH
A4U85_RS152554283.739057NADH-quinone oxidoreductase subunit NuoG
A4U85_RS152601212.663255NADH-quinone oxidoreductase subunit NuoF
A4U85_RS152652181.688880NADH-quinone oxidoreductase subunit NuoE
A4U85_RS152701161.579242NADH-quinone oxidoreductase subunit C/D
A4U85_RS152751140.705662NADH-quinone oxidoreductase subunit B
A4U85_RS15280115-0.104762NADH-quinone oxidoreductase subunit A
A4U85_RS152852170.804273sensor domain-containing diguanylate cyclase
A4U85_RS152904201.322193hypothetical protein
A4U85_RS152955242.285607sensor histidine kinase BfmS
A4U85_RS153003243.024050response regulator transcription factor BfmR
A4U85_RS153051223.044018hypothetical protein
A4U85_RS153101213.177033ribonucleoside-diphosphate reductase subunit
A4U85_RS15315-1161.564609hypothetical protein
A4U85_RS153200150.558993ribonucleotide-diphosphate reductase subunit
A4U85_RS15325115-0.804105hypothetical protein
A4U85_RS15330625-5.627513TetR/AcrR family transcriptional regulator
A4U85_RS15335928-8.085651darcynin
A4U85_RS153401131-9.688439hypothetical protein
A4U85_RS153451027-8.087061type IV secretion protein Rhs
A4U85_RS15350623-6.254447hypothetical protein
A4U85_RS15355116-4.454025sel1 repeat family protein
A4U85_RS15360-118-2.821965hypothetical protein
A4U85_RS15365118-1.439237hypothetical protein
A4U85_RS153701210.221826TetR/AcrR family transcriptional regulator
A4U85_RS153802210.753451flavin reductase
A4U85_RS153852210.813132methionine synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15255NUCEPIMERASE300.046 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.8 bits (67), Expect = 0.046
Identities = 25/88 (28%), Positives = 41/88 (46%), Gaps = 26/88 (29%)

Query: 619 RFYQVYDPSYYKPEYAIKESWRWLHAIETGLKGKPI----------DWTVLDDVIETIVK 668
RF+ VY P + +P+ A+ +++ A+ L+GK I D+T +DD+ E I++
Sbjct: 177 RFFTVYGP-WGRPDMAL---FKFTKAM---LEGKSIDVYNYGKMKRDFTYIDDIAEAIIR 229

Query: 669 NVPVL---------EAIQDVAPDAGYRV 687
V+ E A A YRV
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRV 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15295PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 4e-05
Identities = 25/150 (16%), Positives = 52/150 (34%), Gaps = 31/150 (20%)

Query: 378 VAVETEALKTQKEIELI--PPPLYVKVDAERRYLHRVV-----QNLVGNAVRYC------ 424
+A E + + ++ I L + + V Q LV N +++
Sbjct: 218 LADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQ 277

Query: 425 DNKVRITGGIHNDGMAFVCVEDDGPGIPEQDRKRVFEAFARLDDSRTRASGGYGLGLSIV 484
K+ + G ++G + VE+ G + ++ S G GL ++
Sbjct: 278 GGKILLKG-TKDNGTVTLEVENTGSLALKNTKE----------------STGTGL-QNVR 319

Query: 485 SRIAYWFGGEIKVDESPSLGGARFIMTWPA 514
R+ +G E ++ S G ++ P
Sbjct: 320 ERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15300HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 6e-22
Identities = 33/137 (24%), Positives = 59/137 (43%), Gaps = 1/137 (0%)

Query: 8 PKILIVEDDERLARLTQEYLIRNGLEVGVETDGNRAIRRIISEQPDLVVLDVMLPGADGL 67
IL+ +DD + + + L R G +V + ++ R I + DLVV DV++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 TVCREVRPHY-HQPILMLTARTEDMDQVLGLEMGADDYVAKPVQPRVLLARIRALLRRTD 126
+ ++ P+L+++A+ M + E GA DY+ KP L+ I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 127 KTVEDEVAQRIEFDDLV 143
+ + LV
Sbjct: 124 RRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15325VACCYTOTOXIN350.001 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 35.4 bits (81), Expect = 0.001
Identities = 65/320 (20%), Positives = 108/320 (33%), Gaps = 26/320 (8%)

Query: 159 GNSITLIGDSSSSSVNNSATNTSNTVNDNDTTYNGNGSGAGNGSGDGLLNGIGSGNGEQN 218
GNS T DS+ + + +++ N GSGAG + +L
Sbjct: 172 GNSFTSYKDSADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLT--------LQ 223

Query: 219 YGIGNGIADDASITAPITLPINLSGNSITLIGNSSASSVNSSPTTTSNTVNDNDTTYNGN 278
G ++A I+ +NL+ NS+ L+GN + + + + +T+
Sbjct: 224 ASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTS-KVT 282

Query: 279 GTGDSGVSALGDSGNGSGDGAGNGIASGNGEHNYGIGNGNGDDVDITAPITGVLNFSGNS 338
G + +GD + A GI + N H + ++I AP G N
Sbjct: 283 GEVNFNHLTVGDH-----NAAQAGIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPND 337

Query: 339 FTLIGNSSSSSVNTAPTTTSNTVNDNDTIDNGNSGGTGSGSGNGSGDGLLNGAASG---- 394
N++ ++ +S ++ I+ NS DG G +
Sbjct: 338 --KPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNI 395

Query: 395 ---NGEHNYGIGNGNGDDVDITAPITGVFNFSGNSFSLIGNSSSSSINTAPTTTTNTVND 451
N + I G T + +L +S S+ T TV D
Sbjct: 396 NRINTNADGTIRVGGFKASLTTN--AAHLHIGKGGINLSNQASGRSLLVENLTGNITV-D 452

Query: 452 NDVTVNGNDGGGLLGGSSGN 471
+ VN GG L GSS N
Sbjct: 453 GPLRVNNQVGGYALAGSSAN 472



Score = 30.8 bits (69), Expect = 0.035
Identities = 32/167 (19%), Positives = 63/167 (37%), Gaps = 9/167 (5%)

Query: 29 GSGDGLLNGISSGNGEHNYGIGNGIADDASITAPITIPLNLSGNSITLIGN---SSSNSV 85
GSG G + + + GI + +A I+ LNL+ NS+ L+GN V
Sbjct: 208 GSGAGRKASSTVLTLQASEGITSRE--NAEISLYDGATLNLASNSVKLMGNVWMGRLQYV 265

Query: 86 NSSPTTTSNNVNDNDVTNNGNGSTIGSGTGNGSGDGLLNGAASGNGEHNYGIGNGIADDA 145
+ + + +N + VT N + + G N + G++ + G + G+
Sbjct: 266 GAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLWQSAGL---- 321

Query: 146 SITAPLSIPINLAGNSITLIGDSSSSSVNNSATNTSNTVNDNDTTYN 192
+I AP N +++ + ++ +N+ N
Sbjct: 322 NIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPN 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15330HTHTETR706e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.7 bits (170), Expect = 6e-17
Identities = 27/187 (14%), Positives = 65/187 (34%), Gaps = 17/187 (9%)

Query: 11 RPRQARSVATFEAILEAAARILESLGFAGFNTNAVAELAGVSIGSLYQYFPSKDALIVEL 70
R + + T + IL+ A R+ G + + +A+ AGV+ G++Y +F K L E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 71 IRRERAKLSNHIVEVIQQSDAADLKDKLKLIIQAAVQHQLSRPQLARTLEFASELIGKDI 130
+ + +E + D L+ I+ ++ ++ + +E
Sbjct: 63 WELSESNIGELELEYQAK-FPGDPLSVLREILIHVLESTVTEERRRLLMEI-------IF 114

Query: 131 EESEHQHELETIISDLFKRSGVSHAQTAAQDVIALSKGMINAAGIVGESDLNHLQQRVEK 190
+ E E+ + + + + + + + +R
Sbjct: 115 HKCEFVGEMAVVQ---------QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAI 165

Query: 191 AVFGYLD 197
+ GY+
Sbjct: 166 IMRGYIS 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15370HTHTETR506e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.4 bits (120), Expect = 6e-10
Identities = 13/63 (20%), Positives = 26/63 (41%)

Query: 42 SSKKLQVVQTAIQLFTTHGFHNAGVDLIIKEAKIPKATFYNYFHSKERLIEMCIAFQKSL 101
+ ++ A++LF+ G + + I K A + + Y +F K L +S
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 102 LKE 104
+ E
Sbjct: 70 IGE 72


40A4U85_RS16070A4U85_RS16155Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS160702272.032858ketol-acid reductoisomerase
A4U85_RS160751191.529834acetolactate synthase small subunit
A4U85_RS160800171.417581acetolactate synthase 3 large subunit
A4U85_RS160851161.809793DUF4124 domain-containing protein
A4U85_RS160952142.158856leucine--tRNA ligase
A4U85_RS161003132.257806hypothetical protein
A4U85_RS161053112.020412DNA polymerase III subunit delta
A4U85_RS161102122.103068MacA family efflux pump subunit
A4U85_RS161151122.533770MacB family efflux pump subunit
A4U85_RS161202151.847154efflux transporter outer membrane subunit
A4U85_RS16125-1181.001391enoyl-ACP reductase
A4U85_RS161301170.502456Bax inhibitor-1/YccA family protein
A4U85_RS16135214-0.125544oligoribonuclease
A4U85_RS161401141.142309ribosome small subunit-dependent GTPase A
A4U85_RS161451150.659574rhodanese-like domain-containing protein
A4U85_RS161501140.027080glutaredoxin 3
A4U85_RS161552150.547050protein-export chaperone SecB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16085PYOCINKILLER260.045 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 25.9 bits (56), Expect = 0.045
Identities = 17/69 (24%), Positives = 23/69 (33%), Gaps = 10/69 (14%)

Query: 37 SGSTHYTTTPPPQGAKHLNKVSTYGSQPLLKNPTSNSEQSSQDKDKVVQEVTNVTVEKGA 96
+G T A L T S P +NP+S + + V V +GA
Sbjct: 395 TGLYEVTVPSTTAEAPPLILTWTPASPPGNQNPSSTTPVVPK----------PVPVYEGA 444

Query: 97 PAVPVPPAP 105
PV P
Sbjct: 445 TLTPVKATP 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16110RTXTOXIND441e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.7 bits (103), Expect = 1e-06
Identities = 21/167 (12%), Positives = 55/167 (32%), Gaps = 10/167 (5%)

Query: 44 GDIENNVLATGTL-DATKLISVGAQVSGQVKKMYVQLGDQVKQGQLIAQIDSTTQENSLK 102
G +E A G L + + + + VK++ V+ G+ V++G ++ ++ +
Sbjct: 78 GQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTAL------- 130

Query: 103 TSDANIKNLEAQRLQQIASLNEKQLEYRRQQQMYAQDATPRADLESAEAAYKTAQAQVKA 162
++A+ ++ LQ Q+ R + + + + +
Sbjct: 131 GAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190

Query: 163 LDAQIESAKITRSTAQTNIGYTRIVAPTDGTVVAIVTEEGQTVNANQ 209
+ Q + + Q + + A + I E +
Sbjct: 191 IKEQFSTWQ--NQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235



Score = 42.1 bits (99), Expect = 3e-06
Identities = 28/159 (17%), Positives = 55/159 (34%), Gaps = 21/159 (13%)

Query: 87 QLIAQIDSTTQENSLKTSDANIKNLEAQRLQQIASLNEKQLEYRRQQQMYAQDATPRADL 146
Q IA+ QEN + ++ ++Q Q + + + EY+ Q++ +
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL----- 301

Query: 147 ESAEAAYKTAQAQVKALDAQIESAKITRSTAQTNIGYTRIVAPTDGTVVAI-VTEEGQTV 205
+ + L ++ + + I AP V + V EG V
Sbjct: 302 ----DKLRQTTDNIGLLTLELAKNEE-------RQQASVIRAPVSVKVQQLKVHTEGGVV 350

Query: 206 NANQSAPTIVKIAKLQN-MTIKAQVSEADIMKVEKGQQV 243
+ T++ I + + + A V DI + GQ
Sbjct: 351 TTAE---TLMVIVPEDDTLEVTALVQNKDIGFINVGQNA 386


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16125DHBDHDRGNASE607e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 7e-13
Identities = 66/263 (25%), Positives = 98/263 (37%), Gaps = 27/263 (10%)

Query: 16 LAGKRFLIAGVASKLSIAYGIAQALHREGAEL-AFTYPNEKLKKRVDEFAEQFGSKLVFP 74
+ GK I G A I +A+ L +GA + A Y EKL+K V + FP
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 75 CDVAVDAEIDNAFAELAKHWDGVDGVVHSIGF---APAHTLDGDFTDVTDRDGFKIAHDI 131
DV A ID A + + +D +V+ G H+L +D +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSL-------SDEEWEATFSVN 116

Query: 132 SAYSFVAMARAAKPLLQARQGCLLTLTYQGSERVMPNYNVMGMAKASLEAGVRYLASSLG 191
S F A +K ++ R G ++T+ + + +KA+ + L L
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 192 VDGIRVNAISAGPIRTL-----------AASGIKSFRKMLDANEKVAPLKRNVTIEEVGN 240
IR N +S G T A IK + PLK+ ++ +
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTG---IPLKKLAKPSDIAD 233

Query: 241 AALFLCSPWASGITGEILYVDAG 263
A LFL S A IT L VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16130RTXTOXINA300.007 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.3 bits (68), Expect = 0.007
Identities = 12/51 (23%), Positives = 22/51 (43%), Gaps = 9/51 (17%)

Query: 25 GAIQKSVMLTIIAAAVGVALFFYAAFTANVGIAYAASIVGAIGGLVLALIT 75
GAI S+ + L A+ ++ + A S+VGA ++ +T
Sbjct: 362 GAIDASL------TTISTVL---ASVSSGISAAATTSLVGAPVSALVGAVT 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16155SECBCHAPRONE1553e-51 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 155 bits (393), Expect = 3e-51
Identities = 59/147 (40%), Positives = 95/147 (64%), Gaps = 3/147 (2%)

Query: 3 EEQQVQPQLALERIYTKDISFEVPGA-QVFTKQWQPELNINLSSAAEKIDPTHFEVSLKV 61
+ QP L ++RIY KD+SFE P +F + W+P+L+ +LS+ A+++ +EV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 62 VVQANNDNE--TAFIVDVTQSGIFLIDNIEEDRLPYILGAYCPNILFPFLREAVNDLVTK 119
V+ ++ AFI +V Q+G+F I +EE ++ + L + CPN+LFP+ RE V+ LV +
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 120 GSFPQLLLTPINFDAEFEANMQRAQAA 146
G+FP L L+P+NFDA F +QR + A
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQA 158


41A4U85_RS17270A4U85_RS17315Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS172703262.480639TRAP transporter large permease subunit
A4U85_RS172757332.196294alpha/beta fold hydrolase
A4U85_RS172807362.409402DUF4442 domain-containing protein
A4U85_RS172858382.674080hypothetical protein
A4U85_RS172908392.513593hypothetical protein
A4U85_RS172959412.368539DNA-directed RNA polymerase subunit beta'
A4U85_RS173008371.639868DNA-directed RNA polymerase subunit beta
A4U85_RS173055371.94260650S ribosomal protein L7/L12
A4U85_RS173106412.82978050S ribosomal protein L10
A4U85_RS173152312.41403650S ribosomal protein L1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17285ADHESNFAMILY260.034 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.4 bits (58), Expect = 0.034
Identities = 6/28 (21%), Positives = 12/28 (42%)

Query: 3 KKIGLISTVILSTVMFTGCQNMSPSDQR 30
KK+G + + LS ++ C +
Sbjct: 2 KKLGTLLVLFLSAIILVACASGKKDTTS 29


42A4U85_RS17865A4U85_RS17900Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS178653240.860006helix-turn-helix domain-containing protein
A4U85_RS178704332.228724glutathione peroxidase
A4U85_RS178754312.598023hypothetical protein
A4U85_RS178805333.690696F0F1 ATP synthase subunit epsilon
A4U85_RS178855343.580175F0F1 ATP synthase subunit beta
A4U85_RS178903313.279157F0F1 ATP synthase subunit gamma
A4U85_RS178954313.325235F0F1 ATP synthase subunit alpha
A4U85_RS179003212.625091F0F1 ATP synthase subunit delta
43A4U85_RS18040A4U85_RS18150Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS180403203.127753BolA family transcriptional regulator
A4U85_RS180553213.066302**4'-phosphopantetheinyl transferase superfamily
A4U85_RS180603213.865567alpha/beta fold hydrolase
A4U85_RS180652223.970302hypothetical protein
A4U85_RS180702223.836295outer membrane lipoprotein-sorting protein
A4U85_RS180753233.700550non-ribosomal peptide synthetase
A4U85_RS180801202.531340acyl carrier protein
A4U85_RS18085-1202.801640acyl-CoA dehydrogenase
A4U85_RS18090-2191.924861fatty acyl-AMP ligase
A4U85_RS18095-1171.704993LuxR family transcriptional regulator AbaR
A4U85_RS18100-2162.724777DUF4902 domain-containing protein
A4U85_RS18105-2153.532731acyl-homoserine-lactone synthase AbaI
A4U85_RS18110-1164.107984MHS family MFS transporter
A4U85_RS18115-1164.191856enoyl-CoA hydratase/isomerase family protein
A4U85_RS18120-1174.468809enoyl-CoA hydratase
A4U85_RS18125-1184.194140acyl-CoA dehydrogenase family protein
A4U85_RS181300213.841227AMP-binding protein
A4U85_RS181351233.6079673-hydroxyisobutyrate dehydrogenase
A4U85_RS181401233.423419CoA-acylating methylmalonate-semialdehyde
A4U85_RS18145-1213.456299LysR family transcriptional regulator
A4U85_RS18150-1203.221667amino acid permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18060NUCEPIMERASE607e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 60.2 bits (146), Expect = 7e-12
Identities = 31/127 (24%), Positives = 53/127 (41%), Gaps = 9/127 (7%)

Query: 16 TILVTGAAGFIGSRLIVELLREGHQVIAALRNAATKKDKLLGFIATQGLADPSISFVEYD 75
LVTGAAGFIG + LL GHQV+ + N D L + LA P F + D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVV-GIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 76 LSRDFKLDSLLSDAQTKIHVIYHLAA----SFNWGISKAEAERTNIKSGLALIEWAATLK 131
L+ + L + ++ ++ A A+ +N+ L ++E
Sbjct: 61 LADREGMTDLFASGH--FERVFISPHRLAVRYSLENPHAYAD-SNLTGFLNILE-GCRHN 116

Query: 132 QLERFIW 138
+++ ++
Sbjct: 117 KIQHLLY 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18070ACRIFLAVINRP851e-18 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 85.3 bits (211), Expect = 1e-18
Identities = 50/233 (21%), Positives = 99/233 (42%), Gaps = 15/233 (6%)

Query: 725 QRYAKITILLKTGSN-----HRIKEILESLKTYMAGQLGDKAVVSFGGDVTQTIALTETM 779
+ A + I L TG+N IK L L+ + G K + + D T + +
Sbjct: 284 KPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQ--GMKVLYPY--DTTPFV---QLS 336

Query: 780 VHGKLMNILQISFAVFFISALVFRSISAGLIVLTPLLFSILAIFGVMGWLDIPLNIPNSL 839
+H + + + VF + L +++ A LI + +L F ++ +N
Sbjct: 337 IHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMF 396

Query: 840 ISAMAVGIGADYAIYFLYRLREILREEGGDIKDAIRKTLSTAGKASLFVATAVAGGYGVL 899
+A+G+ D AI + + ++ E+ K+A K++S A + +A ++ + +
Sbjct: 397 GMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPM 456

Query: 900 SLSQG--FHVHQWLAMFIVIAMLFSVFTTLIMVPTM-ILILKPRFIFSSKKKS 949
+ G +++ ++ IV AM SV LI+ P + +LKP + K
Sbjct: 457 AFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKG 509



Score = 61.8 bits (150), Expect = 1e-11
Identities = 27/156 (17%), Positives = 63/156 (40%), Gaps = 10/156 (6%)

Query: 792 FAVFFISALVFRSISAGLIVLTPLLFSILAIFGVMGWLDIPLNIPNSLISAMAVGIGADY 851
VF A ++ S S + V+ + I+ + + ++ + +G+ A
Sbjct: 881 VVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKN 940

Query: 852 AIYFLYRLREILREEGGDIKDAIRKTLSTAGKASL--FVATAVAGGYGVL----SLSQGF 905
AI + ++++ +EG + +A A + L + T++A GVL S G
Sbjct: 941 AILIVEFAKDLMEKEGKGVVEATLM----AVRMRLRPILMTSLAFILGVLPLAISNGAGS 996

Query: 906 HVHQWLAMFIVIAMLFSVFTTLIMVPTMILILKPRF 941
+ + ++ M+ + + VP ++++ F
Sbjct: 997 GAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032



Score = 41.0 bits (96), Expect = 4e-05
Identities = 42/223 (18%), Positives = 84/223 (37%), Gaps = 30/223 (13%)

Query: 394 VLVIGLLHFEAFRSKQGLILPLVTALLAVAWGMGMMGLFKQPMDIFNSPTPILILAIAAG 453
V ++ L + R+ + + LL + G + +F ++LAI
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFG-----MVLAIGLL 405

Query: 454 --HAVQLLKRYYEDFDRLIAQGMEPKAANSEAVVQSLVRVGPVMVLAGGIAAAGFFSLLT 511
A+ +++ ++ + PK EA +S+ ++ +V + +A F +
Sbjct: 406 VDDAIVVVENVE---RVMMEDKLPPK----EATEKSMSQIQGALVGIAMVLSAVFIPMAF 458

Query: 512 FNIPT---IRSFGIFTGIGIISTLVIEMTFIPALRSML--PPPSVVKVKRKGLPIW---- 562
F T R F I + ++++ + PAL + L P + + G W
Sbjct: 459 FGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTT 518

Query: 563 -DWIPNRIGDV---ILSVRPRMMLMTAIAAMG---IFLAIGTS 598
D N + IL R +L+ A+ G +FL + +S
Sbjct: 519 FDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSS 561



Score = 36.4 bits (84), Expect = 0.001
Identities = 32/204 (15%), Positives = 74/204 (36%), Gaps = 26/204 (12%)

Query: 350 MGPINKIVESEQSK---DMTISVGGNPVYLDKAEDYSKRINILFPIAVLVIGLLHFEAFR 406
G ++E+ SK + G + + L I+ +V+ L +
Sbjct: 836 SGDAMALMENLASKLPAGIGYDWTGMSYQERLSG---NQAPALVAISFVVVFLCLAALYE 892

Query: 407 SKQGLILPLVTALLAVAWGMGMMGLFKQPMDIFNSPTPILILAIAAGHAVQLLKRYYEDF 466
S + ++ L + + LF Q D++ + + ++A +A+ ++ +F
Sbjct: 893 SWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIV-----EF 947

Query: 467 --DRLIAQGMEPKAANSEAVVQSLVRVGPVMVLAGGIAAAGFFSLLTFNIPTIRSFGIFT 524
D + +G A A +R+ P+++ + A +L I G
Sbjct: 948 AKDLMEKEGKGVVEATLMA---VRMRLRPILM----TSLAFILGVLPLAISNGAGSGAQN 1000

Query: 525 GI------GIISTLVIEMTFIPAL 542
+ G++S ++ + F+P
Sbjct: 1001 AVGIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18080ISCHRISMTASE270.008 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 26.9 bits (59), Expect = 0.008
Identities = 19/73 (26%), Positives = 34/73 (46%), Gaps = 5/73 (6%)

Query: 12 IRTLVAKEMRVEPETIDPDQKFTSYGLDSIVALSVSGDLEDLTKL--ELEPTLLWDYPTI 69
IR +A+ ++ PE I + GLDS+ +++ +E + E+ L + PTI
Sbjct: 235 IRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTL---VEQWRREGAEVTFVELAERPTI 291

Query: 70 NALAEYLVSELQQ 82
+ L + QQ
Sbjct: 292 EEWQKLLTTRSQQ 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18105AUTOINDCRSYN1243e-38 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 124 bits (314), Expect = 3e-38
Identities = 35/156 (22%), Positives = 63/156 (40%), Gaps = 5/156 (3%)

Query: 14 NNFSEGLYTKFKSYRYRVFVEYLGWELNCPNDEELDQFDKIDTAYVVAQDRESNIIGCAR 73
SE + + R F + L W + C + E DQ+D +T Y+ ++ +I R
Sbjct: 10 TLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIK-DNTVICSLR 68

Query: 74 LLPTTQPYLLGAIFPQLLNGMPIPCSPEIWELSRFSAVDFSNPPSSSSQAVSSPVSIAIL 133
+ T P ++ F + IP E SRF VD S P+S +
Sbjct: 69 FIETKYPNMITGTFFPYFKEINIPEGN-YLESSRF-FVDKSRAKDILGNE--YPISSMLF 124

Query: 134 QEAINFAREQGAKQLITTSPLGVERLLRAAGFRAHR 169
IN+++++G + T + +L+ +G+
Sbjct: 125 LSMINYSKDKGYDGIYTIVSHPMLTILKRSGWGIRV 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18110TCRTETB290.045 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.045
Identities = 29/121 (23%), Positives = 53/121 (43%), Gaps = 20/121 (16%)

Query: 75 LGGLVFGHFGDKIGRKSMLLLTLMLMGIPTVLIGLLPTYESIGYWAAIGLVILRFIQGMA 134
+G V+G D++G K +LL +++ +V+ + ++ S+ L++ RFIQG
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL-------LIMARFIQG-- 114

Query: 135 MGGEWGGAVLMAV------EHAPEGGKGFWGSLPQASTG-----GGLMLASIALGLVSLL 183
G A++M V + G GS+ G GG++ I + L+
Sbjct: 115 AGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174

Query: 184 P 184
P
Sbjct: 175 P 175


44A4U85_RS18195A4U85_RS18395Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18195218-2.520001ribonuclease E inhibitor RraB
A4U85_RS18200219-3.427646RluA family pseudouridine synthase
A4U85_RS18205115-0.630782DUF1003 domain-containing protein
A4U85_RS18210116-0.297816SDR family NAD(P)-dependent oxidoreductase
A4U85_RS182150160.745120hypothetical protein
A4U85_RS18220-1171.590564DUF3396 domain-containing protein
A4U85_RS18225-1192.244847hypothetical protein
A4U85_RS182300223.319296type VI secretion system tip protein VgrG
A4U85_RS182350223.399688sel1 repeat family protein
A4U85_RS182400233.3438323-oxoacyl-ACP synthase
A4U85_RS18250223-2.519156hypothetical protein
A4U85_RS18255325-5.106653hypothetical protein
A4U85_RS18260326-5.457147hypothetical protein
A4U85_RS18265-2170.473981DUF4126 domain-containing protein
A4U85_RS19330-2190.269157hypothetical protein
A4U85_RS18275-1190.981096hypothetical protein
A4U85_RS182801212.744767DUF4433 domain-containing protein
A4U85_RS182852264.123315Fe/S-dependent 2-methylisocitrate dehydratase
A4U85_RS182902264.8151152-methylcitrate synthase
A4U85_RS182951224.162010methylisocitrate lyase
A4U85_RS183000194.158037GntR family transcriptional regulator
A4U85_RS18305-1193.625897aspartate/tyrosine/aromatic aminotransferase
A4U85_RS18310-2183.050769D-lactate dehydrogenase
A4U85_RS18315-2162.513516FMN-dependent L-lactate dehydrogenase LldD
A4U85_RS18320-2131.408165transcriptional regulator LldR
A4U85_RS18325-3110.023607L-lactate permease
A4U85_RS18330-311-0.643805phosphomannomutase/phosphoglucomutase
A4U85_RS18335-311-1.665153UDP-glucose 4-epimerase GalE
A4U85_RS18340-217-3.077702glucose-6-phosphate isomerase
A4U85_RS18345-116-4.423197UDP-glucose/GDP-mannose dehydrogenase family
A4U85_RS18350120-7.380210UTP--glucose-1-phosphate uridylyltransferase
A4U85_RS18355524-9.379943sugar transferase
A4U85_RS18360826-11.556332glycosyltransferase
A4U85_RS18365829-12.141665glycosyltransferase family 4 protein
A4U85_RS18370931-11.444401EpsG family protein
A4U85_RS18375830-10.141378glycosyltransferase
A4U85_RS18380527-7.674206glycosyltransferase family 4 protein
A4U85_RS18385323-6.271178acyltransferase
A4U85_RS18390319-3.590288oligosaccharide flippase family protein
A4U85_RS18395216-2.624853DegT/DnrJ/EryC1/StrS family aminotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18210DHBDHDRGNASE398e-06 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 38.9 bits (90), Expect = 8e-06
Identities = 27/77 (35%), Positives = 36/77 (46%)

Query: 4 NIIIFGYGTGISQAVAHKFGKEGYKIGLVARNAQKLEKAILELKAQGIEAYAFACDLAVL 63
I G GI +AVA +G I V N +KLEK + LKA+ A AF D+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 64 EDIPNLIKRIKDRLGEI 80
I + RI+ +G I
Sbjct: 70 AAIDEITARIEREMGPI 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18255MICOLLPTASE270.030 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 26.6 bits (58), Expect = 0.030
Identities = 10/56 (17%), Positives = 19/56 (33%)

Query: 3 YTWDEFEQRLITYRDAWIDLARILDAYEHQIKELLQQIQLLTYEDSLPVFNQLYEI 58
Y + Y + +D Y +++ L L +++ V N LY
Sbjct: 269 YYTNSVIYNTKGYDAKNTEFYNRIDPYMERLESLCTIGDKLNNDNAWLVNNALYYT 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18300ANTHRAXTOXNA330.002 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 32.8 bits (74), Expect = 0.002
Identities = 17/46 (36%), Positives = 26/46 (56%), Gaps = 4/46 (8%)

Query: 232 LALYPLSAFRAMNK----AAETVYETLRKEGTQKNVVDIMQTRKEL 273
L LY F MNK E + E+L+KEG +K+ +D+++ K L
Sbjct: 257 LELYAPDMFEYMNKLEKGGFEKISESLKKEGVEKDRIDVLKGEKAL 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18340NUCEPIMERASE1798e-56 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 179 bits (455), Expect = 8e-56
Identities = 85/348 (24%), Positives = 144/348 (41%), Gaps = 35/348 (10%)

Query: 3 KILVTGGAGYIGSHTCVELLEAGHEVIVFDNLSNSSKESLN--RVQEITQKGLTFVEGDI 60
K LVTG AG+IG H LLEAGH+V+ DNL++ SL R++ + Q G F + D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 61 RNSGELDRVFQEHAIDAVIHFAGLKAVGESQEKPLIYFDNNIAGSIQLVKSMEKAGVYTL 120
+ + +F + V AV S E P Y D+N+ G + +++ + L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 121 VFSSSATVYDEANTSPLNEEMPTGMPSNNYGYTKLIVEQLLQKLSVADSKWSIALLRYFN 180
+++SS++VY P + + P + Y TK E + S LR+F
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYS-HLYGLPATGLRFFT 180

Query: 181 PVGAHKSGRIGEDPQGIPNNLMPYVTQVAVGRREKLSIYGNDYDTIDGTGVRDYIHVVDL 240
G P G P+ + T+ A+ + + +Y G RD+ ++ D+
Sbjct: 181 VYG----------PWGRPDMALFKFTK-AMLEGKSIDVYN------YGKMKRDFTYIDDI 223

Query: 241 ANAHLCALNNRLQAQGC---------------RAWNIGTGNGSSVLQVKNTFEQVNGVPV 285
A A + + A R +NIG + ++ E G+
Sbjct: 224 AEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEA 283

Query: 286 AFEFAPRRAGDVATSFADNARAVAELGWKPQYGLEDMLKDSWNWQKQN 333
P + GDV + AD +G+ P+ ++D +K+ NW +
Sbjct: 284 KKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18345PF05616300.019 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.5 bits (68), Expect = 0.019
Identities = 10/24 (41%), Positives = 20/24 (83%)

Query: 104 LRLPSEYSKFPELTKQVHTQLQRM 127
+RL S+YS+FPE+ + + +Q++R+
Sbjct: 152 MRLMSDYSRFPEVKELMESQMERL 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18385SALSPVBPROT320.004 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 32.0 bits (72), Expect = 0.004
Identities = 35/132 (26%), Positives = 60/132 (45%), Gaps = 8/132 (6%)

Query: 80 ETMLPRFLDILSNHKNLNPEY-----IYNDEINKYLEVVARDNCLGVIALSKSAKKIQSD 134
ET+L R D LS ++ + E+ +Y ++I + L + + V K K SD
Sbjct: 425 ETLLSR--DYLSTNEPSDEEFKNAMSVYINDIAEGLSSLPETDHRVVYRGLKLDKPALSD 482

Query: 135 ILKAYPKVRDKIEKKMFVLYPPQKIYTTQHEIEEKVLKPLKLIFVGNDFYLKGGAECILA 194
+LK Y + + I K F+ P K + + + K K +G+ + KG AE +
Sbjct: 483 VLKEYTTIGNIIIDKAFMSTSPDKAWINDTILNIYLEKGHKGRILGDVAHFKGEAEMLFP 542

Query: 195 INELLE-EGIIS 205
N L+ E I++
Sbjct: 543 PNTKLKIESIVN 554


45A4U85_RS18465A4U85_RS19450Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18465218-3.858014transposase
A4U85_RS18470126-2.467191DUF4942 domain-containing protein
A4U85_RS18475026-1.818186hypothetical protein
A4U85_RS18480125-2.100543hypothetical protein
A4U85_RS18485024-1.099803hypothetical protein
A4U85_RS18490-125-0.855158hypothetical protein
A4U85_RS18495022-1.073187hypothetical protein
A4U85_RS18500223-1.206608hypothetical protein
A4U85_RS18505122-0.682067hypothetical protein
A4U85_RS18510-121-0.971881hypothetical protein
A4U85_RS18515223-1.178664hypothetical protein
A4U85_RS18520122-1.840170hypothetical protein
A4U85_RS18525316-2.752001hypothetical protein
A4U85_RS18530417-2.714774hypothetical protein
A4U85_RS18545520-3.696235hypothetical protein
A4U85_RS185504262.308536hypothetical protein
A4U85_RS185555303.832739recombinase family protein
A4U85_RS185606376.461024Fic family protein
A4U85_RS1856566415.875550hypothetical protein
A4U85_RS1857076916.981036hypothetical protein
A4U85_RS1858078221.379553sulfonamide-resistant dihydropteroate synthase
A4U85_RS1858558521.634320phosphoglucosamine mutase
A4U85_RS1859058721.985531hypothetical protein
A4U85_RS1859579022.117830IS91-like element ISVsa3 family transposase
A4U85_RS1860099622.902544hypothetical protein
A4U85_RS1860589522.992526recombinase family protein
A4U85_RS1861088821.165670helix-turn-helix transcriptional regulator
A4U85_RS1861578520.522283arsenical resistance protein ArsH
A4U85_RS1862057218.351847arsenate reductase ArsC
A4U85_RS1862546817.593765ACR3 family arsenite efflux transporter
A4U85_RS1863026016.393039hypothetical protein
A4U85_RS1863525915.765363type I toxin-antitoxin system ptaRNA1 family
A4U85_RS1864026316.726358hypothetical protein
A4U85_RS1864526116.617824P-type conjugative transfer protein TrbL
A4U85_RS1865026416.568570entry exclusion lipoprotein TrbK
A4U85_RS1865526917.212581P-type conjugative transfer protein TrbJ
A4U85_RS1866015512.842518stabilization protein
A4U85_RS18665-1449.721565replication protein C
A4U85_RS18670-1345.292867AAA family ATPase
A4U85_RS18675-1313.189021AlpA family phage regulatory protein
A4U85_RS18680-2302.747559site-specific integrase
A4U85_RS18685017-2.364235hypothetical protein
A4U85_RS18690-116-2.554847hypothetical protein
A4U85_RS18695217-2.360735hypothetical protein
A4U85_RS19400518-0.268616hypothetical protein
A4U85_RS18700315-0.462249hypothetical protein
A4U85_RS18705114-0.254301hypothetical protein
A4U85_RS194502120.447901hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18590PF05272270.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 27.3 bits (60), Expect = 0.007
Identities = 21/62 (33%), Positives = 28/62 (45%), Gaps = 4/62 (6%)

Query: 3 RSQRRGRDPPCWLSEKEPLVGKRRVNFQRCWLPVSRIGLHRRGVERVNCGPKLLDLGEKG 62
RSQR G + W +P R + F P SR + G + +C +LLD G G
Sbjct: 197 RSQRDGSEAWKWRGWDDP----RPLYFPSHRAPESRTVVLVEGERKADCLQQLLDAGAPG 252

Query: 63 VY 64
VY
Sbjct: 253 VY 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18645cloacin300.017 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.5 bits (68), Expect = 0.017
Identities = 22/70 (31%), Positives = 30/70 (42%), Gaps = 2/70 (2%)

Query: 283 GLVSGGGIGAAGGIGNFGAGAAVGAAVTAASMATGGAALAGKAVMGAAAGAAGGASALQA 342
G SG G G G G+G + AA +A G AL+ G A + G AL A
Sbjct: 57 GGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAG--ALSA 114

Query: 343 AFQKASASME 352
A A+++
Sbjct: 115 AIADIMAALK 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18700ISCHRISMTASE280.018 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 28.1 bits (62), Expect = 0.018
Identities = 8/17 (47%), Positives = 12/17 (70%)

Query: 135 WVIDPERAIQKLHEMKE 151
WV DP RA+ +H+M+
Sbjct: 24 WVPDPNRAVLLIHDMQN 40


46A4U85_RS18825A4U85_RS18900Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18825320-2.921739signal peptidase II
A4U85_RS18830221-2.611567peptidylprolyl isomerase
A4U85_RS18835222-2.085860NAD(P)H-dependent oxidoreductase
A4U85_RS18840220-1.771540ankyrin repeat domain-containing protein
A4U85_RS18870-118-1.584573**peptidylprolyl isomerase
A4U85_RS18875118-1.123716alpha/beta hydrolase
A4U85_RS18880322-0.177956hypothetical protein
A4U85_RS18885321-1.954168hypothetical protein
A4U85_RS18890418-2.890874MBL fold metallo-hydrolase
A4U85_RS18895417-3.508760nucleotide exchange factor GrpE
A4U85_RS18900217-2.474095molecular chaperone DnaK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18830INFPOTNTIATR280.017 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 28.0 bits (62), Expect = 0.017
Identities = 18/79 (22%), Positives = 37/79 (46%), Gaps = 2/79 (2%)

Query: 3 EIIQPNEEIRITDGSKVDLHFSVAIENGVEIDNTRSREEPVSLTIGDGNLLPGFEKALLG 62
+II + V + ++ + +G D+T +P + + ++PG+ +AL
Sbjct: 131 KIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVS--QVIPGWTEALQL 188

Query: 63 LRAGDRRTVHLPPEDAFGP 81
+ AG V +P + A+GP
Sbjct: 189 MPAGSTWEVFVPADLAYGP 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18875SACTRNSFRASE280.036 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.036
Identities = 15/73 (20%), Positives = 23/73 (31%), Gaps = 18/73 (24%)

Query: 84 AIRQKPTDIELVEDI------------RLPLQSGTIFARHYHPA------PNKKLPLIVF 125
IR L+EDI L +A+ H + + F
Sbjct: 81 KIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHF 140

Query: 126 YHGGGFVVGGLDT 138
Y F++G +DT
Sbjct: 141 YAKHHFIIGAVDT 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18905SHAPEPROTEIN1412e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (358), Expect = 2e-39
Identities = 79/380 (20%), Positives = 141/380 (37%), Gaps = 69/380 (18%)

Query: 5 IGIDLGTTNSCVAVLEGDKVKVIENAEGARTTPSIIAYKDGEILVGQSAKRQAVTNPKNT 64
+ IDLGT N+ + V V + R + VG AK+ P N
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRA--GSPKSVAAVGHDAKQMLGRTPGN- 69

Query: 65 LFAIKRLIGRRYEDQAVQKDIGLVPYKIIKADNGDAWVEVNDKKLAPQQISAEILKK-MK 123
+ AI+ P K D +A ++ ++L+ +K
Sbjct: 70 IAAIR-------------------PMK--------------DGVIADFFVTEKMLQHFIK 96

Query: 124 KTAEDYLGETVTEAVITVPAYFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAFGMDKK 183
+ + ++ VP +R+A +++ + AG +I EP AAA+ G+
Sbjct: 97 QVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVS 156

Query: 184 EGDRKVAVYDLGGGTFDVSIIEIADLDGDQQIEVLSTNGDTFLGGEDFDNALIEYLVEEF 243
E V D+GGGT +V++I + + + +GG+ FD A+I Y+ +
Sbjct: 157 E-ATGSMVVDIGGGTTEVAVISLNG---------VVYSSSVRIGGDRFDEAIINYVRRNY 206

Query: 244 KKEQNVNLKNDPLALQRLKEAAEKAKIELSSS----NATEINLPYITADATGPKHLVINV 299
+ AE+ K E+ S+ EI + P+ +N
Sbjct: 207 G-------------SLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN- 252

Query: 300 TRAKLEGLVADLVARTIEPCKIALKD-AGLSTSDISD--VILVGGQSRMPLVQQKVQEFF 356
+ LE L + + + +AL+ SDIS+ ++L GG + + + + + E
Sbjct: 253 SNEILEALQ-EPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEET 311

Query: 357 GREPRKDVNPDEAVAIGAAI 376
G +P VA G
Sbjct: 312 GIPVVVAEDPLTCVARGGGK 331


47A4U85_RS01525A4U85_RS01550N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS01525117-3.234892TPM domain-containing protein
A4U85_RS01530117-4.526751TPM domain-containing protein
A4U85_RS01535314-4.386428pilin
A4U85_RS01540011-3.249795O-antigen ligase family protein
A4U85_RS01545-111-2.081177O-antigen ligase family protein
A4U85_RS01550-118-1.239486bacterioferritin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01525cloacin340.001 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.5 bits (76), Expect = 0.001
Identities = 14/32 (43%), Positives = 16/32 (50%)

Query: 326 SGAGGGGRGGGFGGGGGGYGGGGGRFGGGGAS 357
SG GG G GGG G GGG GG ++
Sbjct: 52 SGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83



Score = 33.5 bits (76), Expect = 0.001
Identities = 17/31 (54%), Positives = 19/31 (61%), Gaps = 1/31 (3%)

Query: 330 GGGRGGGFG-GGGGGYGGGGGRFGGGGASGS 359
GGG G G GGG G+G GGG GG SG+
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGT 77



Score = 33.1 bits (75), Expect = 0.001
Identities = 15/34 (44%), Positives = 17/34 (50%)

Query: 326 SGAGGGGRGGGFGGGGGGYGGGGGRFGGGGASGS 359
SG+G GG G GGG G GG G GG +
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83



Score = 33.1 bits (75), Expect = 0.002
Identities = 14/31 (45%), Positives = 16/31 (51%)

Query: 329 GGGGRGGGFGGGGGGYGGGGGRFGGGGASGS 359
GG G G +GGG G GGG GGG+
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG 78



Score = 33.1 bits (75), Expect = 0.002
Identities = 15/32 (46%), Positives = 16/32 (50%)

Query: 327 GAGGGGRGGGFGGGGGGYGGGGGRFGGGGASG 358
G G G GG G G GGG G GGG +G
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG 78



Score = 30.8 bits (69), Expect = 0.008
Identities = 14/33 (42%), Positives = 15/33 (45%)

Query: 327 GAGGGGRGGGFGGGGGGYGGGGGRFGGGGASGS 359
G G G G G G G GG G GG G G+
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01535BCTERIALGSPG486e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 48.0 bits (114), Expect = 6e-10
Identities = 17/54 (31%), Positives = 33/54 (61%)

Query: 1 MNAQKGFTLIELMIVVAIIGILAAIAIPAYQNYIAKSQASEAFTLADGLKTTIN 54
+ Q+GFTL+E+M+V+ IIG+LA++ +P K+ +A + L+ ++
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALD 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01540ABC2TRNSPORT300.022 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 29.5 bits (66), Expect = 0.022
Identities = 27/125 (21%), Positives = 49/125 (39%), Gaps = 8/125 (6%)

Query: 12 AIYLITAILGYQTGWS-IFQYDELRLLQFPLALYALLILIFAKKLKFSIYSQITFFIIAG 70
I ++ A LGY S ++ + L A +++ A + I+ Q T I
Sbjct: 131 GIGVVAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQ-TLVITPI 189

Query: 71 LILTQQINWEIFEFQELWSVIALLFIFSALTYELRNIKNI--EHGIALIVVTAIIPCLFI 128
L L+ +F +L V F L++ + I+ I H + + C++I
Sbjct: 190 LFLSGA----VFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYI 245

Query: 129 LISIF 133
+I F
Sbjct: 246 VIPFF 250


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01550HELNAPAPROT310.002 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 30.6 bits (69), Expect = 0.002
Identities = 21/94 (22%), Positives = 38/94 (40%), Gaps = 10/94 (10%)

Query: 44 EYKESIRQMKHADKIIERILFLEGLPN--LQHLGKLY------IGQHTEEVLQCDIRKVK 95
E + + D I ER+L + G P ++ + E++Q + K
Sbjct: 52 ELYDHAA--ETVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 ENIEAIQKAVALAETEQDYVTRDLVQEILEKEEE 129
+ + + LAE QD T DL ++E+ E+
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEK 143


48A4U85_RS01580A4U85_RS01615N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS01580118-0.744106GspH/FimT family pseudopilin
A4U85_RS01590120-0.934501type IV pilus modification protein PilV
A4U85_RS01595019-0.749054PilW family protein
A4U85_RS01600017-0.270574pilus assembly protein
A4U85_RS01605-1180.102868VWA domain-containing protein
A4U85_RS016102201.305630prepilin-type N-terminal cleavage/methylation
A4U85_RS016152182.354762prepilin-type N-terminal cleavage/methylation
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01585BCTERIALGSPG414e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 40.6 bits (95), Expect = 4e-07
Identities = 15/42 (35%), Positives = 26/42 (61%), Gaps = 1/42 (2%)

Query: 1 MRGIIPQEGFTLVELMVTIAVMAIIALMAAPS-MSNLLESKR 41
MR Q GFTL+E+MV I ++ ++A + P+ M N ++ +
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADK 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01595BCTERIALGSPG335e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 33.3 bits (76), Expect = 5e-04
Identities = 14/48 (29%), Positives = 28/48 (58%), Gaps = 2/48 (4%)

Query: 7 SNKEQGFTLIELIVALA-LGLILVAAATQLFIGGLLSSRLQKANAEIQ 53
++K++GFTL+E++V + +G++ L G + QKA ++I
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLM-GNKEKADKQKAVSDIV 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01610BCTERIALGSPG573e-13 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 57.2 bits (138), Expect = 3e-13
Identities = 22/59 (37%), Positives = 35/59 (59%)

Query: 11 QGVTLIELMVVIVIVAIFASIAIPSYQSYSRRATASAAKSEILKLAEQLEQHKSRNFTY 69
+G TL+E+MVVIVI+ + AS+ +P+ +A A S+I+ L L+ +K N Y
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHY 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS01615BCTERIALGSPG544e-12 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 53.7 bits (129), Expect = 4e-12
Identities = 18/63 (28%), Positives = 37/63 (58%)

Query: 2 KNGFTLIEIMIVVAIIAILAAIATPSYLQYLRKGHRTAVQSEMMNIAQTLESEKVVHNRY 61
+ GFTL+EIM+V+ II +LA++ P+ + K + S+++ + L+ K+ ++ Y
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHY 66

Query: 62 SSN 64
+
Sbjct: 67 PTT 69


49A4U85_RS03125A4U85_RS03160N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS03125-112-1.297528efflux RND transporter permease subunit
A4U85_RS03130013-1.297509efflux RND transporter periplasmic adaptor
A4U85_RS03135213-1.122069hypothetical protein
A4U85_RS03140212-0.943360twitching motility response regulator PilG
A4U85_RS03145212-0.784127response regulator
A4U85_RS03150212-0.927018purine-binding chemotaxis protein CheW
A4U85_RS03155112-1.341155methyl-accepting chemotaxis protein
A4U85_RS03160111-1.343627Hpt domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03125ACRIFLAVINRP7850.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 785 bits (2029), Expect = 0.0
Identities = 284/1037 (27%), Positives = 496/1037 (47%), Gaps = 34/1037 (3%)

Query: 5 RISVKYPVFTIMMMLSLMVLGLASWKRMTVEEFPNIDFPFVVVTTQYAGASPEAVESDIT 64
++ P+F ++ + LM+ G + ++ V ++P I P V V+ Y GA + V+ +T
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 65 KKLEDQINTISGIKQITSRS-SEGLSMVIAEFNLDTSSAIAAQDVRDKIAPVIAQFRDEI 123
+ +E +N I + ++S S S G + F T IA V++K+ E+
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 124 DTPIVQRYDPSSSPIMSVVFESNSMSLAQ--LSSYVDKKIVPQLKTVSGVGNVNLLGDAK 181
+ SSS +M F S++ Q +S YV + L ++GVG+V L G A+
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG-AQ 181

Query: 182 RQIRIKVHPEQLQSYGIGIDQVINTLKNENIEVPAGTL------QQKNSELVVQIQSKVI 235
+RI + + L Y + VIN LK +N ++ AG L + + Q++
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 236 HPLGFGDLVI-ANKNGSPIFLKQVATVEDTQAELQSSAFYNGRTAVSVDILKSSDANVIQ 294
+P FG + + N +GS + LK VA VE A NG+ A + I ++ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 295 VVDKTYQTLEKLKAQMPAGLNYKVVADSSKGIRASIKDVVRTIIEGAVLAVLIVLLFLGS 354
L +L+ P G+ D++ ++ SI +VV+T+ E +L L++ LFL +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 355 FRSTVITGLTLPITLLGTLTFIWAFGFSINMMTLLALSLSIGLLIDDAIVVRENIVRH-T 413
R+T+I + +P+ LLGT + AFG+SIN +T+ + L+IGLL+DDAIVV EN+ R
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 414 ELGKDHVTAALDGTKEIGLAVLATTLTIVAVFLPVAFMGGLIGRFFYQFGVTVSTAVLIS 473
E A +I A++ + + AVF+P+AF GG G + QF +T+ +A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 474 MFISFTLDPMLSAHWKDPIKKKESRLQR-FFNYISNLLDGLTHIYEKLLKLALRFRFITV 532
+ ++ L P L A P+ + + FF + + D + Y + L +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 533 IIAIVSLVVALGLSKMIGTEFVPTPDKGEIRIQFETPVDSSLEYTQAKLHQVDQII--RQ 590
+I + + + L + + F+P D+G + P ++ E TQ L QV +
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 591 FPDVVSTYGVVNSEVDSGKNHAGLG-VTLKPKQERSADLTTLNNEFRDRLQSVAGIRVTS 649
+V S + V +AG+ V+LKP +ER+ D + + IR
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 650 VAAAQDS------VSGGQKPIMISIKGSDLNELQKISDRFMAEMEK-IDGVVDLESSLKE 702
V + G +I G + L + ++ + + +V + + E
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 703 PKPTLGVHINRVLASDLDLSVSQIANAIRPLIAGDNVTTWEDRDGETYDVNIRLNENKRV 762
+ +++ A L +S+S I I + G V + D G + ++ + R+
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFID-RGRVKKLYVQADAKFRM 780

Query: 763 LPQDVQNLYLNSNKTNANGQNILVPLSAVATTQEKLGASQINRRDLEREVLIEAN-TSGR 821
LP+DV LY+ +ANG+ +VP SA T+ G+ ++ R + + I+ G
Sbjct: 781 LPEDVDKLYV----RSANGE--MVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGT 834

Query: 822 PSGDIGQDIDKMQKAFKLPAGYTFDTQGANADMAESAGYALTAITLSIVFIYIVLGSQFN 881
SGD ++ + KLPAG +D G + S A + +S V +++ L + +
Sbjct: 835 SSGDAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYE 892

Query: 882 SFIHPAAIMASLPLSLIGVFLALFLFRSTLNLFSIIGIIMLMGLVTKNAILLIDFIKKAM 941
S+ P ++M +PL ++GV LA LF +++ ++G++ +GL KNAIL+++F K M
Sbjct: 893 SWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLM 952

Query: 942 E-DGISRYDAILQAGKTRLRPILMTTSAMVMGMVPLALGLGEGGEQSAPMAHAVIGGVIT 1000
E +G +A L A + RLRPILMT+ A ++G++PLA+ G G + V+GG+++
Sbjct: 953 EKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVS 1012

Query: 1001 STLLTLVVVPVIFTYLD 1017
+TLL + VPV F +
Sbjct: 1013 ATLLAIFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03130RTXTOXIND445e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 5e-07
Identities = 37/217 (17%), Positives = 72/217 (33%), Gaps = 49/217 (22%)

Query: 102 RLNNQDNAARLAQAQANLASAQAQAELARNLMNRKQRLLNQGFIARVEF---EQSQVDYK 158
LN A A + + + + ++ ++ LL++ IA+ E V+
Sbjct: 206 ELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 159 GQLESVRAQ-------------------------------QANVDIA------KKADRDG 181
+L ++Q Q +I K +
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325

Query: 182 ---IITSPISGVITKRQV-EPGQTVSVGQTLFEIV-NPDQLEIQAKLPIEQQSALKVGSS 236
+I +P+S + + +V G V+ +TL IV D LE+ A + + + VG +
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQN 385

Query: 237 IQYQI----QGNSKQLHAILTRISPVADQDSRQIEFF 269
++ L + I+ A +D R F
Sbjct: 386 AIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVF 422



Score = 37.1 bits (86), Expect = 9e-05
Identities = 20/120 (16%), Positives = 42/120 (35%), Gaps = 10/120 (8%)

Query: 75 IQAQVSATATAVTANVGQKVQKGQVLVRLNNQDNAARLAQAQANLASAQAQAELARNLMN 134
I+ ++ + G+ V+KG VL++L A + Q++L A+ + + L
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 135 --RKQRLLNQGFIARVEFEQS--------QVDYKGQLESVRAQQANVDIAKKADRDGIIT 184
+L F+ K Q + + Q+ ++ R +T
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03140HTHFIS792e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 78.7 bits (194), Expect = 2e-20
Identities = 31/118 (26%), Positives = 55/118 (46%), Gaps = 2/118 (1%)

Query: 9 KVMVIDDSKTIRRTAETLLQREGCEVITAVDGFEALSKIAEANPDIVFVDIMMPRLDGYQ 68
++V DD IR L R G +V + IA + D+V D++MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 69 TCALIKNSQNYQNIPVIMLSSKDGLFDQAKGRVVGSDEYLTKPFSKDELLNAIRNHVS 126
IK + ++PV+++S+++ K G+ +YL KPF EL+ I ++
Sbjct: 65 LLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03145HTHFIS834e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.3 bits (206), Expect = 4e-22
Identities = 39/118 (33%), Positives = 58/118 (49%), Gaps = 2/118 (1%)

Query: 2 ARILIVDDSPTETFRFKEILTKHGYDVLEASNGADGVTLAKAEQPDLVLMDVVMPGVNGF 61
A IL+ DD + L++ GYDV SN A A DLV+ DVVMP N F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 QATRQITRDEDTKHIPVVIVSTKDQATDRVWGKRQGAIDYLIKPIEEKQLIDVIKQFL 119
+I + +PV+++S ++ + +GA DYL KP + +LI +I + L
Sbjct: 64 DLLPRI-KKAR-PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03155FLAGELLIN300.025 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 30.4 bits (68), Expect = 0.025
Identities = 22/228 (9%), Positives = 63/228 (27%), Gaps = 10/228 (4%)

Query: 451 STAMNEMAQSIDQVSANASESAEVAQRSVQIASNGAQVVNRSIEGMDTIREQIQETSKRI 510
+ + + Q S NA++ +AQ +N +++ + + Q +
Sbjct: 50 ANRFTSNIKGLTQASRNANDGISIAQ----TTEGALNEINNNLQRVRELSVQATNGTNSD 105

Query: 511 KRLGESSQEIGNIVSLINDIADQT-----NILALNAAIQASMAGEAGRGFAVVADEVQRL 565
L EI + I+ +++QT +L+ + ++ + G + ++
Sbjct: 106 SDLKSIQDEIQQRLEEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVK 165

Query: 566 AERSASATKQIETLV-KTIQTDTNEAVISMEQTTTEVVRGANLAKDAGIALDEIQKVSGD 624
+ + + V + + + D D
Sbjct: 166 SLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPD 225

Query: 625 LAKLIASISDAAKLQSASASHIATTMTVVQEITSQTTTATFDTARSVS 672
+ A+ + + + + T + A +
Sbjct: 226 KVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGK 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03160HTHFIS871e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 1e-19
Identities = 29/124 (23%), Positives = 56/124 (45%), Gaps = 2/124 (1%)

Query: 1382 IMIVDDSVTVRKVTSRLLERQGYDVVTAKDGVDAIEQLENIKPDLMLLDIEMPRMDGFEV 1441
I++ DD +R V ++ L R GYDV + + DL++ D+ MP + F++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 1442 LNLVRHHDMHQYMPIIMITSRTGEKHRERAFSLGVSQYMGKPFQEEELLENIDALLVASE 1501
L ++ +P+++++++ +A G Y+ KPF EL+ I L +
Sbjct: 66 LPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 1502 SEVK 1505

Sbjct: 124 RRPS 127


50A4U85_RS03450A4U85_RS03475N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS03450-311-3.100317porin
A4U85_RS03455-210-1.729418phosphoethanolamine--lipid A transferase
A4U85_RS03460-212-1.054544DNA-binding response regulator PmrA
A4U85_RS03465-111-0.774427two-component system sensor histidine kinase
A4U85_RS03470113-0.124367phosphatase PAP2 family protein
A4U85_RS19130211-0.727130hypothetical protein
A4U85_RS034750140.772343ammonium transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS0345056KDTSANTIGN363e-04 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 36.1 bits (83), Expect = 3e-04
Identities = 15/41 (36%), Positives = 18/41 (43%)

Query: 51 VQEQRQVQQQQQQVQQQQQVQLAEVKAQPQPVAAPVSPLAG 91
+ + Q QQ Q Q Q Q A+ AQ AA V L G
Sbjct: 330 IHLNFVMPPQAQQQQGQGQQQQAQATAQEAVAAAAVRLLNG 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03460HTHFIS801e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.3 bits (198), Expect = 1e-19
Identities = 34/131 (25%), Positives = 59/131 (45%), Gaps = 2/131 (1%)

Query: 2 TKILMIEDDFMIAESTITLLQYHQFEVEWVNNGLDGLAQLAKTKFDLILLDLGLPMMDGM 61
IL+ +DD I L ++V +N +A DL++ D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 QVLKQIRQRAA-TPVLIISARDQLQNRVDGLNLGADDYLIKPYEFDELLARI-HALLRRS 119
+L +I++ PVL++SA++ + GA DYL KP++ EL+ I AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 120 GVEAQLASQDQ 130
++L Q
Sbjct: 124 RRPSKLEDDSQ 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03465TATBPROTEIN290.026 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 28.8 bits (64), Expect = 0.026
Identities = 27/102 (26%), Positives = 39/102 (38%), Gaps = 23/102 (22%)

Query: 153 LPFAIFALAAIIRRGLKPIDDFKNELKE-------RDS---------EELTPIEVHDYPQ 196
LP A+ +A IR +NEL + +DS LTP E+
Sbjct: 25 LPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQDSLKKVEKASLTNLTP-ELKASMD 83

Query: 197 ELLPTIDEMNRLFERISKAQNEQKQFIADAAHELRTPVTALN 238
EL + M +R A + +K +D AH + PV N
Sbjct: 84 ELRQAAESM----KRSYVANDPEKA--SDEAHTIHNPVVKDN 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS03475PF05272320.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.004
Identities = 18/76 (23%), Positives = 31/76 (40%), Gaps = 5/76 (6%)

Query: 101 GGIAERAKMRSQAIATLALVALVYP---FFEGMVWNGNYGLQKWLETTFGAAFHDFAGSV 157
G A+ QAI A + V+P + + W+ L+KWL G D+
Sbjct: 510 GTGEASAQTTEQAINVAADMNRVHPFRDWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRR 569

Query: 158 V--VHAMGGWIALAAV 171
+ + +G +I + V
Sbjct: 570 LRYLQLVGKYILMGHV 585


51A4U85_RS04015A4U85_RS04050N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS04015-1152.312370TetR/AcrR family transcriptional regulator
A4U85_RS04020-1152.932034nitroreductase family protein
A4U85_RS040251162.619221SDR family oxidoreductase
A4U85_RS040301172.946613TetR/AcrR family transcriptional regulator
A4U85_RS040351173.077349glycerate kinase
A4U85_RS04040-1152.728653FdhF/YdeP family oxidoreductase
A4U85_RS040450122.837107formate dehydrogenase accessory
A4U85_RS040500132.212369DUF1232 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04015HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.9 bits (124), Expect = 1e-10
Identities = 20/105 (19%), Positives = 41/105 (39%), Gaps = 2/105 (1%)

Query: 5 ALPTRALYVVNKAIDLFHHRGFHLIGVDRIVKESEITKATFYNYFHSKERLIEICLMVQK 64
A TR +++ A+ LF +G + I K + +T+ Y +F K L + +
Sbjct: 9 AQETRQH-ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 EKLQEQVIAMVEYDLGTP-AIDKLKKLYFLHTDLEGPYYLLFKAI 108
+ E + G P ++ + ++ L + + L I
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEI 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04025DHBDHDRGNASE804e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 80.5 bits (198), Expect = 4e-20
Identities = 52/185 (28%), Positives = 91/185 (49%), Gaps = 2/185 (1%)

Query: 7 VLITGASSGIGSVYADRFAQRGYHLILVARDTNRLDKISKDLQEKYGVQVEFIQADLSND 66
ITGA+ GIG A A +G H+ V + +L+K+ L+ + E AD+ +
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVRDS 69

Query: 67 QDIRKI-EDVLKNDADIEILVNNAGIALNGNFLTQDRNEIEKLLTLNMTAVVRLSHAMSQ 125
I +I + + I+ILVN AG+ G + E E ++N T V S ++S+
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 126 SLIRKGKGAIINLGSVLGLAPEFGSTIYGASKSFIQFFSQGLHLELKDHGVHVQAVLPSA 185
++ + G+I+ +GS P Y +SK+ F++ L LEL ++ + V P +
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 186 TKTEI 190
T+T++
Sbjct: 190 TETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04030HTHTETR557e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 7e-12
Identities = 19/76 (25%), Positives = 35/76 (46%)

Query: 1 MKVSKTQVKENRDKIVEKATQLFRSKGYDGVGIAELMSSAGFTHGGFYKHFSSKTDLVTI 60
+ +K + +E R I++ A +LF +G + E+ +AG T G Y HF K+DL +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 TAKYGLEQVLKRIEGL 76
+ + +
Sbjct: 62 IWELSESNIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS04050PF04647260.050 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 25.9 bits (57), Expect = 0.050
Identities = 15/88 (17%), Positives = 33/88 (37%), Gaps = 11/88 (12%)

Query: 39 VVAVAYAFSPIDLIPDFIPILGFIDDAVILPILIWLAVRFTPQQVIFDAEQQAKEWLDEH 98
+V A+ + P + +L I L L++L P+ +I
Sbjct: 86 LVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNPRNLI-----------SNT 134

Query: 99 EKRPKNYLVAVLIILIWLTLAVMAYFYF 126
E+R L +++++ ++ AY +
Sbjct: 135 EQRKTLKLKTSMVLMVLFGGSIGAYRLY 162


52A4U85_RS05090A4U85_RS05115N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS05090116-0.156826acinetobactin biosynthesis bifunctional
A4U85_RS05095118-1.199290acinetobactin biosynthesis histidine
A4U85_RS19150119-1.883358hypothetical protein
A4U85_RS05100014-1.313847acinetobactin export ABC transporter
A4U85_RS05105-113-1.383504acinetobactin export ABC transporter
A4U85_RS05110014-2.974472acinetobactin biosynthesis thioesterase BasH
A4U85_RS05115-115-2.605845acinetobactin biosynthesis phosphopantetheinyl
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05090ISCHRISMTASE336e-118 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 336 bits (863), Expect = e-118
Identities = 133/302 (44%), Positives = 190/302 (62%), Gaps = 19/302 (6%)

Query: 1 MAISKISTYLMPERESYPNNKTDWQLDPSRAVLLIHDMQRYFLNFYDAESELIKTVVNHL 60
MAI I Y MP P NK W DP+RAVLLIHDMQ YF++ + A + + + ++
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 VQLRSWAHQNNVPVVYTAQPYEQPAEDRALLNAMWGPGLPASTIDQQKIIDQLSPAEHDI 120
+L++ Q +PVVYTAQP Q +DRALL WGPGL + ++ KII +L+P + D+
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEE-KIITELAPEDDDL 119

Query: 121 VLTKWRYSAFKRSDLLERMQNWNRDQLIIGGVYAHIGCMVTAIEAFMSDIQPFLVGDAVA 180
VLTKWRYSAFKR++LLE M+ RDQLII G+YAHIGC+VTA EAFM DI+ F VGDAVA
Sbjct: 120 VLTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVA 179

Query: 181 DFSEEEHRLALKYVSSRCGQVVDTESVVRQV----------------ATGITRPWLEQKV 224
DFS E+H++AL+Y + RC V T+S++ Q+ T + +++
Sbjct: 180 DFSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQI 239

Query: 225 QQLIEE--DELDPEENLILYGLDSLRIMQFSSELKAQGINISFEELGRTPTLSNWWSLVD 282
+L++E +++ +E+L+ GLDS+RIM + + +G ++F EL PT+ W L+
Sbjct: 240 AELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299

Query: 283 AR 284
R
Sbjct: 300 TR 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05095MICOLLPTASE290.047 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 28.9 bits (64), Expect = 0.047
Identities = 10/70 (14%), Positives = 23/70 (32%), Gaps = 8/70 (11%)

Query: 282 AAIRSQTNLQRRQRIQHCLKMAQYAVDRFQVVGI---PAWRNPNSI--TVVFPCPSEHIW 336
+++ + + + + +VV NP+ I V++ P E+
Sbjct: 401 FVVKAGDKVTEEKIKRLYWASKEVKAQFMRVVQNDKALEEGNPDDILTVVIYNSPEEY-- 458

Query: 337 KKHYLATSGN 346
K +G
Sbjct: 459 -KLNRIINGF 467


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05100PF05272320.005 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.005
Identities = 12/49 (24%), Positives = 21/49 (42%), Gaps = 2/49 (4%)

Query: 347 MTAITGPSGVGKTTLLRALAGLAVLQEGVVQLA--KTDILQLHEKLRHE 393
+ G G+GK+TL+ L GL + + K Q+ + +E
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYE 646


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05115ENTSNTHTASED775e-19 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 76.6 bits (188), Expect = 5e-19
Identities = 48/182 (26%), Positives = 84/182 (46%), Gaps = 19/182 (10%)

Query: 68 RQLEFLTGRLAAKYALQSFNLQDSIVYQGRHGEPLWPEGVMGGISHVGSKKSCYAIAYAR 127
R+ E L GR+AA +AL+ ++ ++ G +PLWP+G+ G ISH + A+A
Sbjct: 46 RKAEHLAGRIAAVHALREVGVR-TVPGMGDKRQPLWPDGLFGSISHCAT----TALAVIS 100

Query: 128 NNTPKEKIFGIDIESQKHYIFFQKKDEFYDVFLNKNEQEEIEKLLKDQAYLYLIIFSAKE 187
GIDIE E ++ +E++ ++ L + FSAKE
Sbjct: 101 RQR-----IGIDIEKIMSQ---HTATELAPSIIDSDERQILQASLLPFPLALTLAFSAKE 152

Query: 188 SIIKAFYLKYKQIIDFKNIKFKALDGAFLYFYLR---QESLIEITLEVKVYFFHTNNEII 244
S+ KAF + + F + K +L + +L ++ E T+ + +F +N +I
Sbjct: 153 SVYKAFSDR-VTLPGFNSAKVTSLTATHISLHLLPAFAATMAERTVRTE--WFQRDNSVI 209

Query: 245 TI 246
T+
Sbjct: 210 TL 211


53A4U85_RS05900A4U85_RS05930N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS05900-116-1.242913NirD/YgiW/YdeI family stress tolerance protein
A4U85_RS05905-314-0.596551response regulator transcription factor
A4U85_RS05910-215-0.492730GHKL domain-containing protein
A4U85_RS05915-117-0.176902hemin uptake protein HemP
A4U85_RS05920-115-0.149680thiamine phosphate synthase
A4U85_RS05925-214-0.133052tetratricopeptide repeat protein
A4U85_RS05930014-0.261476penicillin-binding protein 1B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05900NAFLGMOTY280.011 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 27.8 bits (61), Expect = 0.011
Identities = 26/92 (28%), Positives = 44/92 (47%), Gaps = 6/92 (6%)

Query: 9 MTAGMATAGVVVANTPVNQ-AAIAPATVTTVKQALASKD-NTPVKLH-GQVVKSLGDEKY 65
M + T+GV+++ N A + V T +Q+ NTP++ + S GD +
Sbjct: 1 MNKWLITSGVMLSLLSANSYAVMGKRYVATPQQSQWEMVVNTPLECQLVHPIPSFGDAVF 60

Query: 66 QFRDKSGNITIDVDDELWQGRPVSANKNVTLI 97
R + I++D EL RP+ +NV+LI
Sbjct: 61 SSR---ASKKINLDFELKMRRPMGETRNVSLI 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05905HTHFIS881e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 1e-22
Identities = 32/130 (24%), Positives = 55/130 (42%)

Query: 2 RILLAEDDRSQAESIQSWLELDGYQVDWVERGDYALTAIEQHDYDCILLDRGLPQLAGEK 61
IL+A+DD + + L GY V I D D ++ D +P
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLTTIRHKQKNVPVIFITARDSIHDRVEGLDLGANDYLVKPFSLEELSARIRAQLRKQTL 121
+L I+ + ++PV+ ++A+++ ++ + GA DYL KPF L EL I L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 SQQSILTWGD 131
+
Sbjct: 125 RPSKLEDDSQ 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05910PF06580310.013 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.6 bits (69), Expect = 0.013
Identities = 18/99 (18%), Positives = 42/99 (42%), Gaps = 20/99 (20%)

Query: 351 LFDNAIRY----SPALGAIHVELSQYQQKLKISIEDTGNGVDDEVLRRLGQRFFRVLGTK 406
L +N I++ P G I ++ ++ + + +E+TG+ L
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA---------------LKNT 307

Query: 407 QQGSGLGLS-ITKKIIQLHGGELHFMHASQGGLKVEVIL 444
++ +G GL + +++ L+G E + + G ++L
Sbjct: 308 KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS05930TYPE3OMGPROT310.029 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 30.6 bits (69), Expect = 0.029
Identities = 33/145 (22%), Positives = 54/145 (37%), Gaps = 28/145 (19%)

Query: 49 AKVFARP-LEIYNNAP-ITQANFTQELKLLGYKTSSNYDKSGTYVAQGSNMYVHTRGFDY 106
A+V +RP L NA + + T +K+ G + + + G+ + + R
Sbjct: 357 AQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKG-----ITYGTMLRMTPRVLTQ 411

Query: 107 GDS--------IEPEQVLELSFANDQVVEVRSTKPSSTGVARL---EPLLIGGIYPQHNE 155
GD IE S + + + T + VAR+ + L+IGGIY
Sbjct: 412 GDKSEISLNLHIEDGNQKPNSSGIEGIPTISRTVVDT--VARVGHGQSLIIGGIYRDELS 469

Query: 156 DRVLIKLNSVPK----PLIEALIST 176
+ + VP P I AL
Sbjct: 470 VAL----SKVPLLGDIPYIGALFRR 490


54A4U85_RS07155A4U85_RS07180N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS071550100.3724153-oxoacyl-ACP reductase
A4U85_RS07160-190.044987hypothetical protein
A4U85_RS07165-391.220172serine hydrolase
A4U85_RS07170-3110.974965TetR family transcriptional regulator
A4U85_RS07175-2101.126834multidrug efflux MFS transporter AmvA
A4U85_RS07180-290.918752DNA polymerase III subunit gamma/tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07155DHBDHDRGNASE753e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.7 bits (183), Expect = 3e-17
Identities = 65/262 (24%), Positives = 114/262 (43%), Gaps = 23/262 (8%)

Query: 220 AKPLAGKTALVTGASRGIGEAIAHVLARDGAHVICLD-VPQQQADLDRVAADIGGSTLAI 278
AK + GK A +TGA++GIGEA+A LA GAH+ +D P++ + A
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF 62

Query: 279 DITAADAG---EKIKAAATKQGGLDIIVHNAGITRDKTLANMKPELWDLVININ----LS 331
D+ E + G +DI+V+ AG+ R + ++ E W+ ++N +
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 332 AAERVNDYLLENDGLNANGRIVCVSSISGIAGNLGQTNYAASKAGVIGLVKFTA-PILKN 390
A+ V+ Y+++ G IV V S YA+SKA + K + +
Sbjct: 123 ASRSVSKYMMDRRS----GSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEY 178

Query: 391 GITINAVAPGFIETQMTAAIPFAIREAGRRMNS----------MQQGGLPVDVAETIAWF 440
I N V+PG ET M ++ A + + +++ P D+A+ + +
Sbjct: 179 NIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238

Query: 441 ASTASTGVNGNVVRVCGQSLLG 462
S + + + + V G + LG
Sbjct: 239 VSGQAGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07170HTHTETR559e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.0 bits (132), Expect = 9e-12
Identities = 28/169 (16%), Positives = 64/169 (37%), Gaps = 12/169 (7%)

Query: 5 NRDQRREMILQAAMQIALAEGFTAMTVRRIATEAQTSTGQVHHHFSSASHLKAEAFLKLM 64
+ R+ IL A+++ +G ++ ++ IA A + G ++ HF S L +E +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 EQLDEIEQTLQTTSQFQRLFILLGAENIDRLHPYLRLWNEAELLIEQDIE---------- 114
+ E+E + ++F + + E + + LL+E
Sbjct: 68 SNIGELEL--EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAV 125

Query: 115 IQKAYNLAMQSWHQAIVQSIECGQKEGEFKNRSNSTDIAWRLIAFVCGL 163
+Q+A + I Q+++ + + A + ++ GL
Sbjct: 126 VQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGL 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07175TCRTETB2622e-84 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 262 bits (670), Expect = 2e-84
Identities = 90/419 (21%), Positives = 190/419 (45%), Gaps = 13/419 (3%)

Query: 7 ILTIIVLIYLPVTIDATVMHVATPSLSAALNLTANQLLWIIDIYSLIMAGLILPMGALGD 66
IL + ++ ++ V++V+ P ++ N W+ + L + G L D
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 67 RIGFKKLLFIGTAVFGVGSLAAAFSPTAYA-LIASRAVLGLGAAMLIPATLSGIRNAFTE 125
++G K+LL G + GS+ + ++ LI +R + G GAA PA + + +
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAA-FPALVMVVVARYIP 133

Query: 126 EKQRNFALGLWSTVGGGGAAFGPLVGGFVLEHFHWGAVFLINIPIILAVLVMIVMIIPKQ 185
++ R A GL ++ G GP +GG + + HW +L+ IP+I + V +M + K+
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKK 191

Query: 186 QEKTDQPINLGQALVLVVAILSLIYSIKSAMYNFSVLTVVMFVVGISTLIHFIRSQKRAT 245
+ + ++ +++ V I+ + S +F +++V+ F++ F++ ++ T
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLI-------FVKHIRKVT 244

Query: 246 TPMIDLELFKHPVISTSIVMAVVSMIALVGFELLLSQELQFVHGFSPLQA-AMFIIPFMI 304
P +D L K+ ++ + + GF ++ ++ VH S + ++ I P +
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 305 AISLGGPLAGICLNKWGLRRVSSLGILVSALSLWGLAQLNFSTDHFLAWTCMVFLGFSIE 364
++ + G + GI +++ G V ++G+ ++S + L +T F+ + LG
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSF 364

Query: 365 IALLASTAAIMSSVPPQKASAAGAIEGMAYELGAGLGVAIFGLMLSWFYSRSIILPAEL 423
+ ST S + + + ++ L G G+AI G +LS +LP E+
Sbjct: 365 TKTVISTIVSSSLKQQEAGAGMSLLNFTSF-LSEGTGIAIVGGLLSIPLLDQRLLPMEV 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS07180TONBPROTEIN531e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 52.7 bits (126), Expect = 1e-09
Identities = 38/118 (32%), Positives = 45/118 (38%), Gaps = 26/118 (22%)

Query: 390 QVIAPVSAVQPIEVISQPAMVEPEPEPEPEPEPEPEPEP-EPEPEPEPEPEPEPQPNQDL 448
QVI + QPI V MV P P+ P EPEPEPEP PEP +
Sbjct: 34 QVIELPAPAQPISV----TMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKE----- 84

Query: 449 MVFDPNHHELIGLESAVVQETVSVLEEDFIPVPEQKLVQVQAETQVKQIEPEPASTAE 506
A V + P P +K VQ Q + VK +E PAS E
Sbjct: 85 ---------------APVVIEKPKPKPKPKPKPVKK-VQEQPKRDVKPVESRPASPFE 126



Score = 38.0 bits (88), Expect = 6e-05
Identities = 11/56 (19%), Positives = 24/56 (42%), Gaps = 1/56 (1%)

Query: 399 QPIEVISQPAMVEPEPEPEPEPEPEPEPEPEPEPEPEPEPEP-EPQPNQDLMVFDP 453
+P+P+P+P+P+P + + +P+ + +P E +P P
Sbjct: 75 PEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAP 130


55A4U85_RS08025A4U85_RS08060N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS080250120.254647SDR family oxidoreductase
A4U85_RS080301130.349827OprD family outer membrane porin
A4U85_RS080352140.908702MFS transporter
A4U85_RS080400131.493458winged helix-turn-helix transcriptional
A4U85_RS08045-1131.625645amidase
A4U85_RS080501142.279155acyl-CoA dehydrogenase family protein
A4U85_RS080551141.677676hypothetical protein
A4U85_RS080601141.270384nuclear transport factor 2 family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08025DHBDHDRGNASE629e-14 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 62.4 bits (151), Expect = 9e-14
Identities = 48/189 (25%), Positives = 88/189 (46%), Gaps = 4/189 (2%)

Query: 3 KTILITGASSGLGAGMAHEFAAKGYNLAICARRLDRLETLKTELEHQYGIKVIAKSLDVT 62
K ITGA+ G+G +A A++G ++A ++LE + + L+ + A DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE-ARHAEAFPADVR 67

Query: 63 NYDQVFEVFRAFKQEFGYLDRIIVNAGVGNGRRIGKGNFEINRATAETNFISALAQCEAA 122
+ + E+ ++E G +D ++ AGV I + E AT N +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 IEIFRAQNAGHLVVMSSMSAMRGLPK-HLSTYAASKAAVAHLAEGIRAELLDTPIKVSTI 181
+ + +G +V + S A G+P+ ++ YA+SKAA + + EL + I+ + +
Sbjct: 128 SKYMMDRRSGSIVTVGSNPA--GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 FPGYIRTEM 190
PG T+M
Sbjct: 186 SPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08035TCRTETB515e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.0 bits (122), Expect = 5e-09
Identities = 75/405 (18%), Positives = 153/405 (37%), Gaps = 43/405 (10%)

Query: 25 LCMLAYIFSFIDRQILALMIEPIKADLQLSDTQFSLLHGLAFSLFYAVMGLPLAYIADRF 84
LC+L++ FS ++ +L + + I D + ++ AF L +++ ++D+
Sbjct: 19 LCILSF-FSVLNEMVLNVSLPDIANDFNKPPASTNWVN-TAFMLTFSIGTAVYGKLSDQL 76

Query: 85 SRPKLISIGIIVWSLATATCGLSKNFIQ-LFLSRMAVGVGEAALSPAAYSMFSDMFSKDK 143
+L+ GII+ + + +F L ++R G G AA + + K+
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 144 LGRAVGIYSIGAFLGGGIAFLVGGYVIN--------LLKGVTLIEVPLLGAL----KAWQ 191
G+A G+ +G G+ +GG + + L+ +T+I VP L L +
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIK 196

Query: 192 IAFLVVGLPGIIIGLLFILTVKDPARKGQQLNQSGQVDQVKFTQCLQFIKKHAKTFACHY 251
F + G+ + +G++F + + V + F ++ I+K F
Sbjct: 197 GHFDIKGIILMSVGIVFFMLFTTSYSISFLI-----VSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 252 LGFTFYAM-----------ALYSLTSWTPAFYIRKFQLAPTETGYMLGTILLVANTLGVF 300
LG M + S P QL+ E G ++ ++ + +
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 301 CAGWLNDWFIKKGRQDAPMFTGVIGIVGLIIP---IAFFTQTDQLWLSVTLLIPAMFFAS 357
G L D P++ IG+ L + +F +T ++++ ++ +
Sbjct: 312 IGGILVDRR-------GPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSF 364

Query: 358 FPLVISATALQMLAPNQFRARLSALFLLASNLIGLGVGTTLVAII 402
VIS L + A +S L + + G G +V +
Sbjct: 365 TKTVISTIVSSSLKQQEAGAGMSLLNFT--SFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08040SECA280.014 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.014
Identities = 21/105 (20%), Positives = 40/105 (38%), Gaps = 8/105 (7%)

Query: 45 IDNTRRKIILSTNALGEASITDIANLSTLKLTTATKAVYRLVEDGIVEVYSSTTDERISM 104
ID R +I+S A + + N L K + + DE+
Sbjct: 216 IDEARTPLIISGPAEDSSEMYKRVNKIIPHLIRQEKEDSETFQGEG----HFSVDEKSRQ 271

Query: 105 VKLTAKGVELVEQINQISVVTLAGILNAFSE---DELHNLNHQLK 146
V LT +G+ L+E++ + G + +S +H++ L+
Sbjct: 272 VNLTERGLVLIEELLVKEGIMDEGE-SLYSPANIMLMHHVTAALR 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08060FLGMOTORFLIM270.038 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 27.2 bits (60), Expect = 0.038
Identities = 8/28 (28%), Positives = 11/28 (39%), Gaps = 4/28 (14%)

Query: 71 DSWQSIYDMFKRYTQKESHFVMNAHFVN 98
+SW + D+ R Q E N F
Sbjct: 166 ESWTQVIDLRPRLGQIE----TNPQFAQ 189


56A4U85_RS08225A4U85_RS08255N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08225-2131.273409Rrf2 family transcriptional regulator
A4U85_RS08230-1131.603755MBL fold metallo-hydrolase
A4U85_RS08235-1151.430432TetR/AcrR family transcriptional regulator
A4U85_RS082400171.938190LysR family transcriptional regulator
A4U85_RS082450172.633261SDR family oxidoreductase
A4U85_RS082500141.623252glucose 1-dehydrogenase
A4U85_RS08255-1140.618356SDR family NAD(P)-dependent oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08225FLGMOTORFLIG280.023 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 27.8 bits (62), Expect = 0.023
Identities = 27/114 (23%), Positives = 47/114 (41%), Gaps = 14/114 (12%)

Query: 1 MAYITSSVE--YAIHCLLFLVNNEDKPLSSKDLAELQGVSPSFMAKIFPKLEK--AGLVV 56
+A I S ++ A L L E + ++ +A + SP + ++ LEK A L
Sbjct: 140 IALILSYLDPQKASFILSSL-PTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSS 198

Query: 57 AQEGVRGGYLLARSAHEISFLD------IVNAIEGEKPLFECQEVRGKCAVFNN 104
GG + I+ D I+ ++E E P +E++ K VF +
Sbjct: 199 EDYTSAGG--VDNVVEIINMADRKTEKFIIESLEEEDPELA-EEIKKKMFVFED 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08235HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.7 bits (152), Expect = 1e-14
Identities = 31/186 (16%), Positives = 79/186 (42%), Gaps = 14/186 (7%)

Query: 1 MARP---RSEDKRNAILSAAIETLAELG-ERASTSKIAKVAGVAEGTLFTYFSNKEELLN 56
MAR +++ R IL A+ ++ G S +IAK AGV G ++ +F +K +L +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 57 QLYLSLKAELRQVMMHGY-PTNADLQTQMSHIWQSYLDWSLEAPLKRKVMAQLSTSEQ-- 113
+++ ++ + ++ + D + + I L+ ++ +R +M + +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 114 -----ITEQSKQIGMQTFCDLTQNIQECINDGKLKDY--PPLFIASILGALAEVTLNFIA 166
+ + + + ++++ + Q ++ CI L + G ++ + N++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 167 QDPSQT 172
S
Sbjct: 181 APQSFD 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08245DHBDHDRGNASE1255e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 125 bits (314), Expect = 5e-37
Identities = 88/255 (34%), Positives = 125/255 (49%), Gaps = 9/255 (3%)

Query: 8 LTGKIALVTGASRGIGEEIAKLLAEQGAHVIVSSRKIEDCQRVANEIIAANGKAEAAACH 67
+ GKIA +TGA++GIGE +A+ LA QGAH+ E ++V + + A AEA
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 68 VGKLEDIAAIFEYIRKEHGRLDILVNNAAANPYFGHILDTDIGAYNKTVEVNIRGYFFMS 127
V I I I +E G +DILV N A G I + T VN G F S
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILV-NVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 128 VEAGKLMKEQGGGAIVNTASVNALQPGDRQGIYSITKAAVVNMTKAFAKECGPLGIRVNA 187
K M ++ G+IV S A P Y+ +KAA V TK E IR N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 188 LLPGLTKTKFASALFENED----IYKSWMDT----IPLRRHAEPREMAGTVLYLVSDAAS 239
+ PG T+T +L+ +E+ + K ++T IPL++ A+P ++A VL+LVS A
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 240 YTNGECIVVDGGLTI 254
+ + VDGG T+
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08250DHBDHDRGNASE1043e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 104 bits (261), Expect = 3e-29
Identities = 75/252 (29%), Positives = 117/252 (46%), Gaps = 10/252 (3%)

Query: 7 GQVVLITGAASGFGALLAEQLAKYGAKLVLGDLNIEGLNTVVEPLRQAGVEVVAQVCDVS 66
G++ ITGAA G G +A LA GA + D N E L VV L+ A DV
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 67 CEADVQALVQSAVTQFGRVDVGINNAGMSPPMKSFIDTDEADLDLSFAVNAKGVFFGMKH 126
A + + + G +D+ +N AG+ P +DE + + +F+VN+ GVF +
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDE-EWEATFSVNSTGVFNASRS 126

Query: 127 QIRQMLQQGGGIILNVASVAGLGAAPKLAAYAAAKHAVVGLTKTAAVEYANKGIRVNAIC 186
+ M+ + G I+ V S +AAYA++K A V TK +E A IR N +
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 187 PFYTTTPM---VVDSELKEKQDFLGQAS------PMKRLGHPSEVVAMMLMMCAKENSYL 237
P T T M + E +Q G P+K+L PS++ +L + + + ++
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 238 TGQAIAIDGGVT 249
T + +DGG T
Sbjct: 247 TMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08255DHBDHDRGNASE853e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 84.7 bits (209), Expect = 3e-21
Identities = 54/206 (26%), Positives = 93/206 (45%), Gaps = 10/206 (4%)

Query: 5 LSNRVAIVTGAGAGLGREHALLLARLGAKVVVNDLGSDVNGKGGSTMAAQKVVDEIIAAG 64
+ ++A +TGA G+G A LA GA + D + K S++ A+
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE---------A 56

Query: 65 GEAMANGASVTDIEQVQQMVDETIARWGRVDILINNAGILRDKTFSKMSLDDFRTVIDVH 124
A A A V D + ++ G +DIL+N AG+LR +S +++ V+
Sbjct: 57 RHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVN 116

Query: 125 LMGAVNCTKAVWDIMREQKYGRIVMTTSSSGLYGNFGQSNYSAAKMALVGLMQTLALEGE 184
G N +++V M +++ G IV S+ + Y+++K A V + L LE
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 185 KSNVRVNCLAP-TAATRMLEGLLPEE 209
+ N+R N ++P + T M L +E
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADE 202


57A4U85_RS08340A4U85_RS08395N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS083400130.637705MFS transporter
A4U85_RS08345-112-0.078348LysR family transcriptional regulator
A4U85_RS08350-1131.451816DMT family transporter
A4U85_RS08355-1132.115547DMT family transporter
A4U85_RS08360-1131.827568HlyD family secretion protein
A4U85_RS083650131.841508MFS transporter
A4U85_RS083700131.981344TetR/AcrR family transcriptional regulator
A4U85_RS083750132.261962aldehyde dehydrogenase (NADP(+))
A4U85_RS083801151.271961dihydroxy-acid dehydratase family protein
A4U85_RS08385016-0.035884fumarylacetoacetate hydrolase family protein
A4U85_RS083900170.210565LysR family transcriptional regulator
A4U85_RS08395-1170.002139NAD-dependent epimerase/dehydratase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08340TCRTETB415e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.4 bits (97), Expect = 5e-06
Identities = 82/432 (18%), Positives = 151/432 (34%), Gaps = 77/432 (17%)

Query: 8 RHSWVSLVICWIIWVVVAYDRELIFRAANMICNEFNLSPTQWGYTIAAITLSLAVLSIPV 67
RH+ + + +C + + V + ++ + I N+FN P + A L+ ++ +
Sbjct: 11 RHNQILIWLCILSFFSV-LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVY 69

Query: 68 AALSDKHASGWKRGIFQWPLVIGFTFISLLSGITSLSSSFYKFVTL-RIMVSLGCGVAEP 126
LSD+ G KR L+ G S I + SF+ + + R + G
Sbjct: 70 GKLSDQL--GIKR-----LLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPA 122

Query: 127 VGVSNTAEWWPKEHRGFAIG--------------------AHHSGYPVGALLSGVAMATI 166
+ + A + PKE+RG A G AH+ + L+ + + T+
Sbjct: 123 LVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITV 182

Query: 167 IT--YFGPQNWRYAF---FLGIIFAVPALTFWAIYSTRKRYSEF------------HQSC 209
+ R GII + F+ +++T S H
Sbjct: 183 PFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRK 242

Query: 210 VDNQFTPPTDFVHDEGEEKTSTHSTWERLKQTLSSRGIVFTAASTLITHVVYIGFLTIFP 269
V + F P + + I GF+++ P
Sbjct: 243 VTDPFVDPGLGKN----------------------IPFMIGVLCGGIIFGTVAGFVSMVP 280

Query: 270 AFLYNIVGLDLAKSAGLSAVF--TITGMMGQIIWPTLSDKIGRRLTLILCGCWMAVS--I 325
+ ++ L A G +F T++ ++ I L D+ G L + +++VS
Sbjct: 281 YMMKDVHQLSTA-EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLT 339

Query: 326 ASFCL--TSGVVSVIAIQLFFGLSANAIWPIFYATASDYAPAGAIGTANSLITVAQYVGG 383
ASF L TS +++I + + GLS + S G SL+ ++
Sbjct: 340 ASFLLETTSWFMTIIIVFVLGGLSFTKT--VISTIVSSSLKQQEAGAGMSLLNFTSFLSE 397

Query: 384 AVAPIIMGYLLT 395
I+G LL+
Sbjct: 398 GTGIAIVGGLLS 409



Score = 31.4 bits (71), Expect = 0.007
Identities = 42/185 (22%), Positives = 66/185 (35%), Gaps = 20/185 (10%)

Query: 261 YIGFLTIFPAFLYNIVGLDLAKSAGLS--------AVFTITGMMGQIIWPTLSDKIG-RR 311
+ F ++ + N+ D+A F +T +G ++ LSD++G +R
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 312 LTLILCGCWMAVSIASFCLTSGVVSVIAIQLFFGLSANAIWPIFYATASDYAPAGAIGTA 371
L L S+ F S +I + G A A + + Y P G A
Sbjct: 81 LLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 372 NSLITVAQYVGGAVAPIIMGYLLTSFGGWHSHQGYIWCFLLMSCCAFIGVILQIILGYLI 431
LI +G V P I G + W +LL+ I +I L L+
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGMIAHYIH---------WSYLLL--IPMITIITVPFLMKLL 189

Query: 432 KKEKS 436
KKE
Sbjct: 190 KKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08360RTXTOXIND1232e-33 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 123 bits (309), Expect = 2e-33
Identities = 63/421 (14%), Positives = 136/421 (32%), Gaps = 84/421 (19%)

Query: 14 PPKLTKMQYLKKHWVMVIAFIVVLVSILWILKVIFLPSSIVKTDDARVDV--EYSTIAPK 71
P L ++ ++A+ ++ ++ + + IV T + ++ I P
Sbjct: 43 PAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPI 102

Query: 72 VSGNIEEIYIKDHQTVKKGQLLARIDARDYQAALAEAESNYAKAQAD------------- 118
+ ++EI +K+ ++V+KG +L ++ A +A + +S+ +A+ +
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 119 ----------------------------LNEAMLAVERQPTVIRET-----------EAQ 139
+ E + Q A+
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLAR 222

Query: 140 LRKVEAGIKLTKDNTARYEQLQALGAESRLITQQSKTTLTEQYADLDSSKEKVIDAQYQL 199
+ + E ++ K + L A ++ + + E +L K ++ + ++
Sbjct: 223 INRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282

Query: 200 NQYK---IQVQAK------------QAALKQAQAALDKAKLNLSYTEIRAPIDGMIGQKS 244
K V + L K + + IRAP+ + Q
Sbjct: 283 LSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLK 342

Query: 245 AN-VGNFVGAGNPLMVVVPLDQVY-VEANFREIELKQIKIGQPVTVYVDAYNV----ELK 298
+ G V LMV+VP D V A + ++ I +GQ + V+A+ L
Sbjct: 343 VHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLV 402

Query: 299 GVVDSFSPSTGAFFSPISATNATGNFTKIVQRLPLRIKLLENQPDIKLLRPGLSVVVSVD 358
G V + + G ++ + L +I L G++V +
Sbjct: 403 GKVKNINLDA-------IEDQRLGLVFNVIISIEENC-LSTGNKNIP-LSSGMAVTAEIK 453

Query: 359 T 359
T
Sbjct: 454 T 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08365TCRTETB492e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 49.5 bits (118), Expect = 2e-08
Identities = 59/334 (17%), Positives = 113/334 (33%), Gaps = 18/334 (5%)

Query: 22 NNRISSITLVDIRGEMGISVDSGYWVSSIYASAMIIGMILSTSWAVIFSMRRVLLFAIGL 81
N + +++L DI + S WV++ + IG + + ++R+LLF I +
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIII 88

Query: 82 CLFSSVLIPFSPN-IEIFYLLRGLQGLANGLTIPLLMACALRFLGPEIRLWGLACYALTA 140
F SV+ + + + R +QG L+M R++ E R
Sbjct: 89 NCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIV 148

Query: 141 TFFPNLSAALAAFYLDVIGWKMIFFQTIPFCALSAALVYFGIPQDPLNYSRIKTYDWTGA 200
+ A+ I W + ++ V F + +D G
Sbjct: 149 AMGEGVGPAIGGMIAHYIHWSYLLL----IPMITIITVPFLMKLLKKEVRIKGHFDIKGI 204

Query: 201 ILAIIGLASLSTMLLHGNHLDWFHSKLICVLALMSAITLPLFLIHEWRYPTPLIKPQMLE 260
IL +G+ +L ++S ++ +F+ H + P + P + +
Sbjct: 205 ILMSVGIV---FFMLFTTSYSISF-------LIVSVLSFLIFVKHIRKVTDPFVDPGLGK 254

Query: 261 IRNFGYAVI-ALFCFVVIGMSTSTLPLNYLSAVHGYKPTQTMWIGLQIAALQFIYIPIVI 319
F V+ F + S +P + VH + + + + I +
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPY-MMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIG 313

Query: 320 KVLNQAWVDSRYVHGFGLLLVMVGCLGASQLDTT 353
+L YV G+ + V L AS L T
Sbjct: 314 GILVDR-RGPLYVLNIGVTFLSVSFLTASFLLET 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08370HTHTETR763e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 76.2 bits (187), Expect = 3e-19
Identities = 34/158 (21%), Positives = 59/158 (37%), Gaps = 4/158 (2%)

Query: 12 QKILDAATKFFLIHGFSGTTTDMIQKEAGVSKATMYGCFKNKEAMFAAVIERQCTNMQKQ 71
Q ILD A + F G S T+ I K AGV++ +Y FK+K +F+ + E +N+ +
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 72 IM-SVETKAKNLRSALTEIGKTYLCFILSHSGLAFFRVCI---AEAVRFPELSGKFFEVG 127
+ + S L EI L ++ I E V + +
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNL 133

Query: 128 PQRLANIIAGYLEKSVKQGEIELTSSSEVAAHIFLSLL 165
+ I L+ ++ + + AA I +
Sbjct: 134 CLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08395NUCEPIMERASE803e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 80.2 bits (198), Expect = 3e-19
Identities = 55/225 (24%), Positives = 96/225 (42%), Gaps = 39/225 (17%)

Query: 1 MNVLITGGTGFIGKQIAKEILKTGSLTLDGKQAKPIDKIILFDAFVGDD----------- 49
M L+TG GFIG ++K +L+ G +++ D +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG------------HQVVGIDNL--NDYYDVSLKQARL 46

Query: 50 -LPQDPKIEVVIGDITDKTTVANI--TEKIDVVWHLA--AVVSSAAEADFDLGMDVNLYG 104
L P + D+ D+ + ++ + + V+ V + E + D NL G
Sbjct: 47 ELLAQPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLE-NPHAYADSNLTG 105

Query: 105 LLNLLEELRKKQTMPRVIFASGCAVFGG--QLPEVVTDDTVVTPKSSYGMQKAVGELLVS 162
LN+LE R + + +++AS +V+G ++P TDD+V P S Y K EL+
Sbjct: 106 FLNILEGCRHNK-IQHLLYASSSSVYGLNRKMP-FSTDDSVDHPVSLYAATKKANELMAH 163

Query: 163 DYSRKGFIDGRVLRLPTIVVRPGKPNKAASTFFSSIIREPLKGET 207
YS + LR T+ G+P+ A F ++ L+G++
Sbjct: 164 TYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAM----LEGKS 204


58A4U85_RS08635A4U85_RS08695N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08635-2130.480720SDR family NAD(P)-dependent oxidoreductase
A4U85_RS08640-1140.040779SDR family NAD(P)-dependent oxidoreductase
A4U85_RS08645-113-0.259556alpha/beta fold hydrolase
A4U85_RS08650-110-0.560632AraC family transcriptional regulator
A4U85_RS08655-110-1.959436putative multidrug efflux protein AdeT1
A4U85_RS08660-111-2.262315two-component sensor histidine kinase AdeS
A4U85_RS08665-210-2.028424efflux system response regulator transcription
A4U85_RS08670-210-1.862860multidrug efflux RND transporter periplasmic
A4U85_RS08675-29-1.764954multidrug efflux RND transporter permease
A4U85_RS08680013-2.854621excinuclease ABC subunit UvrA
A4U85_RS08685215-1.795467hypothetical protein
A4U85_RS08690215-0.591918molecular chaperone
A4U85_RS08695110-0.896660fimbrial biogenesis outer membrane usher
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08635DHBDHDRGNASE733e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 72.8 bits (178), Expect = 3e-17
Identities = 58/206 (28%), Positives = 98/206 (47%), Gaps = 2/206 (0%)

Query: 1 MSKYKLKDKVVVITGSTGGLGLAIAQALQAKGAKLALLDLDLNKVESQAKQLGGKS-IAA 59
M+ ++ K+ ITG+ G+G A+A+ L ++GA +A +D + K+E L ++ A
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 60 GWVADVRSLESLETAMANAAQHFGKIDVVIANAGIGSTEALEHMAPETFERTIDINLTGV 119
+ ADVR +++ A + G ID+++ AG+ + ++ E +E T +N TGV
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 120 FRTFRAAIPYVK-QTQGYLLAVSSMAAFVHSPLNTHYTSSKAGVWALCDSLRLELKYLNI 178
F R+ Y+ + G ++ V S A V Y SSKA L LEL NI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 179 GVGSLHPTFFKTPLMDNIQNDPAGKA 204
+ P +T + ++ D G
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08640DHBDHDRGNASE784e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 78.2 bits (192), Expect = 4e-19
Identities = 58/187 (31%), Positives = 88/187 (47%), Gaps = 6/187 (3%)

Query: 8 KVVLITGAAGGIGAATAREFYALGANLVLTDMQQEAVDKLASEFEASRVLP--LALDITD 65
K+ ITGAA GIG A AR + GA++ D E ++K+ S +A D+ D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 66 AVATKDVVQKTIKHFGHLDIAFANAGISWRDGASTIASCDEAEFDKIVEVDLLGVWRTVR 125
+ A ++ + + G +DI AG+ R G I S + E++ V+ GV+ R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGV-LRPGL--IHSLSDEEWEATFSVNSTGVFNASR 125

Query: 126 AALPEV-TRNEGQILITSSVYCFVNGMANAPYAASKAAVEMLGRCLRTEIAYTGATASVV 184
+ + R G I+ S V + A YA+SKAA M +CL E+A ++V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 185 YPGWTAT 191
PG T T
Sbjct: 186 SPGSTET 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08665HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 95.3 bits (237), Expect = 1e-24
Identities = 31/118 (26%), Positives = 61/118 (51%), Gaps = 1/118 (0%)

Query: 15 ILVVEDDYDIGDIIENYLKREGMSVIRAMNGKQAIELHASQPIDLILLDIKLPELNGWEV 74
ILV +DD I ++ L R G V N A+ DL++ D+ +P+ N +++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 75 LNKIRQ-KAQTPVIMLTALDQDIDKVMALRIGADDFVVKPFNPNEVVARVQAVLRRTQ 131
L +I++ + PV++++A + + + A GA D++ KPF+ E++ + L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08670RTXTOXIND523e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.8 bits (124), Expect = 3e-09
Identities = 44/196 (22%), Positives = 81/196 (41%), Gaps = 19/196 (9%)

Query: 55 VHAFRTAEIRPQVGGIIEKVLFKQGSEVRAGQALYKINSETFEADVNSNRASLNKAEAEV 114
H+ R+ EI+P I+++++ K+G VR G L K+ + EAD ++SL +A E
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ 150

Query: 115 ARLKVQLERYEQ----------LLPSNAISKQEVSNAQAQYRQALADVAQMKAL--LARQ 162
R ++ E +S++EV + ++ + K L
Sbjct: 151 TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210

Query: 163 NLNLQYATVRAPISGRIGQSFVTEG------ALVGQGDTNTMATIQQIDKVYVDVKQSVS 216
+ TV A I+ S V + +L+ + A ++Q +K YV+ +
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK-YVEAVNELR 269

Query: 217 EYERLQAALQSGELSA 232
Y+ ++S LSA
Sbjct: 270 VYKSQLEQIESEILSA 285



Score = 50.2 bits (120), Expect = 8e-09
Identities = 39/216 (18%), Positives = 74/216 (34%), Gaps = 49/216 (22%)

Query: 97 EADVNSNRASLNKAEAEVARLKVQLERYEQLLPSNAISKQEVSNAQAQYRQALADVAQMK 156
++ ++ L + E+E+ K + + QL K E+ + RQ ++ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEIL---DKLRQTTDNIGLLT 315

Query: 157 ALLARQNLNLQYATVRAPISGRIGQ-SFVTEGALVGQGDT-------------NTMATIQ 202
LA+ Q + +RAP+S ++ Q TEG +V +T + +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNK 375

Query: 203 QIDKVY----VDVKQSV---SEYERLQAALQSGELSANSDKTVRITNSHGQPYNVTAKML 255
I + +K + Y L +++ L A D+ + G +NV +
Sbjct: 376 DIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRL------GLVFNVIISI- 428

Query: 256 FEDINVDPETGDVTFRIEVNNTERKLLPGMYVRVNI 291
+ N L GM V I
Sbjct: 429 ------------EENCLSTGNKNIPLSSGMAVTAEI 452


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08675ACRIFLAVINRP10590.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1059 bits (2740), Expect = 0.0
Identities = 501/1028 (48%), Positives = 703/1028 (68%), Gaps = 9/1028 (0%)

Query: 2 MSQFFIRRPVFAWVIAIFIIIFGLLSIPKLPIARFPSVAPPQVNISATYPGATAKTINDS 61
M+ FFIRRP+FAWV+AI +++ G L+I +LP+A++P++APP V++SA YPGA A+T+ D+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 62 VVTLIERELSGVKNLLYYSATTDTSGTAEITATFKPGTDVEMAQVDVQNKIKAVEARLPQ 121
V +IE+ ++G+ NL+Y S+T+D++G+ IT TF+ GTD ++AQV VQNK++ LPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 VVRQQGLQVEASSSGFLMLVGINSPNNQYSEVDLSDYLVRNVVEELKRVEGVGKVQSFGA 181
V+QQG+ VE SSS +LM+ G S N ++ D+SDY+ NV + L R+ GVG VQ FGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 182 EKAMRIWVDPNKLVSYGLSISDVNNAIRENNVEIAPGRLGDLPAEKGQLITIPLSAQGQL 241
+ AMRIW+D + L Y L+ DV N ++ N +IA G+LG PA GQ + + AQ +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 242 SSLEQFKNISLKSKTNGSVIKLSDVANVEIGSQAYNFAILENGKPATAAAIQLSPGANAV 301
+ E+F ++L+ ++GSV++L DVA VE+G + YN NGKPA I+L+ GANA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 302 KTAEGVRAKIEELKLNLPEGMEFSIPYDTAPFVKISIEKVIHTLLEAMVLVFIVMYLFLH 361
TA+ ++AK+ EL+ P+GM+ PYDT PFV++SI +V+ TL EA++LVF+VMYLFL
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 362 NVRYTLIPAIVAPIALLGTFTVMLLAGFSINVLTMFGMVLAIGIIVDDAIVVVENVERIM 421
N+R TLIP I P+ LLGTF ++ G+SIN LTMFGMVLAIG++VDDAIVVVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 422 ATEGLSPKDATSKAMKEITSPIIGITLVLAAVFLPMAFASGSVGVIYKQFTLTMSVSILF 481
+ L PK+AT K+M +I ++GI +VL+AVF+PMAF GS G IY+QF++T+ ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 482 SALLALILTPALCATILKPIDGHHQ--KKGFFAWFDRSFDKVTKKYELMLLKIIKHTVPM 539
S L+ALILTPALCAT+LKP+ H K GFF WF+ +FD Y + KI+ T
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 540 MVIFLVITGITFAGMKYWPTAFMPEEDQGWFMTSFQLPSDATAERTRNVVNQFENNLKDN 599
++I+ +I P++F+PEEDQG F+T QLP+ AT ERT+ V++Q + N
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 600 --PDVKSNTTILGWGFSGAGQNVAVAFTTLKDFKERTS---SASKMTSDVNTSMANSTEG 654
+V+S T+ G+ FSG QN +AF +LK ++ER SA + + +G
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 655 ETMAVLPPAIDELGTFSGFSLRLQDRANLGMPALLAAQDELMAMAAKN-KKFYMVWNEGL 713
+ PAI ELGT +GF L D+A LG AL A+++L+ MAA++ V GL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 714 PQGDNISLKIDREKLSALGVKFSDVSDIISTSMGSMYINDFPNQGRMQQVIVQVEAKSRM 773
L++D+EK ALGV SD++ IST++G Y+NDF ++GR++++ VQ +AK RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 774 QLKDILNLKVMGSSGQLVSLSEVVTPQWNKAPQQYNRYNGRPSLSIAGIPNFDTSSGEAM 833
+D+ L V ++G++V S T W + RYNG PS+ I G TSSG+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 834 REMEQLIAKLPKGIGYEWTGISLQEKQSESQMAFLLGLSMLVVFLVLAALYESWAIPLSV 893
ME L +KLP GIGY+WTG+S QE+ S +Q L+ +S +VVFL LAALYESW+IP+SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 894 MLVVPLGIFGAIIAIMSRGLMNDVFFKIGLITIIGLSAKNAILIVEFAK-MLKEEGMSLI 952
MLVVPLGI G ++A NDV+F +GL+T IGLSAKNAILIVEFAK ++++EG ++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 953 EATVAAAKLRLRPILMTSLAFTCGVIPLVIASGASSETQHALGTGVFGGMISATILAIFF 1012
EAT+ A ++RLRPILMTSLAF GV+PL I++GA S Q+A+G GV GGM+SAT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1013 VPVFFIFI 1020
VPVFF+ I
Sbjct: 1021 VPVFFVVI 1028



Score = 88.7 bits (220), Expect = 7e-20
Identities = 52/323 (16%), Positives = 128/323 (39%), Gaps = 13/323 (4%)

Query: 723 IDREKLSALGVKFSDVSDIISTS---MGSMYINDFPNQGRMQQVIVQVEAKSRMQ-LKDI 778
+D + L+ + DV + + + + + P QQ+ + A++R + ++
Sbjct: 188 LDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPG-QQLNASIIAQTRFKNPEEF 246

Query: 779 LNLKVMGS-SGQLVSLSEVVTPQWNKAPQQYN-RYNGRPSLSIAGIPNFDTSSGEA---- 832
+ + + G +V L +V + R NG+P+ + ++ +
Sbjct: 247 GKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAI 306

Query: 833 MREMEQLIAKLPKGIGYEWT-GISLQEKQSESQMAFLLGLSMLVVFLVLAALYESWAIPL 891
++ +L P+G+ + + + S ++ L ++++VFLV+ ++ L
Sbjct: 307 KAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATL 366

Query: 892 SVMLVVPLGIFGAIIAIMSRGLMNDVFFKIGLITIIGLSAKNAILIVE-FAKMLKEEGMS 950
+ VP+ + G + + G + G++ IGL +AI++VE +++ E+ +
Sbjct: 367 IPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLP 426

Query: 951 LIEATVAAAKLRLRPILMTSLAFTCGVIPLVIASGASSETQHALGTGVFGGMISATILAI 1010
EAT + ++ ++ + IP+ G++ + M + ++A+
Sbjct: 427 PKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVAL 486

Query: 1011 FFVPVFFIFILGAVEKLFSSKKK 1033
P +L V K
Sbjct: 487 ILTPALCATLLKPVSAEHHENKG 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08700PF005777600.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 760 bits (1965), Expect = 0.0
Identities = 257/852 (30%), Positives = 407/852 (47%), Gaps = 47/852 (5%)

Query: 37 EAAASAPVEAEFDSAFLIGDAQ-KVDISRFKYGNPVLPGEYNVDVYVNGQWFGKRRMIFK 95
A + E F+ FL D Q D+SRF+ G + PG Y VD+Y+N + R + F
Sbjct: 38 AQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFN 97

Query: 96 ALDPNQNAVTCFTGMNLLEYGVKQEILTKHAPLQKENNSCYKIEEWVENAFYEFDTSRLR 155
D Q V C T L G+ ++ L +++C + + +A + D + R
Sbjct: 98 TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA--DDACVPLTSMIHDATAQLDVGQQR 155

Query: 156 VDISIPQVALQKNAQGYVDPSVWDRGINAGFLSYSGSAYKTFNQSGDRSETTNAFMGVTA 215
++++IPQ + A+GY+ P +WD GINAG L+Y+ S N+ G S A++ + +
Sbjct: 156 LNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS--HYAYLNLQS 213

Query: 216 GLNLAGWQLRHNGQWQWQDTPAENQSKSDYQETSTYLQRAFPKYRGVLTLGDSFTNGEVF 275
GLN+ W+LR N W + + + + SK+ +Q +T+L+R R LTLGD +T G++F
Sbjct: 214 GLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIF 273

Query: 276 DSYGYRGIDFSSDDRMLPNSMLGYAPRIRGNAKTNAKVEVRQQGQLIYQTTVAPGNFEIN 335
D +RG +SDD MLP+S G+AP I G A+ A+V ++Q G IY +TV PG F IN
Sbjct: 274 DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTIN 333

Query: 336 DLYPTGFGGEIEVSVIEANGEIQKFSVPYASVVQMLRPGMNRYSLTVGQFRDQDIDLD-P 394
D+Y G G+++V++ EA+G Q F+VPY+SV + R G RYS+T G++R + + P
Sbjct: 334 DIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKP 393

Query: 395 WIIQGKYQQGINNYLTGYTGIQASENYAAILLGAAVAT-PIGAIAFDVTHSEAEFEKQAS 453
Q G+ T Y G Q ++ Y A G +GA++ D+T + + +
Sbjct: 394 RFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQ 453

Query: 454 QSGQSFRLSYSKLITPTNTNLTLAAYRYSTENFYKLRDALLIRDLEEKGVNTYAAG---- 509
GQS R Y+K + + TN+ L YRYST ++ D R
Sbjct: 454 HDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKP 513

Query: 510 ----------RQRSEFQITLNQGLPEGWGNFYVVGSWVDYWNRSESTKQYQIGYSNNYHG 559
+R + Q+T+ Q L Y+ GS YW S +Q+Q G + +
Sbjct: 514 KFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFED 572

Query: 560 LTYGLSAINRKVEYGSNDASHDTEYLMTLSFPINFKKN----------SVNVNVTASEDS 609
+ + LS K + D + ++ P + S + +++ +
Sbjct: 573 INWTLSYSLTK---NAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNG 629

Query: 610 RT---VGASGMVG--DRFSYGASVSHQD----YANPTFNANGRYRTNYATVGGSYSIADS 660
R G G + + SY + + T A YR Y YS +D
Sbjct: 630 RMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDD 689

Query: 661 YQQAMVSLSGSVVAHSDGILFGPEQGQTMVLVHAPDAAGAKVNNTVGLSVNKAGYAVVPY 720
+Q +SG V+AH++G+ G T+VLV AP A AKV N G+ + GYAV+PY
Sbjct: 690 IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPY 749

Query: 721 VTPYRLNDITLDPQEMSSEVELEETSQRIAPFAGAIAKVDFATKTGYAVYINSKTADGNS 780
T YR N + LD ++ V+L+ + P GAI + +F + G + + T +
Sbjct: 750 ATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTL-THNNKP 808

Query: 781 LPFAAQVFNQKDEAVGIVAQGSMIYLRTPLAQDRLYVKWGDESNERCSVEYNISNELRNK 840
LPF A V ++ ++ GIVA +YL ++ VKWG+E N C Y + E ++
Sbjct: 809 LPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPE--SQ 866

Query: 841 QQSIVMTEAVCK 852
QQ + A C+
Sbjct: 867 QQLLTQLSAECR 878


59A4U85_RS08865A4U85_RS08900N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS08865-3110.593365SDR family oxidoreductase
A4U85_RS08870-311-0.473721dienelactone hydrolase family protein
A4U85_RS08875-110-1.934103MFS transporter
A4U85_RS08880-113-2.403328AraC family transcriptional regulator
A4U85_RS08885116-2.8102683-hydroxybutyrate dehydrogenase
A4U85_RS08890117-3.848476GntP family permease
A4U85_RS08895522-5.794414hypothetical protein
A4U85_RS08900319-5.145784TetR/AcrR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08865DHBDHDRGNASE1095e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 109 bits (274), Expect = 5e-31
Identities = 76/253 (30%), Positives = 113/253 (44%), Gaps = 16/253 (6%)

Query: 5 LRGKVAVVSGGATLIGKAVVQALVSAGAHVAILDIDAKGKAIAESFNHGVMFIQ----TD 60
+ GK+A ++G A IG+AV + L S GAH+A +D + + S D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 61 LTSDAAIQQAVADIHQHLGEVSYLVNLACTYLDDGFKS-SRQDWIQALDLNLVSTIELSR 119
+ AAI + A I + +G + LVN+A S S ++W +N SR
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 120 ALYNDLK-KQQGSIVNFTSISAKVAQTGRWLYPVSKAAIRQLTQSMAMDFAADGIRVNSV 178
++ + ++ GSIV S A V +T Y SKAA T+ + ++ A IR N V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 179 SPGWT-----WSRVIAEVSGNNREKADSVAADYHL---LGRLGHPEEVANVVLFLLSSAA 230
SPG T WS E K + L +L P ++A+ VLFL+S A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGS--LETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 231 SFVTGADYAVDGG 243
+T + VDGG
Sbjct: 244 GHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08875TCRTETA538e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 53.3 bits (128), Expect = 8e-10
Identities = 33/151 (21%), Positives = 62/151 (41%), Gaps = 8/151 (5%)

Query: 24 ILGFFVFFCDGLDTGIIGFVAPALLDDWGITKPQLAP---VLSAALVGMSIGAIISGPLA 80
I+ D + G+I V P LL D + A +L+ + A + G L+
Sbjct: 8 IVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 81 DKFGRKGVIVFTSLLFSIFTILCGFANSTQDLMIYRFITGVGLGAAMPNISTIVSEYMPV 140
D+FGR+ V++ + ++ + A L I R + G+ GA +++
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDG 126

Query: 141 KRKA----FLTGLAGCGFMLGISCGGVLSAY 167
+A F++ G G + G GG++ +
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGF 157



Score = 30.6 bits (69), Expect = 0.011
Identities = 28/125 (22%), Positives = 53/125 (42%), Gaps = 8/125 (6%)

Query: 73 AIISGPLADKFGRKGVIVFTSLLFSIFTILCGFANSTQDLMIYRFITGVGLGAAMPNIST 132
A+I+GP+A + G + ++ + IL FA + G G MP +
Sbjct: 264 AMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG-GIGMPALQA 322

Query: 133 IVSEYMPVKRKAFLTG----LAGCGFMLGISCGGVLSAYLLESY-GWAKVIIIGGSIPLI 187
++S + +R+ L G L ++G + A + ++ GW I G ++ L+
Sbjct: 323 MLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGW--AWIAGAALYLL 380

Query: 188 LVVAL 192
+ AL
Sbjct: 381 CLPAL 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08885DHBDHDRGNASE1253e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 125 bits (316), Expect = 3e-37
Identities = 81/264 (30%), Positives = 121/264 (45%), Gaps = 16/264 (6%)

Query: 3 KLLDGKVAFITGSASGIGLEIAKKFAQEGAKVVISDMNAEKCQETASSLKEQGFDTLSAP 62
K ++GK+AFITG+A GIG +A+ A +GA + D N EK ++ SSLK + + P
Sbjct: 4 KGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVTDEVAYKQAIELTQKTFGTVDILINNAGFQHVAPIEEFPTAVFQKLVQVMLTGAFIG 122
DV D A + ++ G +DIL+N AG I ++ V TG F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 123 IKHVFPIMKAQKYGRIINMASINGLIGFAGKAGYNSAKHGVIGLTKVAALECARDGITVN 182
+ V M ++ G I+ + S + A Y S+K + TK LE A I N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 183 ALCPGYVDTPLVRGQIADLAKTRNVSLDSALEDVILAM-------VPQKRLLSVEEIADY 235
+ PG +T + AD ++ E VI +P K+L +IAD
Sbjct: 184 IVSPGSTETDMQWSLWAD---------ENGAEQVIKGSLETFKTGIPLKKLAKPSDIADA 234

Query: 236 AIFLASSKAGGVTSQAVVMDGGYT 259
+FL S +AG +T + +DGG T
Sbjct: 235 VLFLVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS08900HTHTETR455e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.6 bits (105), Expect = 5e-08
Identities = 25/153 (16%), Positives = 50/153 (32%), Gaps = 5/153 (3%)

Query: 5 STKTNQIDNKNRLIIQTVGDLFLEYGYSRVSINLIISKIGGSKRDLYAQFGDKEGLFRSV 64
TK + + I+ LF + G S S+ I G ++ +Y F DK LF +
Sbjct: 4 KTKQEAQETRQH-ILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 65 IADVCQQVLDPLKALPVE--GGSIEQALTSFGRIFLSVLLSSRVIALQKLVLSEATRYPE 122
+ + + G + + S + R L +++ + E
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 123 FAKT--FVQLGPVSAYNLAAELLMKRAEVGEIR 153
A + + +Y+ + L E +
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLP 155


60A4U85_RS09185A4U85_RS09250N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09185-123-2.482190YARHG domain-containing protein
A4U85_RS09190-122-2.364725HIT family protein
A4U85_RS09195-118-2.612914porin
A4U85_RS09200-117-2.431349DUF937 domain-containing protein
A4U85_RS09205013-2.212805hypothetical protein
A4U85_RS09210-113-2.262760DUF2797 domain-containing protein
A4U85_RS09220015-2.286648ABC transporter permease
A4U85_RS09225014-2.042229ABC transporter permease
A4U85_RS09230-114-2.222350ABC transporter ATP-binding protein
A4U85_RS09235012-2.882301HlyD family efflux transporter periplasmic
A4U85_RS09240012-3.346478CerR family C-terminal domain-containing
A4U85_RS09245011-2.859014lipid-A-disaccharide synthase
A4U85_RS09250111-2.541544TonB-dependent siderophore receptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09190PF05704250.038 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 25.2 bits (55), Expect = 0.038
Identities = 8/57 (14%), Positives = 16/57 (28%), Gaps = 4/57 (7%)

Query: 32 LQDDYNLIYASKGFCFKDQDAKEKYGNENCHTTKP----KFSDKEQQRLDAIKERQK 84
+ D+ + A K N N H + + + + + QK
Sbjct: 222 IFHDFVSVMAVSKEYSKYWKEIPYVNNVNPHMLQYLGNLPYDNSMFNYIKSTSPVQK 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09200ECOLNEIPORIN658e-14 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 64.8 bits (158), Expect = 8e-14
Identities = 78/375 (20%), Positives = 125/375 (33%), Gaps = 51/375 (13%)

Query: 1 MKKLLLAAAVATLSVNAVQAAPTLYGKLNVSINQVDNKNFDG-----KSDVTEVNSNSSR 55
MKK L+A +A L V A A TLYG + + + +G T + S+
Sbjct: 1 MKKSLIALTLAALPV-AAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSK 59

Query: 56 IGVKGEEKLTDKLSAVYLAEWAISTDGSGSDTDLSARNRFIGLKTEGVGTLKVGKYDSYF 115
IG KG+E L + L A++ E S +G+D+ R FIGLK G G L+VG+ +S
Sbjct: 60 IGFKGQEDLGNGLKAIWQVEQKASI--AGTDSGWGNRQSFIGLKG-GFGKLRVGRLNSVL 116

Query: 116 KTSAGSNQDIFNDDTRLDITNIMYGENRLDNVVGFELDPKLLAGLTFNIMAQTGESTSDS 175
K G + L + I E RL + D AGL S S
Sbjct: 117 K-DTGDINPWDSKSDYLGVNKIAEPEARL---ISVRYDSPEFAGL------------SGS 160

Query: 176 KKGETGKDSKNDSFDSVSTSLGYENKDLGLAIAAAGDFGIKGKYAAYGLKDVYTDAYRVT 235
+ ++ + +S Y+N G + G + + + Y +R+
Sbjct: 161 VQYALNDNAGRHNSESYHAGFNYKNG--GFFVQYGGAYKRHHQVQENVNIEKY-QIHRLV 217

Query: 236 GSYDIAKSGFVVGALWQHAEPTDDLTAYGQTYKSDGSIDKAGKAYRGLEEEAYAVTAAYK 295
YD AL+ + Q + + V A
Sbjct: 218 SGYD-------NDALY--------ASVAVQQQDAKLVEENYSHN------SQTEVAATLA 256

Query: 296 IPNTKLKVKAEYASAETQVSGQADRK--IDLYGLGLDYQINKQARFYGIVAQQKRDWLND 353
+ + YA + D +G +Y +K+ +
Sbjct: 257 YRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGES 316

Query: 354 DDKQTVVGTGIEYNF 368
T G G+ + F
Sbjct: 317 KFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09205DNABINDINGHU270.013 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 27.3 bits (61), Expect = 0.013
Identities = 10/36 (27%), Positives = 17/36 (47%), Gaps = 1/36 (2%)

Query: 139 IEQVAQQAQAPKEQVYGAIASVLPQVIDSLTPQGES 174
I +VA+ + K+ A+ +V V L +GE
Sbjct: 8 IAKVAEATELTKKDSAAAVDAVFSAVSSYLA-KGEK 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09220ABC2TRNSPORT542e-10 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 54.2 bits (130), Expect = 2e-10
Identities = 36/171 (21%), Positives = 72/171 (42%), Gaps = 2/171 (1%)

Query: 201 AREREQGTFDQLLVTPYTPLQIMIGKALPPIFVGLMQSTIILLIILFWFKIPMNGSIGLL 260
R Q T++ +L T I++G+ + I ++ + L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLYAL 151

Query: 261 YFGLLSFNVAVVGVGLSISALSLNMQQAMLFTFLLIMPLMLLSGLLTPVENMPKALQVAT 320
L+ +A +G+ ++AL+ + + + L+I P++ LSG + PV+ +P Q A
Sbjct: 152 PVIALT-GLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTAA 210

Query: 321 YANPLRFGINLVQRVYLEGASFAQVKLNFLPMIVLGIVTLPLAAWLFRNRL 371
PL I+L++ + L V + + + ++ L+ L R RL
Sbjct: 211 RFLPLSHSIDLIRPIMLGHPV-VDVCQHVGALCIYIVIPFFLSTALLRRRL 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09225ABC2TRNSPORT382e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 38.4 bits (89), Expect = 2e-05
Identities = 36/142 (25%), Positives = 62/142 (43%), Gaps = 6/142 (4%)

Query: 202 ARERERGTLEALFVTPVRPFEIVLAKLIPYVVVGMIDIVICIVAAHFIFEVPMRGSLFSI 261
R + T EA+ T +R +IVL ++ + V A + L+++
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLYAL 151

Query: 262 LSASFLYLIVSLLLGLTISGFAQSQ--FQASQIALLASFMPALMLSGFVFDTRNLPLVVQ 319
+ L + L G+ ++ A S F Q ++ P L LSG VF LP+V Q
Sbjct: 152 PVIALTGLAFASL-GMVVTALAPSYDYFIFYQTLVIT---PILFLSGAVFPVDQLPIVFQ 207

Query: 320 IISQLLPATHFMVLIKTLFMGG 341
++ LP +H + LI+ + +G
Sbjct: 208 TAARFLPLSHSIDLIRPIMLGH 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09230PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 10/21 (47%), Positives = 13/21 (61%)

Query: 44 VLVGPDGAGKTTLLRLIAGLY 64
VL G G GK+TL+ + GL
Sbjct: 600 VLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09235RTXTOXIND544e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.1 bits (130), Expect = 4e-10
Identities = 27/173 (15%), Positives = 57/173 (32%), Gaps = 20/173 (11%)

Query: 54 SGRIQKLLVQEGDKVQAGQVLATLNTNALQIQAKQAQAQLKAQQEAIIKQEIGARPEEIT 113
+ +++++V+EG+ V+ G VL L + + Q+ L
Sbjct: 104 NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLL------------------- 144

Query: 114 QAKAQLASAQAELDKTNKNLQRLQILVSSTDGRAISQQELDYAKSNQHSAEAAVRERQAN 173
QA+ + Q N L + +S++E+ S + + ++
Sbjct: 145 QARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQ 204

Query: 174 LELLIKGARKEDREATRAQYEVTKANLDLIKYNLTQAELRSPVNAVVRARLQE 226
EL + R E A+ + + K L A+ + + E
Sbjct: 205 KELNLDKKRAERLTV-LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLE 256



Score = 38.7 bits (90), Expect = 3e-05
Identities = 21/104 (20%), Positives = 40/104 (38%), Gaps = 5/104 (4%)

Query: 49 LAFEQSGRIQKLLVQEGDKVQAGQVLATLNTNALQIQAKQAQAQLKAQQ-EAIIKQEIGA 107
L +Q+ +L QE V+A L + QI+++ A+ + Q + K EI
Sbjct: 243 LLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL- 301

Query: 108 RPEEITQAKAQLASAQAELDKTNKNLQRLQILVSSTDGRAISQQ 151
+++ Q + EL K + Q I + + +
Sbjct: 302 --DKLRQTTDNIGLLTLELAKNEERQQASVI-RAPVSVKVQQLK 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09240HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 1e-14
Identities = 29/147 (19%), Positives = 53/147 (36%), Gaps = 1/147 (0%)

Query: 1 MSRSRRSDGDLTKTKIIEAAGPLIAQYGFAKTANKTIAKVANVDLAAINYHFDGRDGLYQ 60
M+R + + T+ I++ A L +Q G + T+ IAK A V AI +HF + L+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AVLMEAHAHYLDEQYLLELVESTYSPEEKLSLLLETLLHKLTEKDVWHGKVFIRELFSPS 120
+ E + E L + P L +L +L ++ + I
Sbjct: 61 E-IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 121 EHLLSFIELTGMRKFFLIRKLISQVAN 147
++ ++ I Q
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLK 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09250OMPADOMAIN372e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 37.2 bits (86), Expect = 2e-04
Identities = 30/161 (18%), Positives = 49/161 (30%), Gaps = 17/161 (10%)

Query: 556 DGNPLNRKVPTSDLESQGVEVGLSGQITDNVNLSLGYAQFSIKDTKNGGEARTYNPNQTL 615
D +N PT + G Q+ V +GY K E Y Q +
Sbjct: 41 DTGFINNNGPTHE-NQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYK-AQGV 98

Query: 616 NLLTTYTPPVLPKL----KVGAGLQWQDGIKLYDSNVNGTIKQDAYALV-NLMASYEVND 670
L P+ L ++G + D SNV G + V Y +
Sbjct: 99 QLTAKLGYPITDDLDIYTRLGGMVWRAD----TKSNVYGKNHDTGVSPVFAGGVEYAITP 154

Query: 671 HITLQANGNNIFDKKYLNSFPDGQAFYGAPANYTVAVKFKY 711
I + + ++ N+ D P N +++ Y
Sbjct: 155 EIATRL------EYQWTNNIGDAHTIGTRPDNGMLSLGVSY 189


61A4U85_RS09350A4U85_RS09370N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09350-114-2.593869IucA/IucC family siderophore biosynthesis
A4U85_RS09355-114-2.565639siderophore achromobactin biosynthesis protein
A4U85_RS09360-314-1.957425DHA2 family efflux MFS transporter permease
A4U85_RS09365-214-2.218891SidA/IucD/PvdA family monooxygenase
A4U85_RS09370-212-1.285985siderophore biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09350PF041831862e-53 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 186 bits (473), Expect = 2e-53
Identities = 106/511 (20%), Positives = 191/511 (37%), Gaps = 35/511 (6%)

Query: 130 SADKVQAFYEQLQKCL-KQYHLLQQHR-VNAHDLLNQSLAHRFRILEQYAGYRDRPYHPL 187
S V + L L LL+ R ++A DL+N + +L P
Sbjct: 88 SDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNADRLQCLLS------GHPKFVF 141

Query: 188 AKLKEGLSQQEYMQYCPEFAQELSIHWVAVHKDKMMFGDGVENIFKQQPSEIFIPRAERY 247
K + G ++ +Y PE+A +HW+AV ++ M++ E Q + P+ E
Sbjct: 142 NKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQ-EFA 200

Query: 248 QLKQEMFQRGLNETHIAMPIHPWQFEHLFPKFYADDIADGICHPLNFISKGMYASASMRS 307
+ Q + GL+ + +P+HPWQ++ + D A+G L A S+R+
Sbjct: 201 RFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGDQWLAQQSLRT 260

Query: 308 LLSK-NVLEESLKLPIGIKALGSLRFLPIVKMINGEKNQKLLQQAKAKDAVLKQKLWLCE 366
L + +KLP+ I R +P + G + LQQ A DA L Q +
Sbjct: 261 LTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVIL 320

Query: 367 ETQWWSYLPEKQNDRTADNEWLFVEKPTHLAAQRRHIPAELLQEPYQLIPMASLGHTIMG 426
Y+ + A + + E L R P L+ + MA+L
Sbjct: 321 GEPAAGYVSHEGYAALARAPYRYQE---MLGVIWRENPCRWLKPDESPVLMATLMECDEN 377

Query: 427 EPAIFDYILQLQHKDINSKQILIEFEKLCTCFFDVNLRLF-RLGLMGEIHGQNICLVLKN 485
+ + ++++ +L L R G+ HGQNI L +K
Sbjct: 378 NQPLAGAY--IDRSGLDAETW---LTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKE 432

Query: 486 GEFDGLMFRD-HDSLRIYLPWVEQSGLKDPNYLSPHDFRNTLYHESVEALLFYIQTLGIQ 544
G ++ +D +R+ + + L P + R+ S + L+ +QT G
Sbjct: 433 GVPQRVLLKDFQGDMRLVKEE-----FPEMDSL-PQEVRDVTSRLSADYLIHDLQT-GHF 485

Query: 545 VNLGCIVDNLASHYQIEVKDLWSVLAHALQQVIQNLNFQ-PEILTQLQHLLFEVPEWPYK 603
V + + L + + + +LA V+ + + P++ + P+
Sbjct: 486 VTVLRFISPLMVRLGVPERRFYQLLA----AVLSDYMKKHPQMSERFALFSLFRPQIIRV 541

Query: 604 QLLRPLL---EQDTRIGSMPSGIGKTRNPLW 631
L L + D +P+ + +NPLW
Sbjct: 542 VLNPVKLTWPDLDGGSRMLPNYLEDLQNPLW 572


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09355PF04183409e-138 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 409 bits (1053), Expect = e-138
Identities = 148/588 (25%), Positives = 257/588 (43%), Gaps = 42/588 (7%)

Query: 32 VEQRVIKQLLQALIFEDIIHSEYDGKNFIIEVQNSQGQTIRYVAAGQRQYSYKLVRLVRN 91
V +R++ ++L L +E + H+E G + + Q+ + R +
Sbjct: 9 VNRRLVAKMLSELEYEQVFHAESQGDDRYC------------INLPGAQWRFIAERGIWG 56

Query: 92 QDVFRQDENGHYQIATLNLVIDEILRTITDAAKVED-----FIFELKRTFIHDLQSQAC- 145
+ + A ++ +L + + D + +L T + DLQ
Sbjct: 57 W---LWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKAR 113

Query: 146 FDHYALPAIQYPYDILESYLMDGHPYHPCYKSRVGFSLQDNVRYGVEFAQPIALVWLAVH 205
A I D L+ L+ GHP K R G+ + RY E+A L WLAV
Sbjct: 114 RGLSASDLINLNADRLQC-LLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVK 172

Query: 206 QDIVAKKHSEDIEPDLFFKEQLNSQDQELFLQHLSDRDLKAEEYIWIPVHPWQWENHLIS 265
++ + + +++ ++ Q+ F Q + L ++ +PVHPWQW+ + +
Sbjct: 173 REHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGL-DHNWLPLPVHPWQWQQKIAT 231

Query: 266 IFAEEILNGKIVYLGQSQDRYLAQQSLRTMTNLQHPEKPYIKLSMSLTNTSSSRVLAKHT 325
F + G++V LG+ D++LAQQSLRT+TN IKL +++ NTS R +
Sbjct: 232 DFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRY 291

Query: 326 VMNGPIITDWLQRLIKQSKTAQELDFAVLHEVYGLSVD---FTKLPKSHAQQAYGTIGCL 382
+ GP+ + WLQ++ T + +L E V + L ++ + +G +
Sbjct: 292 IAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQ-EMLGVI 350

Query: 383 WRESVHQYLREGEDAIPLNGVSHIQKDGQALIAPWLQQYG--VESWTRQLLKVVITPLIH 440
WRE+ ++L+ E + + + ++ Q L ++ + G E+W QL +VV+ PL H
Sbjct: 351 WRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYH 410

Query: 441 LLFAEGIATESHGQNIILVHKQGWPTRVLLKDFHDGVRYSPAHLAHPELAPELDQLPPEH 500
LL G+A +HGQNI L K+G P RVLLKDF +R PE+D LP E
Sbjct: 411 LLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEF------PEMDSLPQEV 464

Query: 501 AKTNSMSFILTDDLNGIRDFSCACLFFVALTDIAIFLNQNFDLPEKSFWQWAAEVIQNYQ 560
+ + FV + L +PE+ F+Q A V+ +Y
Sbjct: 465 RD------VTSRLSADYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYM 518

Query: 561 QQHPEHASRYQLFDVFAEKLRIESLTKRRL-FGDRSIQIKFVDNPLAP 607
++HP+ + R+ LF +F ++ L +L + D + + N L
Sbjct: 519 KKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLED 566


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09360TCRTETB1265e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (319), Expect = 5e-34
Identities = 88/430 (20%), Positives = 180/430 (41%), Gaps = 19/430 (4%)

Query: 33 LNNSSFNPAIPHLMSYFQVGEVWASWVVVAFLLAMSISLPLAGFLSQRFGKRSIYLIALL 92
LN N ++P + + F +WV AF+L SI + G LS + G + + L ++
Sbjct: 28 LNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGII 87

Query: 93 GFALASTAGGLFNQFESVLI-ARALQGFCSGLMIPLSLGLIFSVTPSEQRGSTTGLWGAM 151
S G + + F S+LI AR +QG + L + ++ P E RG GL G++
Sbjct: 88 INCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSI 147

Query: 152 IMLTLAVGPMLGALVLVWLNWKALFFINLPVACLALILGYVFLPKEQGDNKQEFDWAGFF 211
+ + VGP +G ++ +++W + + +P+ + + + L K++ K FD G
Sbjct: 148 VAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGII 205

Query: 212 FLGSSIVLLLGTLSQIHQIQDLLQPLYGVL-LVLSVLLFIRFIFVQKNKSMPLIELALFA 270
+ IV + L Y + L++SVL F+ F+ + + P ++ L
Sbjct: 206 LMSVGIVFFM-----------LFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGK 254

Query: 271 TKGFRYSLVICVAQTVGLFIGMLLIPLWIQHLLKLSPLWTGFALMSSAVVTGICSQP-AG 329
F ++ + + ++P ++ + +LS G ++ ++ I G
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 330 KYLDRYGAAKIMSLGLMITVASFLLLAWAPVQNVWFIVFCMILHGLGMGLSYMPSTTAGL 389
+DR G ++++G+ SFL ++ WF+ ++ G+ + +T
Sbjct: 315 ILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVS 374

Query: 390 NSLRQQQQHLVTQAAALNNLFRRIFAAVAVVIAALYLQLRQQSLPLNTQAIFTSFHTMQE 449
+SL+QQ+ +L N + + I L + L + S +
Sbjct: 375 SSLKQQE---AGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYLYSN 431

Query: 450 IFVCCAILIL 459
+ + + +I+
Sbjct: 432 LLLLFSGIIV 441


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09370PF041832115e-63 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 211 bits (539), Expect = 5e-63
Identities = 91/479 (18%), Positives = 183/479 (38%), Gaps = 35/479 (7%)

Query: 80 IDGQWQKISAGTIVSLLLEELVIESQFKLDA--ASLLEKWIQSRDALLQFLKQRHN-DFD 136
ID Q + + +++ L + + DA A ++ + LQ LK R
Sbjct: 60 IDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSAS 119

Query: 137 DLVKAGQNFIESEQALILGHSMHPAPKSRNGFVHEDWLKFSPEHAGKTQLHYWLVHQNYI 196
DL+ + Q L+ GH K R G+ E +++PE+A +LH+ V + ++
Sbjct: 120 DLINL---NADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHM 176

Query: 197 AEGCATEQPISDQVKDAI---RWYLSESDLNLLKTHVEFKLLPLHPWQARYLQGKPWFEQ 253
C E I + A+ + + LP+HPWQ + +
Sbjct: 177 IWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIAD 236

Query: 254 LKQTGQLIDIGLRGWQSSPTTSIRTLASFNAPW--MVKTSLSVMITNSIRVNLAKECHRG 311
+ G+++ +G G Q S+RTL + + +K L++ T+ R + G
Sbjct: 237 FAE-GRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAG 295

Query: 312 EISYRLWHSDLGKKILKQCPTLKAVNDPAWIALQIDGEIINETICIFRDQPFAVQQQVTC 371
++ R + +PA + +G P+ Q+ +
Sbjct: 296 PLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYA------ALARAPYRYQEMLGV 349

Query: 372 I---ASLCQDHPNKELNRFNALFDQIAQKNQQT-------NFKEIALDWFDHFLKIGLAP 421
I P++ L + +N Q A W ++ + P
Sbjct: 350 IWRENPCRWLKPDESPVLMATLME--CDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVP 407

Query: 422 LMYVYHKYGMAFESHQQNVLLELEDGLPKNLWLRDNQG-FYYIEEFATEIVEALPDLLEK 480
L ++ +YG+A +H QN+ L +++G+P+ + L+D QG ++E E+ ++LP +
Sbjct: 408 LYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEM-DSLPQEVRD 466

Query: 481 AHAVGPKDF-VDERFSYYFFGNTLFGLINAIGATGYISEDELLIHLQQNLLQLLEQYPD 538
+ D+ + + + +F T+ I+ + + E L L ++++P
Sbjct: 467 VTSRLSADYLIHDLQTGHFV--TVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQ 523


62A4U85_RS09670A4U85_RS09705N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS09670-118-2.931057type II secretion system minor pseudopilin GspJ
A4U85_RS09675-117-3.453680type II secretion system minor pseudopilin GspI
A4U85_RS09680-117-2.983299type II secretion system protein
A4U85_RS09685-118-2.331272TetR/AcrR family transcriptional regulator
A4U85_RS09690-218-2.223480TatD family hydrolase
A4U85_RS09695-216-2.317042PilZ domain-containing protein
A4U85_RS09700-216-2.072447DNA polymerase III subunit delta'
A4U85_RS09705-216-1.3119693-deoxy-manno-octulosonate cytidylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09670BCTERIALGSPG290.008 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.1 bits (65), Expect = 0.008
Identities = 11/26 (42%), Positives = 16/26 (61%)

Query: 24 RLTRASGFTLVELLVAIAIFAVLSLL 49
+ GFTL+E++V I I VL+ L
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASL 28


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09675BCTERIALGSPH383e-06 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 37.6 bits (87), Expect = 3e-06
Identities = 17/54 (31%), Positives = 29/54 (53%), Gaps = 3/54 (5%)

Query: 1 MKSKGFTLLEVMVALAIFAVAAVALTKVAMQYTQSTSNAILRTKAQFVAMNEVA 54
M+ +GFTLLE+M+ L + V+A V + + S ++ +T A+F A
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGM---VLLAFPASRDDSAAQTLARFEAQLRFV 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09680BCTERIALGSPH499e-10 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 48.8 bits (116), Expect = 9e-10
Identities = 29/148 (19%), Positives = 54/148 (36%), Gaps = 11/148 (7%)

Query: 9 SQKGFTLIEVMVVIVIMTIMTSLVVLNIGGVDQKKAMQARELFLLDLQKINKESLDQSRV 68
Q+GFTL+E+M+++++M + +V+L A Q F L+ + + L +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQF 61

Query: 69 LALETHGERDVSPFSYELYEYHDQSTLQVQDIKNRWQKYTEFKTRQLPAHVSFSVQPLDD 128
+ V P ++ + + W Y R S S +
Sbjct: 62 FGVS------VHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGS---IAG 112

Query: 129 Q--NYSKAKNTDLIGGQTPQLIWFGNGE 154
N + A+ G P ++ F GE
Sbjct: 113 GKLNLAFAQGEAWTPGDNPDVLIFPGGE 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09685HTHTETR543e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.9 bits (129), Expect = 3e-11
Identities = 17/82 (20%), Positives = 37/82 (45%)

Query: 3 RQAQFRAREALIFQVAEQLLLENGEAGMTLDVLAAELDLAKGTLYKHFQSKDELYMLLII 62
+ + + I VA +L + G + +L +A + +G +Y HF+ K +L+ +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 63 RNERMLLEMVQDTEKAFPEHLA 84
+E + E+ + + FP
Sbjct: 65 LSESNIGELELEYQAKFPGDPL 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS09705HTHFIS300.010 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.010
Identities = 18/72 (25%), Positives = 29/72 (40%), Gaps = 2/72 (2%)

Query: 20 LLLIHDRPMILRVVDQAKKVEGFDDLCVATDDERIAEICRAEGVDVVLTSADHPSGTDRL 79
+L+ D I V++QA G+D + I +G D+V+T P +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMP-DENAF 63

Query: 80 SEVARIKGWDAD 91
+ RIK D
Sbjct: 64 DLLPRIKKARPD 75


63A4U85_RS10095A4U85_RS10130N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS10095-213-2.604724TetR/AcrR family transcriptional regulator
A4U85_RS10100-412-1.0441651-acyl-sn-glycerol-3-phosphate acyltransferase
A4U85_RS10105-311-0.235922YebC/PmpR family DNA-binding transcriptional
A4U85_RS10110-210-0.816026Lrp/AsnC family transcriptional regulator
A4U85_RS10115-212-0.257507DcaP-like protein
A4U85_RS10120-210-0.819968amino acid ABC transporter ATP-binding protein
A4U85_RS10125-212-0.520611amino acid ABC transporter permease
A4U85_RS10130-213-0.172134amino acid ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10095HTHTETR806e-21 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 79.7 bits (196), Expect = 6e-21
Identities = 32/190 (16%), Positives = 64/190 (33%), Gaps = 9/190 (4%)

Query: 1 MSKKEDIINTALELFNQIGYNATGVDKIIAESNVAKMTFYKYFPSKESLIMECLHHRNIN 60
++ I++ AL LF+Q G ++T + +I + V + Y +F K L E N
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 61 IQNSIYEKLSLHPDVS---PIEKIHLIFNWYIDWINSKNFNGCLFKKAFI--EVSKQYTS 115
I E + P E + + + + +F K E++ +
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 116 IREPFQEYTNWLINLLNSLLVELDIK---DPTPLTHIIISIIDGIIIDGTIDKDLID-PS 171
R E + + L + + I+ I G++ + D
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKK 189

Query: 172 KKWQYIEYLI 181
+ Y+ L+
Sbjct: 190 EARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10110BACYPHPHTASE310.002 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 30.9 bits (69), Expect = 0.002
Identities = 28/114 (24%), Positives = 54/114 (47%), Gaps = 13/114 (11%)

Query: 28 INLSVSSVHRRIKHLIE---ANIMGQLKREINFSKLGFTLHILLQVSLSKHDSETFDKFL 84
+NLS+S +HR++ L++ + G+L+ + +K T L S ++ + F +
Sbjct: 1 MNLSLSDLHRQVSRLVQQESGDCTGKLRGNVAANK-ETTFQGLTIASGARESEKVFAQ-- 57

Query: 85 SEIEAIPEVTNAFLVTGQSADFILELVARNMDDYSEILLRRIGKIDNV-VALHS 137
+ V N L +A + V N+++Y LR +G ++V V+L S
Sbjct: 58 ---TVLSHVANVVLTQEDTAKLLQSTVKHNLNNYD---LRSVGNGNSVLVSLRS 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10125TCRTETOQM280.036 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 27.9 bits (62), Expect = 0.036
Identities = 27/112 (24%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query: 82 FEFQSDTYWGPLFSSVVTFAIFEAAFFSEIVRSGIQSISKGQVNAGYALGFTYGQSMRYV 141
+++S G L S F+ A E +R G + G + F YG V
Sbjct: 458 MQYESSVSLGYLNQS------FQNAVM-EGIRYGCEQGLYGWNVTDCKICFKYGLYYSPV 510

Query: 142 VLPQAFRNMLPVLLTQTI-----ILFQDVSLVYVISAPDFLGRADTLANTYG 188
P FR + P++L Q + L + + + ++L RA T A Y
Sbjct: 511 STPADFRMLAPIVLEQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYC 562


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS10130TYPE3IMSPROT280.044 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 27.8 bits (62), Expect = 0.044
Identities = 10/56 (17%), Positives = 23/56 (41%)

Query: 63 IVAFLIAFLLGSLLGVIRTLPNKPLAFIGNCYVEIFRNIPLIVQLFFWAFVFPEFL 118
+++ LI ++ L + LP + I +I R + +I + F ++
Sbjct: 149 LLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQILRQLMVICTVGFVVISIADYA 204


64A4U85_RS11410A4U85_RS11440N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS11410318-2.971718TetR/AcrR family transcriptional regulator
A4U85_RS11415115-1.643331hypothetical protein
A4U85_RS11420-114-0.289638SMI1/KNR4 family protein
A4U85_RS11425-1130.316270DUF637 domain-containing protein
A4U85_RS11430-1172.933064ShlB/FhaC/HecB family hemolysin
A4U85_RS114352193.541003chloride channel protein
A4U85_RS114402193.388016ATP-grasp domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11410HTHTETR533e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 53.5 bits (128), Expect = 3e-11
Identities = 17/65 (26%), Positives = 24/65 (36%)

Query: 5 EASFRALRVLHTAKDLFNQYGFHKVGIDRIIAESKVTKATFYNHFHSKERLIEMCLTFQK 64
EA +L A LF+Q G + I + VT+ Y HF K L +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 65 DGLKE 69
+ E
Sbjct: 68 SNIGE 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11425PF05860622e-13 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 62.1 bits (151), Expect = 2e-13
Identities = 19/144 (13%), Positives = 45/144 (31%), Gaps = 29/144 (20%)

Query: 76 ADIVADSAANAANRAIIAAGKNSAGKTVPVVNIQTPK-NGISHNIYKQFDVLAEGAVLNN 134
A I D+ + ++ T + + H+ +++F V G N
Sbjct: 1 AQITPDTTLPIN-------SNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN 52

Query: 135 SRQGATTKTVGAVAANPFLATGEARVILNEVNSSAASRFEGNLEVAGQMADVIIANPSGI 194
+ + I++ V + S +G + A++ + NP+GI
Sbjct: 53 N-------------------PTNIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGI 92

Query: 195 NIKGGGFINANKAIFTTGKPQLNA 218
++ + + +L
Sbjct: 93 IFGQNARLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11430PF00577330.003 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 33.3 bits (76), Expect = 0.003
Identities = 16/127 (12%), Positives = 42/127 (33%), Gaps = 1/127 (0%)

Query: 187 DFLNLQQLDQGLENLKRAYTVDIQILPSNESVQETQGYSDLVIKLQAHSKISLNLGLDNS 246
D + +E V + +G L + Q +L L +
Sbjct: 491 DTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQ 550

Query: 247 GSKDTGKYIGSLGVNINNPFYLSDSLSFNFSHSLDNLHRDLNKNYFVSYQLPLGNYDFST 306
T +N F + + ++S + + + ++ ++ +P ++ S
Sbjct: 551 TYWGTSNVDEQFQAGLNTAF-EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 307 SYSRYQY 313
S S++++
Sbjct: 610 SKSQWRH 616



Score = 32.5 bits (74), Expect = 0.005
Identities = 30/204 (14%), Positives = 58/204 (28%), Gaps = 24/204 (11%)

Query: 221 TQGY---SDLVIKLQAHSKISLNLGLDNSGSKDTGKYI------GSLGVNINNPFYLSDS 271
T GY +D I G+ K T Y G L + + + +
Sbjct: 483 TSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST 542

Query: 272 LSFNFSHSLDNLHRDLNKNYFVSYQLPLGNYDFSTSYSRYQYEQNVLGANSV-------- 323
L + SH ++++ + + +++ SYS + +
Sbjct: 543 LYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 324 ---LRYHGLSEQGNLNVSRVLSRSG----QHKTSLYGKLYHKQNSNFIDDIEIEVQRRRT 376
LR S+ + + S +S + +YG L N ++
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 377 SGWNAGIQHRQYLGVAVLDAGLDY 400
SG G + G +
Sbjct: 663 SGSTGYATLNYRGGYGNANIGYSH 686


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS11440RTXTOXIND310.011 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.011
Identities = 13/49 (26%), Positives = 23/49 (46%)

Query: 509 APINGVISAWKVENGEQVTEGQVVAIMEAMKMEVQVLAHRSGVIQIGAE 557
N ++ V+ GE V +G V+ + A+ E L +S ++Q E
Sbjct: 101 PIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE 149


65A4U85_RS12105A4U85_RS12130N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS12105-1192.809653benzoate 1,2-dioxygenase large subunit
A4U85_RS12110-1161.700990benzoate 1,2-dioxygenase small subunit
A4U85_RS121150141.536759ring-hydroxylating dioxygenase ferredoxin
A4U85_RS12120-1121.2827421,6-dihydroxycyclohexa-2,4-diene-1-carboxylate
A4U85_RS12125-2110.963568benzoate/H(+) symporter BenE
A4U85_RS12130-1120.757757aromatic acid/H+ symport family MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12105PF05932290.016 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 29.0 bits (65), Expect = 0.016
Identities = 9/52 (17%), Positives = 15/52 (28%)

Query: 253 AGSWGKQGGGSYGFENGHMLLWTQWANPEDRPNFPKADEYTEKYGEAMSKWM 304
A + G G + L + P ++ + P E M W
Sbjct: 72 ALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLEWMRGWR 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12115ANTHRAXTOXNA290.029 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.029
Identities = 16/41 (39%), Positives = 25/41 (60%), Gaps = 1/41 (2%)

Query: 247 VTNDFDLVALE-KLNELQAKFPWFEYRTVVASPESNHERKG 286
+T D+DL AL L E++ + P E+ VV +P S ++KG
Sbjct: 488 LTADYDLFALAPSLTEIKKQIPQKEWDKVVNTPNSLEKQKG 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12120DHBDHDRGNASE981e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 98.2 bits (244), Expect = 1e-26
Identities = 66/259 (25%), Positives = 115/259 (44%), Gaps = 7/259 (2%)

Query: 3 NRQRFTDKVVIVTGSAQGIGRGVALQVAAEGGQVIMAD-RSEYVEEVLTEIQRAGGEAVT 61
N + K+ +TG+AQGIG VA +A++G + D E +E+V++ ++ A
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 62 INADLETYASAQAVVAKAIEHYGRVDVLINNVGGAIWMKPFEEFSEEEIIKEVNRSLFPT 121
AD+ A+ + A+ G +D+L+ NV G + S+EE + +
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILV-NVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 122 LWCCRAVLPAMIKQQSGVIVNVSSIA--TRGINRIPYSASKGGVNALTASLAFEHAKDGI 179
R+V M+ ++SG IV V S + Y++SK T L E A+ I
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 180 RVNAVATGGTEAPPRKVPRNTNPLSQNEKDWMQQVVDQTKDRTFMGRYGTIQEQVNAILF 239
R N V+ G TE + + + ++ ++ K + + + +A+LF
Sbjct: 181 RCNIVSPGSTET---DMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 240 LASDEASYITGSIIPVGGG 258
L S +A +IT + V GG
Sbjct: 238 LVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS12130TCRTETB752e-16 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 74.5 bits (183), Expect = 2e-16
Identities = 73/405 (18%), Positives = 147/405 (36%), Gaps = 17/405 (4%)

Query: 21 HWKVLIWCLLIIIFDGYDLVIYGVALPLLMQQWSLTAVEAGLLASAALFGMMFGAMIFGT 80
H ++LIW ++ F + ++ V+LP + ++ + +A + G ++G
Sbjct: 12 HNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGK 71

Query: 81 LSDKLGRKKTILICVTLFSGFTFIGAFAKGPTEFAIL-RFIAGLGIGGVMPNVVALMTEY 139
LSD+LG K+ +L + + + IG I+ RFI G G V+ ++ Y
Sbjct: 72 LSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131

Query: 140 APKKIRSTLVAIMFSGYAIGGMTSALLGAWLVKDMGWQIMFLIAGIPLLLLPLIWKFLPE 199
PK+ R ++ S A+G +G + + W + LI I ++ +P + K L +
Sbjct: 132 IPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKK 191

Query: 200 SLAFLVKSNHSEQAKSIVSKIAPQTQVNANTQLVLNEST-------TTDAPVRALFQQGR 252
+ + V + + + L S V F
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 253 TFSTFMFWIAFFMCLLMVYALGSW--LPKLMLQAGYSLG---ASMLFLFALNIGGMVGAI 307
F I ++ + + + M++ + L + +F + ++
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 308 GGGALADRFHLKPVITIMFIVGSAALILLGI---NSPQFILYSLIAIAGAATIGSQILLY 364
GG L DR V+ I S + + + F+ ++ + G + ++ ++
Sbjct: 312 IGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSF-TKTVIS 370

Query: 365 TFVAQFYPTALRSTGMGWASGIGRIGAIIGPVLTGALLSFELPHQ 409
T V+ GM + + G + G LLS L Q
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQ 415



Score = 31.8 bits (72), Expect = 0.006
Identities = 27/121 (22%), Positives = 48/121 (39%), Gaps = 6/121 (4%)

Query: 304 VGAIGGGALADRFHLKPVITIMFIVGSAALILLGINSPQF---ILYSLIAIAGAATIGSQ 360
+G G L+D+ +K ++ I+ ++ + F I+ I AGAA +
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPA- 122

Query: 361 ILLYTFVAQFYPTALRSTGMGWASGIGRIGAIIGPVLTGALLS-FELPHQMNFLAIAIPG 419
L+ VA++ P R G I +G +GP + G + + + I I
Sbjct: 123 -LVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIIT 181

Query: 420 V 420
V
Sbjct: 182 V 182


66A4U85_RS13820A4U85_RS13860N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS13820-221-1.032685copper resistance protein NlpE
A4U85_RS13825-220-1.079718isocitrate lyase
A4U85_RS13830-119-0.969074LysR family transcriptional regulator
A4U85_RS13835-312-1.246049HlyC/CorC family transporter
A4U85_RS13840-212-1.476994CitMHS family transporter
A4U85_RS13850-214-0.832447hypothetical protein
A4U85_RS13855-215-1.246325DUF2239 family protein
A4U85_RS13860-116-0.766697sulfate adenylyltransferase subunit CysN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13825VACJLIPOPROT290.005 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.5 bits (66), Expect = 0.005
Identities = 13/30 (43%), Positives = 16/30 (53%)

Query: 1 MKKSLLAIALMSTLLVACNKHENKTETTSD 30
MK L A+AL +TLLV C + SD
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSD 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13835PF05043300.016 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 29.9 bits (67), Expect = 0.016
Identities = 15/61 (24%), Positives = 29/61 (47%), Gaps = 3/61 (4%)

Query: 3 LERVDLNLLIYLDVLLREK---NVTRAAEQLGVTQPAMSNILRRLRNLFNDPLLIRSSEG 59
L + L L++L K + + AE L T+ A+ + L +++ F D + S+ G
Sbjct: 5 LSKKSHRQLELLELLFEHKRWFHRSELAELLNCTERAVKDDLSHVKSAFPDLIFHSSTNG 64

Query: 60 M 60
+
Sbjct: 65 I 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13860PF05704280.025 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 27.9 bits (62), Expect = 0.025
Identities = 18/83 (21%), Positives = 31/83 (37%)

Query: 51 LDLSGSEQELQQRYAEPEEIKKVGRPKLGVISREITLQKKHWDWLDQQSASASAVIRKLI 110
L LS E+EL R ++IKK I +E QK + Q A ++++ +
Sbjct: 31 LKLSKKEKELIWRNTVKKDIKKSICFFNDEIIQEPMRQKYIFICWLQGIEKAPYIVQQCV 90

Query: 111 DKELNNPNSEGNIMLAKQAIDRF 133
N I++ +
Sbjct: 91 ASVKKNSGDFKVIIIDGNNYKEW 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS13865TCRTETOQM685e-14 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 67.6 bits (165), Expect = 5e-14
Identities = 56/181 (30%), Positives = 77/181 (42%), Gaps = 26/181 (14%)

Query: 33 VDDGKSTLIGRLLYDSKLIYEDQLQAVTRDSKKVGTTGDAPDLALLVDGLQAEREQGITI 92
VD GK+TL LLY+S I +L +V + GTT D ER++GITI
Sbjct: 12 VDAGKTTLTESLLYNSGAI--TELGSVDK-----GTT--------RTDNTLLERQRGITI 56

Query: 93 DVAYRYFSTEKRKFIIADTPGHEQYTRNMATGASTADLAIILIDARYGVQTQTRRHTFIA 152
F E K I DTPGH + + S D AI+LI A+ GVQ QTR
Sbjct: 57 QTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 153 SLLGIKNIVVAINKMDLVEYSSERFNEIQVEYDAFVSQLGDRRPANILFVPISALNGDNV 212
+GI I INK+D + ++ + ++ A I+ L +
Sbjct: 117 RKMGIPTIFF-INKID----------QNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMC 165

Query: 213 V 213
V
Sbjct: 166 V 166


67A4U85_RS15020A4U85_RS15065N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS15020190.808054trehalose-6-phosphate synthase
A4U85_RS15025190.906933MFS transporter
A4U85_RS15030190.793763bacterioferritin
A4U85_RS150351100.974701NAD-dependent DNA ligase LigA
A4U85_RS150401110.203493cell division protein ZipA
A4U85_RS150451110.606495chromosome segregation protein SMC
A4U85_RS15050-3120.564217GntR family transcriptional regulator
A4U85_RS15055-1110.950785sulfite exporter TauE/SafE family protein
A4U85_RS15060-2100.382311biotin--[acetyl-CoA-carboxylase] ligase
A4U85_RS15065-111-0.227430pantothenate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15020HTHFIS290.042 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.042
Identities = 13/74 (17%), Positives = 25/74 (33%), Gaps = 11/74 (14%)

Query: 350 RDGMNLVAKEYIAAQDPENPGVLILSKYAGAAEQMTQAL-------IVDPLDRAAMMDSL 402
+ +L+ + I P+ P VL++S +A + P D ++ +
Sbjct: 60 ENAFDLLPR--IKKARPDLP-VLVMSAQ-NTFMTAIKASEKGAYDYLPKPFDLTELIGII 115

Query: 403 KTALEMSKAERINR 416
AL K
Sbjct: 116 GRALAEPKRRPSKL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15025TCRTETA422e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.1 bits (99), Expect = 2e-06
Identities = 59/352 (16%), Positives = 118/352 (33%), Gaps = 20/352 (5%)

Query: 52 ANFGLLLLCMGIGSMIAMPATGALVKRWGCRPLIAVATILLMILLPSLTIWHSLVSMAVA 111
A++G+LL + P GAL R+G RP++ V+ + + L + +
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 112 LFIFGTAAGSLGVAINLQAVVVEKHSLRALMSSFHGMCSLGGLIGAMLVTALLAIGLSPL 171
+ G + VA A + + RA F C G++ ++ L+ G SP
Sbjct: 103 RIVAGITGATGAVAGAYIADITDGDE-RARHFGFMSACFGFGMVAGPVLGGLMG-GFSPH 160

Query: 172 MSTLSVVMVLLVVSFVAIPSALTTFEQDEQGAAEITDAPKKSSRPNGTILLIGMMCFIAF 231
+ + + + + + + P S R + ++ + + F
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF 220

Query: 232 L----SEGAAMDWGGIYLTSKYQLNPAFAGLAYTFFAL--SMTSGRFAGHILLKQWGEKT 285
+ + A W I+ ++ + G++ F + S+ G + + GE+
Sbjct: 221 IMQLVGQVPAALW-VIFGEDRFHWDATTIGISLAAFGILHSLAQAMITG-PVAARLGERR 278

Query: 286 IVTYSAIVAALAMVTIVMAPVWQVVVLGYALLGLG--CSNIVPVMFSRVGRQNDMPKAAA 343
+ I + + A + LL G + M SR + +
Sbjct: 279 ALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQG 338

Query: 344 LSLVSTIAYTGSLSGPALIGLI-----GQWTSLTTVLSGVAVLLTMIAILNR 390
+ + S+ GP L I W + G A+ L + L R
Sbjct: 339 SL--AALTSLTSIVGPLLFTAIYAASITTWNGWAWIA-GAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15030HELNAPAPROT362e-05 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 36.0 bits (83), Expect = 2e-05
Identities = 19/97 (19%), Positives = 35/97 (36%), Gaps = 14/97 (14%)

Query: 46 HEMQEE-----ASHADAIIRRVLFLGAKPNMHREDINVGTDV---------VSCLKADLA 91
HE EE A D I R+L +G +P ++ + ++A +
Sbjct: 47 HEKFEELYDHAAETVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVN 106

Query: 92 LEYHVREKLATGIKLCEEKGDYISRDMLRQQLSDTEE 128
+ + I L EE D + D+ + + E+
Sbjct: 107 DYKQISSESKFVIGLAEENQDNATADLFVGLIEEVEK 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15040SECFTRNLCASE348e-04 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 33.7 bits (77), Expect = 8e-04
Identities = 28/136 (20%), Positives = 47/136 (34%), Gaps = 21/136 (15%)

Query: 3 INTIIGIVVAIIIMLIGLRMILKKPNHAEPSLD--SDLHINPESNQPVIPRHVRDQLEQP 60
++ I I+ ++IGL +D I ES + R LE
Sbjct: 26 AAIVMMIASVILPLVIGLNF----------GIDFKGGTTIRTESTTAIDVGVYRAALE-- 73

Query: 61 EVTVASAAVAERVEPTLSEP-------AQSEEKGTKELEQASQAQTVQTQVPVENTPVEV 113
+ + ++E +P+ E Q +E G Q +Q Q + +V T V+
Sbjct: 74 PLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDP 133

Query: 114 EEVKAEENTVSPTVSE 129
+V P VS
Sbjct: 134 ALKITSFESVGPKVSG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15045GPOSANCHOR612e-11 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 61.2 bits (148), Expect = 2e-11
Identities = 48/272 (17%), Positives = 100/272 (36%), Gaps = 7/272 (2%)

Query: 651 EQVLQKQQPELQALDQIIVQQKDELGQLQVDLQQKQQVIKQKQKDLQQLDVQIAKQQTAA 710
+ +AL + +EL + L++ + + +K +Q+L+ + A + A
Sbjct: 70 KLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKAL 129

Query: 711 QAFLLQKQQLKDQLAQLDTQLEEDAMQKDDLEIDLHALAMKLETILPDYKTLQFQVEELT 770
+ + ++ L+ + A +K DLE L KTL+ + L
Sbjct: 130 EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALE 189

Query: 771 EQLEEQQQVLQQQQQEREILRRNSTQTTQQIELLEKDISFLQSQYQQITAQMEQAKKFVD 830
+ E ++ L+ LE + + L ++ + +E A F
Sbjct: 190 ARQAELEKALEGAMNFSTADSAKIKT-------LEAEKAALAARKADLEKALEGAMNFST 242

Query: 831 PIQLELPNLESEFQQQFAQTEKLQKTWNEWQIELNSVQEKQQTLTDQRHQYQQQDEKLRE 890
++ LE+E A+ +L+K + K +TL ++ + + L
Sbjct: 243 ADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302

Query: 891 QLEAKRLAWQAAKSDREHYQEQLKELNAELQT 922
Q + Q+ + D + +E K+L AE Q
Sbjct: 303 QSQVLNANRQSLRRDLDASREAKKQLEAEHQK 334



Score = 43.1 bits (101), Expect = 6e-06
Identities = 37/246 (15%), Positives = 87/246 (35%)

Query: 178 RETLQHLEHTEQNLSRLEDIALELKSQLKTLKRQSEAAVQYKTLENQIRTLKIEILSFQA 237
++L Q L + + A ++ E + L
Sbjct: 105 DKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKAL 164

Query: 238 EKSVRLQEEYTVQMNELGETFKLVRSELSTIEHDLESTSALFQRLIQQSSPLQQEWQQAE 297
E ++ + ++ L + + + +E LE + L+ E
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 298 KKLSELKMTLEQKQSLFQQNSTTLVQLEQQKAQTKERLQLSELQLETLNSQLEEQTEALT 357
+ ++L+ LE + +S + LE +KA + R E LE + + +
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 284

Query: 358 AVEHTAAEAEQNFASLQSQQRQAQQQFEQVKAQVEKQQQQKMQMSAQIEQLGKNVQRIEQ 417
+E A E A L+ Q + + ++ ++ ++ K Q+ A+ ++L + + E
Sbjct: 285 TLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEA 344

Query: 418 QKETLQ 423
+++L+
Sbjct: 345 SRQSLR 350



Score = 42.4 bits (99), Expect = 1e-05
Identities = 48/312 (15%), Positives = 119/312 (38%), Gaps = 11/312 (3%)

Query: 644 RIRLDEIEQVLQKQQPELQALDQIIVQQKDELGQLQVDLQQKQQVIKQKQKDLQQLDVQI 703
++ + E AL+ + + L IK + + L +
Sbjct: 168 MNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK 227

Query: 704 AKQQTAAQAFLLQKQQLKDQLAQLDTQLEEDAMQKDDLEIDLHALAMKLETILPDYKTLQ 763
A + A + + ++ L+ + ++ +LE L
Sbjct: 228 ADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNF-------STADS 280

Query: 764 FQVEELTEQLEEQQQVLQQQQQEREILRRNSTQTTQQIELLEKDISFLQSQYQQITAQME 823
+++ L + + + + ++L N + ++ + L++++Q++ Q +
Sbjct: 281 AKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNK 340

Query: 824 QAKKFVDPIQLELPNLESEFQQQFAQTEKLQKTWNEWQIELNSVQEKQQTLTDQRHQYQQ 883
++ ++ +L +Q A+ +KL++ + +I S Q ++ L R ++
Sbjct: 341 ISEASRQSLRRDLDASREAKKQLEAEHQKLEE---QNKISEASRQSLRRDLDASREA-KK 396

Query: 884 QDEKLREQLEAKRLAWQAAKSDREHYQEQLKELNAELQTGLKIDLNEHQQKLEKVQKQFE 943
Q EK E+ +K A + + E ++ ++ AELQ L+ + ++KL K ++
Sbjct: 397 QVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELA 456

Query: 944 KIGAVNLAASQE 955
K+ A + SQ
Sbjct: 457 KLRAGKASDSQT 468



Score = 40.8 bits (95), Expect = 3e-05
Identities = 34/268 (12%), Positives = 81/268 (30%), Gaps = 23/268 (8%)

Query: 742 EIDLHALAMKLETILPDYKTLQFQVEELTEQLEEQQQVLQQQQQEREILRRNSTQTTQQI 801
L + + + + TL+ + +L+ + + + +E + + + +
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSL 108

Query: 802 ELLEKDISFLQSQYQQITAQMEQAKKFVDPIQLELPNLESEFQQQFAQTEKLQKTWNEWQ 861
I L+++ + +E A F ++ LE+E A+ L+K
Sbjct: 109 SEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAM 168

Query: 862 IELNSVQEKQQTLTDQRHQYQQQDEKLREQLEAKRLAWQAAKSDREHYQEQLKELNAELQ 921
+ K +TL ++ + + +L + LE A + + + + L A
Sbjct: 169 NFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKA 228

Query: 922 TGLKIDLNEHQQKLEKVQKQFEKIGAVNLAASQEFEEVSQRFDELSHQIQDLENTVTQLK 981
+ A S + L + LE +L+
Sbjct: 229 ----------------------DLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 266

Query: 982 DAMKSIDQETRKLFMSTFDQINQELQNL 1009
A++ + + E L
Sbjct: 267 KALE-GAMNFSTADSAKIKTLEAEKAAL 293



Score = 33.5 bits (76), Expect = 0.006
Identities = 39/241 (16%), Positives = 100/241 (41%), Gaps = 11/241 (4%)

Query: 184 LEHTEQNLSRLEDIALELKSQLKTL-KRQSEAAVQYKTLENQIRTLKIEILSFQA--EKS 240
E+ L + + +++KTL ++ + LE + + A +
Sbjct: 227 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 286

Query: 241 VRLQEEYTVQMNELGETFKLVRSELSTIEHDLESTSALFQRLIQQSSPLQQEWQQAEKKL 300
+ + +L +++ + ++ DL+++ ++L + L+++ + +E
Sbjct: 287 EAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASR 346

Query: 301 SELKMTLEQKQSLFQQNSTTLVQLEQQKAQTKERLQLSELQLETLNSQLEEQTEALTAVE 360
L+ L+ + + QLE + + +E+ ++SE ++L L+ EA VE
Sbjct: 347 QSLRRDLDASREAKK-------QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVE 399

Query: 361 HTAAEAEQNFASLQSQQRQAQQQFEQV-KAQVEKQQQQKMQMSAQIEQLGKNVQRIEQQK 419
EA A+L+ ++ ++ + K + E Q + + + A E+L K + + + +
Sbjct: 400 KALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELAKLR 459

Query: 420 E 420

Sbjct: 460 A 460



Score = 33.1 bits (75), Expect = 0.007
Identities = 32/178 (17%), Positives = 63/178 (35%), Gaps = 1/178 (0%)

Query: 333 ERLQLSELQLETLNSQLEEQTEALTAVEHTAAEAEQNFASLQSQQRQAQQQFEQVKAQVE 392
L+L L N L++ + LT A E + S++ Q+ E KA +E
Sbjct: 67 NTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLE 126

Query: 393 KQQQQKMQMSAQIEQLGKNVQRIEQQKETLQHQANQIQSQVHEDEQGELEQLQQQLCREI 452
K + M S K ++ E+ + + + + + L E
Sbjct: 127 KALEGAMNFSTADSAKIKTLEA-EKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEK 185

Query: 453 STLEAEIEQYVQRIEQAQQAHQVNKNQQQTLKTEIQVLLSEQKNLSQLVAKQSPKQNQ 510
+ LEA + + +E A + + +TL+ E L + + +L + +
Sbjct: 186 AALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTA 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15065PF03309964e-26 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 96.4 bits (240), Expect = 4e-26
Identities = 43/263 (16%), Positives = 97/263 (36%), Gaps = 34/263 (12%)

Query: 4 LWLDIGNTRLKYWI----TENQQIIEH--AAELHLQSPADLLLGLIQHFKHQG--LHRIG 55
L +D+ NT + ++ ++++ + +L L + L
Sbjct: 3 LAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLIGDDAERLTGAS 62

Query: 56 ISSVLDTENNQRIQQILKWLEI-PVVFAKVHAEYAGLQCGYEVPSQLGIDRWLQ-VLAVA 113
S + + ++ + ++ P V + G+ + P ++G DR + + A
Sbjct: 63 GLSTVPSVLHEVRVMLEQYWPNVPHVLIEPGVR-TGIPLLVDNPKEVGADRIVNCLAAYH 121

Query: 114 EEKENYCIIGCGTALTID-LTKGKQHLGGYILPNLYLQRDALIQNTK-----GIKIPDSA 167
+ ++ G+++ +D ++ + LGG I P + + DA + + P S
Sbjct: 122 KYGTAAIVVDFGSSICVDVVSAKGEFLGGAIAPGVQVSSDAAAARSAALRRVELTRPRSV 181

Query: 168 FDNLNPGNNTVDAVHHGILLGLISTIESIMQQS----------PKKLLLTGGDAPLFAKF 217
G NTV+ + G + G ++ ++ + ++ TG APL
Sbjct: 182 I-----GKNTVECMQAGAVFGFAGLVDGLVNRIRDDVDGFSGADVAVVATGHTAPLVLPD 236

Query: 218 LQKYQPTVETDLLLKGLQQYIAH 240
L + + L L GL + +
Sbjct: 237 L-RTVEHYDRHLTLDGL-RLVFE 257


68A4U85_RS15295A4U85_RS15330N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS152955242.285607sensor histidine kinase BfmS
A4U85_RS153003243.024050response regulator transcription factor BfmR
A4U85_RS153051223.044018hypothetical protein
A4U85_RS153101213.177033ribonucleoside-diphosphate reductase subunit
A4U85_RS15315-1161.564609hypothetical protein
A4U85_RS153200150.558993ribonucleotide-diphosphate reductase subunit
A4U85_RS15325115-0.804105hypothetical protein
A4U85_RS15330625-5.627513TetR/AcrR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15295PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 4e-05
Identities = 25/150 (16%), Positives = 52/150 (34%), Gaps = 31/150 (20%)

Query: 378 VAVETEALKTQKEIELI--PPPLYVKVDAERRYLHRVV-----QNLVGNAVRYC------ 424
+A E + + ++ I L + + V Q LV N +++
Sbjct: 218 LADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQ 277

Query: 425 DNKVRITGGIHNDGMAFVCVEDDGPGIPEQDRKRVFEAFARLDDSRTRASGGYGLGLSIV 484
K+ + G ++G + VE+ G + ++ S G GL ++
Sbjct: 278 GGKILLKG-TKDNGTVTLEVENTGSLALKNTKE----------------STGTGL-QNVR 319

Query: 485 SRIAYWFGGEIKVDESPSLGGARFIMTWPA 514
R+ +G E ++ S G ++ P
Sbjct: 320 ERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15300HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.2 bits (216), Expect = 6e-22
Identities = 33/137 (24%), Positives = 59/137 (43%), Gaps = 1/137 (0%)

Query: 8 PKILIVEDDERLARLTQEYLIRNGLEVGVETDGNRAIRRIISEQPDLVVLDVMLPGADGL 67
IL+ +DD + + + L R G +V + ++ R I + DLVV DV++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 68 TVCREVRPHY-HQPILMLTARTEDMDQVLGLEMGADDYVAKPVQPRVLLARIRALLRRTD 126
+ ++ P+L+++A+ M + E GA DY+ KP L+ I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 127 KTVEDEVAQRIEFDDLV 143
+ + LV
Sbjct: 124 RRPSKLEDDSQDGMPLV 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15325VACCYTOTOXIN350.001 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 35.4 bits (81), Expect = 0.001
Identities = 65/320 (20%), Positives = 108/320 (33%), Gaps = 26/320 (8%)

Query: 159 GNSITLIGDSSSSSVNNSATNTSNTVNDNDTTYNGNGSGAGNGSGDGLLNGIGSGNGEQN 218
GNS T DS+ + + +++ N GSGAG + +L
Sbjct: 172 GNSFTSYKDSADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLT--------LQ 223

Query: 219 YGIGNGIADDASITAPITLPINLSGNSITLIGNSSASSVNSSPTTTSNTVNDNDTTYNGN 278
G ++A I+ +NL+ NS+ L+GN + + + + +T+
Sbjct: 224 ASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTS-KVT 282

Query: 279 GTGDSGVSALGDSGNGSGDGAGNGIASGNGEHNYGIGNGNGDDVDITAPITGVLNFSGNS 338
G + +GD + A GI + N H + ++I AP G N
Sbjct: 283 GEVNFNHLTVGDH-----NAAQAGIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPND 337

Query: 339 FTLIGNSSSSSVNTAPTTTSNTVNDNDTIDNGNSGGTGSGSGNGSGDGLLNGAASG---- 394
N++ ++ +S ++ I+ NS DG G +
Sbjct: 338 --KPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNI 395

Query: 395 ---NGEHNYGIGNGNGDDVDITAPITGVFNFSGNSFSLIGNSSSSSINTAPTTTTNTVND 451
N + I G T + +L +S S+ T TV D
Sbjct: 396 NRINTNADGTIRVGGFKASLTTN--AAHLHIGKGGINLSNQASGRSLLVENLTGNITV-D 452

Query: 452 NDVTVNGNDGGGLLGGSSGN 471
+ VN GG L GSS N
Sbjct: 453 GPLRVNNQVGGYALAGSSAN 472



Score = 30.8 bits (69), Expect = 0.035
Identities = 32/167 (19%), Positives = 63/167 (37%), Gaps = 9/167 (5%)

Query: 29 GSGDGLLNGISSGNGEHNYGIGNGIADDASITAPITIPLNLSGNSITLIGN---SSSNSV 85
GSG G + + + GI + +A I+ LNL+ NS+ L+GN V
Sbjct: 208 GSGAGRKASSTVLTLQASEGITSRE--NAEISLYDGATLNLASNSVKLMGNVWMGRLQYV 265

Query: 86 NSSPTTTSNNVNDNDVTNNGNGSTIGSGTGNGSGDGLLNGAASGNGEHNYGIGNGIADDA 145
+ + + +N + VT N + + G N + G++ + G + G+
Sbjct: 266 GAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLWQSAGL---- 321

Query: 146 SITAPLSIPINLAGNSITLIGDSSSSSVNNSATNTSNTVNDNDTTYN 192
+I AP N +++ + ++ +N+ N
Sbjct: 322 NIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPN 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15330HTHTETR706e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.7 bits (170), Expect = 6e-17
Identities = 27/187 (14%), Positives = 65/187 (34%), Gaps = 17/187 (9%)

Query: 11 RPRQARSVATFEAILEAAARILESLGFAGFNTNAVAELAGVSIGSLYQYFPSKDALIVEL 70
R + + T + IL+ A R+ G + + +A+ AGV+ G++Y +F K L E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 71 IRRERAKLSNHIVEVIQQSDAADLKDKLKLIIQAAVQHQLSRPQLARTLEFASELIGKDI 130
+ + +E + D L+ I+ ++ ++ + +E
Sbjct: 63 WELSESNIGELELEYQAK-FPGDPLSVLREILIHVLESTVTEERRRLLMEI-------IF 114

Query: 131 EESEHQHELETIISDLFKRSGVSHAQTAAQDVIALSKGMINAAGIVGESDLNHLQQRVEK 190
+ E E+ + + + + + + + +R
Sbjct: 115 HKCEFVGEMAVVQ---------QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAI 165

Query: 191 AVFGYLD 197
+ GY+
Sbjct: 166 IMRGYIS 172


69A4U85_RS15415A4U85_RS15455N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS15415-1182.605109MFS transporter
A4U85_RS15420-1181.911739extracellular solute-binding protein
A4U85_RS15425-2162.099798LysR family transcriptional regulator
A4U85_RS154300151.803461FAD-dependent tricarballylate dehydrogenase
A4U85_RS154350151.555915tricarballylate utilization 4Fe-4S protein TcuB
A4U85_RS154400151.206427HPP family protein
A4U85_RS154450162.338947TetR/AcrR family transcriptional regulator
A4U85_RS154500152.600650CoA transferase
A4U85_RS154551152.877850MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15415TCRTETA385e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.3 bits (89), Expect = 5e-05
Identities = 66/356 (18%), Positives = 128/356 (35%), Gaps = 46/356 (12%)

Query: 64 LMRPLGAIFLGAYVDKVGRRKGLIVTLSLMAIGTILITFVPGYETIGIIAPILVVIGRLL 123
LM+ A LGA D+ GRR L+V+L+ A+ ++ P ++ IGR++
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW--------VLYIGRIV 105

Query: 124 QGFSAGVESGGVSIYLAEIATDKNRGFITSWQSGSQQIAVVFAALLGYWLNTILTHAQVG 183
G + G Y+A+I R + S +V +LG + G
Sbjct: 106 AGIT-GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLM---------G 155

Query: 184 EWGWRIPFLI-----GCLIIPLIFLFRRTLEETEDFKAQKTHPSSKEIFSTLVSNWRIVL 238
+ PF G + FL + + P +E + L S +R
Sbjct: 156 GFSPHAPFFAAAALNGLNFLTGCFLLPES-------HKGERRPLRREALNPLAS-FRWAR 207

Query: 239 AGMMMSAMTTTTF-------YFITVYTTVYAKRTLEMSVTDSLLATVFVGLSNFFWLPMG 291
+++A+ F ++ R + T + F L + +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 292 GLLSDKIG-RRPVLVGITTLAIFTSYPVLSWLVSDISFSNLLITLAYFSFFFGMYNGTMV 350
G ++ ++G RR +++G+ +A T Y +L++ +++ LA +
Sbjct: 268 GPVAARLGERRALMLGM--IADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325

Query: 351 ATLAEVMPKRVRTVGFSLAFSLAAAIFGGMTPMACTFLVENTGNASTPAFWLMLAA 406
+ E +++ G A + +I G P+ T + + W+ AA
Sbjct: 326 RQVDEERQGQLQ--GSLAALTSLTSIVG---PLLFTAIYAASITTWNGWAWIAGAA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15435TCRTETA290.024 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.024
Identities = 14/46 (30%), Positives = 23/46 (50%), Gaps = 3/46 (6%)

Query: 305 DRGFIFLLLIVSASGLALMAFRNAPYMALLLIFHLATVMTFFITMP 350
+R + L +I +G L+AF +MA ++ LA + I MP
Sbjct: 276 ERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA---SGGIGMP 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15445HTHTETR623e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.0 bits (150), Expect = 3e-14
Identities = 28/176 (15%), Positives = 60/176 (34%), Gaps = 18/176 (10%)

Query: 7 KILDTAEKLFNENSFVGVGVDLIRDESGCSKTTMYTYYKNKNQLVKSVLVARDERFKQSL 66
ILD A +LF++ + I +G ++ +Y ++K+K+ L + + +
Sbjct: 15 HILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELE 74

Query: 67 LGYVGDATG------LEAINKILDWHTNWFRQDFFKGCLFVR--AVAESNQDDQDIISIS 118
L Y G E + +L+ R+ +F + V E Q ++
Sbjct: 75 LEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLC 134

Query: 119 KAHKQWIKELIAANCNMPNGE--------ALSELIYTVIEGLISRFLVDGFDETLA 166
I++ + + + ++ I GL+ +L L
Sbjct: 135 LESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15450HTHFIS290.038 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.038
Identities = 10/19 (52%), Positives = 12/19 (63%)

Query: 293 RNELIPLLSEHFLQKTAKE 311
R E IP L HF+Q+ KE
Sbjct: 313 RAEDIPDLVRHFVQQAEKE 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS15455TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.5 bits (126), Expect = 1e-09
Identities = 65/396 (16%), Positives = 128/396 (32%), Gaps = 31/396 (7%)

Query: 34 ALLFAYFAMVVDGIDIMLLSYSLTSLKAEFGLSTFQAGALGSA----SLAGMGIGGILGG 89
L+ + +D + I L+ L L + S G +L +LG
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 90 WACDKFGRVRTIANSVTFFSVATCLLGFTQSFEQFMALRFIGALGIGALYMACNTLMAEY 149
+ D+FGR + S+ +V ++ R + + GA +A+
Sbjct: 66 LS-DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIADI 123

Query: 150 VPTTYRTTVLGTLQTGQTVGYIAATLLAGAIIPDYGWRVLFFLTVVPAFVNIFLQRF-VP 208
R G + G +A +L G ++ + FF +N F +P
Sbjct: 124 TDGDERARHFGFMSACFGFGMVAGPVL-GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 209 EPKSWQLTKIESLQGKRQPKERVVVEKPKSSSIYKQIFNNFKHRKMFLLWMTTAFFLQ-F 267
E +G+R+P R P +S + + + M F +Q
Sbjct: 183 ESH----------KGERRP-LRREALNPLASFRWARGM------TVVAALMAVFFIMQLV 225

Query: 268 GYYGINNWMPSYLETEVHMNFKNLT-SYMVGSYTAMILGKILAGYLADKFNRRAVFVFGT 326
G W+ + E H + + S + ++ G +A + R + G
Sbjct: 226 GQVPAALWV-IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGM 284

Query: 327 IASAVFLPIIIFFNTPDNILYLLITFGFLYGIPYGVNATYMAESFSTDVRGTAIGGAYNI 386
IA ++ F +++ GI ++ + +G G +
Sbjct: 285 IADGTGYILLAFATRGWMAFPIMVLLAS-GGIGMPALQAMLSRQVDEERQGQLQGSLAAL 343

Query: 387 GRVGAAIAPATIGFL--ASGGTFTMAFIVMGAAYFV 420
+ + + P + AS T+ + GAA ++
Sbjct: 344 TSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYL 379


70A4U85_RS16040A4U85_RS16085N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS16040-112-1.590330peptide chain release factor 3
A4U85_RS16045-111-2.411466iron-containing redox enzyme family protein
A4U85_RS16050-212-0.832327TetR/AcrR family transcriptional regulator
A4U85_RS16055-2120.223528TetR/AcrR family transcriptional regulator
A4U85_RS160650181.065655EAL domain-containing protein
A4U85_RS160702272.032858ketol-acid reductoisomerase
A4U85_RS160751191.529834acetolactate synthase small subunit
A4U85_RS160800171.417581acetolactate synthase 3 large subunit
A4U85_RS160851161.809793DUF4124 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16040TCRTETOQM2123e-63 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 212 bits (540), Expect = 3e-63
Identities = 110/459 (23%), Positives = 203/459 (44%), Gaps = 48/459 (10%)

Query: 15 RTFAIISHPDAGKTTMTEKLLLWGKAIQVAGMVKSRKSDRAATSDWMEMEKERGISITTS 74
+++H DAGKTT+TE LL AI G V +D +E++RGI+I T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKG----TTRTDNTLLERQRGITIQTG 59

Query: 75 VMQFPYKGHTINLLDTPGHEDFSEDTYRTLTAVDSALMVIDGAKGVEERTIKLMEVCRMR 134
+ F ++ +N++DTPGH DF + YR+L+ +D A+++I GV+ +T L R
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 135 DTPIISFVNKMDREIREPLELLDEIENVLNIRCVPITWPLGMGRDFAGVYNILEDKLYVY 194
P I F+NK+D+ + + +I+ L+ V K+ +Y
Sbjct: 120 GIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ------------------KVELY 161

Query: 195 KAGFGSTITDIEVRDGY--NHADIREKVGELAWASFEESLELVQMANEPLDRELFLQGKQ 252
+ T+ E D + D+ EK +SLE +++ E + F
Sbjct: 162 PNMCVTNFTESEQWDTVIEGNDDLLEKYMS------GKSLEALELEQE--ESIRFHNCSL 213

Query: 253 TPVLFGTALGNFGVDHVLDAFMNWAPEPKAHPTQERMVEAKEEGFSGFVFKIQANMDPKH 312
PV G+A N G+D++++ N + G VFKI+ K
Sbjct: 214 FPVYHGSAKNNIGIDNLIEVITNKFYSS---------THRGQSELCGKVFKIE--YSEK- 261

Query: 313 RDRIAFMRICSGKYEKGLKMNHVRIGKEVRISDALTFLAGEREHLEEAWPGDIIGLHNHG 372
R R+A++R+ SG + K ++I++ T + GE +++A+ G+I+ L N
Sbjct: 262 RQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNEF 320

Query: 373 TIQIGDTFTSGENLHFTGIPHFAPEMFR-RVRLKDPLKSKQLQKGLKELSEEGAT-QVFM 430
+++ + L + + V P + + L L E+S+ + ++
Sbjct: 321 -LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379

Query: 431 PQISNDLIVGAVGVLQFDVVAYRLKEEYKVDCVYEPVSV 469
++++I+ +G +Q +V L+E+Y V+ + +V
Sbjct: 380 DSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTV 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16050HTHTETR675e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 66.6 bits (162), Expect = 5e-16
Identities = 31/171 (18%), Positives = 59/171 (34%), Gaps = 18/171 (10%)

Query: 1 MSKRETIITTAMTLFNQKSYTSIGVDKIIAESKVAKMTFYKYFSSKEVLIEECLRR---R 57
R+ I+ A+ LF+Q+ +S + +I + V + Y +F K L E
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 58 ILEVQTSLLDKVNSVDDPLNKLKSIFNWYIDWINTED----FSGCLFKKATIEVLQLYPS 113
I E++ K DPL+ L+ I ++ TE+ +F K E +
Sbjct: 70 IGELELEYQAKFP--GDPLSVLREILIHVLESTVTEERRRLLMEIIFHKC--EFVGEMAV 125

Query: 114 IKKQVNKYREWIYSLVLSIFLE-------LEIEDPKVLSSLFLNIIDGLII 157
+++ Y + + + + I GL+
Sbjct: 126 VQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLME 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16055HTHTETR512e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 2e-10
Identities = 30/163 (18%), Positives = 55/163 (33%), Gaps = 12/163 (7%)

Query: 12 SVLHKSRYLFNKHGFHNVGVDRIVREAEVPKASFYNYFHSKERLIEICLHFQKDVLKEQV 71
+L + LF++ G + + I + A V + + Y +F K L + + E
Sbjct: 15 HILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELE 74

Query: 72 HSI---IYIQKDLILREKLKKIFFLHTSLDGYYHLLFRAIFEIEKLYPAAYQVVIQYRHW 128
+LRE L I L +++ L I + + VV Q +
Sbjct: 75 LEYQAKFPGDPLSVLREIL--IHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRN 132

Query: 129 LTTEVYKLLLTVKKDATKS-------DSDMFLFTLEGAIIQLL 164
L E Y + K ++ + + G I L+
Sbjct: 133 LCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLM 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16065TCRTETB320.008 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 32.2 bits (73), Expect = 0.008
Identities = 29/147 (19%), Positives = 60/147 (40%), Gaps = 18/147 (12%)

Query: 45 LLLNGLLLAAAISIVHIVGMHAYHLFEAASSNVPLITLAFGISAVLSSVAIWLTSRFTLP 104
LLL G+++ S++ VG H++ + + A + V+ VA ++
Sbjct: 81 LLLFGIIINCFGSVIGFVG-HSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGK 139

Query: 105 IFRLILSSIIMGIGISASY---YVSMLGWNIDIYKKDYTSFLILFSVLIAMSGSGLAFLL 161
F LI S + MG G+ + + W S+L+L ++ ++ L
Sbjct: 140 AFGLIGSIVAMGEGVGPAIGGMIAHYIHW----------SYLLLIPMITIITV----PFL 185

Query: 162 AYKLKESERHRISLKLAFAVMMTLSIM 188
LK+ R + + ++M++ I+
Sbjct: 186 MKLLKKEVRIKGHFDIKGIILMSVGIV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16085PYOCINKILLER260.045 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 25.9 bits (56), Expect = 0.045
Identities = 17/69 (24%), Positives = 23/69 (33%), Gaps = 10/69 (14%)

Query: 37 SGSTHYTTTPPPQGAKHLNKVSTYGSQPLLKNPTSNSEQSSQDKDKVVQEVTNVTVEKGA 96
+G T A L T S P +NP+S + + V V +GA
Sbjct: 395 TGLYEVTVPSTTAEAPPLILTWTPASPPGNQNPSSTTPVVPK----------PVPVYEGA 444

Query: 97 PAVPVPPAP 105
PV P
Sbjct: 445 TLTPVKATP 453


71A4U85_RS16125A4U85_RS16165N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS16125-1181.001391enoyl-ACP reductase
A4U85_RS161301170.502456Bax inhibitor-1/YccA family protein
A4U85_RS16135214-0.125544oligoribonuclease
A4U85_RS161401141.142309ribosome small subunit-dependent GTPase A
A4U85_RS161451150.659574rhodanese-like domain-containing protein
A4U85_RS161501140.027080glutaredoxin 3
A4U85_RS161552150.547050protein-export chaperone SecB
A4U85_RS161601150.9451804'-phosphopantetheinyl transferase superfamily
A4U85_RS161651181.168663ribosome biogenesis GTPase Der
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16125DHBDHDRGNASE607e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 7e-13
Identities = 66/263 (25%), Positives = 98/263 (37%), Gaps = 27/263 (10%)

Query: 16 LAGKRFLIAGVASKLSIAYGIAQALHREGAEL-AFTYPNEKLKKRVDEFAEQFGSKLVFP 74
+ GK I G A I +A+ L +GA + A Y EKL+K V + FP
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 75 CDVAVDAEIDNAFAELAKHWDGVDGVVHSIGF---APAHTLDGDFTDVTDRDGFKIAHDI 131
DV A ID A + + +D +V+ G H+L +D +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSL-------SDEEWEATFSVN 116

Query: 132 SAYSFVAMARAAKPLLQARQGCLLTLTYQGSERVMPNYNVMGMAKASLEAGVRYLASSLG 191
S F A +K ++ R G ++T+ + + +KA+ + L L
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 192 VDGIRVNAISAGPIRTL-----------AASGIKSFRKMLDANEKVAPLKRNVTIEEVGN 240
IR N +S G T A IK + PLK+ ++ +
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTG---IPLKKLAKPSDIAD 233

Query: 241 AALFLCSPWASGITGEILYVDAG 263
A LFL S A IT L VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16130RTXTOXINA300.007 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.3 bits (68), Expect = 0.007
Identities = 12/51 (23%), Positives = 22/51 (43%), Gaps = 9/51 (17%)

Query: 25 GAIQKSVMLTIIAAAVGVALFFYAAFTANVGIAYAASIVGAIGGLVLALIT 75
GAI S+ + L A+ ++ + A S+VGA ++ +T
Sbjct: 362 GAIDASL------TTISTVL---ASVSSGISAAATTSLVGAPVSALVGAVT 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16155SECBCHAPRONE1553e-51 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 155 bits (393), Expect = 3e-51
Identities = 59/147 (40%), Positives = 95/147 (64%), Gaps = 3/147 (2%)

Query: 3 EEQQVQPQLALERIYTKDISFEVPGA-QVFTKQWQPELNINLSSAAEKIDPTHFEVSLKV 61
+ QP L ++RIY KD+SFE P +F + W+P+L+ +LS+ A+++ +EV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 62 VVQANNDNE--TAFIVDVTQSGIFLIDNIEEDRLPYILGAYCPNILFPFLREAVNDLVTK 119
V+ ++ AFI +V Q+G+F I +EE ++ + L + CPN+LFP+ RE V+ LV +
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 120 GSFPQLLLTPINFDAEFEANMQRAQAA 146
G+FP L L+P+NFDA F +QR + A
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQA 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16160PF05704280.026 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 27.9 bits (62), Expect = 0.026
Identities = 20/119 (16%), Positives = 47/119 (39%), Gaps = 3/119 (2%)

Query: 3 SKHVDIYIGKLNDFLKRKQH-DFPDFRTFN-EYKKQQIKSIRNRLIVEKLQLISTELEFA 60
+ +I F+K K + Y K++ K + + + +++ E++
Sbjct: 178 LESETTHISNWLIFVKSKNDPFLVGLKNSMVTYLKKKEKPADYYIFHDFVSVMAVSKEYS 237

Query: 61 KHEHGKPYLLNHTLHFNHSHSQQYYALALSERIKDIGIDVEELDRKVRLDSLAQHAFHS 119
K+ PY+ N H Y ++ IK V++L K+ ++L ++ ++
Sbjct: 238 KYWKEIPYVNNVNPHMLQYLGNLPYDNSMFNYIKSTS-PVQKLTYKLDYNNLKRNTYYD 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16165TCRTETOQM320.007 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.7 bits (72), Expect = 0.007
Identities = 29/141 (20%), Positives = 57/141 (40%), Gaps = 28/141 (19%)

Query: 177 LRLAIIGRPNVGKSTLVNRLL----GEDRVVAFDQPGTTRDSIYIPFER----------- 221
+ + ++ + GK+TL LL + + D+ T D+ + +R
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 222 --EGRKYTLIDTAGVRRKGKVDEMIEKFSIVKTLQAMKDAHVVVVVVDAREGIVEQDLHL 279
E K +IDT G +D + E + ++L + A ++++ A++G+ Q L
Sbjct: 64 QWENTKVNIIDTP-----GHMDFLAE---VYRSLSVLDGA---ILLISAKDGVQAQTRIL 112

Query: 280 IGYALEAGRAMVIAINKWDNM 300
+ G + INK D
Sbjct: 113 FHALRKMGIPTIFFINKIDQN 133


72A4U85_RS16590A4U85_RS16620N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS16590-313-1.306883glutamate/aspartate:proton symporter GltP
A4U85_RS16595-312-0.306714endonuclease/exonuclease/phosphatase family
A4U85_RS16600-2120.357848aspartate-semialdehyde dehydrogenase
A4U85_RS16605-211-0.016310hypothetical protein
A4U85_RS16610-2110.360517hypothetical protein
A4U85_RS166150110.944828asparaginase
A4U85_RS16620-1121.473194tRNA pseudouridine(38-40) synthase TruA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16590V8PROTEASE320.005 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.9 bits (72), Expect = 0.005
Identities = 7/41 (17%), Positives = 18/41 (43%)

Query: 293 AYGAPKAISSFVIPTGYSFNLDGSTLYQSIAAIFIAQLYGI 333
+ A ++ + TGY + +T+++S I + +
Sbjct: 186 SNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEAM 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS1659556KDTSANTIGN290.027 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 29.2 bits (65), Expect = 0.027
Identities = 15/38 (39%), Positives = 23/38 (60%)

Query: 333 EQTQADEEEAQAAIQEGIAKAEKEEKIVTDEIAQPYKE 370
+Q Q +++AQA QE +A A +D+IAQ YK+
Sbjct: 343 QQGQGQQQQAQATAQEAVAAAAVRLLNGSDQIAQLYKD 380


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16600FLGPRINGFLGI290.028 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 29.1 bits (65), Expect = 0.028
Identities = 18/50 (36%), Positives = 28/50 (56%), Gaps = 6/50 (12%)

Query: 288 LDEIEDMIRNSNQWAKVVPNTREASM-----TDLTPVAVT-GTLTVPVGR 331
+ EIE++ ++ AKVV N R ++ ++ VAV+ GTLTV V
Sbjct: 247 MAEIENLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16605TACYTOLYSIN310.007 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 31.5 bits (71), Expect = 0.007
Identities = 34/145 (23%), Positives = 55/145 (37%), Gaps = 23/145 (15%)

Query: 150 IALNLPESTQYTPTENTSNVAETTGNEH----NLNISTGTA-----PALNSSSPAPVAPF 200
I NL + + +NT+N TT NE + ++T A LNS+ +AP
Sbjct: 25 IVGNLVTANADSNKQNTANTETTTTNEQPKPESSELTTEKAGQKMDDMLNSNDMIKLAP- 83

Query: 201 TEKPSTQTPQPTIATMNIESSAVEKQPSTSLKSSGGNQTTTVKKAAQSKTAQQKNVLSKN 260
M +ES+ E++ S K S + T + S + VL+KN
Sbjct: 84 -------------KEMPLESAEKEEKKSEDNKKSEEDHTEEINDKIYSLNYNELEVLAKN 130

Query: 261 ASKKQKISTYKGPAPTGKYIVQRNE 285
+ +G K+IV +
Sbjct: 131 GETIENFVPKEGVKKADKFIVIERK 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS16620TYPE3OMGPROT344e-04 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 34.1 bits (78), Expect = 4e-04
Identities = 19/92 (20%), Positives = 34/92 (36%), Gaps = 21/92 (22%)

Query: 13 IQYRGWQTQQPGVASVQETI--ERVLSKIADEPITL-HGAGRTDAGVHATNMVAHFDTNA 69
I YR + PGVA++ + + + + ++ + + A R A + A NA
Sbjct: 199 IHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAATRASA---QARVEADPSLNA 255

Query: 70 I----RPERGWIMGANSQLPKDISIQWIKQMD 97
I PER + + I +D
Sbjct: 256 IIVRDSPER-----------MPMYQRLIHALD 276


73A4U85_RS17060A4U85_RS17120N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS170601162.019791translation initiation factor IF-2
A4U85_RS17065-1101.135179transcription termination/antitermination
A4U85_RS17070-1100.861393ribosome maturation factor RimP
A4U85_RS17085-1110.959404**preprotein translocase subunit SecG
A4U85_RS170900110.699558triose-phosphate isomerase
A4U85_RS17095-2111.096987type IV-A pilus assembly ATPase PilB
A4U85_RS171000111.046576type II secretion system F family protein
A4U85_RS171050130.234493prepilin peptidase
A4U85_RS17110-2120.513566dephospho-CoA kinase
A4U85_RS17115-3120.750767DMT family transporter
A4U85_RS17120-2131.13076723S rRNA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17060TCRTETOQM833e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 82.6 bits (204), Expect = 3e-18
Identities = 81/391 (20%), Positives = 128/391 (32%), Gaps = 103/391 (26%)

Query: 406 IMGHVDHGKTSLLDRIRRSKVAAGEAG------------------GITQHIGAYHVETDK 447
++ HVD GKT+L + + + A E G GIT G + +
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 448 GIITFLDTPGHAAFTSMRARGAKATDIVVLVVAADDGVMPQTAEAIDHARAAGTPIIVAI 507
+ +DTPGH F + R D +L+++A DGV QT R G P I I
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 508 NKMDKESADPDRVL---------------------NELTTKEIVPEEW------------ 534
NK+D+ D V N T E+W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 535 -----------------------GGDVPVAKVSAHTGQGIDELLDLILIQSELMELKASA 571
PV SA GID L+++I ++
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 572 EGAAQGVVIEARVDKGRGAVTSILVQNGTLNIGDLVLAGSSYGRVRAMSDENGKPIKSAG 631
+ G V + + R + I + +G L++ D V +S++ I
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVR----------ISEKEKIKITEMY 295

Query: 632 PSIPVEILGLPEAPMAGDEVLVVNDEKKAREVADARADREREKRIERQSAMRLENIMASM 691
SI E+ + +A +G+ V++ N+ K V + +RIE
Sbjct: 296 TSINGELCKIDKA-YSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPL----------- 343

Query: 692 GKKDVPTVNVVLRTDVRGTLEALNAALHELS 722
P + + E L AL E+S
Sbjct: 344 -----PLLQTTVEPSKPQQREMLLDALLEIS 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17085SECGEXPORT962e-29 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 96.2 bits (239), Expect = 2e-29
Identities = 45/98 (45%), Positives = 66/98 (67%)

Query: 1 MHSFVLVVHIILAVLMIALILVQHGKGADAGASFGGGGAATVFGASGSGNFLTRVTAILT 60
M+ +LVV +I+A+ ++ LI++Q GKGAD GASFG G +AT+FG+SGSGNF+TR+TA+L
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 61 ALFFVTSLTLAIFAKKQTTEAYSLKTVQTTAPAQTTSP 98
LFF+ SL L +T + + + A + T P
Sbjct: 61 TLFFIISLVLGNINSNKTNKGSEWENLSAPAKTEQTQP 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17100BCTERIALGSPF398e-139 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 398 bits (1024), Expect = e-139
Identities = 119/409 (29%), Positives = 219/409 (53%), Gaps = 12/409 (2%)

Query: 9 MPTFAYEGVDRKGVKIKGELPAKNMALAKVTLRKQGVTVRNIREKRKNILEG-------L 61
M + Y+ +D +G K +G A + A+ LR++G+ ++ E R + +
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 62 FKKKVTTLDITIFTRQLATMMKAGVPLVQGFEIVAEGLENPAMREVVLGIKGEVEGGSTF 121
K +++T D+ + TRQLAT++ A +PL + + VA+ E P + +++ ++ +V G +
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 122 ASALRKYPQHFDNLFCSLVESGEQSGALETMLDRVAIYKEKSELLKQKIKKAMKYPATVI 181
A A++ +P F+ L+C++V +GE SG L+ +L+R+A Y E+ + ++ +I++AM YP +
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 182 VVAIVVTIILMVKVVPVFQDLFASFGADLPAFTQMVVNMSKWMQEY--WFIMIIAIGAVI 239
VVAI V IL+ VVP + F LP T++++ MS ++ + W ++ + G +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 AAFLEAKKRSKKFRDGLDKLALKLPIFGDLVYKAIIARYSRTLATTFAAGVPLIDALEST 299
+ R +K R + L LP+ G + ARY+RTL+ A+ VPL+ A+ +
Sbjct: 241 FRVM---LRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRIS 297

Query: 300 AGATNNVIYEKAVMKIREDVATGQQLQFAMRVSNRFPSMAIQMVAIGEESGALDSMLDKV 359
+N + + V G L A+ + FP M M+A GE SG LDSML++
Sbjct: 298 GDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERA 357

Query: 360 ATYYENEVDNAVDGLTSMMEPLIMAILGVLVGGLVIAMYLPIFQMGSVV 408
A + E + + + EPL++ + +V +V+A+ PI Q+ +++
Sbjct: 358 ADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17105PREPILNPTASE320e-112 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 320 bits (823), Expect = e-112
Identities = 148/286 (51%), Positives = 188/286 (65%), Gaps = 2/286 (0%)

Query: 1 MQDIIAYFIQNLTALYIAVALVSLCIGSFLNVVIYRTPRMMEQDWQQECQMLLNPEQPII 60
M ++ + V L SL IGSFLNVVI+R P M+E++WQ E + NP+ +
Sbjct: 1 MALLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGV 60

Query: 61 DHEKLTLSKPASSCPACQQPIRWYQNIPVISWLVLRGKCGHCQHPISIRYPVIELLTMLC 120
D L P S CP C PI +NIP++SWL LRG+C CQ PIS RYP++ELLT L
Sbjct: 61 DEPPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALL 120

Query: 121 SLVVVMMFGPTIQMLFGLVLTWVLIALTFIDFDTQLLPDRFTLPLAALGLGINTFNIYTS 180
S+ V M P L L+LTWVL+ALTFID D LLPD+ TLPL GL N + S
Sbjct: 121 SVAVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVS 180

Query: 181 PNSAIWGYLIGFLCLWIVYYLFKVITGKEGMGYGDFKLLAALGAWMGPLMLPLIVLLSSL 240
A+ G + G+L LW +Y+ FK++TGKEGMGYGDFKLLAALGAW+G LP+++LLSSL
Sbjct: 181 LGDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSL 240

Query: 241 LGAIIGIILLKLRNDN--QPFAFGPYIAIAGWVAFLWGDQIMKIYL 284
+GA +GI L+ LRN + +P FGPY+AIAGW+A LWGD I + YL
Sbjct: 241 VGAFMGIGLILLRNHHQSKPIPFGPYLAIAGWIALLWGDSITRWYL 286


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17120INVEPROTEIN346e-04 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 33.6 bits (76), Expect = 6e-04
Identities = 27/91 (29%), Positives = 45/91 (49%), Gaps = 9/91 (9%)

Query: 30 LKGRDDQRLQKILQLAEPFGISVQK-ASRDSLEKLAGL-PFHQGVVAAVRPHPTLNEKDL 87
L+ + ++IL+L ISV A D L + L P +V +R L KDL
Sbjct: 86 LEDEALPKAKQILKL-----ISVHGGALEDFLRQARSLFPDPSDLVLVLRE--LLRRKDL 138

Query: 88 DQLLAETPDALLLALDQVTDPHNLGACIRTA 118
++++ + ++LL +++ TDP L A I A
Sbjct: 139 EEIVRKKLESLLKHVEEQTDPKTLKAGINCA 169


74A4U85_RS17330A4U85_RS17385N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS17330-2130.850102preprotein translocase subunit SecE
A4U85_RS17340-2120.883458*elongation factor Tu
A4U85_RS17365-2120.126932****anthranilate synthase component I
A4U85_RS17370-312-0.350628phosphoglycolate phosphatase
A4U85_RS17375-212-0.785013FHA domain-containing protein
A4U85_RS17380-212-1.708892type II secretion system secretin GspD
A4U85_RS17385-114-2.577950general secretion pathway protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17330SECETRNLCASE752e-20 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 75.3 bits (185), Expect = 2e-20
Identities = 45/126 (35%), Positives = 64/126 (50%), Gaps = 5/126 (3%)

Query: 21 SAEVVRSGSPLDIVLWVIAIALLLSATMVNQHLPAYWAPANDVWVRVGVIFACIVVALGL 80
+ E SG L+ + WV+ +ALLL A + N P +R + I A G+
Sbjct: 4 NTEAQGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIMLP-----LRALAVVILIAAAGGV 58

Query: 81 LYATHQGKGFVRLLKDARVELRRVTWPTKQETVTTSWQVLLVVVVASLVLWCFDYGLGWL 140
T +GK V ++AR E+R+V WPT+QET+ T+ V V V SL+LW D L L
Sbjct: 59 ALLTTKGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRL 118

Query: 141 IKLIIG 146
+ I G
Sbjct: 119 VSFITG 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17340TCRTETOQM781e-17 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 78.0 bits (192), Expect = 1e-17
Identities = 50/149 (33%), Positives = 77/149 (51%), Gaps = 5/149 (3%)

Query: 13 VNVGTIGHVDHGKTTLTAAI--ATICAKTYGGEAKDYSQIDSAPEEKARGITINTSHVEY 70
+N+G + HVD GKTTLT ++ + G K ++ D+ E+ RGITI T +
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSF 63

Query: 71 DSPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVCAATDGPMPQTREHILLSRQVGVPY 130
+D PGH D++ + + +DGAIL+ +A DG QTR R++G+P
Sbjct: 64 QWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP- 122

Query: 131 IIVFLNKCDLVDDEELLELVEMEVRELLS 159
I F+NK D + L V +++E LS
Sbjct: 123 TIFFINKIDQNGID--LSTVYQDIKEKLS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17380BCTERIALGSPD427e-142 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 427 bits (1100), Expect = e-142
Identities = 226/692 (32%), Positives = 337/692 (48%), Gaps = 73/692 (10%)

Query: 12 ALLAAAPLIATVSSSAYAQTWKINLRDADLTAFINEVADITGKNFAVDPRVRGNVTVISN 71
LL A L+ A A+ + + + D+ FIN V+ K +DP VRG +TV S
Sbjct: 13 TLLIFAALLFR---PAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSY 69

Query: 72 KPLNKDEVYDLFLGVLNVNGVVAIPSGN-TIKLVPDSNVKNSGIPYDSR-NRVRGDQIVT 129
LN+++ Y FL VL+V G I N +K+V + K + +P S GD++VT
Sbjct: 70 DMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVT 129

Query: 130 RVIWLENTNPNDLIPALRPLMPQFAHMAAI--AGTNALIVSDRAANIYQLENIIRNLDGT 187
RV+ L N DL P LR L + + +N L+++ RAA I +L I+ +D
Sbjct: 130 RVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNA 189

Query: 188 GQNDIEAITLQSSQAEEIITQLEAMSATGASKDFSGARI-RIIADNRTNRILIKGDPQTR 246
G + + L + A +++ + ++ + G+ + ++AD RTN +L+ G+P +R
Sbjct: 190 GDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSR 249

Query: 247 KRIRHMIEMLDVPSADRLGGLKVFRLKYASAKNLSEILQGLVTGQAVSSSNNSNNSSNSS 306
+RI MI+ LD A G KV LKYA A +L E+L G
Sbjct: 250 QRIIAMIKQLDRQQA-TQGNTKVIYLKYAKASDLVEVLTG-------------------- 288

Query: 307 NPINSLIGNNQNSGSNTSGSSGTSISTPAINLNGNSNSSNQNNITSFNQNGVSIIADNAQ 366
I + S + + I A
Sbjct: 289 ------ISSTMQSEKQAAKPVAAL------------------------DKNIIIKAHGQT 318

Query: 367 NSLVVKADPQLMREIESAIQQLDVRRQQVLIEAAIIEVSGKDADQLGVQWALGDINSGIG 426
N+L+V A P +M ++E I QLD+RR QVL+EA I EV D LG+QWA N G
Sbjct: 319 NALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWA----NKNAG 374

Query: 427 LINFTNAGSSLASLAAGYLTGGAAG-LGSAIGAGSSIALGKYKEGADGSRQLYGALIQAL 485
+ FTN+G +++ AG G + S++ + S G A + + L+ AL
Sbjct: 375 MTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNG---IAAGFYQGNWAMLLTAL 431

Query: 486 KENTASNLLSTPSIVTMDNEEAYIVVGQNVPFVTGSVTTNSTGINPYTTVERKDVGVTLK 545
+T +++L+TPSIVT+DN EA VGQ VP +TGS TT+ N + TVERK VG+ LK
Sbjct: 432 SSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGD--NIFNTVERKTVGIKLK 489

Query: 546 VIPHIGENGTVRLEIEQEVSNVQASKGQAA---DLITNKRAIKTAVLAEHGQTVVLGGLV 602
V P I E +V LEIEQEVS+V + + N R + AVL G+TVV+GGL+
Sbjct: 490 VKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLL 549

Query: 603 SDDVEFNRQGIPGLSSIPYLGRLFRSDTRSNTKRNLLVFIHPTIVGDANDVRRLSQQRYN 662
V +P L IP +G LFRS ++ +KRNL++FI PT++ D ++ R+ S +Y
Sbjct: 550 DKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYRQASSGQYT 609

Query: 663 QLYSLQL-AMDKNGNFAKLPEQVDDIYNQKMT 693
Q K N A L + + +IY ++ T
Sbjct: 610 AFNDAQSKQRGKENNDAMLNQDLLEIYPRQDT 641


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17385BCTERIALGSPC631e-13 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 62.7 bits (152), Expect = 1e-13
Identities = 57/274 (20%), Positives = 98/274 (35%), Gaps = 33/274 (12%)

Query: 19 LSVVVLAILILWLCWKLASFFWLVIAP---PQLMQFDRVELGSQQPQIPNIST-FSLFNE 74
+ ++ +L+L C +LA FW + P P QQP N T F + E
Sbjct: 14 IRRILFYLLMLLFCQQLAMIFWRIGLPDNAPVSSVQITPAQARQQPVTLNDFTLFGVSPE 73

Query: 75 P----------SANAAQESVNLELQGVMVGYPNRFSSAVIKIDNTAERYRVGETIGSTSY 124
+N ++NL L GVM G + S A+I DN V E + +
Sbjct: 74 KNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRGVNEEVPGYNA 133

Query: 125 QLAEVYWDHVVLSQGNGSTRELQFKGLPNGLYQPMTPDASQQSATPSQPTEPMNTAQQAL 184
++ + D VVL G Y+ + + + S + P +N Q
Sbjct: 134 KIVSIRPDRVVLQY--------------QGRYEVLGLYSQEDSGSDGVPGAQVNEQLQQR 179

Query: 185 GQAIQQMQGNREQYLRDMGVSGNSGEGYEVTERTPTALRNKLGLRPGDRIVSLNGQTVGQ 244
+ + D N +GY + + ++GL+ D V+LNG +
Sbjct: 180 ASTTMSDYVSFSPIMND-----NKLQGYRLNPGPKSDSFYRVGLQDNDMAVALNGLDLRD 234

Query: 245 GQTDVQLLEQARRAGQVKIEIKRGDQVMTIQQNF 278
+ + +E+ + ++R Q I F
Sbjct: 235 AEQAKKAMERMADVHNFTLTVERDGQRQDIYMEF 268


75A4U85_RS17925A4U85_RS17960N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS17925-1111.266314zinc ABC transporter substrate-binding protein
A4U85_RS17930-2101.212069transcriptional repressor
A4U85_RS17935-1111.040602metal ABC transporter ATP-binding protein
A4U85_RS17940-1120.976358metal ABC transporter permease
A4U85_RS17945-1110.488213LysE family transporter
A4U85_RS179500120.425230Dyp-type peroxidase
A4U85_RS17955-1120.425653NAD-dependent malic enzyme
A4U85_RS17960012-0.000200arginine--tRNA ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17925ADHESNFAMILY814e-20 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 81.4 bits (201), Expect = 4e-20
Identities = 49/225 (21%), Positives = 79/225 (35%), Gaps = 21/225 (9%)

Query: 11 VVSTHPIYLIAKEITKGVEEPQLLLQ-GQSGHDVQLTPAHRKAINDASLVIWLGKAHE-- 67
V + I I K I + ++ GQ H+ + P K ++A L+ + G E
Sbjct: 36 VATNSIIADITKNIAGDKIDLHSIVPIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETG 95

Query: 68 --APLNKLLSN-----NKKAIALLDSGILSILPQRNTRGAALPNTVDTHVWLEPNNAVRI 120
A KL+ N NK A+ S + ++ G D H WL N +
Sbjct: 96 GNAWFTKLVENAKKTENKDYFAV--SDGVDVIY---LEGQNEKGKEDPHAWLNLENGIIF 150

Query: 121 GFFIAALRSQQHPENKAKYWNNANTFARNMLQAAQAYDS-----SSNGKPYWSYHDAYQY 175
IA S + P NK Y N + + + + + K + A++Y
Sbjct: 151 AKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKY 210

Query: 176 LERSLNLKFAGALTDDPHVAPTAAQIKYLND-SRPKAQMCLLAES 219
++ + A + T QIK L + R L ES
Sbjct: 211 FSKAYGVPSAYIWEINTEEEGTPEQIKTLVEKLRQTKVPSLFVES 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17935PF05272330.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.7 bits (74), Expect = 0.002
Identities = 13/31 (41%), Positives = 17/31 (54%), Gaps = 6/31 (19%)

Query: 31 KVDFALHENEIVTLIGPNGAGKSTLIKVLLG 61
K D+++ L G G GKSTLI L+G
Sbjct: 594 KFDYSV------VLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17955ARGDEIMINASE300.031 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 29.8 bits (67), Expect = 0.031
Identities = 14/50 (28%), Positives = 23/50 (46%), Gaps = 5/50 (10%)

Query: 124 GVFISYPDRDVIDDILQNVNKNNVKVIVITDGERILGLGDQGIGGMGIPI 173
G I+Y R+ + + + +N +KV I E L G G M +P+
Sbjct: 360 GEIIAY-SRNHVTN--KLFEENGIKVHRIPSSE--LSRGRGGPRCMSMPL 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS17960BLACTAMASEA300.032 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.8 bits (67), Expect = 0.032
Identities = 19/76 (25%), Positives = 27/76 (35%), Gaps = 22/76 (28%)

Query: 92 NADQRF------------AILDQIQAQKESFGRSQSNAAKKIQVEFVSANPTSSLHVGHG 139
AD+RF A+L ++ A E R + + V +P S H+ G
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDL----VDYSPVSEKHLADG 112

Query: 140 RGAAYGMTVANLLEAT 155
MTV L A
Sbjct: 113 ------MTVGELCAAA 122


76A4U85_RS18405A4U85_RS18455N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18405-112-1.781168Gfo/Idh/MocA family oxidoreductase
A4U85_RS18410013-1.891094Vi polysaccharide biosynthesis
A4U85_RS18420-111-0.929747polysaccharide biosynthesis/export family
A4U85_RS18425-19-0.968220low molecular weight phosphotyrosine protein
A4U85_RS18430-18-0.538100polysaccharide biosynthesis tyrosine autokinase
A4U85_RS18435-2120.735691FKBP-type peptidyl-prolyl cis-trans isomerase
A4U85_RS18440-2111.048272FKBP-type peptidyl-prolyl cis-trans isomerase
A4U85_RS18445-1110.425051murein biosynthesis integral membrane protein
A4U85_RS18450012-0.8708221,6-anhydro-N-acetylmuramyl-L-alanine amidase
A4U85_RS18455011-0.853251carboxylating nicotinate-nucleotide
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18410TCRTETOQM290.035 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 28.7 bits (64), Expect = 0.035
Identities = 32/154 (20%), Positives = 58/154 (37%), Gaps = 17/154 (11%)

Query: 7 IGAAGYIAPRHLKAIKET-GNTLAVAMDVNDSVGIMDSHFPEAEFFTEFEE-----FEAY 60
I G + IKE + + V + ++F E+E + E E Y
Sbjct: 130 IDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLEKY 189

Query: 61 VEDQKLKGEKLD-----YVAICS--PNYLHAPHMKYALKNGIDVICEK---PLVLNSEDL 110
+ + L+ +L+ CS P Y + + N I+VI K +L
Sbjct: 190 MSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTHRGQSEL 249

Query: 111 NMLAEYEKQYGAKVNSILQLRLHPSIIALRDKVQ 144
++ +Y K + +RL+ ++ LRD V+
Sbjct: 250 CGKV-FKIEYSEKRQRLAYIRLYSGVLHLRDSVR 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18430RTXTOXIND330.004 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.004
Identities = 24/153 (15%), Positives = 54/153 (35%), Gaps = 23/153 (15%)

Query: 246 QGQDKEHITKVLNAILATYSAQ------NIERRSAESA----------QTLKFLDEQLPD 289
++ +T ++ +T+ Q N++++ AE + +L D
Sbjct: 180 SEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDD 239

Query: 290 LKKQLDDAERQFNKFRQQYNT-VDVTKESELYLTQSITLETKKAELEQKQAEMAAKYTAE 348
L + +Q N V+ E +Y +Q +E++ +++ + +
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFK-- 297

Query: 349 HPAMREINGQLAAINKQIGELNSTLKQLPDVQR 381
EI +L IG L L + + Q+
Sbjct: 298 ----NEILDKLRQTTDNIGLLTLELAKNEERQQ 326


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18435INFPOTNTIATR1445e-45 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 144 bits (365), Expect = 5e-45
Identities = 80/218 (36%), Positives = 117/218 (53%), Gaps = 10/218 (4%)

Query: 29 TTEVGRKADKNASPIQKISYVLGYEVAQQTPP---ELDTKAFVQGIHDARNKQPSAYTQE 85
T A + K+SY +G ++ + +++ +G+ D + T+E
Sbjct: 17 TAMAATDATSLTTDKDKLSYSIGADLGKNFKNQGIDINPDVLAKGMQDGMSGAQLILTEE 76

Query: 86 DLKAAVAAYEKELQQK--MQHQDKPEQAGTATDSADAQFLAENKTKAGVKTTVSGLQYII 143
+K ++ ++K+L K + K E+ D+ FL+ NK+K G+ SGLQY I
Sbjct: 77 QMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDA----FLSANKSKPGIVVLPSGLQYKI 132

Query: 144 TKEGTGKQPTAQSIVKVHYEGRLINGQVFDSSYKRGQPVEFPLNQVIPGWTEGLQLMKEG 203
GTG +P V V Y G LI+G VFDS+ K G+P F ++QVIPGWTE LQLM G
Sbjct: 133 IDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPAG 192

Query: 204 GKATFFIPSNLAYGPQELPG-IPANSTLIFDVELISVK 240
F+P++LAYGP+ + G I N TLIF + LISVK
Sbjct: 193 STWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVK 230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18440INFPOTNTIATR1784e-58 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 178 bits (452), Expect = 4e-58
Identities = 93/225 (41%), Positives = 132/225 (58%), Gaps = 3/225 (1%)

Query: 11 VIAASTMSLSV---FAAAPITNKSPAKDQFSYSYGYLMGRNNTDALTDLNLDIFYQGLQE 67
++ A+ M L++ AA T+ + KD+ SYS G +G+N + D+N D+ +G+Q+
Sbjct: 5 LVTAAIMGLAMSTAMAATDATSLTTDKDKLSYSIGADLGKNFKNQGIDINPDVLAKGMQD 64

Query: 68 GAQNKTARLTDEEMAKAINDYKKTLEAKQLVEFQKQGQQNAQAGAAFLAENAKKSGVIAT 127
G LT+E+M ++ ++K L AK+ EF K+ ++N G AFL+ N K G++
Sbjct: 65 GMSGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVL 124

Query: 128 KSGLQYQVLKEGTGKTPKATSRVKVNYEGRLLDGTVFDSSIARNHPVDFQLNQVIAGWTE 187
SGLQY+++ GTG P + V V Y G L+DGTVFDS+ P FQ++QVI GWTE
Sbjct: 125 PSGLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTE 184

Query: 188 GLQTMKEGGKTRFFIPAKLAYGEVGAGDSIGPNSTLIFDIELLQV 232
LQ M G F+PA LAYG G IGPN TLIF I L+ V
Sbjct: 185 ALQLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISV 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18445ACRIFLAVINRP310.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.3 bits (71), Expect = 0.012
Identities = 34/167 (20%), Positives = 60/167 (35%), Gaps = 41/167 (24%)

Query: 218 IPPKVDFKHEGVERILKL---MLPALFGVSVTQINLLLNTIWASFMQDGSVSWLYSAERM 274
+P + + G+ +L PAL +S + L L ++ S+ SV M
Sbjct: 850 LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV--------M 901

Query: 275 TELPLGLIGVAIGTVILPSLSARHAEQDQAKFRSMIDWAAKV--IVLVGLPASIALFMLS 332
+PLG++GV + + F D V + +GL A A+ ++
Sbjct: 902 LVVPLGIVGVLLAATL---------------FNQKNDVYFMVGLLTTIGLSAKNAILIVE 946

Query: 333 ----------TPIIQALFQRGEFDLRDTQMTALALQCMSAGVISFML 369
+++A LR MT+LA GV+ +
Sbjct: 947 FAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLA---FILGVLPLAI 990


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18455PF07328290.012 T-DNA border endonuclease VirD1
		>PF07328#T-DNA border endonuclease VirD1

Length = 144

Score = 28.9 bits (64), Expect = 0.012
Identities = 9/45 (20%), Positives = 16/45 (35%)

Query: 58 VNALISAYDNTVQVTWLKQEGDRVAANEAFLKLAGSARSLLTVER 102
+N + A + T + +R KL+ L+ V R
Sbjct: 85 INQIAKAANRTHDPAYHSFMAERKVLGLELSKLSAVLAPLMEVSR 129


77A4U85_RS18730A4U85_RS18770N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18730-1130.488772TetR family transcriptional regulator
A4U85_RS18735-1110.747526TetR/AcrR family transcriptional regulator
A4U85_RS187401112.220819thiol:disulfide interchange protein DsbA/DsbL
A4U85_RS187451121.763175bifunctional 3-demethylubiquinone
A4U85_RS18750-1101.582245HAD-IA family hydrolase
A4U85_RS18755-4112.176244YciK family oxidoreductase
A4U85_RS18760-3142.837170RcnB family protein
A4U85_RS18765-3142.750115RcnB family protein
A4U85_RS18770-2122.581596amino-acid N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18730HTHTETR531e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 52.7 bits (126), Expect = 1e-10
Identities = 23/72 (31%), Positives = 39/72 (54%), Gaps = 1/72 (1%)

Query: 6 ERKQQSRQALLDAALHLSTSGRSFSSISLREVAREVGLVPTAFYRHFQDMDELGKELVDQ 65
+ Q++RQ +LD AL L S + SS SL E+A+ G+ A Y HF+D +L E+ +
Sbjct: 7 QEAQETRQHILDVALRL-FSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWEL 65

Query: 66 VALHLKSVLHQL 77
++ + +
Sbjct: 66 SESNIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18735HTHTETR568e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.8 bits (134), Expect = 8e-12
Identities = 17/64 (26%), Positives = 32/64 (50%), Gaps = 1/64 (1%)

Query: 12 RKEKILSVAEKLLLENN-QEITLDELVAELDIAKGTLYKHFRSKNELLLELIIQNEKQIL 70
++ IL VA +L + +L E+ + +G +Y HF+ K++L E+ +E I
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 71 EISQ 74
E+
Sbjct: 72 ELEL 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18740BLACTAMASEA290.014 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.6 bits (64), Expect = 0.014
Identities = 14/49 (28%), Positives = 19/49 (38%), Gaps = 7/49 (14%)

Query: 63 EPHMQTWLKQIPSDVRFVRTPAAMNKVWEQGARTYYTSEALGVRKRTHL 111
E + +P D R TPA+M R TS+ L R + L
Sbjct: 162 ETELNEA---LPGDARDTTTPASMAATL----RKLLTSQRLSARSQRQL 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18755DHBDHDRGNASE878e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 87.4 bits (216), Expect = 8e-23
Identities = 54/203 (26%), Positives = 93/203 (45%), Gaps = 6/203 (2%)

Query: 13 LKDRIILITGAGDGIGRAAALSYALHGATVVLHGRTLNKLEVIYDEIEGLGAPQPAILPL 72
++ +I ITGA GIG A A + A GA + KLE + ++ A P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA-FPA 64

Query: 73 QLSSASDRDYDFLVSTLEKQFGRLDGILHNAGILGERVELAH-YPAEVWDDVMAVNLRAP 131
+ ++ D + + +E++ G +D +++ AG+L R L H E W+ +VN
Sbjct: 65 DVRDSAA--IDEITARIEREMGPIDILVNVAGVL--RPGLIHSLSDEEWEATFSVNSTGV 120

Query: 132 FALTQALLPLLQKSENASVVFTSSGVGREARALWGAYSVSKVAIEAVSKIFAAEHTYPNI 191
F ++++ + + S+V S R AY+ SK A +K E NI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 192 RFNCINPGATRTAMRAKAYPEED 214
R N ++PG+T T M+ + +E+
Sbjct: 181 RCNIVSPGSTETDMQWSLWADEN 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18770SACTRNSFRASE300.014 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.9 bits (67), Expect = 0.014
Identities = 23/85 (27%), Positives = 36/85 (42%), Gaps = 10/85 (11%)

Query: 367 RSAEIACVAVHPSYRKSNRGSQILQFLEEKAKQQGIRQLFVLTTR----TAHWFLEHGFH 422
A I +AV YRK G+ +L E AK+ L + T H++ +H F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 423 QVSVD-----DLPNAR-QALYNYQR 441
+VD + P A A++ Y +
Sbjct: 148 IGAVDTMLYSNFPTANEIAIFWYYK 172


78A4U85_RS18875A4U85_RS18910N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
A4U85_RS18875118-1.123716alpha/beta hydrolase
A4U85_RS18880322-0.177956hypothetical protein
A4U85_RS18885321-1.954168hypothetical protein
A4U85_RS18890418-2.890874MBL fold metallo-hydrolase
A4U85_RS18895417-3.508760nucleotide exchange factor GrpE
A4U85_RS18900217-2.474095molecular chaperone DnaK
A4U85_RS18905116-2.026998matrixin family metalloprotease
A4U85_RS18910-214-2.217215matrixin family metalloprotease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18875SACTRNSFRASE280.036 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.036
Identities = 15/73 (20%), Positives = 23/73 (31%), Gaps = 18/73 (24%)

Query: 84 AIRQKPTDIELVEDI------------RLPLQSGTIFARHYHPA------PNKKLPLIVF 125
IR L+EDI L +A+ H + + F
Sbjct: 81 KIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHF 140

Query: 126 YHGGGFVVGGLDT 138
Y F++G +DT
Sbjct: 141 YAKHHFIIGAVDT 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18905SHAPEPROTEIN1412e-39 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 141 bits (358), Expect = 2e-39
Identities = 79/380 (20%), Positives = 141/380 (37%), Gaps = 69/380 (18%)

Query: 5 IGIDLGTTNSCVAVLEGDKVKVIENAEGARTTPSIIAYKDGEILVGQSAKRQAVTNPKNT 64
+ IDLGT N+ + V V + R + VG AK+ P N
Sbjct: 13 LSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRA--GSPKSVAAVGHDAKQMLGRTPGN- 69

Query: 65 LFAIKRLIGRRYEDQAVQKDIGLVPYKIIKADNGDAWVEVNDKKLAPQQISAEILKK-MK 123
+ AI+ P K D +A ++ ++L+ +K
Sbjct: 70 IAAIR-------------------PMK--------------DGVIADFFVTEKMLQHFIK 96

Query: 124 KTAEDYLGETVTEAVITVPAYFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAFGMDKK 183
+ + ++ VP +R+A +++ + AG +I EP AAA+ G+
Sbjct: 97 QVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVS 156

Query: 184 EGDRKVAVYDLGGGTFDVSIIEIADLDGDQQIEVLSTNGDTFLGGEDFDNALIEYLVEEF 243
E V D+GGGT +V++I + + + +GG+ FD A+I Y+ +
Sbjct: 157 E-ATGSMVVDIGGGTTEVAVISLNG---------VVYSSSVRIGGDRFDEAIINYVRRNY 206

Query: 244 KKEQNVNLKNDPLALQRLKEAAEKAKIELSSS----NATEINLPYITADATGPKHLVINV 299
+ AE+ K E+ S+ EI + P+ +N
Sbjct: 207 G-------------SLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN- 252

Query: 300 TRAKLEGLVADLVARTIEPCKIALKD-AGLSTSDISD--VILVGGQSRMPLVQQKVQEFF 356
+ LE L + + + +AL+ SDIS+ ++L GG + + + + + E
Sbjct: 253 SNEILEALQ-EPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEET 311

Query: 357 GREPRKDVNPDEAVAIGAAI 376
G +P VA G
Sbjct: 312 GIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18910FRAGILYSIN414e-06 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 40.8 bits (95), Expect = 4e-06
Identities = 19/49 (38%), Positives = 27/49 (55%), Gaps = 3/49 (6%)

Query: 251 TLAHEFGHALGLKHTDDPKSLMYPLLREQDIHNFKLTNSDLDLLATLYG 299
+AHE GH LG +HTD+ K LMY H L+ ++D++A G
Sbjct: 353 VMAHELGHILGAEHTDNSKDLMYATFTGYLSH---LSEKNMDIIAKNLG 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
A4U85_RS18915FRAGILYSIN361e-04 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 36.2 bits (83), Expect = 1e-04
Identities = 13/24 (54%), Positives = 17/24 (70%)

Query: 253 TIAHELGHALGLKHSDQPGALMYS 276
+AHELGH LG +H+D LMY+
Sbjct: 353 VMAHELGHILGAEHTDNSKDLMYA 376



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.