[BiO BB] spell checker for biological words

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Sun Jul 9 11:05:11 EDT 2006


Mike Marchywka wrote:
> On a related topic, does anyone know where to get good lists
> of chemical names ( systematic and trivial )?
> I hunted around iupac site for a while and could extract
> some for organic things but I really had to play with
> it and it isn't quite complete.
> 
> The FDA has some drug listings that are fairly easy to parse with
> bash to extract drug vocabularies.
> 
> While I don't need a spell checker, this does come up when
> you want to scan patents or SEC filings for word catagories.
> There is probably something obvious on google
> related to this but I haven't found it. Word catagorization
> is probably a common interest in many text analysis issues.
> 
> If you are really looking for spell check algorithms,
> sometimes citeseer has some nice articles.
> 

You can grab lists of systematic protein and chemical names from KEGG 
(who provide nice versions of ENZYME and LIGAND databases). I zapped the 
systematic protein names through aspell, which makes things a lot easier.




> 
> This seems to have come up on cpan before: 
> http://search.cpan.org/dist/Text-Aspell/Aspell.pm
> http://www.cpanforum.com/posts/2165
> 
> You could probably write a simple one in a few lines of PERL but
> I don't know offhand where to get a dictionary.
> Their hashs do a lot of thrashing when they get too big
> and I've never figured out how to fix this ( and I don't
> hold out a lot of hope with cygwin either :)).
> 
> 
> 
> 
> -----Original Message-----
> From:
> bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformatics.org
> [mailto:bio_bulletin_board-bounces+mmarchywka=eyewonder.com at bioinformati
> cs.org]On Behalf Of Deepan Chakravarthy
> Sent: FridayJuly-07-2006 02:37 PM
> To: bio_bulletin_board at bioinformatics.org
> Subject: [BiO BB] spell checker for biological words
> 
> 
> Hello,
>   I am hunting for a opensource biological spell checker. If someone is
> familiar with an algorithm for writing one.. then please do comment on
> the same. 
> thanks
> Deepan
> Home Page: www.codeshepherd.com
> Fun Page: www.sudoku-solver.net/sudoku.html
> 
> _______________________________________________
> Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
> 
> _______________________________________________
> Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board




More information about the BBB mailing list