> >From Thomas.... > > [Konrad, can you take a shot at this question?] I'll try... > What python data type would you recommend for the class representation > of a nucleotide sequence ? > - string, list or array (module) ? > I am not (yet) familiar with the performance questions of python types, but > I got the impression that lists are very slow - and I have no idea how the > array module is implemented. (btw I used strings in Tcl) The main question is what operations you want to perform on nucleotide sequences. Here are some considerations: - Strings are compact and benefit from a large range of string operations (in module "string"). However, elements can only be characters, and strings are immutable, i.e. cannot be changed once created. So any modification requires constructing a new string. But being immutable can be an advantage as well, e.g. you can use strings as keys in dictionaries. - Lists can store any data type, and can be modified in a very general way (including insertion of lists etc.), but there are fewer operations available on them. - Tuples are just immutable lists. - Arrays don't seem to be very useful for non-numerical data, with two exceptions: they can most easily be accessed from C modules, and they facilitate certain structural operations. In terms of performance, there is not so much difference for basic operations (creation, indexing, etc.). The main concern should be to as many built-in operations as possible for typical manipulations; any piece of Python code is much slower than a simple call to a built-in function implemented in C! So the first thing to do is to find out which operations are to be performed on nucleotide sequences, and which of them occur most frequently. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen at cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais -------------------------------------------------------------------------------