Revision
1630 -
Directory Listing
Modified
Fri Feb 15 11:54:22 2013 UTC
(9 years, 4 months ago)
by
jmduarteg
New feature: now blast xml parser can read either raw xml or gzipped xml files. Also HomologList will use xml.gz files only as cache files
Revision
1603 -
Directory Listing
Modified
Tue Jan 22 14:57:37 2013 UTC
(9 years, 5 months ago)
by
jmduarteg
Small fix to allow for underscore in blast query id
Revision
1602 -
Directory Listing
Modified
Tue Jan 22 10:19:21 2013 UTC
(9 years, 5 months ago)
by
jmduarteg
Small fix: moved up the naming of blast cache file
Revision
1601 -
Directory Listing
Modified
Tue Jan 22 08:28:36 2013 UTC
(9 years, 5 months ago)
by
jmduarteg
Small fix to be sure that the uniprot ver reported in error message is the actual one that the japi server returned, at the moment they seem to be having issues and 2 calls to the server return different versions!
Revision
1599 -
Directory Listing
Modified
Fri Nov 16 10:02:32 2012 UTC
(9 years, 7 months ago)
by
jmduarteg
Fixed important issues with searching and alignment of homologs:
- ids and coverage were calculated wrongly, based on all BlastHsps for a hit. Now based on single Hsps
- fixed bug in BlastHsp.getQueryCoverage: was miscalculating the coverage by 1 unit
- the useHspsOnly parameter in alignment didn't make much sense. We need always to use the hsp segments only, extending to the full sequence is dangerous and in any case does not make sense at all if the clustering is done in hsps only
Revision
1597 -
Directory Listing
Modified
Wed Nov 7 08:58:43 2012 UTC
(9 years, 7 months ago)
by
jmduarteg
Several changes related to blast:
- now supporting blast+ blastp.
- blast xml parser can now ignore the DTD url, avoiding unnecessary network access if desired
- refactoring: legacy blast related stuff is now refactored to contain "legacy", runBlastp refers to blast+
Also updated uniprot jar.
Revision
1593 -
Directory Listing
Modified
Wed Oct 31 11:43:11 2012 UTC
(9 years, 8 months ago)
by
jmduarteg
Now supporting clustalo as well as tcoffee for alignment of homologs
Revision
1570 -
Directory Listing
Modified
Thu Aug 9 16:07:22 2012 UTC
(9 years, 10 months ago)
by
jmduarteg
Now logging t_coffee total runtime
Revision
1564 -
Directory Listing
Modified
Wed Jul 25 16:12:03 2012 UTC
(9 years, 11 months ago)
by
jmduarteg
Minor fixes in entropy calculation
Revision
1560 -
Directory Listing
Modified
Wed May 23 10:04:05 2012 UTC
(10 years, 1 month ago)
by
jmduarteg
Now checking that alignment from cache contains actually what we are expecting
Revision
1558 -
Directory Listing
Modified
Mon May 14 14:14:13 2012 UTC
(10 years, 1 month ago)
by
jmduarteg
Solved a couple of small issues in HomologList
Revision
1557 -
Directory Listing
Modified
Thu May 10 08:15:52 2012 UTC
(10 years, 1 month ago)
by
jmduarteg
Fixed bug: for a uniref100 cluster member, was taking the representative's uniprot id/tax id instead of the member's. Database was not correctly modelled -> need also the tax ids for cluster members (and then to query the tax ids from members and not from representative)
Revision
1556 -
Directory Listing
Modified
Wed May 9 12:04:21 2012 UTC
(10 years, 1 month ago)
by
jmduarteg
New feature: now reducing redundancy via clustering with blastclust
Revision
1555 -
Directory Listing
Modified
Tue Apr 24 16:54:50 2012 UTC
(10 years, 2 months ago)
by
jmduarteg
Improved a bit the skimming strategy to be a bit less naive: now it skims more from the very high identities [e.g. it seems to improve case 3b37_1, a xtal contact]
Revision
1554 -
Directory Listing
Modified
Tue Apr 24 13:33:47 2012 UTC
(10 years, 2 months ago)
by
jmduarteg
Writing fasta tag with subinterval also for query
Revision
1553 -
Directory Listing
Modified
Tue Apr 24 13:26:04 2012 UTC
(10 years, 2 months ago)
by
jmduarteg
Now we can do alignments using hsp regions only (instead of always full sequences)
Revision
1552 -
Directory Listing
Modified
Tue Apr 24 09:04:01 2012 UTC
(10 years, 2 months ago)
by
jmduarteg
Fixed bug: redundancy (duplicate) elimination was done on whole uniprot sequences and not on HSPs
Revision
1551 -
Directory Listing
Modified
Tue Apr 24 08:07:13 2012 UTC
(10 years, 2 months ago)
by
jmduarteg
Introduced identicals elimination, removed the taxonomy grouping in removeRedundancy (didn't make any sense anymore without coding sequences)
Revision
1544 -
Directory Listing
Modified
Tue Mar 6 10:10:06 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Changed how uniprotver is read
Revision
1542 -
Directory Listing
Modified
Wed Feb 29 17:37:21 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Fixed issues with homologs with a uniparc id as reference, now the reference can be either uniprot or uniparc
Revision
1540 -
Directory Listing
Modified
Tue Feb 28 14:18:48 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Fixed bug in uniref xml parser: was not taking the right uniprot representative for old style (e.g. ver 1.0) xml files. They also contain several uniprot ids sometimes: first one being the active representative and the remaining being the inactive ones.
Revision
1538 -
Directory Listing
Modified
Sun Feb 26 15:12:05 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Implemented retrieval of uniprot KB data from local database
Revision
1535 -
Directory Listing
Modified
Fri Feb 24 13:29:26 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Restructured how the UniprotConnection is used. Now all uniprot connection stuff done within the class
Revision
1534 -
Directory Listing
Modified
Thu Feb 23 17:55:08 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Now the reference entry in HomologList is also a UnirefEntry. Not using anymore UniprotEntry
Revision
1533 -
Directory Listing
Modified
Thu Feb 23 16:31:32 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Major change: Homolog and HomologList are now using the new class UnirefEntry. Removed a lot of the code related to Ka/Ks calculation and EMBL CDS retrieval
Revision
1531 -
Directory Listing
Modified
Thu Feb 23 11:37:40 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
More complete parsing: cluster members and inactive ids
Revision
1530 -
Directory Listing
Modified
Thu Feb 23 08:43:35 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Some refactoring
Revision
1523 -
Directory Listing
Modified
Wed Feb 8 10:46:23 2012 UTC
(10 years, 4 months ago)
by
jmduarteg
Implemented caching of alignment files
Revision
1508 -
Directory Listing
Modified
Tue Dec 13 15:59:12 2011 UTC
(10 years, 6 months ago)
by
jmduarteg
Adding flag to use uniparc or not
Revision
1507 -
Directory Listing
Modified
Tue Dec 13 14:47:32 2011 UTC
(10 years, 6 months ago)
by
jmduarteg
Now supporting both uniprot and uniparc entries in HomologList
Revision
1506 -
Directory Listing
Modified
Mon Dec 12 10:54:04 2011 UTC
(10 years, 6 months ago)
by
jmduarteg
New methods for filtering by domain of life
Revision
1505 -
Directory Listing
Modified
Wed Dec 7 17:30:05 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Now UniprotHomologList contains a full list and a filtered list, making it more flexible
Revision
1504 -
Directory Listing
Modified
Wed Dec 7 10:35:38 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Now catching OutOfMemory errors and continuing in redundancy removal procedure
Revision
1503 -
Directory Listing
Modified
Wed Dec 7 08:43:11 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
New method to produce statistics of the sequence entropies variability
Revision
1498 -
Directory Listing
Modified
Mon Dec 5 11:31:10 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Fixed minor bug: now will re-blast only if maxNumSeqs above 500 and more hits requested
Revision
1497 -
Directory Listing
Modified
Sat Dec 3 16:12:05 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
New feature: now can specify a subinterval in UniprotHomologList
Revision
1495 -
Directory Listing
Modified
Wed Nov 30 11:29:36 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Fixed issue with uniparc and uniprot isoform ids
Revision
1493 -
Directory Listing
Modified
Fri Nov 25 10:48:53 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Modified the uniref regex to also capture uniprot isoforms identifiers
Revision
1492 -
Directory Listing
Modified
Fri Nov 18 10:48:57 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Improved the blast cache checking
Revision
1491 -
Directory Listing
Modified
Thu Nov 17 14:56:11 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Now can pass option -v to blast (max number of hits reported)
Revision
1489 -
Directory Listing
Modified
Wed Nov 16 14:26:57 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Fixed logging and error handling of tcoffee: now it will really log errors to log file, temp files won't be deleted when exit!=0, command line is logged
Revision
1488 -
Directory Listing
Modified
Wed Nov 16 11:42:28 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Fixed minor issue: wasn't logging an error through logger
Revision
1487 -
Directory Listing
Modified
Wed Nov 16 10:49:18 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
More logging
Revision
1486 -
Directory Listing
Modified
Wed Nov 16 08:20:28 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Now blasting against uniref also supported in UniprotHomologList
Revision
1484 -
Directory Listing
Modified
Fri Nov 11 16:06:07 2011 UTC
(10 years, 7 months ago)
by
jmduarteg
Now explicitely specifying number of threads in TcoffeeRunner
Revision
1467 -
Directory Listing
Modified
Mon Sep 5 16:53:40 2011 UTC
(10 years, 10 months ago)
by
jmduarteg
New method for getting URL of uniprot entry
Revision
1406 -
Directory Listing
Modified
Mon May 30 14:08:31 2011 UTC
(11 years, 1 month ago)
by
jmduarteg
Some new methods for disulfide detection and other minor things
Revision
1392 -
Directory Listing
Modified
Thu May 19 14:27:50 2011 UTC
(11 years, 1 month ago)
by
jmduarteg
MAJOR CHANGE: now able to read and treat properly HETATMS and nucleotides from PDB entries. All tests pass. Anyway surely there will be some bugs to iron out still.
Revision
1372 -
Directory Listing
Modified
Wed Mar 30 09:55:49 2011 UTC
(11 years, 3 months ago)
by
jmduarteg
MAJOR INTERFACE CHANGE: the main PDB data loading interface has changed. Now all loading of data occurs through PdbAsymUnit. Refactoring of some classes: Pdb is now PdbChain.
Revision
1346 -
Directory Listing
Modified
Fri Mar 18 08:25:07 2011 UTC
(11 years, 3 months ago)
by
jmduarteg
Now detecting type of sequence: protein/nucleotide
Revision
1340 -
Directory Listing
Modified
Sun Mar 13 17:44:18 2011 UTC
(11 years, 3 months ago)
by
hstehr
Fixed bugs in method for reading from PIR file; added script to convert from PIR to Fasta
Revision
1335 -
Directory Listing
Modified
Wed Mar 9 08:38:35 2011 UTC
(11 years, 3 months ago)
by
jmduarteg
Now throwing InterruptedException. Refactor BlastErro to BlastException
Revision
1334 -
Directory Listing
Modified
Wed Mar 9 08:13:04 2011 UTC
(11 years, 3 months ago)
by
jmduarteg
Refactoring xxxError to xxxException
Revision
1329 -
Directory Listing
Modified
Sat Mar 5 15:38:12 2011 UTC
(11 years, 4 months ago)
by
jmduarteg
Important change: now the PDB file reader will try to read and fix the numbering of PDB files. Whenever the alignment is wrong it will realign and renumber using the jaligner package. The result of reading original PDB files now will be the same as that of reading CIF files, including proper mapping of classical PDB numbers to SEQRES residue serials (as in CIF).
There is one change of behaviour in comparison to before: when no SEQRES present the sequence is taken to be that of ATOM lines instead of padding it with Xs.
Still in this version the re-alignment is not perfect as there are some times when ambiguities occur and they are not solved (e.g. in 2nwr where alignment in an unobserved loop can be at two possible places for a GLY). That is anyway a rather minor problem (coordinates are still fine, just the chain is not ordered correctly at 1 or 2 points) and rare (~1% of files)
Revision
1326 -
Directory Listing
Modified
Thu Mar 3 13:21:18 2011 UTC
(11 years, 4 months ago)
by
jmduarteg
Renamed xxxxError classes to xxxxException as they should be.
Revision
1317 -
Directory Listing
Modified
Wed Feb 9 17:41:17 2011 UTC
(11 years, 4 months ago)
by
jmduarteg
Catching if the server returns non-requested records and throwin IOException for it.
Revision
1310 -
Directory Listing
Modified
Mon Jan 31 16:49:11 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Fixed bug: was not checking whether the uniprot japi was actually returning all requested records for ids given when using getMultipleEntries. Now checking, logging it and removing the not-found ids from the homolog list.
Revision
1306 -
Directory Listing
Modified
Tue Jan 25 14:15:20 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Added Serializable interface to many classes. To be able to serialize classes containing objects of these types.
Revision
1305 -
Directory Listing
Modified
Fri Jan 21 10:11:16 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Renaming all xxxxError named exceptions to xxxxException. Long overdue thing (was historical because it came from python). Learnt just recently that java does have a concept of Error too.
Revision
1304 -
Directory Listing
Modified
Sun Jan 16 16:06:09 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Fixed important bug: we didn't model the blast data properly. A hit is composed of several hsps. Now our classes following that data model properly (before we treated a hit as an hsp).
Revision
1300 -
Directory Listing
Modified
Wed Jan 12 10:22:15 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Better docs
Revision
1298 -
Directory Listing
Modified
Tue Jan 11 14:35:17 2011 UTC
(11 years, 5 months ago)
by
jmduarteg
Now storing the query coverage cutoff.
Revision
1285 -
Directory Listing
Modified
Thu Dec 2 11:03:18 2010 UTC
(11 years, 7 months ago)
by
jmduarteg
Now throwing MatchNotFound exception instead of catching it.
Revision
1284 -
Directory Listing
Modified
Mon Nov 29 10:06:28 2010 UTC
(11 years, 7 months ago)
by
cvehlow
Overrides removed
Revision
1273 -
Directory Listing
Modified
Thu Oct 28 13:56:03 2010 UTC
(11 years, 8 months ago)
by
jmduarteg
Fixed buf: EMBL DB fetch returns sequences now in upper case, we were not converting properly to lower case in one place, now should be fixed.
Revision
1267 -
Directory Listing
Modified
Thu Oct 21 16:26:54 2010 UTC
(11 years, 8 months ago)
by
jmduarteg
Now possible to not retrieve any CDS data
Revision
1260 -
Directory Listing
Modified
Fri Oct 8 15:00:58 2010 UTC
(11 years, 8 months ago)
by
jmduarteg
Using the new UniprotVerMismatchException instead of IOException
Revision
1259 -
Directory Listing
Modified
Fri Oct 8 13:45:56 2010 UTC
(11 years, 8 months ago)
by
jmduarteg
New exception for mismatching uniprot versions
Revision
1253 -
Directory Listing
Modified
Wed Oct 6 17:01:19 2010 UTC
(11 years, 9 months ago)
by
jmduarteg
Some minor updates
Revision
1172 -
Directory Listing
Modified
Wed Aug 4 15:26:03 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Now using apache commons logging
Revision
1168 -
Directory Listing
Modified
Mon Aug 2 16:47:37 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Now catching the case when the best translation contains stop codons. This is usually due to a wrong genetic code, but for the moment it's still very difficult to know the genetic code from encoding organelle+organism taxonomy, so this is the temporary solution.
Revision
1166 -
Directory Listing
Modified
Fri Jul 30 14:58:26 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Fixed bug: not allowing anymore the presence of gaps when choosing the representative CDS to uniprot match. Now the nucleotide alignments should be always correct (before they were shifted if there were gaps in CDS-to-uniprot)
Revision
1164 -
Directory Listing
Modified
Fri Jul 30 09:56:39 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Now returning null for the representative CDS when the gene encoding organelle is not nucleus/plasmid. Before the whole program would stop, which wasn't ideal. Eventually we will need to use the proper genetic code when needed.
Revision
1162 -
Directory Listing
Modified
Wed Jul 28 09:15:19 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Allowing plasmids as gene encoding organelles without having to treat them especially as they don't have a different genetic code.
Revision
1159 -
Directory Listing
Modified
Fri Jul 23 14:56:35 2010 UTC
(11 years, 11 months ago)
by
jmduarteg
Reverting change made in last revision. It intended to fix a bug for a particular test case (don't know anymore which one!) but it actually broke most other cases.
Revision
1157 -
Directory Listing
Modified
Wed Jul 7 15:31:14 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug, wasn't computing the nucleotide alignment correctly (was missing the last codon)
Revision
1155 -
Directory Listing
Modified
Wed Jul 7 13:15:06 2010 UTC
(12 years ago)
by
jmduarteg
Implemented a simple skimming strategy. Put back some debug level logging to info level, logging aint' easy!
Revision
1154 -
Directory Listing
Modified
Wed Jul 7 10:39:28 2010 UTC
(12 years ago)
by
jmduarteg
And all other amiguous codons
Revision
1153 -
Directory Listing
Modified
Wed Jul 7 08:46:37 2010 UTC
(12 years ago)
by
jmduarteg
Another ambiguous nucleotide letter
Revision
1151 -
Directory Listing
Modified
Tue Jul 6 16:58:04 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug: was null pointing with null cache file (which is a valid value)
Revision
1149 -
Directory Listing
Modified
Mon Jul 5 16:46:57 2010 UTC
(12 years ago)
by
jmduarteg
Changed level of some logging messages
Revision
1148 -
Directory Listing
Modified
Mon Jul 5 16:16:44 2010 UTC
(12 years ago)
by
jmduarteg
New removeRedundancy method
Revision
1147 -
Directory Listing
Modified
Mon Jul 5 08:35:15 2010 UTC
(12 years ago)
by
jmduarteg
Implemented logging with log4j library
Revision
1146 -
Directory Listing
Modified
Fri Jul 2 09:54:46 2010 UTC
(12 years ago)
by
jmduarteg
Now can write sequences with or without query
Revision
1145 -
Directory Listing
Modified
Thu Jul 1 16:05:26 2010 UTC
(12 years ago)
by
jmduarteg
Now can filter also on query coverage
Revision
1144 -
Directory Listing
Modified
Thu Jul 1 10:09:55 2010 UTC
(12 years ago)
by
jmduarteg
Changed the order of checks in isReferenceSeqPositionReliable
Revision
1143 -
Directory Listing
Modified
Thu Jul 1 10:02:53 2010 UTC
(12 years ago)
by
jmduarteg
New methods to check for reliability of positions with respect of the CDS matching
Revision
1142 -
Directory Listing
Modified
Wed Jun 30 09:55:27 2010 UTC
(12 years ago)
by
jmduarteg
Now nucleotide alignment contains reference sequence as well. Moved entropy and kaks calc methods to UniprotHomologList.
Revision
1140 -
Directory Listing
Modified
Tue Jun 29 08:49:34 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug: now nucleotide alignment is in same order as protein alignment
Revision
1139 -
Directory Listing
Modified
Mon Jun 28 17:44:19 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug: when retrieving (or reading from cache) embl cds sequences and embl dbfetch doesn't have a certain identifier, then we were adding nulls to the list of emblcds sequences of the UniprotHomologList (resulting in a null pointer down the line)
Revision
1137 -
Directory Listing
Modified
Mon Jun 28 13:38:32 2010 UTC
(12 years ago)
by
jmduarteg
Now checking the sequence identity of the representative CDS to be above a minimum value.
Revision
1136 -
Directory Listing
Modified
Mon Jun 28 12:49:31 2010 UTC
(12 years ago)
by
jmduarteg
New ambiguous nucleotide w.
Revision
1133 -
Directory Listing
Modified
Fri Jun 25 13:36:03 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug: getNucleotideAlignment was nullpointing when encountering a null return from getRepresentativeCDS() (which happens whenever there is no representative CDS for the particular UniprotEntry). Now checking for nulls before using the homolog in the alignment.
Revision
1132 -
Directory Listing
Modified
Fri Jun 25 13:01:53 2010 UTC
(12 years ago)
by
jmduarteg
New SelectonRunner class. More taxonomy information retrieved from uniprot.
Revision
1126 -
Directory Listing
Modified
Tue Jun 22 12:19:07 2010 UTC
(12 years ago)
by
jmduarteg
New code to obtain the nucleotide alignment by mapping the aminoacid alignment to the nucleotide sequences.
Revision
1123 -
Directory Listing
Modified
Mon Jun 21 12:50:54 2010 UTC
(12 years ago)
by
jmduarteg
Improved the selection of representative CDS
Revision
1122 -
Directory Listing
Modified
Mon Jun 21 10:48:05 2010 UTC
(12 years ago)
by
jmduarteg
Some refactoring
Revision
1121 -
Directory Listing
Modified
Fri Jun 18 16:02:55 2010 UTC
(12 years ago)
by
jmduarteg
Now implementing HasFeatures
Revision
1120 -
Directory Listing
Modified
Fri Jun 18 13:29:34 2010 UTC
(12 years ago)
by
jmduarteg
New class UniprotEntry. Things make a lot more sense now...
Revision
1119 -
Directory Listing
Modified
Fri Jun 18 11:55:46 2010 UTC
(12 years ago)
by
jmduarteg
Now checking also for ambiguous nucleotide codes (n, m) when translating. Changed implementation of translation by using the new Codon class.
Revision
1117 -
Directory Listing
Modified
Thu Jun 17 09:52:30 2010 UTC
(12 years ago)
by
jmduarteg
Now checking codon length before translating
Revision
1116 -
Directory Listing
Modified
Thu Jun 17 09:48:22 2010 UTC
(12 years ago)
by
jmduarteg
Now properly translating all frames and finding the best translation match in checkEmblCDSMatching. New class doing all the matching of protein to CDS.
Revision
1114 -
Directory Listing
Modified
Wed Jun 16 10:26:22 2010 UTC
(12 years ago)
by
jmduarteg
New feature: code for dna to protein translation
Revision
1112 -
Directory Listing
Modified
Tue Jun 15 08:44:52 2010 UTC
(12 years ago)
by
jmduarteg
Now getting also gene encoding organelle from uniprot api
Revision
1106 -
Directory Listing
Modified
Thu Jun 10 09:24:30 2010 UTC
(12 years ago)
by
jmduarteg
Implemented caching of blast in UniprotHomologList
Revision
1104 -
Directory Listing
Modified
Wed Jun 9 17:04:36 2010 UTC
(12 years ago)
by
jmduarteg
Now reading uniprot version from reldate.txt and checking against the uniprot api version
Revision
1103 -
Directory Listing
Modified
Wed Jun 9 16:11:18 2010 UTC
(12 years ago)
by
jmduarteg
Implemented caching for EMBWSDBfetchConnection
Revision
1098 -
Directory Listing
Modified
Tue Jun 8 09:10:32 2010 UTC
(12 years ago)
by
jmduarteg
Fixed bug: was nullpointing at retrieveUniprotKBData because of a single uniprot id can have multiple blast hits and was using the uni ids as unique identifiers => the lookup map was failing. Now the lookup map is from uniIds to lists of homologs(hits)
Revision
1097 -
Directory Listing
Modified
Mon Jun 7 10:17:58 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Now explicitely specifying output dnd file in tcoffee so it can be removed by the calling program.
Revision
1095 -
Directory Listing
Modified
Wed Jun 2 15:06:46 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Fixed bug: was nullpointing when the letter encountered in counting the column was unknown.
Revision
1094 -
Directory Listing
Modified
Wed Jun 2 13:19:21 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Implemented reduced alphabets in AminoAcid class and modified entropy calculation to allow reduced alphabets.
Revision
1085 -
Directory Listing
Modified
Mon May 31 13:45:03 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Some new methods
Revision
1081 -
Directory Listing
Modified
Fri May 28 14:51:46 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Fixed bug with '-' embl cds identifiers.
Revision
1074 -
Directory Listing
Modified
Fri May 21 10:02:24 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Implemented profile counting and entropy calculation for alignment.
Revision
1073 -
Directory Listing
Modified
Thu May 20 15:21:10 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Also retrieving uniprot sequence in UniprotHomolog.retrieveUniprotKBData()
Revision
1072 -
Directory Listing
Modified
Thu May 20 13:59:25 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
Implemented methods to run a tcoffee alignment. Now also retrieving uniprot sequences via the uniprot api.
Revision
1069 -
Directory Listing
Modified
Tue May 18 14:53:34 2010 UTC
(12 years, 1 month ago)
by
jmduarteg
New methods to retrieve uniprot data and embl cds for a single homolog
Revision
1055 -
Directory Listing
Modified
Wed Apr 28 15:08:53 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
Removed code left over from debugging.
Revision
1054 -
Directory Listing
Modified
Wed Apr 28 13:37:00 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
Minor changes, couple of new methods.
Revision
1051 -
Directory Listing
Modified
Tue Apr 27 15:34:38 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
Minor fix to java docs
Revision
1049 -
Directory Listing
Modified
Tue Apr 27 12:36:16 2010 UTC
(12 years, 2 months ago)
by
hstehr
removing @Override annotations which were causing compilation errors for me
Revision
1048 -
Directory Listing
Modified
Tue Apr 27 08:24:18 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
New classes UniprotHomolog and UniprotHomologList to contain a set of homologs of a given sequence, including the blast hit data, embl cds coding sequences etc.
Revision
1045 -
Directory Listing
Modified
Mon Apr 26 12:55:28 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
Uniprot and embl ws dbfetch connections can now do multiple entries per request. Update to parse also the hit_def tag in blast parser.
Revision
1027 -
Directory Listing
Modified
Fri Apr 16 10:03:57 2010 UTC
(12 years, 2 months ago)
by
jmduarteg
Fixed some docs and indentation
Revision
1010 -
Directory Listing
Modified
Wed Mar 31 16:05:52 2010 UTC
(12 years, 3 months ago)
by
hstehr
refactoring: removed a warning in MultipleSequenceAlignment and deleted obsolete directory src/tests
Revision
1009 -
Directory Listing
Modified
Wed Mar 31 15:50:04 2010 UTC
(12 years, 3 months ago)
by
hstehr
refactoring: moved many many classes to more appropriate packages; created new packages owl.core.sequence.alignment, owl.core.structure.alignment, owl.core.structure.features
Revision
1005 -
Directory Listing
Modified
Wed Mar 31 12:29:26 2010 UTC
(12 years, 3 months ago)
by
hstehr
Copied from:
trunk/src/sequence revision 1002
refactoring: renaming proteinstructure to structure and tools to util; moving connections,features,runners,sequence,structure,util to owl.core
Revision
951 -
Directory Listing
Modified
Fri Jan 29 16:12:34 2010 UTC
(12 years, 5 months ago)
by
duarte
Original Path:
trunk/src/sequence
Copied from:
trunk/sequence revision 949
Reorganised the project with a src folder for java source files.
Added a jars dir with all jars needed for the project.
Added .project and .classpath pointing to relative path of jars.
The project should now work out of the box after a check-out with eclipse. No need to setup external jars or anything.
Revision
876 -
Directory Listing
Modified
Fri Aug 7 13:35:42 2009 UTC
(12 years, 11 months ago)
by
stehr
Original Path:
trunk/sequence
New methods for writing sequence in Fasta format
Revision
801 -
Directory Listing
Modified
Mon Nov 17 13:40:10 2008 UTC
(13 years, 7 months ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug: was not doing correctly the sanity checks in getAlignmentFullSeqs
Revision
763 -
Directory Listing
Modified
Thu Sep 25 15:19:45 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
New method in BlastHit to transform the BlastHit local alignment into one that contains the full sequences of subject and query.
Revision
760 -
Directory Listing
Modified
Wed Sep 24 18:18:53 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Rewrote the secondary structure handling in Alignment. Also changed first parameter in writeFasta to be a PrintStream not an OutputStream
Revision
756 -
Directory Listing
Modified
Tue Sep 23 18:07:25 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
New static method in BlastHit to get subjectId from templateId
New member source in TemplateList
Revision
755 -
Directory Listing
Modified
Tue Sep 23 15:00:46 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
New method getAlignmentWithTemplateIDtag
Revision
752 -
Directory Listing
Modified
Wed Sep 17 13:44:30 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
BlastHitList: new method getHit(subjectId), a new lookup HastMap member added for that.
TemplateList: new method get(i)
Sequence: new convenience method writeToFastaFile
Revision
751 -
Directory Listing
Modified
Tue Sep 16 16:35:34 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Now cleaning up correctly paul temp files
Now closing tcoffee's log properly
Revision
750 -
Directory Listing
Modified
Tue Sep 16 13:15:25 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Now we can run blast with multi threads option
Revision
748 -
Directory Listing
Modified
Sun Sep 14 21:47:56 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Initial commit, tcoffee runner for sequence to profile alignment
Revision
745 -
Directory Listing
Modified
Thu Sep 11 12:57:10 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Now BlastHit printing works correctly for query identifier strings of any length
Revision
744 -
Directory Listing
Modified
Thu Sep 11 11:09:27 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Using the new XML parser in TemplateSelection.
New method getQueryCoverage in BlastHit
Revision
743 -
Directory Listing
Modified
Thu Sep 11 10:06:22 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
New tests package (should be a folder and under it packages, but for now we can't do that until we move all source code to a src folder).
Moved all existing tests to appropiate tests packages.
New test PdbParsersTest: does cif against pdbase at the moment.
Adde equals method to SecStrucElement
Added some getters to Pdb
Revision
742 -
Directory Listing
Modified
Wed Sep 10 17:23:37 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
New BlastXMLParser together with BlastParsersTest Junit test for both xml and tabular parsers.
Added some new fields in BlastHist and BlastHitList.
Blast running methods in BlastRunner now take a new parameter: noFiltering.
Revision
741 -
Directory Listing
Modified
Tue Sep 9 10:47:21 2008 UTC
(13 years, 9 months ago)
by
duarte
Original Path:
trunk/sequence
Created getters for missing fields in BlastHit.
New method removeAll in TemplateList.
Revision
711 -
Directory Listing
Modified
Thu Jul 31 09:47:58 2008 UTC
(13 years, 11 months ago)
by
stehr
Original Path:
trunk/sequence
added static method to print a ruler with sequence numbers
Revision
699 -
Directory Listing
Modified
Wed Jul 16 08:38:19 2008 UTC
(13 years, 11 months ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug: now Sequence class considers '.' to be part of the fasta headers (in agreement with what we do in Alignment class). The 2 FASTAHEADER_REGEXes should be always kept in sync.
Revision
651 -
Directory Listing
Modified
Wed May 21 18:11:02 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
New class PhiPsiAverager to get consensus of phi/psi angles from a TemplateList and an Alignment. Tested with a few examples and seems to work. The wrapping of angles at 180/-180 is not yet taking into account, i.e. if an interval falls in the region just below 180 and just above -180, no consensus will be found.
Pdb: added some checks to methods getPhi/Psi so that it doesn't fail when there's no coordinates.
Changed yet again the design of TemplateList/Template. Now loading of PDB data happens upon call of the loadPDBdata method. Changed dependencies accordingly.
Revision
642 -
Directory Listing
Modified
Fri May 16 17:08:04 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug: writeClusterGraph shouldn't try to do anything if there's only one template.
Revision
638 -
Directory Listing
Modified
Thu May 15 14:54:42 2008 UTC
(14 years, 1 month ago)
by
stehr
Original Path:
trunk/sequence
added automatic conversion of matrix file to ps visualization (using script plot_simmatrix.sh)
Revision
637 -
Directory Listing
Modified
Thu May 15 14:26:22 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Now skipping lines with <PRE> tags.
Revision
635 -
Directory Listing
Modified
Thu May 15 08:44:59 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Changed default gravity value and default IDENTITY_SCORE for similarity graph
Revision
634 -
Directory Listing
Modified
Wed May 14 14:46:38 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug in BlastUtils.writeClusterGraph: was not working at all. Wrong curly brackets, was still using a GDT cutoff instead of rmsd and was never getting Pdb correctly out of the Template.
Two new members in Template: GTGHit and BlastHit, so we can reference back to the hit if needed (like if we want to get the evalue/score). Got rid on MySQLConnection as a member, not very nice design, now a parameter for the specific methods that need to query for PDB data.
Revision
633 -
Directory Listing
Modified
Wed May 14 12:08:52 2008 UTC
(14 years, 1 month ago)
by
stehr
Original Path:
trunk/sequence
put test code for writeClusterGraph in main method of BlastUtils
Revision
628 -
Directory Listing
Modified
Tue May 13 10:34:44 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Now writeClusterGraph uses rmsd for similarity measure instead of GDT (maxcluster was failing almost always with GDT)
Revision
625 -
Directory Listing
Modified
Fri May 9 14:57:37 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug: wasn't catching wrong format in GTG output
Revision
623 -
Directory Listing
Modified
Fri May 9 14:10:27 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Fixed bug: now catching when the mapping of observed residue sequence serials to internal residue serials is out of range, which happens in some extremely rare cases (because GTG uses pre-remediation data)
Revision
622 -
Directory Listing
Modified
Fri May 9 12:56:27 2008 UTC
(14 years, 1 month ago)
by
duarte
Original Path:
trunk/sequence
Added code to be able to filter out template ids (pdb codes) by maximum release date.
Some improvements in Hit classes.
Revision
619 -
Directory Listing
Modified
Thu May 8 10:13:10 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Added methods to be able to print "graphical" output for GTGHits, all equivalent to the methods in BlastHit
Added compare() and writeIdsToFile() in TemplateList.
Revision
618 -
Directory Listing
Modified
Wed May 7 12:06:56 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Initial commit of GTG parsing classes.
Added method in PdbasePdb to map observed residue sequence serials to internal (cif) residue serials
Revision
612 -
Directory Listing
Modified
Wed Apr 30 18:20:09 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Now checking wheter read sequence contains spaces and throwing FastaFileFormatError
Revision
602 -
Directory Listing
Modified
Tue Apr 29 08:36:22 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Moved method to writeSeqs to Sequence class.
Tidied up the help text in averageGraph
Revision
597 -
Directory Listing
Modified
Fri Apr 25 16:53:47 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Fixed the fasta header regex to take only first token of header as sequence name (same behaviour as in Alignment class)
Revision
593 -
Directory Listing
Modified
Tue Apr 22 09:05:33 2008 UTC
(14 years, 2 months ago)
by
stehr
Original Path:
trunk/sequence
merged in some changes (now writing similarity matrix in blast utils)
Revision
590 -
Directory Listing
Modified
Fri Apr 11 14:43:53 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
Added code to run psipred.
Fixed a few minor bugs.
Revision
589 -
Directory Listing
Modified
Wed Apr 9 16:16:08 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
New classes Template, TemplateList to be used in homology modelling pipeline.
Modified BlastUtils to use TemplateList class
Revision
584 -
Directory Listing
Modified
Tue Apr 8 15:18:22 2008 UTC
(14 years, 2 months ago)
by
stehr
Original Path:
trunk/sequence
added BlastUtils.writeClusterGraph(String[] templateIds, outFile)
Revision
583 -
Directory Listing
Modified
Tue Apr 8 13:02:28 2008 UTC
(14 years, 2 months ago)
by
duarte
Original Path:
trunk/sequence
New methods to get the template ids and write them to file
Revision
581 -
Directory Listing
Modified
Tue Apr 8 11:00:45 2008 UTC
(14 years, 3 months ago)
by
stehr
Original Path:
trunk/sequence
Made printing hits with graphical overview the default using print() method. Print() takes no parameters, but BlastHitList.setQueryLength() has to be called once to set the length of the query. Introduced new printSome() function to output only the first n hits.
Revision
579 -
Directory Listing
Modified
Mon Apr 7 17:57:54 2008 UTC
(14 years, 3 months ago)
by
stehr
Original Path:
trunk/sequence
added class Sequence, added ascii-art output of blast hits to BlastHitList
Revision
577 -
Directory Listing
Added
Fri Apr 4 16:46:23 2008 UTC
(14 years, 3 months ago)
by
duarte
Original Path:
trunk/sequence
New package sequence. Initial commit of classes to run blast and parse its output.