 This is a very messy and svd-centric TODO.  The items predominantly
 relate to other graph-related applications and helper applications,
 mostly *NOT* to mcl.


+todo____________________________________________________________________________
    |                      general guidelines                                    |
    |  - code as much as possible in high-level routines, do not micro-optimize. |
    |    do not shun matrix allocations etc. Aim for time/memory-efficient algorithms
    |    within this context.                                                    |
    |____________________________________________________________________________|
$$$$$ an option that does both matrix, vector and ivp transforms: include -tf  -ceil-nb -knn-mutual  
  $$$ proper initialization for streamer. think e.g. of cmax_123 (extend,fail,ignore)
  $$$ mclcm revisit coarsening.
    $ mcxdump --write-tabc --write-tabr --dump-domc --dump-domr are all mcxsuby and suspect.
   $$ only load connected components of size >= X; which app?  mcxload mcxsubs mcx convert
    $ mclxReadx option to ignore loops.
    - catread now can produce a partition. [ test ]
    - orthomcl type transformations. BRH easy. PP & CO harder.
  ??? more logical adapt-inflation behaviour.
    - mcxdump option to output largest edge lists first ..
    - mcx convert: split stack over multiple files.
    - mcx q or mcx equate -sub -equal -project et cetera (cf. mcxsubs language)
   $$ mcxarray: accept '-', 'na', 'nan'
   /> what does adapt-inflation do on satellite type graphs?
   $$ smarter coarsening. clustering vs aborption ? size balancing ?
    f mclvUnionv semaphore
    $ clean up mclcm, modularize, memclean.
    $ clean up memory logic and matrix cache/reread/transform in alg.c.
    $ make mclvaDump static, unify+streamline vector dumping (+ easy macro for dumping)
    $ create stack memory managing code.
    - stress test mcxload (further)
    - analyze dagginess (lattice characteristics)
$$$$$ rewrite mcxsubs in a more extendible and generalized framework -> mini-language, mcxi
    - update mcxdeblast
    - mcxerdos directed mode would be interesting for wikipedia type graphs.
    - check mclvSelectHighest implementation with -R -S and > to >= change.
    ! make a vector-dump-debug routine that I am happy with. perhaps build on the one in mcxdump
    - adapt-local: check what it does when mcl iterands become very directed. Only before?
    - clxdo <new-mode>: <id> <nb-count> <max-weight> <mean-weight> <median-weight>
    $ clm lint (mcl lint options)
    $ mcxdump optionally do not parenthesize singletons.
    - visual unify of 'clm <mode> -h' and 'clm help <mode>' in clm -h
    - check referenceable va_list autoconf macro. necessary on x86_64 + gcc ?
    - tree distance based on average node pair subtree leaf set hamming distance, IYKWIM
    - mcxrand -gen 1000 -add 2000 | mx/mcxdump | mx/mcxload -abc - --stream-mirror | tee ttt | cl/clm close
         mcxdump does not dump all nodes, so mcxload reads in a smaller domain
         with -123 instead of -abc, gaps (except for the last, if present!)
         are filled.
    - read in newick format
    - mcx q: dump nblist sizes directly
    - add env variable for verbosity on non-matching domains
    - taking submatrix with same domains, is that slow?
    $ mclxRead with "w" filehandle does not generate warning.
    $ mclxWrite with "r" filehandle does not generate warning.
    - test mcxarray pearson, e.g. centering. default formula centers data?
    $ mcxdump: upper/lower; do this as -tf transformation
      -> requires new engine for ivp.idx access.
    $ mcxdump skeleton does not work for cat format, as body is not read/skipped.
    $ localized inflation: is dissipation the best measure?
    $ elaborate ENQUIRE_ON_FAIL
    $ clm info implement clceil for flat clusterings
    $ compare mlmfifofum and clmframe
   ?$ mcxdump: --dump-rlines, --dump-lines no-empty option.
    ? ../mclcm small.mci --dispatch -write stack -b2 "" -- "-if 2 -I 2" (check)
    $ standalone generator of shadow matrices mcx x?x
    $ transform to add diagonals in diamonds. (lattice application).
  $$$ true tree representation
    $ integrate skeleton read with matrix read .. ? restructure read code?
   $$ tfparse still half/broken as gq(0), add(10)) will not result in 0 -> 10
    $ cleverer read routines (domainequatinglywise): {stack,tab,io}.h
   $$ simplify mcl/alg.c {stream:1/0} vs {cache:1/0} vs {transforms} framework.
    - can scatter distance be fixed?
    - try to cut back size of impala library - unify select routines.
    - logical line based clmformat output
    - move mcxrand code into library.
    - package enstrict domain checks nesting all in a single interface.
    - richer binary format (easier stats gathering)
   $$ readx for domain etc we can use the readDomPart code.
   $$ readx for nonnegative numbers
   -> checked io with EQT domain specifications 
    - slink / fibonacci heap single link clustering / skip lists
   $$ make the mclcm coarsening/shadowing step much more pluggable
    - optimum spanning tree
    - implement interactive mcl in javascript (anyone?)
    - annotate map matrices, validate at IO time
    ? use MCLXICFLAGS to specify dump-type behaviour?
    $  fix mclvMap (error checking)
  !?$ introduce n_alloc in mclv*
   ?$ template ivp with float|void* union ->val would become VAL()
    $ scatter distance is not a distance. ahem.
    $ implement sane log/verbosity/progress framework
    $ clean up taurus
    $ clmmate is pbb needlessly inefficient
    $ clean up matrix.h (order, redo, and document callback equipped functions)
   $$ mcxdump (and others): do not construct the entire matrix in memory
    $ mclxicflags (see below)
    ~ mcxquery scripting language
    ~ extend mcxi with data structures|scripting language. ruby/lua/R.
    ~ enable io annotation of matrix header (e.g. creation info)
    ~ general interchange s-expression type input syntax
    ~ framework for functions of virtual vectors (meet, join)
    $ prune vector.h, inline idiosyncratic stuff to place where it is used
    $ smart cattable ascii/binary/123/abc/packed recognition
    $ typedef the largest pnum type (use that rather than long)
    $ embed -tf functionality in read stage
    - in what other scenarios might we want to optimize mclvBinary?
    -  there is currently no way to have lint without lint-k as k=0 has special meaning internally.
    - domain checks in clew/*.c; document/code requirements
    - buffer mcl interchange input (mcxExpectNum,Integer problematic)
    - internally replace tab by hash.
    - clean up and document tab/streamIn implementation
    - try to spot/frame siphoning
    $ look at mcxsubs rand and mcxrand behaviour. mergeable?
    / visualize mcl process dynamically
    - stress/test suite-setup
    - framework for IO domain manipulation|
    - framework for overlap
   ## implement betweenness
    # mcxarray enable tab file creation
   ## streaming binary format
    # disprove descending consistency property T Tx Ty Txy
    # concatenated stack binary format
    # implement edge swap randomnization
    # why not mclxSubWrite mclxSubCompose, mclxSubBinary ..... (same) demand first
    # focus: large graph problems, not just clustering
    # mcl option to NOT touch loops: exists! --discard-loops=n
    # smarter vector set operations (+ testing framework)
    # optimize lots of set components a la mcxerdos
   @@ mcl libs do not unwind on memory errors. (culprit: vector)
    |_____________________________________|

#ifdef IEEE_754
if (ISNAN(x) || ISNAN(r) || ISNAN(b) || ISNAN(n))
   return x + r + b + n;
#endif

overlap
   -> reintroduce transient attractor systems .. ?
   -> when purging dag, consider node-wise   [ 1 / (1+total-#-neighbours) ] cutoff.
   * visualize DAG. piclo

-  if there is no annotation in a part of the tree, what does mlmimpromptu do?
!? coarsening: use efficiency as cluster->node similarity?
-  mcxsubs foo bar nonsense waits for STDIN; it should check specs first?
?? mcxsubs: specify by label: load a list of labels from file, or read from command line.
$$ mcxrand noise interface is cumbersome.
$  clmdist mode where only the chain is done.
?  report n_cls, n_cls - n_meet for both.
-  make clxdo work on multiple file arguments.
-  compare with David-Goliath index:
   -> how about computing the cluster size and rank where the 50/50 split is?
d  mcxdump write-tabc write-tabr
d  mcxdump --dump-{upper,lower}[i]
-  implement mcxdump --write-tabr-shadow
#  residue no longer adapt, s/-mvp/-o/
#  clm close no longer -cc
#  mcxsubs fin(weed) symmetrifies the domains. (introduced weedg)
$  enable stack reads with argv type input. mclxStackReadArgv
   -> clminfo, clmmeet will use this, except when they don't want
   to hold everything in memory of course.
-  check mclcm with input clustering. fully consistent, same flow?
-  allow tab-write with empty tab
-  stress-test clm_split_overlap
-  mcxsubs: read domain from file .. ?
-  mcx ?: reorder cluster sizes.
-  audit mclxSub implementation and usage, especially NULL argument passing.
-  mclvUpdate{Meet,Diff} speed gains are at most 25%.  worth the complexity?
-  mlmfifofum/weave should do sth sane with loops.
-  b 1 matrix, mclcm; does it include the max loops?
!  stress-test subreads from binary format.  make this a unit test.
~  binary/ascii format: in both write number of entries.
-  mcxload -restrict-{nc,nr,nd}, -extend-{nc,nr,nd} for 123 format.
/  readx REMOVE_LOOPS, SET_LOOPS_MAX, FORCE_LOOPS, UPPER, LOWER UPPERINC LOWERINC
-  write ivp size in binary format and have mcxconvert report it.
-  CHECK -pp for clminfo and mcl have changed semantics currently.
   -> both should no longer use mclgMakeSparse
-  progress bar for reading tab.
-  method for adding v1 + fac* v2
-  last taurus dependency: expand.[ch], il_levels_*
-  mcxerdos compute total number of paths (doable in pathmx?)
-  mcxerdos change protocol: cookie character ?q <node> <node>
-  mcl + stdin + lint: set cache to yes.
/  change distances to work with dim. (sj distance done)
?  mclcm allow shared options inbetween trailing options.
-  some clmapp for custom contraction of matrices (testbed for different strategies)
-  optify assimilation.
-  optify mclTabHash with ON_FAIL (duplicate labels);
-  clxdo: introduce tag which mimics clewCastActors
\  need way to set loops to max with clxdo too.
-  move level_quiet to a global setting in err.[ch]
-  document clewCastActors transformations on input arguments.
-  can include libraries LFLAGS be made more finegrained?
-  mcx max does not work for matrices: it even moans about lt.
-  mcxdump: optify dumping empty vectors/lines.
-  clmformat: option to skip small clusters
?  mclvCascade ?  sum, powsum, max, min
-  check mclvAdd usage; can it be supplanted by mclvUpdateMeet(,,fltAdd) ?
!/ force-connected=y fails with directed graphs. made quick fix I believe to work with transpose as well.
? -dir nm option to make mcl output in nm ?
-  is mcxassemble fully capable of doing asymmetric domains?
-  prune usage of ugly mcxResize.
-  clean up all the interface enums in io.h. some are not used.
-  sth to make matrices symmetric; by add or max or mul ..
   -> generalize addTranspose to mclxMergeTranspose
?  transformations that make difficult graphs more amenable.
   -> (large diameter/segmentation)
-  mcxarray: make '\n' endtoken for vector read, adjust whitespace handling so that line-based stuff can be done.
!/ remove exit's from matrix library.
!  check every thing that might fail mem-wise (that's a loooottttt).
!  could copy util ON_ALLOC_FAILURE compile option.
-  convert stack code in /shmcx/stack.[ch] to generic code using callbacks.
   -> do better job at type handling.
-  perhaps remove propagation stuff from vectorUnary,
   -> make vectorCascade instead.

audit status of source code
vector.[ch]
   mclvReplaceIdx
      mcxRand
      ?   get rid of insanely complicated weight generation.
      ->  option to throw away diagonal entirely or add diagonal (but mcxi can already do it).
      mclxAddto (changed to accommodate domains)
      mclxCatVectors (new)

-  comparing cluster algorithms gotchas
      -  use optimizaton criterion as comparison criterion
      -  include (parameterized) graph transformations in one cluster algorithm, not the other
      -  use 'default parameters'
      -  algorithm that is trained
      -  granularity-biased ground truth: specialized test case.
      -  selective testing: test only one of weighted/undirected

-  can we establish a way to incorporate a-priori information into mcl,
   i.e. the clustering must respect a coarse a priori partitioning.
   this can be done by using the induced subgraphs, but is it possible
   to consider intra-cluster edges during the mcl process, while
   ensuring a consistent subclustering, in an elegant way?

-  mcxload
      -123-rmax:     extend options:
      -123-cmax         fail |  ignore | extend
      -235-cmax
      -235-rmax (not yet implemented)>

-  mcxload          | mcl
   -stream-tf       | -abc-tf
   --stream-neg-log | --abc-neg-log



   orthology

big matrix + species annotation.

   best reciprocal hits: mutual 1-nn, {species A, species B} - wise
   putatitve paralogs:     s(a,b) > max s(a,x) && s(a,b) > max(b,x)
   co-orthologs:  for BRH ..




