BM 20160120

Thu May 12 22:41:17 CDT 2016 2659296c74bcec23c5dec54ac227ebf4d09dd60a

Gaussian Data, Obsolete

Tree: start simple

Bifurcating tree involving \(L=3\) nodes each a sequence of \(K=1000\) binary features:

\begin{gather*} \mbox{II } \{0,1,0,0, \ldots, 1\}_{K} \begin{cases} \mbox{I:1 } \{0,1,0,0, \ldots, 0\}_{K}\\ \mbox{I:2 } \{0,1,0,1, \ldots, 1\}_{K}\\ \end{cases} \end{gather*}

Simulation of I:1 and I:2 resembles continuous time Jukes-Cantor:

\[ Q = \begin{bmatrix} 1 - \alpha & \alpha \\ \alpha & 1 - \alpha \end{bmatrix} ;\quad P = \exp(Qt) \] * Simulation default setup \(\alpha = 0.2, t = 0.2\)

Tree: same setup, more complex structure

Samples

For each of the \(K\) “features” on a node, sample \(Y_1, \ldots, Y_N\), N = 100 from \[Y|\bar{Y}, \sigma \sim N(\bar{Y}, \sigma^2)\]\[\bar{Y}|\omega_k \sim N(0, \omega_k^2)\] where \(\omega_k = \omega_1\) if the feature is different from the root, \(\omega_0\) otherwise.

  • Simulation default setup: \(\omega_0 = 1.0\) for small effect, \(\omega_1 = 5.0\) for large effect, \(\sigma = 0.2\) for some noise.
  #! Manifold learning methods
  # 6 methods (or 9 considering variants on LLE)
  * PCA
  * MDS
  * t-SNE
  * Isomap
  * Spectral embedding
  * Locally linear embedding
    * Standard, LTSA, Hessian, Modified
  
  
  #! Results on simple structure
  snakemake project --config seed=3_1 out_dir=3_1
  # # 3_1/pca.pdf 0.25; 3_1/mds.pdf 0.25; 3_1/spectral_embedding.pdf 0.25; 3_1/t_sne.pdf 0.25
  # # 3_1/LLE_standard.pdf 0.25; 3_1/LLE_modified.pdf 0.25; 3_1/isomap.pdf 0.25
  snakemake project --config seed=31_1 out_dir=31_1
  # # 31_1/pca.pdf 0.25; 31_1/mds.pdf 0.25; 31_1/spectral_embedding.pdf 0.25; 31_1/t_sne.pdf 0.25
  # # 31_1/LLE_standard.pdf 0.25; 31_1/LLE_modified.pdf 0.25; 31_1/isomap.pdf 0.25
  snakemake project --config seed=314_1 out_dir=314_1
  # # 314_1/pca.pdf 0.25; 314_1/mds.pdf 0.25; 314_1/spectral_embedding.pdf 0.25; 314_1/t_sne.pdf 0.25
  # # 314_1/LLE_standard.pdf 0.25; 314_1/LLE_modified.pdf 0.25; 314_1/isomap.pdf 0.25
  snakemake project --config seed=3141_1 out_dir=3141_1
  # # 3141_1/pca.pdf 0.25; 3141_1/mds.pdf 0.25; 3141_1/spectral_embedding.pdf 0.25; 3141_1/t_sne.pdf 0.25
  # # 3141_1/LLE_standard.pdf 0.25; 3141_1/LLE_modified.pdf 0.25; 3141_1/isomap.pdf 0.25
  #! Results on complex structure
  snakemake project --config seed=3_2 out_dir=3_2 data_set=2
  # # 3_2/pca.pdf 0.25; 3_2/mds.pdf 0.25; 3_2/spectral_embedding.pdf 0.25; 3_2/t_sne.pdf 0.25
  # # 3_2/LLE_standard.pdf 0.25; 3_2/LLE_modified.pdf 0.25; 3_2/isomap.pdf 0.25
  snakemake project --config seed=31_2 out_dir=31_2 data_set=2
  # # 31_2/pca.pdf 0.25; 31_2/mds.pdf 0.25; 31_2/spectral_embedding.pdf 0.25; 31_2/t_sne.pdf 0.25
  # # 31_2/LLE_standard.pdf 0.25; 31_2/LLE_modified.pdf 0.25; 31_2/isomap.pdf 0.25
  snakemake project --config seed=314_2 out_dir=314_2 data_set=2
  # # 314_2/pca.pdf 0.25; 314_2/mds.pdf 0.25; 314_2/spectral_embedding.pdf 0.25; 314_2/t_sne.pdf 0.25
  # # 314_2/LLE_standard.pdf 0.25; 314_2/LLE_modified.pdf 0.25; 314_2/isomap.pdf 0.25
  snakemake project --config seed=3141_2 out_dir=3141_2 data_set=2
  # # 3141_2/pca.pdf 0.25; 3141_2/mds.pdf 0.25; 3141_2/spectral_embedding.pdf 0.25; 3141_2/t_sne.pdf 0.25
  # # 3141_2/LLE_standard.pdf 0.25; 3141_2/LLE_modified.pdf 0.25; 3141_2/isomap.pdf 0.25