Programmer Guide/Command Reference/EVAL/modclust: Difference between revisions
From STX Wiki
< Programmer Guide | Command Reference | EVAL
Jump to navigationJump to search
(Created page with '{{DISPLAYTITLE:{{SUBPAGENAME}}}} Model based agglomerative clustering. A hierarchy and BIC (Bayesian Information Criterion) values are calculated for given data vectors. ---- ;Us…') |
No edit summary |
||
Line 17: | Line 17: | ||
|} | |} | ||
:;<var>min, max</var>:optional minimum (default=2) and maximum (default=N) number of clusters for BIC calculation; 2 <= ''min'' < ''max'' <= N | :;<var>min, max</var>:optional minimum (default=2) and maximum (default=N) number of clusters for BIC calculation; 2 <= ''min'' < ''max'' <= N | ||
:;<var>alpha</var>:optional factor for BIC calculation (default=1); 0 <= alpha <= 100 | :;<var>alpha</var>:optional factor for BIC calculation (default=1); 0 <= ''alpha'' <= 100 | ||
;Result 1: On return the hierarchy information is stored in ''htable'' (Nx3 matrix) and the BIC values are stored in ''bictable ((''max''-''min''+1)x3 matrix). The return value ''ibest'' is the index of the BIC table entry with the highest BIC value (<code>''ibtest''=imax(''bictable''[*,2]</code>). | ;Result 1: On return the hierarchy information is stored in ''htable'' (Nx3 matrix) and the BIC values are stored in ''bictable ((''max''-''min''+1)x3 matrix). The return value ''ibest'' is the index of the BIC table entry with the highest BIC value (<code>''ibtest''=imax(''bictable''[*,2]</code>). | ||
::hierarchy table ''htable'': N rows, 3 columns | ::hierarchy table ''htable'': N rows, 3 columns | ||
::{|class="einrahmen" | ::{|class="einrahmen" | ||
|column 0 ||index of min. row (from) | |column 0 ||index of min. row (from) | ||
|- | |||
|column 1 ||index of min. column (to) | |column 1 ||index of min. column (to) | ||
|- | |||
|column 2 ||agglomeration cost (distance) | |column 2 ||agglomeration cost (distance) | ||
|} | |} | ||
Line 28: | Line 30: | ||
::{|class="einrahmen" | ::{|class="einrahmen" | ||
|column 0 ||number of clusters | |column 0 ||number of clusters | ||
|- | |||
|column 1 ||log. likelihood | |column 1 ||log. likelihood | ||
|- | |||
|column 2 ||BIC | |column 2 ||BIC | ||
|} | |} | ||
Line 35: | Line 39: | ||
:;<var>htable</var>:cluster hierarchy table (Nx3 matrix, see '''Usage 1'''). | :;<var>htable</var>:cluster hierarchy table (Nx3 matrix, see '''Usage 1'''). | ||
:;<var>nclust</var>:number of clusters | :;<var>nclust</var>:number of clusters | ||
;Result 2:The created partition table '' | ;Result 2:The created partition table ''ptable'', which is a vector with N elements containing the group indices. The value ''ptable''[i] (i=0..N) is the index of the cluster containing the data vector i. | ||
---- | ---- | ||
;Usage 3:<code>modclust(<var>ptable</var>, <var>iclust</var> { <var>x</var>})</code> | ;Usage 3:<code>modclust(<var>ptable</var>, <var>iclust</var> { <var>x</var>})</code> |
Revision as of 06:30, 21 April 2011
Model based agglomerative clustering. A hierarchy and BIC (Bayesian Information Criterion) values are calculated for given data vectors.
- Usage 1
modclust(x, htable, bictable, mflag {, min, max, alpha})
- x
- data matrix NxM, one data vector with length M per row
- htable
- hierarchy table (reference used for output)
- bictable
- BIC table (reference used for output)
- mflag
- method for distance and BIC calculation
mflag method 0 Single Linkage 1 Complete Linkage (linaer distances) 2 Complete Linkage (log. distances)
- min, max
- optional minimum (default=2) and maximum (default=N) number of clusters for BIC calculation; 2 <= min < max <= N
- alpha
- optional factor for BIC calculation (default=1); 0 <= alpha <= 100
- Result 1
- On return the hierarchy information is stored in htable (Nx3 matrix) and the BIC values are stored in bictable ((max-min+1)x3 matrix). The return value ibest is the index of the BIC table entry with the highest BIC value (
ibtest=imax(bictable[*,2]
).- hierarchy table htable: N rows, 3 columns
column 0 index of min. row (from) column 1 index of min. column (to) column 2 agglomeration cost (distance)
- BIC table bictable: max-min+1 rows, 3 columns
column 0 number of clusters column 1 log. likelihood column 2 BIC
- Usage 2
modclust(htable, nclust)
- htable
- cluster hierarchy table (Nx3 matrix, see Usage 1).
- nclust
- number of clusters
- Result 2
- The created partition table ptable, which is a vector with N elements containing the group indices. The value ptable[i] (i=0..N) is the index of the cluster containing the data vector i.
- Usage 3
modclust(ptable, iclust { x})
- ptable
- partition table (Nx1 matrix, see Usage 2).
- nclust
- index of cluster to be extracted
- x
- input data matrix (NxM matrix, see Usage 1)
- Result 3
-
- if x is supplied: data matrix of all data vectors associated with the cluster iclust
- otherwise: index vector containing the indices of the data vectors associated with cluster iclust