Linear Manifold Clustering¶
Linear manifold clustering algorithm (LMCLUS) discovers clusters which are described by a following model:
where is a dimension of the dataset, is dimension of the manifold, is a linear manifold translation vector, is a matrix whose columns are orthonormal vectors that span , is a matrix whose columns span subspace orthogonal to spanned by columns of , is a zero-mean random vector whose entries are i.i.d. from a support of linear manifold, is a zero-mean random vector with small variance independent of .
Clustering¶
This package implements the LMCLUS algorithm in the lmclus
function:
-
lmclus
(X, p)¶ Performs linear manifold clustering over the given dataset.
Parameters: - X – The given sample matrix. Each column of
X
is a sample. - p – The clustering parameters as instance of LMCLUSParameters.
This function returns an
LMCLUSResult
instance.- X – The given sample matrix. Each column of
Results¶
Let M
be an instance of Manifold
, n
be the number of observations, and d
be the dimension of the linear manifold cluster.
-
indim
(M)¶ Returns a dimension of the observation space.
-
outdim
(M)¶ Returns a dimension of the linear manifold cluster which is the dimension of the subspace.
-
size
(M)¶ Returns the number of points in the cluster which is the size of the cluster.
-
points
(M)¶ Returns indexes of points assigned to the cluster.
-
mean
(M)¶ Returns the translation vector which contains coordinates of the linear manifold origin.
-
projection
(M)¶ Returns the basis matrix with columns corresponding to orthonormal vectors that span the linear manifold.”
-
separation
(M)¶ Returns the instance of Separation object.
Example¶
using LMCLUS
# Load test data, remove label column and flip
X = readdlm(Pkg.dir("LMCLUS", "test", "testData"), ',')[:,1:end-1]'
# Initialize clustering parameters with
# maximum dimensionality for clusters.
# I should be less then original space dimension.
params = LMCLUSParameters(5)
# perform clustering and returns a collection of clusters
clust = lmclus(X, params)
# pick the first cluster
M = manifold(clust, 1)
# obtain indexes of points assigned to the cluster
l = points(M)
# obtain the linear manifold cluster translation vector
mu = mean(M)
# get basis vectors that span manifold as columns of the returned matrix
B = projection(M)
# get separation properties
S = separation(M)