Multidimensional partitioning and bi-partitioning: analysis and application to gene expression datasets

Gabriela Kalna, J. Keith Vass, Desmond J. Higham

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Eigenvectors and, more generally, singular vectors, have proved to be useful tools for data mining and dimension reduction. Spectral clustering and reordering algorithms have been designed and implemented in many disciplines, and they can be motivated from several dierent standpoints. Here we give a general, unied, derivation from an applied linear algebra perspective. We use a variational approach that has the benet of (a) naturally introducing an appropriate scaling, (b) allowing for a solution in any desired dimension, and (c) dealing with both the clustering and bi-clustering issues in the same framework. The motivation and analysis is then backed up with examples involving two large data sets from modern, high-throughput, experimental cell biology. Here, the objects of interest are genes and tissue samples, and the experimental data represents gene activity. We show that looking beyond the dominant, or Fiedler, direction reveals important information.
LanguageEnglish
Pages475-485
Number of pages11
JournalInternational Journal of Computer Mathematics
Volume85
Issue number3/4
DOIs
Publication statusPublished - Mar 2008

Fingerprint

Gene expression
Gene Expression
Partitioning
Genes
Cytology
Gene
Biclustering
Spectral Clustering
Singular Vectors
Reordering
Variational Approach
Dimension Reduction
Large Data Sets
Eigenvalues and eigenfunctions
Algebra
Eigenvector
High Throughput
Biology
Data mining
Data Mining

Keywords

  • data mining dimension reduction
  • feature selection
  • graphLaplacian
  • Fiedler vector
  • microarray
  • singular value decomposition
  • tumour classication

Cite this

Kalna, Gabriela ; Vass, J. Keith ; Higham, Desmond J. / Multidimensional partitioning and bi-partitioning : analysis and application to gene expression datasets. In: International Journal of Computer Mathematics. 2008 ; Vol. 85, No. 3/4. pp. 475-485.
@article{c64486849caa425f800e35e4681171ac,
title = "Multidimensional partitioning and bi-partitioning: analysis and application to gene expression datasets",
abstract = "Eigenvectors and, more generally, singular vectors, have proved to be useful tools for data mining and dimension reduction. Spectral clustering and reordering algorithms have been designed and implemented in many disciplines, and they can be motivated from several dierent standpoints. Here we give a general, unied, derivation from an applied linear algebra perspective. We use a variational approach that has the benet of (a) naturally introducing an appropriate scaling, (b) allowing for a solution in any desired dimension, and (c) dealing with both the clustering and bi-clustering issues in the same framework. The motivation and analysis is then backed up with examples involving two large data sets from modern, high-throughput, experimental cell biology. Here, the objects of interest are genes and tissue samples, and the experimental data represents gene activity. We show that looking beyond the dominant, or Fiedler, direction reveals important information.",
keywords = "data mining dimension reduction, feature selection, graphLaplacian, Fiedler vector, microarray, singular value decomposition, tumour classication",
author = "Gabriela Kalna and Vass, {J. Keith} and Higham, {Desmond J.}",
year = "2008",
month = "3",
doi = "10.1080/00207160701210158",
language = "English",
volume = "85",
pages = "475--485",
journal = "International Journal of Computer Mathematics",
issn = "0020-7160",
number = "3/4",

}

Multidimensional partitioning and bi-partitioning : analysis and application to gene expression datasets. / Kalna, Gabriela; Vass, J. Keith; Higham, Desmond J.

In: International Journal of Computer Mathematics, Vol. 85, No. 3/4, 03.2008, p. 475-485.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Multidimensional partitioning and bi-partitioning

T2 - International Journal of Computer Mathematics

AU - Kalna, Gabriela

AU - Vass, J. Keith

AU - Higham, Desmond J.

PY - 2008/3

Y1 - 2008/3

N2 - Eigenvectors and, more generally, singular vectors, have proved to be useful tools for data mining and dimension reduction. Spectral clustering and reordering algorithms have been designed and implemented in many disciplines, and they can be motivated from several dierent standpoints. Here we give a general, unied, derivation from an applied linear algebra perspective. We use a variational approach that has the benet of (a) naturally introducing an appropriate scaling, (b) allowing for a solution in any desired dimension, and (c) dealing with both the clustering and bi-clustering issues in the same framework. The motivation and analysis is then backed up with examples involving two large data sets from modern, high-throughput, experimental cell biology. Here, the objects of interest are genes and tissue samples, and the experimental data represents gene activity. We show that looking beyond the dominant, or Fiedler, direction reveals important information.

AB - Eigenvectors and, more generally, singular vectors, have proved to be useful tools for data mining and dimension reduction. Spectral clustering and reordering algorithms have been designed and implemented in many disciplines, and they can be motivated from several dierent standpoints. Here we give a general, unied, derivation from an applied linear algebra perspective. We use a variational approach that has the benet of (a) naturally introducing an appropriate scaling, (b) allowing for a solution in any desired dimension, and (c) dealing with both the clustering and bi-clustering issues in the same framework. The motivation and analysis is then backed up with examples involving two large data sets from modern, high-throughput, experimental cell biology. Here, the objects of interest are genes and tissue samples, and the experimental data represents gene activity. We show that looking beyond the dominant, or Fiedler, direction reveals important information.

KW - data mining dimension reduction

KW - feature selection

KW - graphLaplacian

KW - Fiedler vector

KW - microarray

KW - singular value decomposition

KW - tumour classication

U2 - 10.1080/00207160701210158

DO - 10.1080/00207160701210158

M3 - Article

VL - 85

SP - 475

EP - 485

JO - International Journal of Computer Mathematics

JF - International Journal of Computer Mathematics

SN - 0020-7160

IS - 3/4

ER -