Object Oriented Microarray Library: Principal Components Analysis

The gene-sample-pca module provides the definition of the gene.pc class and the sample.pc class. See the bottom of the page for an example of how these classes can be used.

Class Name: gene.pc

Attributes

scores: A matrix whose rows contain the descriptions of each gene as a sum of the principal components.
variances: A vector describing the amount of the variance explained by each principal component.
components: A matrix whose columns are the principal components.

Methods

gene.pc(data): The constructor takes a data frame as its argument, and then performs a principal components analysis on the rows, which we think of as representing genes. The columns of the data are first centered to have mean zero, and then a singular value decomposition is used to perform the analysis.
plot(object): This simple plot method produces a plot of the first two principal components.

Description

An object of the gene.pc class represent the results of performing a principal components analysis on the genes.

Class Name: sample.pc

Attributes

scores: A matrix whose columns express each of the samples as a sum of the principal components.
variances: A vector describing the amount of the variance explained by each principal component.
components: A matrix whose rows are the principal components.
splitter: The value passed in as the optional argument to the constructor.

Methods

sample.pc(data, splitter, usecor, center): The constructor takes a data frame as its first (required) argument, and then performs a principal components analysis on the columns, which we think of as representing samples. The rows of the data are first centered to have mean zero, and then a singular value decomposition is used to perform the analysis. Note that this method is infinitely faster than the default method implemented in the standard S-PLUS princomp function. The optional splitter argument, which defaults to 0, should be a logical vector to separate the samples into two types for later plotting purposes. The optional usecor argument, which defaults to false, determines whether to use the correlation matrix instead of the covariance matrix. The optional center argument, which defaults to true, determines whether the data should be centered before performing the analysis.
plot(object, split, name, ...): This method produces several plots: the first two principal components, the first and third, and the second and third. It usually works best in a 2x2 layout of plots. The optional split argument defaults to the value of the splitter attribute. If supplied, it should be a logical vector to separate the sample into two types, which are plotted with distinct colors.
screeplot(object, ...): Produces a screeplot of the variances explained by the principal components.

Description

An object of the sample.pc class rpresents the results of a principal components analysis applied to the samples.

Example

  n.genes <- 1000
  n.samples <- 30

  bogus <- matrix(rnorm(n.samples*n.genes, 0, 3), ncol=n.samples)
  splitter <- rep(F, n.samples)
  splitter[sample(1:n.samples, trunc(n.samples/2))] <- T

  x <- gene.pc(bogus)
  y <- sample.pc(bogus, splitter)

  plot(x)
  screeplot(y)
  opar <- par(mfrow=c(2,2))
  plot(y)
  par(opar)