PerturbationClusterTest {ClassDiscovery} | R Documentation |

Performs a parametric bootstrap test (by adding independent Gaussian noise) to determine whether the clusters found by an unsupervised method appear to be robust in a given data set.

PerturbationClusterTest(data, FUN, nTimes = 100, noise = 1, verbose = TRUE, ...)

`data` |
A data matrix, numerical data frame, or
`ExpressionSet` object. |

`FUN` |
A `function` that, given a data matrix,
returns a vector of cluster assignments. Examples of functions
with this behavior are `cutHclust` ,
`cutKmeans` , `cutPam` , and
`cutRepeatedKmeans` . |

`...` |
Additional arguments passed to the classifying function, `FUN` . |

`noise` |
An optional numeric argument; the standard deviation of the mean zero Gaussian noise added to each measurement during each bootstrap. Defaults to 1. |

`nTimes` |
The number of bootstrap samples to collect. |

`verbose` |
A logical flag |

Objects should be created using the `PerturbationClusterTest`

function, which performs the requested bootstrap on the
clusters. Following the standard R paradigm, the resulting object can be
summarized and plotted to determine the results of the test.

`f`

:- A
`function`

that, given a data matrix, returns a vector of cluster assignments. Examples of functions with this behavior are`cutHclust`

,`cutKmeans`

,`cutPam`

, and`cutRepeatedKmeans`

. `noise`

:- The standard deviation of the Gaussian noise added during each bootstrap sample.
`nTimes`

:- An intetger, the number of bootstrap samples that were collected.
`call`

:- An object of class
`call`

, which records how the object was produced. `result`

:- Object of class
`matrix`

containing, for each pair of columns in the original data, the number of times they belonged to the same cluster of a bootstrap sample.

Class `ClusterTest`

, directly. See that class for
descriptions of the inherited methods `image`

and `hist`

.

- summary
`signature(object = PerturbationClusterTest)`

: Write out a summary of the object.

Kevin R. Coombes <kcoombes@mdanderson.org>

Kerr MK, Churchill GJ. Boostrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments. PNAS 2001; 98:8961-8965.

`ClusterTest`

,
`BootstrapClusterTest`

# simulate data from two different groups d1 <- matrix(rnorm(100*30, rnorm(100, 0.5)), nrow=100, ncol=30, byrow=FALSE) d2 <- matrix(rnorm(100*20, rnorm(100, 0.5)), nrow=100, ncol=20, byrow=FALSE) dd <- cbind(d1, d2) cols <- rep(c('red', 'green'), times=c(30,20)) # peform your basic hierarchical clustering... hc <- hclust(distanceMatrix(dd, 'pearson'), method='complete') # bootstrap the clusters arising from hclust bc <- PerturbationClusterTest(dd, cutHclust, nTimes=200, k=3, metric='pearson') summary(bc) # look at the distribution of agreement scores hist(bc, breaks=101) # let heatmap compute a new dendrogram from the agreement image(bc, col=blueyellow(64), RowSideColors=cols, ColSideColors=cols) # plot the agreement matrix with the original dendrogram image(bc, dendrogram=hc, col=blueyellow(64), RowSideColors=cols, ColSideColors=cols) # bootstrap the results of K-means kmc <- PerturbationClusterTest(dd, cutKmeans, nTimes=200, k=3) image(kmc, dendrogram=hc, col=blueyellow(64), RowSideColors=cols, ColSideColors=cols) # contrast the behavior when all the data comes from the same group xx <- matrix(rnorm(100*50, rnorm(100, 0.5)), nrow=100, ncol=50, byrow=FALSE) hct <- hclust(distanceMatrix(xx, 'pearson'), method='complete') bct <- PerturbationClusterTest(xx, cutHclust, nTimes=200, k=4, metric='pearson') summary(bct) image(bct, dendrogram=hct, col=blueyellow(64), RowSideColors=cols, ColSideColors=cols) # cleanup rm(d1, d2, dd, cols, hc, bc, kmc, xx, hct, bct)

[Package *ClassDiscovery* version 2.5.0 Index]