The tnom module provides the
definition of the
tnom
class.
See the bottom of the page for an example of how the class can be used.
zm
, whose
rows are genes and whose columns are samples, and a logical vector,
zclass
, to classify the samples into two types. The
constructor performs a TNoM analysis (as defined by Yakhini and
Ben-Dor) to determine the minimum number of classifications of the
samples using each possible split along the expression values of
each gene. (Because this process can be time-consuming, the
constructor prints brief progress reports as it goes.)tnom
object
as its argument. On a gene-by-gene basis, it scrambles the
classifying labels on the columns (samples) and recomputes the total
number of misclassifications with these random labels. It returns
the number of genes at each misclassification level, as returned by
the summary
method.tnom
constructor. The
function then scrambles the classifying labels on the columns
(samples) and computes the total number of misclassifications with
these random labels. It returns the number of genes at each
misclassification level, as returned by the summary
method.An object of the tnom
class represents
the reuslt of a preliminary TNoM analysis. This method was introduced
by Yakhini and Ben-Dor, and was used by Bittner et al. in their study
of melanoma. The underlying idea is that, for each gene, we can order
the samples in increasing order of expression. At the gap between each
expression level, we can split the data and ask how many samples are
misclassified by such a split. The smallest number of
misclassifications for this gene is a quality measure that describes
how well its expression levels (when converted to ranks) matches the
actual classification. A key additional step is to determine how many
genes there are at ach misclassification level. In particular, one
would like to know that the number of genes with only a few
misclassifications is greater than would be expected by random
chance.
n.genes <- 200 n.samples <- 10 n.sim <- 10 bogus <- matrix(rnorm(n.samples*n.genes, 0, 3), ncol=n.samples) splitter <- rep(F, n.samples) splitter[sample(1:n.samples, trunc(n.samples/2))] <- T tn <- tnom(bogus, splitter) temp <- matrix(0, n.sim, length(summary(tn))) for (i in 1:n.sim) { print(i) temp[i,] <- simulate.genes.tnom(tn) } ct <- data.frame(t(apply(temp, 1, cumsum))) fakir <- apply(ct, 2, mean) dex <- 0:(length(fakir)-1) scram <- scramble.samples.tnom(bogus, splitter) obs <- cumsum(summary(tn)) scr <- cumsum(scram) plot(dex, fakir, type='n', xlab='Maximum Number of Misclassifications', ylab='Number of Genes') points(dex, fakir, type='b', col=6, pch=1) points(dex, obs, type='b', col=8, pch=16) points(dex, scr, type='b', col=4, pch=17) title(paste('TNoM', bogus$name)) legend(3, 50, c('observed', 'expected', 'scrambled'), col=c(8, 6, 4), marks=c(16, 1, 17))