tail.rank.test {TailRank} | R Documentation |
Perform a tail-rank test to find candidate biomarkers in a microarray data set.
tail.rank.test(data, split) tail.rank.test(data, split, direction = "down") tail.rank.test(data, split, specificity = 0.95, tolerance = 0.9, confidence = 0.95, direction = "up")
data |
A matrix or data.frame containing numerical measurements on which to perform the tail-rank test. |
split |
A logical vector or factor splitting the data into two
parts. The length of this vector should equal the number of columns
in the data . The TRUE portion (or the first level of
the factor) represents a "base" or "healthy" group of samples; the
other samples are the "test" or "cancer" group. |
specificity |
a real number between 0 and 1; the desired specificity used in the test to estimate a quantile from the "base" group. This is an optional argument with default value 0.95. |
tolerance |
a real number between 0 and 1; the upper tolerance bound used to estimate the threshold. This is an optional argument with default value 0.90. |
confidence |
a real number between 0 and 1; the confidence level that there are no false positives. This is an optional argument with default value 0.95. |
direction |
a character string representing the direction of the test; can be "up", "down", or "two-sided". The default value is "up". |
This function computes the tail rank statistic for each gene (viewed as one row of the data matrix). The data is split into two groups. The first ("base") group is used to estimate a tolerance bound (defaults to 90%) on a specific quantile (defaults to 95%) of the distribution of each gene. The tail-rank statistic is the defined as the number of samples in the second ("test") group that lie outside the bound. The test can be applied in the "up", "down", or "two-sided" direction, depending on the kinds of markers being sought. Also computes the cutoff for significance based on a confidence level that is "1 - FWER" for a desired family-wise error rate.
The return value is an object of class tail.rank.test.
Kevin R. Coombes <kcoombes@mdanderson.org>
http://bioinformatics.mdanderson.org
tail.rank.test-class
,
tail.rank.power
,
biomarker.power.table
,
tol.bound
# generate some fake data to use in the example nr <- 40000 nc <- 110 fake.data <- matrix(rnorm(nr*nc), ncol=nc) fake.class <- rep(c(TRUE, FALSE), c(40, 70)) # perform the tail-rank test null.tr <- tail.rank.test(fake.data, fake.class) # get a summary of the results summary(null.tr) # plot a histogram of the statistics hist(null.tr, overlay=TRUE) # get the actual statistics stats <- getStatistic(null.tr) # get a vector that selects the "positive" calls for the test is.marker <- as.logical(null.tr) # the following line should evaluate to the number of rows, nr = 40000 sum( is.marker == (stats > null.tr@cutoff) )