%
% Lectures for Microarray Course
%
% This is the template file for all lectures to ensure consistent
% layout, fonts, and colors.
%
% Remember to update the hypersetup to include individual lecture
% titles, change the leftheader to include the lecture title, and
% then start at \begin{document} ....
% Also, produce a printable (black and white version) according
% to the cues immediately following \begin{document}.
%
% Author: Keith Baggerly and Kevin R. Coombes
% Date : 17 August 2004
%
% Compile this document with:
% pdflatex .tex
% and then:
% ppower4pb .pdf .pdf
%
\documentclass[Screen4to3,25pt,headrule,footrule]{foils}
\usepackage[pdftex]{color} %for colors
\definecolor{Teal}{rgb}{0.0,0.47,0.46}
\definecolor{PaleTeal}{rgb}{0.0,0.8,0.6}
\definecolor{DarkTeal}{rgb}{0.0,0.2157,0.2118}
\definecolor{Lemon}{rgb}{1.0,1.0,0.4}
\definecolor{Orange}{rgb}{1.0,0.6,0.0}
\definecolor{PaleBlue}{rgb}{0.0,0.8,1.0}
\definecolor{Gray7}{rgb}{0.7,0.7,0.7}
\definecolor{DarkBurgundy}{rgb}{0.2745,0.0,0.0}
\definecolor{Burgundy}{rgb}{0.62,0.06,0.41}
\definecolor{DarkGreen}{rgb}{0.5,0.8,0.6}
\definecolor{RGBblack}{rgb}{0.0,0.0,0.0} %black doesn't work (?) for
% {h,v}pagecolor so use this
\usepackage[pdftex]{graphicx} %for importing graphics
\usepackage{times} %for nice fonts
\usepackage[pdftex]{hyperref} %for hyperlinking and other effects
\hypersetup{
pdftitle={Lecture 12: Rank-based tests of differential expression},
pdfsubject={Analysis of Microarray Data},
pdfauthor={Keith A. Baggerly and Kevin R. Coombes,
Department of Bioinformatics and Computational Biology,
UT M.D. Anderson Cancer Center,
,
},
pdfkeywords={microarray,R},
pdfpagemode={None},
linkcolor=PaleBlue,
citecolor=Gray7,
pagecolor=PaleBlue,
urlcolor=PaleBlue
}
\usepackage{pause} %required for ppower4
\usepackage{background} %required for ppower4
\usepackage{pp4slide} %required for ppower4
\usepackage{pp4link}
\usepackage[pdftex]{geometry} %for resizing the header areas
\geometry{hscale=0.85, %make the slides 85% of horizontal
vscale=0.8 % and 80% of vertical
}
\setcounter{page}{0} %don't count the title page
\setlength{\foilheadskip}{-0.5in} %skip less space below the title
%\setlength{\itemsep}{1pt} %set distance between list items
% use different colored bullets in nested lists
% color choices come from MDACC graphics standards
\renewcommand{\labelitemi}{\textcolor{PaleTeal}{$\bullet$}}
\renewcommand{\labelitemii}{\textcolor{Orange}{$\bullet$}}
\renewcommand{\labelitemiii}{\textcolor{PaleBlue}{$\bullet$}}
\renewcommand{\labelitemiv}{\textcolor{Lemon}{$\bullet$}}
% change color of slide titles
%\renewcommand\normalcolor{\color{black}}
%\def\Black#1{\color{black}#1}
% change the default background, again an MDACC standard
\vpagecolor[RGBblack]{Teal}
\raggedright %no right-justify
% put lecture specific items in the header and consistent course
% items in the footer
\leftheader{\hyperlink{start}{\textsc{Introduction to Microarrays}}}
\rightheader{\textsf{\thepage}}
\MyLogo{{\copyright\ Copyright 2004--2007, Kevin R. Coombes and Keith
A. Baggerly}} % left footer
\rightfooter{\textsc{GS01 0163: Analysis of Microarray Data}}
\begin{document}
\hypersetup{pdfpagetransition=Replace}
%\color{black} % for black-and-white version
% Also, run M-x replace-string \textcolor{white} \textcolor{black}.
% Then, using pdflatex only (no ppower4 step) should produce the
% black and white version with pauses shown as colored boxes.
% Remember to run M-x replace-string \textcolor{black} \textcolor{white}
% afterwards to reset!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Title Page
\title{\textcolor{Lemon}{GS01 0163\\ Analysis of Microarray Data}}
\author{\textcolor{white}{Keith Baggerly and Kevin Coombes}\\
\textcolor{white}{Department of Bioinformatics and Computational Biology}\\
\textcolor{white}{UT M. D. Anderson Cancer Center}\\
\textcolor{Orange}{\tt kabagg@mdanderson.org}\\
\textcolor{Orange}{\tt kcoombes@mdanderson.org}}
\date{\textcolor{Lemon}{9 October 2007}}
\LogoOff
\maketitle
\SweaveOpts{prefix.string=Figs/ma12}
<>=
options(width=55)
options(SweaveHooks=list(fig=function() par(bg="white",lwd=2)))
@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Individual Lecture Title Page
\foilhead{\hypertarget{start}{Lecture 12}: Rank-based tests of differential expression}
\LogoOn
% In most cases, this slide should be an outline of topics to be
% covered in this lecture, presented as an itemized list. Each item in
% the list should provide a hyperlink to the slide where discussion of
% that topic begins.
\begin{itemize}
\item \hyperlink{wilcox}{Wilcoxon rank-sum test}
\item \hyperlink{ebayes}{Empirical Bayes}
\item \hyperlink{tailrank}{The Tail-Rank Test}
\item \hyperlink{stripchart}{Looking at the results}
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Nonparametric tests}
The t-test for differences in mean expression between two groups of
samples assumes that the measurements in each group are normally
distributed. If this assumption is far from the truth, then the
t-statistics and p-values you get may be meaningless. (Actually,
departures from normality tend to increase the Type II error,
especially when the sample size is small.)
Statistically, the dispute over log-transforming microarray data
reduces to whether a normal distribution better describes the data on
the raw scale or on the log scale.
We can avoid this problem entirely by using a statistical test that
does not assume anything about the distributions. These tests are
usually called \textcolor{Orange}{distribution-free} or
\textcolor{Orange}{nonparametric}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{\hypertarget{wilcox}{Wilcoxon rank-sum test}}
The most common nonparametric test for a difference in mean expression
is the \textcolor{Orange}{Wilcoxon rank-sum test}, which is also known
as the \textcolor{Orange}{Mann-Whitney test}.
We assume that we have sample measurements from two groups:
$$X_1, X_2, \dots, X_{n_X}$$
$$Y_1, Y_2, \dots, Y_{n_Y}$$
We then rank these values from smallest to largest, getting something
like
$$
\begin{array}{ccccccccccc}
X_3 &\le &Y_5 &\le &X_1 &\le &X_{10} &\le &Y_1 &\le \cdots \le &X_2 \\
1 & &2 & &3 & &4 & &5 & &n_X+n_Y\\
\end{array}
$$
\foilhead{Computing rank-sums}
Next, we compute a statistic $W$ by summing the ranks of the
measurements from the first group. In our example,
$$W = 1 + 3 + 4 + \cdots + (n_X+n_Y).$$
$W$ is always an integer, and it is easy to compute its minimum and
maximum values. The minimum occurs when all the $X$ values are smaller
than all the $Y$ values. Thus, all the $X$ values are at the beginning
of the list, and we have
$$W \ge 1 + 2 + \cdots + n_X = \frac{n_X(n_X+1)}{2}.$$
\foilhead{}
The maximum occurs when all the $X$ values come after all the $Y$
values, giving
\begin{eqnarray*}
W & \le & (n_Y+1) + (n_Y+2) + \cdots + (n_Y+n_X) \\
& = & n_Xn_Y + (1 + 2 + \cdots n_X)\\
& = & n_Xn_Y + \frac{n_X(n_X+1)}{2}\\
& = & \frac{n_X(2n_Y + n_X + 1)}{2}.
\end{eqnarray*}
Those formulas are very nice, but let's see what happens when we have
$10$ samples in each group. The range of values for $W$ in this case
is
$$(1+2+\cdots+10)=55 \le W \le (11+12+\cdots+20)=155$$
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{When is a rank-sum significant?}
Intuitively, if we get a value of $W$ near its extreme values, then we
strongly suspect that the two groups are different. If, however, we
get a value near the middle, then we suspect that there is no
difference. How can we make this idea more precise?
First, let's do some exploration. We start by generating an unstructured
random data matrix:
<>=
n.genes <- 40000
n.samples <- 20
type <- rep(c('A', 'B'), each=10)
data <- matrix(rnorm(n.genes*n.samples),
ncol=n.samples)
@
\foilhead{}
Next, we rank the values in each row:
<>=
ranked.data <- apply(data, 1, rank)
dim(data)
dim(ranked.data)
@
\foilhead{}
Notice that the matrix of ranks is transposed when compared to the
original data matrix. So, we can compute the Wilcoxon rank-sum
statistics by summing the correct ranks by column:
<>=
wilstat <- apply(ranked.data[type=='A',], 2, sum)
summary(wilstat)
@
We don't quite get to the extremes... %'
\foilhead{Null distribution of Wilcoxon rank-sum}
\begin{center}
<>=
hist(wilstat, breaks=seq(54.5, 155.5, by=1))
@
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{What happens with real data?}
We will return to the prostate cancer data set used in the last
lecture. Recall that this data set contains the log ratios of $42,129$
genes measured using two-color fluorescent microarrays and a common
reference channel. Recall also that we selected a subset of $10$
samples from normal prostate and $10$ samples of prostate cancer.
<>=
if (!exists("clinical.info")) {
load(file.path("TailRankData", "data", "expression.data.rda"))
load(file.path("TailRankData", "data", "clinical.info.rda"))
status <- clinical.info[, "Status"]
choice <- c(sample(which(status=="N"), 10), sample(which(status=="T"), 10))
expression.data <- expression.data[, choice]
clinical.info <- clinical.info[choice,]
clinical.info[, "Status"] <- factor(status[choice])
}
@
We compute the Wilcoxon rank-sum statistics for this data set:
<>=
ranked.data <- apply(expression.data, 1, rank)
dim(ranked.data)
status <- clinical.info[, "Status"]
ranksum <- apply(ranked.data[status=='N',], 2, sum)
@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Real data yields extreme statistics}
<>=
summary(ranksum)
@
\begin{center}
<>=
hist(ranksum, breaks=seq(54.5, 155.5, by=1))
@
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Distributions matter}
When we simulated data, we got values of the rank-sum statistic
between \Sexpr{min(wilstat)} and \Sexpr{max(wilstat)}.
When we looked at real data, we got rank-sum
statistics that spanned the full range of possible values from $55$ to
$155$.
One (pessimistic) interpretation of this result is that rank-sum
statistics are only useful in microarray experiments if they find
genes where all the values in one group are less than all the values
in the other group.
We can do better by looking more carefully at the distributions.
\foilhead{R and Wilcoxon}
R contains functions to explore the distribution of the rank-sum
statistics:
\begin{description}
\item[rwilcox]{generate random values from the Wilcoxon distribution}
\item[dwilcox]{probability density function}
\item[pwilcox]{cumulative probability function}
\item[qwilcox]{quantile function}
\end{description}
This set of functions parallels those for other distributions (like
\texttt{rnorm}, \texttt{dnorm}, \texttt{pnorm}, and \texttt{qnorm} for
the normal distribution).
One should note, however, that the rank-sum statistics in R are
shifted so that the smallest value is $0$ instead of $n_X$.
\foilhead{The null distribution}
\begin{center}
<>=
minW <- sum(1:10); maxW <- sum(11:20)
breaker <- seq(minW-0.5, maxW+0.5, by=1)
hist(wilstat, breaks=breaker, prob=TRUE)
lines(minW:maxW, dwilcox(0:(maxW-minW), 10, 10),
col='red', lwd=3)
@
\end{center}
\foilhead{The real distribution}
\begin{center}
<>=
hist(ranksum, breaks=breaker, prob=TRUE, ylim=c(0,0.03))
lines(minW:maxW, dwilcox(0:(maxW-minW), 10, 10),
col='red', lwd=3)
@
\end{center}
\foilhead{Wilcoxon p-values}
<>=
wilp <- sapply(ranksum-minW, function(w, m, n) {
if (w > m*n/2) 2*(1-pwilcox(w, m, n))
else 2*pwilcox(w, m, n)
}, 10, 10)
@
\foilhead{}
\begin{center}
<>=
hist(wilp, breaks=100)
@
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{\hypertarget{ebayes}{Empirical Bayes}}
The discreteness of the values of the Wilcoxon statistics makes the
distribution of p-values problematic for the application of something
like BUM to sort out the significance in the face of multiple
testing. Instead, we are going to use a differnt approach.
Reference: Efron and Tibshirani. Empirical Bayes methods and false
discovery rates for microarrays. \textit{Genetic Epidemiology}, 2002;
\textbf{23}: 70--86.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Basic idea}
Assume that there are two classes of genes,
\textcolor{Lemon}{Different} and \textcolor{Lemon}{Not Different}. We
assume prior probabilities
\begin{itemize}
\item $p_0=$ Prob(Not Different)
\item $p_1= 1 - p_0 =$ Prob(Different)
\end{itemize}
and density functions
\begin{itemize}
\item $f_0(y)$, known Wilcoxon, if Not Different
\item $f_1(y)$, unknown, if Different
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Mixtures}
The overall probability density function is a mixture
$$ f(y) = p_0f_0(y) + p_1f_1(y).$$
Bayes Theorem: $ P(H | D) = P(D | H) P(H)/P(D)$
Applying Bayes Theorem gives posterior estimates:
$$ p_1(y) \equiv Prob(Diff | Y = y) = 1 - p_0f_0(y)/f(y) $$
and
$$ p_0(y) \equiv Prob(Not Diff | Y = y) = p_0f_0(y)/f(y) $$
We can use the observed data to estimate the overall density function
by $\hat{f}(y)$ (typically by log-transforming the observed function
and fitting a curve.)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Empirical Bayes}
The ``empirical'' nature of this Bayesian idea is that we can adjust
the ``prior'' $p_0$ after looking at the data, and thus obtain some
reasonable values for it. First, here is how well we fit the
distribution (mentally swap the labels, since they are wrong):
\begin{center}
\includegraphics[height=4in]{Figs/ebayes1}
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Plot of Posterior Probability of Difference}
\begin{center}
\includegraphics[height=4in]{Figs/ebayes2}
\end{center}
This graph assumes $p_0 = 1$, so no genes are different. The posterior
probability of being different becomes negative in the middle of
the graph. This results from the ``empirical'' nature of the estimate
without imposing a full model. We can, however, adjust $p_0$ to
prevent seeing any negative probabilities.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Plot of Posterior Probability of Difference}
\begin{center}
\includegraphics[height=4in]{Figs/ebayes3}
\end{center}
This shows posterior probabilities with $p_0 = 0.7, 0.8, 0.9, 1.0$.
Somewhere between $p0=0.7$ and $p_0=0.8$, all the posterior
probabilities become positive.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Plot of Posterior Probability of Difference}
\begin{center}
\includegraphics[height=4in]{Figs/ebayes4}
\end{center}
This plot uses $p_0=0.75$, which is essentially the largest value we
can use for $p_0$ and ensure that all the posterior probabilities are
positive. The horizontal line indicates a posterior probability of
$90\%$ that a gene is differentially expressed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{How does this work in R?}
We have implemented this idea in an R package:
\url{http://bioinformatics.mdanderson.org/software.html}
\begin{center}
\includegraphics[height=5in]{Figs/mdasw}
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{The OOMPA home page}
\begin{center}
\includegraphics[height=5.5in]{Figs/oompa}
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{The ClassComparison package}
\begin{center}
\includegraphics[height=5in]{Figs/cc}
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Empirical Bayes in R}
The ClassComparison package implements the empirical Bayes method with
Wilcoxon statistics. For the computations shown above, do the
following:
\begin{verbatim}
> require(ClassComparison)
> efron <- MultiWilcoxonTest(expression.data,
+ status)
> hist(efron)
> plot(efron, prior=c(0.7, 0.8, 0.9, 1.0),
+ signif=NULL)
> abline(h=0)
> plot(efron, prior=0.75, ylim=c(0,1))
> abline(h=0)
\end{verbatim}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{How many genes are differentially expressed?}
As a crude estimate, the fact that we had to take $p_0 = 0.75$
suggests that $25\%$ of the genes may be different; this is consistent
with the BUM estimate from last time. With a cutoff of $90\%$ on the
posterior probability, we get:
\begin{verbatim}
> cutoffSignificant(efron, prior=0.75, signif=0.9)
$low
[1] 68
$high
[1] 143
> sum(efron@rank.sum.statistics < 68)
[1] 839
> sum(efron@rank.sum.statistics > 143)
[1] 825
\end{verbatim}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{\hypertarget{tailrank}{The Tail-Rank Test}}
The Wilcoxon rank-sum test and the t-test both look at the same
property: is the mean expression the same? When looking for cancer
biomarkers, this may well be the \textcolor{Lemon}{wrong} question.
Cancers that are histologically the same are not identical. Deletion
of part of chromosome 3 (3p14-p23) is found in $50\%$ of
non-small-cell lung cancers; MYC amplification is found in $14\%$ of
stomach cancers; BRCA1 mutations are found in a subset of breast
cancers; a translocation between chromosomes 11 and 14 occurs in
$35\%$ of mantle cell lymphomas. These genetic abnormalities directly
causes specific differences in gene expression that only occur in a
subset of cancers. Statistically, these results suggest that the
distributions of gene expression in cancer patients are likely to
differ from the healthy distributions in much more than the location
of the center.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Motivation: Subset Biomarkers}
If a biomarker is only present in $20\%$ of the cancer samples, then
the distributions might look something like this.
\begin{center}
\includegraphics[height=5in]{Figs/motivate}
\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Outline of The Tail-Rank Test}
\begin{itemize}
\item Collect data on $G$ genes from $n_H$ healthy individuals. Write
$X_{g,i}$ for measurement of gene $g$ on individual $i$. Assume for
fixed $g$ that $X_{g,i} \sim X_g$ are IID.
\item Specify a target value $\psi$ for specificity.
\item Estimate, for each $g$, a threshold $\tau_g$ such that $Prob(X_g <
\tau_g) = \psi$.
\item Collect data from $n_C$ cancer patients. Count the number $Y_g$
of cancer patients for which the measured expression level of gene
$g$ exceeds $\tau_g$; we call $Y_g$ the \textcolor{Orange}{tail-rank
statistic}.
\item Call $g$ a biomarker if $Y_g$ exceeds a certain threshold.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{The null distribution}
Null hypothesis: gene $g$ is not a useful biomarker.
More precisely: the measurements on cancer patients have the same
distribution as the measurements from healthy individuals.
Then: all $Y_g$ have identical binomial distributions,
$$Y_g \sim Y = Binom(n_C, 1-\psi).$$
The point here is that the probability of being in the tail is the
same for healthy and cancer, and is given by $1-\psi$, where $\psi$
was the desired specificity.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{}
Even when we perform the same test for $G$ genes, the expected maximum
value of $G$ independent instances of $Y_g$ remains small.
Let $M_G = max_{g=1\dots G} \left(Y_g\right)$ be the maximum over $G$
IID binomial random variables. Also, let
\begin{eqnarray*}
\alpha & = & \alpha(m) = Prob(Y > m)\\
\gamma & = & Prob(M_G > m)
\end{eqnarray*}
Then
\begin{eqnarray*}
1-\gamma &=& Prob(M_G \le m) \\
&=& Prob(Y_1\le m, \dots, Y_G\le m) \\
&=& Prob(Y\le m)^G = (1-\alpha)^G.
\end{eqnarray*}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{The maximum value expected by chance}
Solving,
$$\alpha = 1 - (1-\gamma)^{1/G}.$$
and $m$ is the $(1-\alpha)^{\rm th}$ quantile of a single binomial
distribution:
\bigskip
\hbox to \hsize{\hfill\vbox{\offinterlineskip
\hrule
\halign{\vrule#&\strut\quad\hfil#\quad
&\vrule#\quad&\hfil#\quad&\hfil#\quad&\hfil#\quad &\hfil#\quad
&\vrule#\cr
&&&\multispan4\hfil$\gamma=0.01$, $\psi=0.99$\hfil&\cr
\noalign{\hrule}
&$n_C$& &$G=100$&$1000$&$10000$&$100000$&\cr
\noalign{\hrule}
& 10 && 3& 3& 4& 4&\cr
& 20 && 3& 4& 5& 5&\cr
& 50 && 5& 6& 6& 7&\cr
&100 && 6& 7& 8& 9&\cr
&250 && 10& 12& 13& 14&\cr
&500 && 15& 17& 19& 20&\cr
}\hrule}\hfill}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Interpretation}
One needs to specify two parameters in order to apply the tail-rank
test.
\begin{enumerate}
\item $\psi$, the desired specificity of the biomarker
\item $\gamma$, the desired bound on the FWER
\end{enumerate}
Then, given the number of genes and the number of cancer samples, the
values $m$ in a table like the previous one represent the largest
value of $Y_g$ that we would expect to see by chance over the entire
microarray. Any gene where we observe $Y_g > m$ is a potential
biomarker.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Tail-rank and real data}
We return yet again to our prostate cancer data set. We will now start
using the entire data set, which contains $14$ samples from normal
prostate, $61$ prostate cancer samples, and $9$ samples from lymph
node metastases of prostate cancer. With this number of samples,
taking $\gamma=0.95$ and $\psi=0.95$, a gene was called a biomarker if
at least $16$ of the $71$ cancer samples were above the threshold.
We assumed that the log ratios of the \textcolor{Lemon}{normal
prostate} samples were normally distributed. We computed $90\%$
tolerance bounds for the $5^{\rm th}$ and $95^{\rm th}$ percentiles,
and counted the number of combined prostate cancer samples whose log
ratios fell outside these boundaries.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Tail-rank results}
We identified $1,766$ spots that were ``positive'' biomarkers, since
they were present at higher than normal levels in at least $16$ cancer
samples. We also identified $1,930$ spots that were ``negative''
biomarkers, since they were expressed at lower than normal levels in
at least $16$ samples. In total, we identified
\textcolor{Orange}{3,692} spots as candidate biomarkers.
Although the theory told us the number of false positives should be
close to zero, we decided to test this using both simulations and a
permutaion test. We simulated completely random (IID normal) data 100
times, and we permuted the samples labels on the real data 100 times.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Using the tail-rank test in R}
We developed the tail-rank test. A preprint, along with an R package
is available on the web at
\url{http://bioinformatics.mdanderson.org/TailRank}
\begin{verbatim}
> require(TailRank)
> tr.test <- TailRankTest(expression.data, status,
+ direction="two")
> summary(tr.test)
A tail-rank test object in the two-sided direction.
Specificity: 0.95 computed with tolerance 0.9.
Significance cutoff: 15 based on a family-wise error
rate less than 0.05.
There are 3692 tail-rank statistics that exceed the
cutoff.
\end{verbatim}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Reviewing significance}
\begin{center}
\includegraphics[height=5in]{Figs/simHist}
\end{center}
Pretty good, when you consider that the test with the real data
detected a few thousand potential markers!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Differential expression results}
We repeated the t-test analysis on the full data set (adjusing for
multiple testing using BUM). With $FDR <0.05$, we used a cutoff at $p
< 0.000045$ or $|t| > 4.25$. We detected \textcolor{Orange}{3,522}
differentially expressed spots. Of these, $1,415$ spots were
overexpressed in prostate cancer and $2,107$ spots were
underexpressed.
We also repeated the Wilcoxon test with the empirical Bayes approach.
In order to get comparable results, we selected a cutoff corresponding
to a posterior probability of $99.9\%$. We detected
\textcolor{Orange}{3,627} differentially expressed spots. Of these,
$1,498$ spots were overexpressed and $2,129$ spots were underexpressed
in prostate cancer.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Comparing tests}
The number of genes found by the three tests was very similar. Are
they finding the same things?
There was good agreement between the t-test and the Wilcoxon
test. More than $90\%$ ($1,905$) of underexpressed and $88\%$
($1,244$) of overexpressed spots that were found by the t-test were
also detected by the Wilcoxon test. So, we only need to compare one
of these to the tail-rank test.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{Comparing tests}
\begin{center}
\includegraphics[height=5in]{Figs/scissors}
\end{center}
Lower left and right = different by T, not by tail-rank
Upper center = different by tail-rank, not by T.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\foilhead{\hypertarget{stripchart}{Looking at the results}}
Since the tests give different answers, which one should we believe?
Both, since they are giving the answers to different questions!
Whether you perform one test or many, however, it is useful to look at
the expression values for some of the genes that you find, if only to
make sure you believe the results.
A useful R function for this purpose is
\texttt{\textcolor{Orange}{stripchart}}. Here is an example for our
data set. First, we get the clinical status as an \texttt{ordered} factor.
\begin{verbatim}
> x <- ordered(clinical.info$Status,
+ levels=c('N', 'T', 'L'))
\end{verbatim} %$
\foilhead{Selecting interesting genes}
Now we look at genes with small tail-rank statistics ($< 2$) and
significant t-statistics.
\begin{verbatim}
> tr.stats <- getStatistic(tr.test)
> k.weird <- tr.stats < 2 & (abs(t.stats) > 4.25)
> sum(k.weird)
[1] 38
\end{verbatim}
We can select one of the ``weird'' genes and get its expression data.
\begin{verbatim}
> i.k.weird <- (1:length(k.weird))[k.weird]
> i <- sample(i.k.weird, 1)
> y <- as.vector(t(expression.data[i,]))
\end{verbatim}
\foilhead{Outliers can throw off the estimates}
\begin{verbatim}
> label <- as.character(gene.info$Gene.Symbol[i])
> stripchart(y ~ x, xlab='', main=label,
+ method='jitter')
\end{verbatim} %$
\begin{center}
\includegraphics[height=4in]{Figs/lap1b}
\end{center}
\foilhead{Some genes are normally variable}
\begin{center}
\includegraphics[height=5in]{Figs/supt3h}
\end{center}
\foilhead{Selecting interesting biomarkers}
Now we look at genes with significant tail-rank statistics and
small t-statistics ($|t| < 1.25 $).
\begin{verbatim}
> k.weird <- tr.stats > 15 & (abs(t.stats) < 1.25)
> sum(k.weird)
[1] 52
\end{verbatim}
We use the same idea to select some of these genes and plot
stripcharts to see if the values agree with what we tihnk the test
should be doing.
\foilhead{GDF11}
\begin{center}
\includegraphics[height=5in]{Figs/gdf11}
\end{center}
\foilhead{HACE1}
\begin{center}
\includegraphics[height=5in]{Figs/hace1}
\end{center}
\foilhead{CANX}
\begin{center}
\includegraphics[height=5in]{Figs/canx}
\end{center}
\foilhead{GITA}
\begin{center}
\includegraphics[height=5in]{Figs/gita}
\end{center}
\foilhead{When T and Tail-Rank agree}
\begin{center}
\includegraphics[height=5in]{Figs/cav1}
\end{center}
\foilhead{More Examples with ClassComparison}
\begin{verbatim}
> require(ClassComparison}
> # perform t-tests
> tea <- MultiTtest(exprs(prostate), status)
> # beta-uniform mixture model
> bum <- Bum(tea@p.values)
> hist(bum)
> # Dudoit's method
> dudoit <- Dudoit(exprs(prostate), status)
> plot(dudoit)
> # significane analysis of microarrays
> sam <- Sam(exprs(prostate), status)
> plot(sam)
\end{verbatim}
\end{document}