% % Lectures for Microarray Course % % This is the template file for all lectures to ensure consistent % layout, fonts, and colors. % % Remember to update the hypersetup to include individual lecture % titles, change the leftheader to include the lecture title, and % then start at \begin{document} .... % Also, produce a printable (black and white version) according % to the cues immediately following \begin{document}. % % Author: Keith Baggerly and Kevin R. Coombes % Date : 17 August 2004 % % Compile this document with: % pdflatex .tex % and then: % ppower4pb .pdf .pdf % \documentclass[Screen4to3,25pt,headrule,footrule]{foils} \usepackage[pdftex]{color} %for colors \definecolor{Teal}{rgb}{0.0,0.47,0.46} \definecolor{PaleTeal}{rgb}{0.0,0.8,0.6} \definecolor{DarkTeal}{rgb}{0.0,0.2157,0.2118} \definecolor{Lemon}{rgb}{1.0,1.0,0.4} \definecolor{Orange}{rgb}{1.0,0.6,0.0} \definecolor{PaleBlue}{rgb}{0.0,0.8,1.0} \definecolor{Gray7}{rgb}{0.7,0.7,0.7} \definecolor{DarkBurgundy}{rgb}{0.2745,0.0,0.0} \definecolor{Burgundy}{rgb}{0.62,0.06,0.41} \definecolor{DarkGreen}{rgb}{0.5,0.8,0.6} \definecolor{RGBblack}{rgb}{0.0,0.0,0.0} %black doesn't work (?) for % {h,v}pagecolor so use this \usepackage[pdftex]{graphicx} %for importing graphics \usepackage{times} %for nice fonts \usepackage[pdftex]{hyperref} %for hyperlinking and other effects \hypersetup{ pdftitle={Lecture 12: Rank-based tests of differential expression}, pdfsubject={Analysis of Microarray Data}, pdfauthor={Keith A. Baggerly and Kevin R. Coombes, Department of Bioinformatics and Computational Biology, UT M.D. Anderson Cancer Center, , }, pdfkeywords={microarray,R}, pdfpagemode={None}, linkcolor=PaleBlue, citecolor=Gray7, pagecolor=PaleBlue, urlcolor=PaleBlue } \usepackage{pause} %required for ppower4 \usepackage{background} %required for ppower4 \usepackage{pp4slide} %required for ppower4 \usepackage{pp4link} \usepackage[pdftex]{geometry} %for resizing the header areas \geometry{hscale=0.85, %make the slides 85% of horizontal vscale=0.8 % and 80% of vertical } \setcounter{page}{0} %don't count the title page \setlength{\foilheadskip}{-0.5in} %skip less space below the title %\setlength{\itemsep}{1pt} %set distance between list items % use different colored bullets in nested lists % color choices come from MDACC graphics standards \renewcommand{\labelitemi}{\textcolor{PaleTeal}{$\bullet$}} \renewcommand{\labelitemii}{\textcolor{Orange}{$\bullet$}} \renewcommand{\labelitemiii}{\textcolor{PaleBlue}{$\bullet$}} \renewcommand{\labelitemiv}{\textcolor{Lemon}{$\bullet$}} % change color of slide titles %\renewcommand\normalcolor{\color{black}} %\def\Black#1{\color{black}#1} % change the default background, again an MDACC standard \vpagecolor[RGBblack]{Teal} \raggedright %no right-justify % put lecture specific items in the header and consistent course % items in the footer \leftheader{\hyperlink{start}{\textsc{Introduction to Microarrays}}} \rightheader{\textsf{\thepage}} \MyLogo{{\copyright\ Copyright 2004--2007, Kevin R. Coombes and Keith A. Baggerly}} % left footer \rightfooter{\textsc{GS01 0163: Analysis of Microarray Data}} \begin{document} \hypersetup{pdfpagetransition=Replace} %\color{black} % for black-and-white version % Also, run M-x replace-string \textcolor{white} \textcolor{black}. % Then, using pdflatex only (no ppower4 step) should produce the % black and white version with pauses shown as colored boxes. % Remember to run M-x replace-string \textcolor{black} \textcolor{white} % afterwards to reset! %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Title Page \title{\textcolor{Lemon}{GS01 0163\\ Analysis of Microarray Data}} \author{\textcolor{white}{Keith Baggerly and Kevin Coombes}\\ \textcolor{white}{Department of Bioinformatics and Computational Biology}\\ \textcolor{white}{UT M. D. Anderson Cancer Center}\\ \textcolor{Orange}{\tt kabagg@mdanderson.org}\\ \textcolor{Orange}{\tt kcoombes@mdanderson.org}} \date{\textcolor{Lemon}{9 October 2007}} \LogoOff \maketitle \SweaveOpts{prefix.string=Figs/ma12} <>= options(width=55) options(SweaveHooks=list(fig=function() par(bg="white",lwd=2))) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Individual Lecture Title Page \foilhead{\hypertarget{start}{Lecture 12}: Rank-based tests of differential expression} \LogoOn % In most cases, this slide should be an outline of topics to be % covered in this lecture, presented as an itemized list. Each item in % the list should provide a hyperlink to the slide where discussion of % that topic begins. \begin{itemize} \item \hyperlink{wilcox}{Wilcoxon rank-sum test} \item \hyperlink{ebayes}{Empirical Bayes} \item \hyperlink{tailrank}{The Tail-Rank Test} \item \hyperlink{stripchart}{Looking at the results} \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Nonparametric tests} The t-test for differences in mean expression between two groups of samples assumes that the measurements in each group are normally distributed. If this assumption is far from the truth, then the t-statistics and p-values you get may be meaningless. (Actually, departures from normality tend to increase the Type II error, especially when the sample size is small.) Statistically, the dispute over log-transforming microarray data reduces to whether a normal distribution better describes the data on the raw scale or on the log scale. We can avoid this problem entirely by using a statistical test that does not assume anything about the distributions. These tests are usually called \textcolor{Orange}{distribution-free} or \textcolor{Orange}{nonparametric}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{\hypertarget{wilcox}{Wilcoxon rank-sum test}} The most common nonparametric test for a difference in mean expression is the \textcolor{Orange}{Wilcoxon rank-sum test}, which is also known as the \textcolor{Orange}{Mann-Whitney test}. We assume that we have sample measurements from two groups: $$X_1, X_2, \dots, X_{n_X}$$ $$Y_1, Y_2, \dots, Y_{n_Y}$$ We then rank these values from smallest to largest, getting something like $$ \begin{array}{ccccccccccc} X_3 &\le &Y_5 &\le &X_1 &\le &X_{10} &\le &Y_1 &\le \cdots \le &X_2 \\ 1 & &2 & &3 & &4 & &5 & &n_X+n_Y\\ \end{array} $$ \foilhead{Computing rank-sums} Next, we compute a statistic $W$ by summing the ranks of the measurements from the first group. In our example, $$W = 1 + 3 + 4 + \cdots + (n_X+n_Y).$$ $W$ is always an integer, and it is easy to compute its minimum and maximum values. The minimum occurs when all the $X$ values are smaller than all the $Y$ values. Thus, all the $X$ values are at the beginning of the list, and we have $$W \ge 1 + 2 + \cdots + n_X = \frac{n_X(n_X+1)}{2}.$$ \foilhead{} The maximum occurs when all the $X$ values come after all the $Y$ values, giving \begin{eqnarray*} W & \le & (n_Y+1) + (n_Y+2) + \cdots + (n_Y+n_X) \\ & = & n_Xn_Y + (1 + 2 + \cdots n_X)\\ & = & n_Xn_Y + \frac{n_X(n_X+1)}{2}\\ & = & \frac{n_X(2n_Y + n_X + 1)}{2}. \end{eqnarray*} Those formulas are very nice, but let's see what happens when we have $10$ samples in each group. The range of values for $W$ in this case is $$(1+2+\cdots+10)=55 \le W \le (11+12+\cdots+20)=155$$ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{When is a rank-sum significant?} Intuitively, if we get a value of $W$ near its extreme values, then we strongly suspect that the two groups are different. If, however, we get a value near the middle, then we suspect that there is no difference. How can we make this idea more precise? First, let's do some exploration. We start by generating an unstructured random data matrix: <>= n.genes <- 40000 n.samples <- 20 type <- rep(c('A', 'B'), each=10) data <- matrix(rnorm(n.genes*n.samples), ncol=n.samples) @ \foilhead{} Next, we rank the values in each row: <>= ranked.data <- apply(data, 1, rank) dim(data) dim(ranked.data) @ \foilhead{} Notice that the matrix of ranks is transposed when compared to the original data matrix. So, we can compute the Wilcoxon rank-sum statistics by summing the correct ranks by column: <>= wilstat <- apply(ranked.data[type=='A',], 2, sum) summary(wilstat) @ We don't quite get to the extremes... %' \foilhead{Null distribution of Wilcoxon rank-sum} \begin{center} <>= hist(wilstat, breaks=seq(54.5, 155.5, by=1)) @ \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{What happens with real data?} We will return to the prostate cancer data set used in the last lecture. Recall that this data set contains the log ratios of $42,129$ genes measured using two-color fluorescent microarrays and a common reference channel. Recall also that we selected a subset of $10$ samples from normal prostate and $10$ samples of prostate cancer. <>= if (!exists("clinical.info")) { load(file.path("TailRankData", "data", "expression.data.rda")) load(file.path("TailRankData", "data", "clinical.info.rda")) status <- clinical.info[, "Status"] choice <- c(sample(which(status=="N"), 10), sample(which(status=="T"), 10)) expression.data <- expression.data[, choice] clinical.info <- clinical.info[choice,] clinical.info[, "Status"] <- factor(status[choice]) } @ We compute the Wilcoxon rank-sum statistics for this data set: <>= ranked.data <- apply(expression.data, 1, rank) dim(ranked.data) status <- clinical.info[, "Status"] ranksum <- apply(ranked.data[status=='N',], 2, sum) @ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Real data yields extreme statistics} <>= summary(ranksum) @ \begin{center} <>= hist(ranksum, breaks=seq(54.5, 155.5, by=1)) @ \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Distributions matter} When we simulated data, we got values of the rank-sum statistic between \Sexpr{min(wilstat)} and \Sexpr{max(wilstat)}. When we looked at real data, we got rank-sum statistics that spanned the full range of possible values from $55$ to $155$. One (pessimistic) interpretation of this result is that rank-sum statistics are only useful in microarray experiments if they find genes where all the values in one group are less than all the values in the other group. We can do better by looking more carefully at the distributions. \foilhead{R and Wilcoxon} R contains functions to explore the distribution of the rank-sum statistics: \begin{description} \item[rwilcox]{generate random values from the Wilcoxon distribution} \item[dwilcox]{probability density function} \item[pwilcox]{cumulative probability function} \item[qwilcox]{quantile function} \end{description} This set of functions parallels those for other distributions (like \texttt{rnorm}, \texttt{dnorm}, \texttt{pnorm}, and \texttt{qnorm} for the normal distribution). One should note, however, that the rank-sum statistics in R are shifted so that the smallest value is $0$ instead of $n_X$. \foilhead{The null distribution} \begin{center} <>= minW <- sum(1:10); maxW <- sum(11:20) breaker <- seq(minW-0.5, maxW+0.5, by=1) hist(wilstat, breaks=breaker, prob=TRUE) lines(minW:maxW, dwilcox(0:(maxW-minW), 10, 10), col='red', lwd=3) @ \end{center} \foilhead{The real distribution} \begin{center} <>= hist(ranksum, breaks=breaker, prob=TRUE, ylim=c(0,0.03)) lines(minW:maxW, dwilcox(0:(maxW-minW), 10, 10), col='red', lwd=3) @ \end{center} \foilhead{Wilcoxon p-values} <>= wilp <- sapply(ranksum-minW, function(w, m, n) { if (w > m*n/2) 2*(1-pwilcox(w, m, n)) else 2*pwilcox(w, m, n) }, 10, 10) @ \foilhead{} \begin{center} <>= hist(wilp, breaks=100) @ \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{\hypertarget{ebayes}{Empirical Bayes}} The discreteness of the values of the Wilcoxon statistics makes the distribution of p-values problematic for the application of something like BUM to sort out the significance in the face of multiple testing. Instead, we are going to use a differnt approach. Reference: Efron and Tibshirani. Empirical Bayes methods and false discovery rates for microarrays. \textit{Genetic Epidemiology}, 2002; \textbf{23}: 70--86. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Basic idea} Assume that there are two classes of genes, \textcolor{Lemon}{Different} and \textcolor{Lemon}{Not Different}. We assume prior probabilities \begin{itemize} \item $p_0=$ Prob(Not Different) \item $p_1= 1 - p_0 =$ Prob(Different) \end{itemize} and density functions \begin{itemize} \item $f_0(y)$, known Wilcoxon, if Not Different \item $f_1(y)$, unknown, if Different \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Mixtures} The overall probability density function is a mixture $$ f(y) = p_0f_0(y) + p_1f_1(y).$$ Bayes Theorem: $ P(H | D) = P(D | H) P(H)/P(D)$ Applying Bayes Theorem gives posterior estimates: $$ p_1(y) \equiv Prob(Diff | Y = y) = 1 - p_0f_0(y)/f(y) $$ and $$ p_0(y) \equiv Prob(Not Diff | Y = y) = p_0f_0(y)/f(y) $$ We can use the observed data to estimate the overall density function by $\hat{f}(y)$ (typically by log-transforming the observed function and fitting a curve.) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Empirical Bayes} The ``empirical'' nature of this Bayesian idea is that we can adjust the ``prior'' $p_0$ after looking at the data, and thus obtain some reasonable values for it. First, here is how well we fit the distribution (mentally swap the labels, since they are wrong): \begin{center} \includegraphics[height=4in]{Figs/ebayes1} \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Plot of Posterior Probability of Difference} \begin{center} \includegraphics[height=4in]{Figs/ebayes2} \end{center} This graph assumes $p_0 = 1$, so no genes are different. The posterior probability of being different becomes negative in the middle of the graph. This results from the ``empirical'' nature of the estimate without imposing a full model. We can, however, adjust $p_0$ to prevent seeing any negative probabilities. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Plot of Posterior Probability of Difference} \begin{center} \includegraphics[height=4in]{Figs/ebayes3} \end{center} This shows posterior probabilities with $p_0 = 0.7, 0.8, 0.9, 1.0$. Somewhere between $p0=0.7$ and $p_0=0.8$, all the posterior probabilities become positive. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Plot of Posterior Probability of Difference} \begin{center} \includegraphics[height=4in]{Figs/ebayes4} \end{center} This plot uses $p_0=0.75$, which is essentially the largest value we can use for $p_0$ and ensure that all the posterior probabilities are positive. The horizontal line indicates a posterior probability of $90\%$ that a gene is differentially expressed. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{How does this work in R?} We have implemented this idea in an R package: \url{http://bioinformatics.mdanderson.org/software.html} \begin{center} \includegraphics[height=5in]{Figs/mdasw} \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{The OOMPA home page} \begin{center} \includegraphics[height=5.5in]{Figs/oompa} \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{The ClassComparison package} \begin{center} \includegraphics[height=5in]{Figs/cc} \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Empirical Bayes in R} The ClassComparison package implements the empirical Bayes method with Wilcoxon statistics. For the computations shown above, do the following: \begin{verbatim} > require(ClassComparison) > efron <- MultiWilcoxonTest(expression.data, + status) > hist(efron) > plot(efron, prior=c(0.7, 0.8, 0.9, 1.0), + signif=NULL) > abline(h=0) > plot(efron, prior=0.75, ylim=c(0,1)) > abline(h=0) \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{How many genes are differentially expressed?} As a crude estimate, the fact that we had to take $p_0 = 0.75$ suggests that $25\%$ of the genes may be different; this is consistent with the BUM estimate from last time. With a cutoff of $90\%$ on the posterior probability, we get: \begin{verbatim} > cutoffSignificant(efron, prior=0.75, signif=0.9) $low [1] 68 $high [1] 143 > sum(efron@rank.sum.statistics < 68) [1] 839 > sum(efron@rank.sum.statistics > 143) [1] 825 \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{\hypertarget{tailrank}{The Tail-Rank Test}} The Wilcoxon rank-sum test and the t-test both look at the same property: is the mean expression the same? When looking for cancer biomarkers, this may well be the \textcolor{Lemon}{wrong} question. Cancers that are histologically the same are not identical. Deletion of part of chromosome 3 (3p14-p23) is found in $50\%$ of non-small-cell lung cancers; MYC amplification is found in $14\%$ of stomach cancers; BRCA1 mutations are found in a subset of breast cancers; a translocation between chromosomes 11 and 14 occurs in $35\%$ of mantle cell lymphomas. These genetic abnormalities directly causes specific differences in gene expression that only occur in a subset of cancers. Statistically, these results suggest that the distributions of gene expression in cancer patients are likely to differ from the healthy distributions in much more than the location of the center. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Motivation: Subset Biomarkers} If a biomarker is only present in $20\%$ of the cancer samples, then the distributions might look something like this. \begin{center} \includegraphics[height=5in]{Figs/motivate} \end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Outline of The Tail-Rank Test} \begin{itemize} \item Collect data on $G$ genes from $n_H$ healthy individuals. Write $X_{g,i}$ for measurement of gene $g$ on individual $i$. Assume for fixed $g$ that $X_{g,i} \sim X_g$ are IID. \item Specify a target value $\psi$ for specificity. \item Estimate, for each $g$, a threshold $\tau_g$ such that $Prob(X_g < \tau_g) = \psi$. \item Collect data from $n_C$ cancer patients. Count the number $Y_g$ of cancer patients for which the measured expression level of gene $g$ exceeds $\tau_g$; we call $Y_g$ the \textcolor{Orange}{tail-rank statistic}. \item Call $g$ a biomarker if $Y_g$ exceeds a certain threshold. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{The null distribution} Null hypothesis: gene $g$ is not a useful biomarker. More precisely: the measurements on cancer patients have the same distribution as the measurements from healthy individuals. Then: all $Y_g$ have identical binomial distributions, $$Y_g \sim Y = Binom(n_C, 1-\psi).$$ The point here is that the probability of being in the tail is the same for healthy and cancer, and is given by $1-\psi$, where $\psi$ was the desired specificity. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{} Even when we perform the same test for $G$ genes, the expected maximum value of $G$ independent instances of $Y_g$ remains small. Let $M_G = max_{g=1\dots G} \left(Y_g\right)$ be the maximum over $G$ IID binomial random variables. Also, let \begin{eqnarray*} \alpha & = & \alpha(m) = Prob(Y > m)\\ \gamma & = & Prob(M_G > m) \end{eqnarray*} Then \begin{eqnarray*} 1-\gamma &=& Prob(M_G \le m) \\ &=& Prob(Y_1\le m, \dots, Y_G\le m) \\ &=& Prob(Y\le m)^G = (1-\alpha)^G. \end{eqnarray*} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{The maximum value expected by chance} Solving, $$\alpha = 1 - (1-\gamma)^{1/G}.$$ and $m$ is the $(1-\alpha)^{\rm th}$ quantile of a single binomial distribution: \bigskip \hbox to \hsize{\hfill\vbox{\offinterlineskip \hrule \halign{\vrule#&\strut\quad\hfil#\quad &\vrule#\quad&\hfil#\quad&\hfil#\quad&\hfil#\quad &\hfil#\quad &\vrule#\cr &&&\multispan4\hfil$\gamma=0.01$, $\psi=0.99$\hfil&\cr \noalign{\hrule} &$n_C$& &$G=100$&$1000$&$10000$&$100000$&\cr \noalign{\hrule} & 10 && 3& 3& 4& 4&\cr & 20 && 3& 4& 5& 5&\cr & 50 && 5& 6& 6& 7&\cr &100 && 6& 7& 8& 9&\cr &250 && 10& 12& 13& 14&\cr &500 && 15& 17& 19& 20&\cr }\hrule}\hfill} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Interpretation} One needs to specify two parameters in order to apply the tail-rank test. \begin{enumerate} \item $\psi$, the desired specificity of the biomarker \item $\gamma$, the desired bound on the FWER \end{enumerate} Then, given the number of genes and the number of cancer samples, the values $m$ in a table like the previous one represent the largest value of $Y_g$ that we would expect to see by chance over the entire microarray. Any gene where we observe $Y_g > m$ is a potential biomarker. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Tail-rank and real data} We return yet again to our prostate cancer data set. We will now start using the entire data set, which contains $14$ samples from normal prostate, $61$ prostate cancer samples, and $9$ samples from lymph node metastases of prostate cancer. With this number of samples, taking $\gamma=0.95$ and $\psi=0.95$, a gene was called a biomarker if at least $16$ of the $71$ cancer samples were above the threshold. We assumed that the log ratios of the \textcolor{Lemon}{normal prostate} samples were normally distributed. We computed $90\%$ tolerance bounds for the $5^{\rm th}$ and $95^{\rm th}$ percentiles, and counted the number of combined prostate cancer samples whose log ratios fell outside these boundaries. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Tail-rank results} We identified $1,766$ spots that were ``positive'' biomarkers, since they were present at higher than normal levels in at least $16$ cancer samples. We also identified $1,930$ spots that were ``negative'' biomarkers, since they were expressed at lower than normal levels in at least $16$ samples. In total, we identified \textcolor{Orange}{3,692} spots as candidate biomarkers. Although the theory told us the number of false positives should be close to zero, we decided to test this using both simulations and a permutaion test. We simulated completely random (IID normal) data 100 times, and we permuted the samples labels on the real data 100 times. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Using the tail-rank test in R} We developed the tail-rank test. A preprint, along with an R package is available on the web at \url{http://bioinformatics.mdanderson.org/TailRank} \begin{verbatim} > require(TailRank) > tr.test <- TailRankTest(expression.data, status, + direction="two") > summary(tr.test) A tail-rank test object in the two-sided direction. Specificity: 0.95 computed with tolerance 0.9. Significance cutoff: 15 based on a family-wise error rate less than 0.05. There are 3692 tail-rank statistics that exceed the cutoff. \end{verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Reviewing significance} \begin{center} \includegraphics[height=5in]{Figs/simHist} \end{center} Pretty good, when you consider that the test with the real data detected a few thousand potential markers! %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Differential expression results} We repeated the t-test analysis on the full data set (adjusing for multiple testing using BUM). With $FDR <0.05$, we used a cutoff at $p < 0.000045$ or $|t| > 4.25$. We detected \textcolor{Orange}{3,522} differentially expressed spots. Of these, $1,415$ spots were overexpressed in prostate cancer and $2,107$ spots were underexpressed. We also repeated the Wilcoxon test with the empirical Bayes approach. In order to get comparable results, we selected a cutoff corresponding to a posterior probability of $99.9\%$. We detected \textcolor{Orange}{3,627} differentially expressed spots. Of these, $1,498$ spots were overexpressed and $2,129$ spots were underexpressed in prostate cancer. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Comparing tests} The number of genes found by the three tests was very similar. Are they finding the same things? There was good agreement between the t-test and the Wilcoxon test. More than $90\%$ ($1,905$) of underexpressed and $88\%$ ($1,244$) of overexpressed spots that were found by the t-test were also detected by the Wilcoxon test. So, we only need to compare one of these to the tail-rank test. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{Comparing tests} \begin{center} \includegraphics[height=5in]{Figs/scissors} \end{center} Lower left and right = different by T, not by tail-rank Upper center = different by tail-rank, not by T. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \foilhead{\hypertarget{stripchart}{Looking at the results}} Since the tests give different answers, which one should we believe? Both, since they are giving the answers to different questions! Whether you perform one test or many, however, it is useful to look at the expression values for some of the genes that you find, if only to make sure you believe the results. A useful R function for this purpose is \texttt{\textcolor{Orange}{stripchart}}. Here is an example for our data set. First, we get the clinical status as an \texttt{ordered} factor. \begin{verbatim} > x <- ordered(clinical.info$Status, + levels=c('N', 'T', 'L')) \end{verbatim} %$ \foilhead{Selecting interesting genes} Now we look at genes with small tail-rank statistics ($< 2$) and significant t-statistics. \begin{verbatim} > tr.stats <- getStatistic(tr.test) > k.weird <- tr.stats < 2 & (abs(t.stats) > 4.25) > sum(k.weird) [1] 38 \end{verbatim} We can select one of the ``weird'' genes and get its expression data. \begin{verbatim} > i.k.weird <- (1:length(k.weird))[k.weird] > i <- sample(i.k.weird, 1) > y <- as.vector(t(expression.data[i,])) \end{verbatim} \foilhead{Outliers can throw off the estimates} \begin{verbatim} > label <- as.character(gene.info$Gene.Symbol[i]) > stripchart(y ~ x, xlab='', main=label, + method='jitter') \end{verbatim} %$ \begin{center} \includegraphics[height=4in]{Figs/lap1b} \end{center} \foilhead{Some genes are normally variable} \begin{center} \includegraphics[height=5in]{Figs/supt3h} \end{center} \foilhead{Selecting interesting biomarkers} Now we look at genes with significant tail-rank statistics and small t-statistics ($|t| < 1.25 $). \begin{verbatim} > k.weird <- tr.stats > 15 & (abs(t.stats) < 1.25) > sum(k.weird) [1] 52 \end{verbatim} We use the same idea to select some of these genes and plot stripcharts to see if the values agree with what we tihnk the test should be doing. \foilhead{GDF11} \begin{center} \includegraphics[height=5in]{Figs/gdf11} \end{center} \foilhead{HACE1} \begin{center} \includegraphics[height=5in]{Figs/hace1} \end{center} \foilhead{CANX} \begin{center} \includegraphics[height=5in]{Figs/canx} \end{center} \foilhead{GITA} \begin{center} \includegraphics[height=5in]{Figs/gita} \end{center} \foilhead{When T and Tail-Rank agree} \begin{center} \includegraphics[height=5in]{Figs/cav1} \end{center} \foilhead{More Examples with ClassComparison} \begin{verbatim} > require(ClassComparison} > # perform t-tests > tea <- MultiTtest(exprs(prostate), status) > # beta-uniform mixture model > bum <- Bum(tea@p.values) > hist(bum) > # Dudoit's method > dudoit <- Dudoit(exprs(prostate), status) > plot(dudoit) > # significane analysis of microarrays > sam <- Sam(exprs(prostate), status) > plot(sam) \end{verbatim} \end{document}