Residual Disease Paper
========================================================
Comparing RD Results
-------------------------------------------------------

by Shelley Herbrich

```{r options, echo=FALSE}

opts_chunk$set(tidy=TRUE, message=FALSE, warning=FALSE,fig.path='figure/RDValidation-', cache.path='cache/RDValidation-')

```


## 1 Executive Summary

### 1.1 Introduction

Using the true residual disease (RD) status for the validation cohort, we are interested to check our predictions using FABP4 and ADH1B. 

### 1.2 Data and Methods

We work with the results dataset, *PCRResults*.

For both target genes, we define our subset of patients with enriched proportion of residual disease as those with the top 25% of expression (this corresponds to the top 35 samples).

### 1.3 Results

We plot the sorted log2 FABP4 and ADH1B values based on our quantification method. We also plot ADH1B against FABP4.  


## 2 Loading Libraries and Quantification Data

We load the PCR results, containing our quantification summaries and true RD status.

```{r libraries}
library(qpcR)
library(gdata)
```

```{r loadResults}
load(file.path("RDataObjects","PCRResults.RData"))
load(file.path("RDataObjects","rawPCRData.RData"))
```

```{r defineRD}
sampleID <- PCRResults$Sample.Name
rd <- PCRResults$RDStatus
names(rd) <- sampleID
```


## 3 Flagging RD Using FABP4
 
First, we graphically examine our cutoff of the top 25th percentile based on levels of FABP4. 

```{r plotSortedFABP4}
plot(rev(PCRResults$FABP4), ylab="Initial Amount (log2)",xlab="",pch=21,bg=c("grey","red")[rev(factor(rd))],main="Sorted FABP4 Concentrations")
abline(h=-20.05,lty=2)
mtext("25%",side=4, at=-16.5, las=2,line=0.5, cex=0.8)
legend("topleft",c("Yes","No"),pch=19,col=c("red","grey"),bty="n",title="RD Status")
```

We do see a subgroup with an enriched proportion of residual disease that is associated with high FABP4. In our cohort where the overall percentage of patients with residual disease is 60%, we are able to identify a subgroup with 86% residual disease. 

```{r originalResults}
table(rd[1:35])/sum(table(rd[1:35]))
table(rd[36:139])/sum(table(rd[36:139]))

fisher.test(matrix(c(30,5,54,50),ncol=2),alternative="greater")
```

Based on a one-sided Fisher's Exact test, the difference in proportion of residual disease is significantly higher for those with elevated FABP4. 


## 4 Flagging RD Using ADH1B

Now, we look at the top 25th percentile based on ADH1B. 

```{r plotSortedADH1B}
orderADH1B <- order(PCRResults$ADH1B)

plot(PCRResults$ADH1B[orderADH1B], ylab="Initial Amount (log2)",xlab="",pch=21,bg=c("grey","red")[factor(rd[orderADH1B])],main="Sorted ADH1B Concentrations")
abline(h=-19.15,lty=2)
mtext("25%",side=4, at=-16.5, las=2,line=0.5, cex=0.8)
legend("topleft",c("Yes","No"),pch=19,col=c("red","grey"),bty="n",title="RD Status")
```

Using ADH1B alone, we are also able to define a subgroup with an enriched proportion (86%) of residual disease. 

```{r originalResultsADH1B}
table(rd[rev(orderADH1B)[1:35]])/sum(table(rd[rev(orderADH1B)[1:35]]))
table(rd[rev(orderADH1B)[36:139]])/sum(table(rd[rev(orderADH1B)[36:139]]))

fisher.test(matrix(c(30,5,54,50),ncol=2),alternative="greater")
```

Again, we see the difference in proportion of residual disease is significantly higher for those with elevated ADH1B. 

```{r overlap}
byBoth <- intersect(PCRResults$Sample.Name[1:35], PCRResults$Sample.Name[rev(orderADH1B)[1:35]])
rd[byBoth]
```

Of the 35 samples flagged by either marker, 23 were flagged by both (22 RD, 1 no RD). 

```{r noRDSample}
rawPCRData[which(rawPCRData$Sample.Name=="W20"),1:5]
```

Here, we note that for the single sample with RD two wells for both ADH1B and FABP4 were removed due to poor PCR quality leaving only a single replicate to quantify each target gene. 

## 5 Flagging RD Using Both ADH1B and FABP4

```{r both}
plot(PCRResults$FABP4, PCRResults$ADH1B, ylab="ADH1B",xlab="FABP4",pch=21,bg=c("grey","red")[factor(rd)],main="")
abline(a=-39.5,b=-1)
sum(-39.5-PCRResults$FABP4<PCRResults$ADH1B, na.rm=TRUE)
byBothSim <- PCRResults$Sample.Name[which(-39.5-PCRResults$FABP4<PCRResults$ADH1B)]
table(rd[byBothSim])
```

By using both markers simultaneously, we improve our enriched subgroup to 89% residual disease. 

## Appendix

```{r getLocation}
getwd()
```

```{r sessionInfo}
sessionInfo()
```