Residual Disease Paper
========================================================
Exploring Default PCR Problems
-------------------------------------------------------


by Shelley Herbrich

*May 13, 2013*






## 1 Executive Summary

### 1.1 Introduction

Currently, PCR vendor software internally chooses a default threshold for each target gene on a plate (in this case, 18S, FABP4, and ADH1B). On revisiting the data obtained from Washington, we noticed that for a few of the plates the threshold chosen for FABP4 is quite low. The problem is that in going from one plate to another, these  cutoffs are largely stable for 18S and ADH1B, but not for FABP4. 

We suspect that the cutoffs for FABP4 may be being dragged out of place by some outlier wells, as might be the case  with one or two problematic samples. 

### 1.2 Methods

For this analysis, we focus primarily on the first plate run on Plate 2. Some of the samples were flagged as problematic (and were later rerun), but the threshold for FABP4 was applied to all samples, including those that were retained. 

We suspect a better threshold for FABP4 would be near the same level as those for ADH1B and 18S. We interpolate the revised Ct values based on this proposed threshold.  

### 1.3 Results

It appears that the software chooses the gene-specific threshold based on the first encountered curve for that particular target. This results in significantly underestimated thresholds when there are poor quality wells on a given plate. The implications of this change aren't subtle. The Ct values we would infer for FABP4 with both thresholds are different, and the mean difference is about 5.5 units (or about 2^5.5 = 46-fold). 

Therefore, we need to develop an alternative quantification method for our PCR data that is independent of the software-defined threshold values. 

## 3 Assessing the Impact of Incorrect Thresholds


```r
library(RColorBrewer)
```


```r
load(file.path("RDataObjects", "rawPCRWash_Unfiltered.RData"))
load(file.path("RDataObjects", "PCRResults.RData"))
```



```r
# extract annotation info
rawPCRPlate2 <- rawPCRWash[which(rawPCRWash$Plate == "Plate.2"), ]
wells <- rawPCRPlate2$Well
genes <- rawPCRPlate2$Target.Name
thresholds <- unique(rawPCRPlate2$Ct.Threshold)
names(thresholds) <- unique(genes)
```



```r
colors <- array(NA, length(genes))
colors[which(genes == "ADH1B")] <- rep(brewer.pal(9, "OrRd")[3:8], 4)[1:23]
colors[which(genes == "FABP4")] <- rep(brewer.pal(9, "Purples")[3:8], 4)[1:23]
colors[which(genes == "18S")] <- rep(brewer.pal(9, "Greens")[3:8], 4)[1:23]
```



```r
par(mar = c(5.1, 5.1, 4.1, 3.1))
plot(NA, xlim = c(1, 40), ylim = c(-5, 1.1), xlab = "Cycle Number", ylab = "", 
    yaxt = "n", main = "Plate 2")
axis(2, at = seq(-5, 2, by = 2), labels = 10^seq(-5, 2, by = 2), las = 2, cex.axis = 0.7)
mtext("Delta Rn", side = 2, at = -2, line = 4, las = 0, cex = 1)

for (i1 in 1:length(wells)) {
    wn <- wells[i1]
    tempRn <- log10(rawPCRPlate2[which(rawPCRPlate2$Well == wn), grep("Cycle", 
        colnames(rawPCRPlate2))])
    missing <- is.na(tempRn)
    points(c(1:40), tempRn, col = colors[i1], type = "l")
}

# add thresholds
abline(h = log10(thresholds))
mtext(names(thresholds), side = 4, at = log10(thresholds) + c(0.1, 0, -0.1), 
    line = 0.5, las = 2, cex = 0.75)

# add proposed threshold
otherFABP4 <- c(1.674, 1.674, 1.631, 1.589)
estimatedThreshold <- mean(otherFABP4)
abline(h = log10(estimatedThreshold), col = "red", lty = 2)
mtext("FABP4", side = 2, at = log10(estimatedThreshold), line = 0.5, las = 2, 
    cex = 0.75, col = "red")
```

![plot of chunk deltaRnPlot](figure/deltaRnPlot.png) 




```r
goi <- "FABP4"
fabp4Wells <- wells[which(genes == goi)]
defaultCt <- as.numeric(rawPCRPlate2$Ct[which(genes == goi)])
mean(defaultCt, na.rm = TRUE)
```

```
## [1] 23.26
```

```r

estimatedCt <- array(NA, length(fabp4Wells))
for (i1 in 1:length(fabp4Wells)) {
    wellID <- fabp4Wells[i1]
    subRn <- as.numeric(rawPCRPlate2[which(rawPCRPlate2$Well == wellID), grep("Cycle", 
        colnames(rawPCRPlate2))])
    ub <- which(subRn > estimatedThreshold)[1]
    m <- (subRn[ub] - subRn[ub - 1])
    b <- subRn[ub - 1]
    estimatedCt[i1] <- (ub - 1) + (estimatedThreshold - b)/m
}
mean(estimatedCt, na.rm = TRUE)
```

```
## [1] 28.79
```



```r
plot(defaultCt, ylim = c(min(c(defaultCt, estimatedCt), na.rm = TRUE), 40), 
    ylab = "Number of Cycles", main = "FABP4 PCR Results")
abline(h = 40, lty = 2)
points(c(1:23)[!is.na(defaultCt)], defaultCt[!is.na(defaultCt)], type = "l")

points(estimatedCt, pch = 8, col = "red")
imputeCt <- estimatedCt
imputeCt[is.na(imputeCt)] <- 40
points(imputeCt, type = "l", col = "red")

legend(15, 17, c("default Ct", "estimated Ct"), cex = 0.8, col = c("black", 
    "red"), pch = c(1, 8), bty = "n")
```

![plot of chunk plotProposedAndCurrent](figure/plotProposedAndCurrent.png) 



```r
# flagging bad rows
goodRow <- rep(TRUE, nrow(rawPCRWash))
## Plate Plate.1 '- 2nd set 021613' all fine Plate Plate.2 '021213', Rerun
## samples
goodRow[(rawPCRWash$Sample.Name == "W37") & (rawPCRWash$Plate == "Plate.2")] <- FALSE
goodRow[(rawPCRWash$Sample.Name == "W32") & (rawPCRWash$Plate == "Plate.2")] <- FALSE
goodRow[(rawPCRWash$Sample.Name == "W20") & (rawPCRWash$Plate == "Plate.2")] <- FALSE
## Plate Plate.3 '021613'
goodRow[(rawPCRWash$Well == "A6") & (rawPCRWash$Plate == "Plate.3")] <- FALSE
goodRow[(rawPCRWash$Well == "B6") & (rawPCRWash$Plate == "Plate.3")] <- FALSE
goodRow[(rawPCRWash$Well == "F3") & (rawPCRWash$Plate == "Plate.3")] <- FALSE
goodRow[(rawPCRWash$Well == "G2") & (rawPCRWash$Plate == "Plate.3")] <- FALSE
## Plate Plate.4 '021813'; all 80-4 (rerun)
goodRow[(rawPCRWash$Well == "C2") & (rawPCRWash$Plate == "Plate.4")] <- FALSE
goodRow[(rawPCRWash$Well == "C3") & (rawPCRWash$Plate == "Plate.4")] <- FALSE
goodRow[(rawPCRWash$Well == "C5") & (rawPCRWash$Plate == "Plate.4")] <- FALSE
goodRow[(rawPCRWash$Well == "C6") & (rawPCRWash$Plate == "Plate.4")] <- FALSE
## Plate Plate.5 '3rd batch 201313'
goodRow[(rawPCRWash$Well == "F6") & (rawPCRWash$Plate == "Plate.5")] <- FALSE
## Plate W6 '4th set 021413' all fine Plate W7 'second machine' row B
## (sample 81-2) may be weak overall
rawPCRWash <- rawPCRWash[goodRow, ]
```






```r
sampleCt <- function(x) {
    geneMeans <- sapply(split(x, factor(x$Target.Name)), function(y) mean(as.numeric(y$Ct), 
        na.rm = TRUE))
    geneThresh <- unique(x$Ct.Threshold)
    names(geneThresh) <- unique(x$Target.Name)
    rv <- geneMeans["18S"] - geneMeans["FABP4"]
    if (geneThresh["FABP4"] < 1) {
        tmp <- x[which(x$Target.Name == "FABP4"), ]
        estimatedCt <- array(NA, nrow(tmp))
        for (i1 in 1:length(estimatedCt)) {
            subRn <- as.numeric(tmp[i1, grep("Cycle", colnames(tmp))])
            ub <- which(subRn > estimatedThreshold)[1]
            m <- (subRn[ub] - subRn[ub - 1])
            b <- subRn[ub - 1]
            estimatedCt[i1] <- (ub - 1) + (estimatedThreshold - b)/m
        }
        rv <- c(rv, geneMeans["18S"] - mean(estimatedCt, na.rm = TRUE))
    } else {
        rv <- c(rv, NA)
    }
    rv
}
```



```r
ct <- NULL
for (i1 in 1:length(unique(rawPCRWash$Plate))) {
    tmp <- rawPCRWash[which(rawPCRWash$Plate == unique(rawPCRWash$Plate)[i1]), 
        ]
    tmpCt <- t(sapply(split(tmp, factor(tmp$Sample.Name)), sampleCt))
    ct <- rbind(ct, tmpCt)
}
```


```r
plot(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 1], 
    PCRResults$FABP4[which(PCRResults$Source == "Washington")], main = "FABP4 PCR Results", 
    xlab = "Delta Ct", ylab = "Alternative Quantification", pch = 21, bg = "grey", 
    cex = 0.75)
points(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 
    2], PCRResults$FABP4[which(PCRResults$Source == "Washington")], col = "red", 
    pch = 8)
arrows(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 
    1], PCRResults$FABP4[which(PCRResults$Source == "Washington")], x1 = ct[PCRResults$Sample.Name[which(PCRResults$Source == 
    "Washington")], 2], col = "pink", length = 0.15)
legend("topleft", c("default Ct", "estimated Ct"), cex = 0.8, col = c("grey", 
    "red"), pch = c(1, 8), bty = "n")
```

![plot of chunk arrowPlot](figure/arrowPlot.png) 


## Appendix


```r
getwd()
```

```
## [1] "\\\\mdadqsfs02/workspace/kabagg/RDPaper/Webpage/ResidualDisease"
```



```r
sessionInfo()
```

```
## R version 2.15.3 (2013-03-01)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] RColorBrewer_1.0-5 knitr_1.2         
## 
## loaded via a namespace (and not attached):
## [1] digest_0.6.3   evaluate_0.4.3 formatR_0.7    stringr_0.6.2 
## [5] tools_2.15.3
```