Residual Disease Paper ======================================================== Exploring Default PCR Problems ------------------------------------------------------- by Shelley Herbrich *May 13, 2013* ## 1 Executive Summary ### 1.1 Introduction Currently, PCR vendor software internally chooses a default threshold for each target gene on a plate (in this case, 18S, FABP4, and ADH1B). On revisiting the data obtained from Washington, we noticed that for a few of the plates the threshold chosen for FABP4 is quite low. The problem is that in going from one plate to another, these cutoffs are largely stable for 18S and ADH1B, but not for FABP4. We suspect that the cutoffs for FABP4 may be being dragged out of place by some outlier wells, as might be the case with one or two problematic samples. ### 1.2 Methods For this analysis, we focus primarily on the first plate run on Plate 2. Some of the samples were flagged as problematic (and were later rerun), but the threshold for FABP4 was applied to all samples, including those that were retained. We suspect a better threshold for FABP4 would be near the same level as those for ADH1B and 18S. We interpolate the revised Ct values based on this proposed threshold. ### 1.3 Results It appears that the software chooses the gene-specific threshold based on the first encountered curve for that particular target. This results in significantly underestimated thresholds when there are poor quality wells on a given plate. The implications of this change aren't subtle. The Ct values we would infer for FABP4 with both thresholds are different, and the mean difference is about 5.5 units (or about 2^5.5 = 46-fold). Therefore, we need to develop an alternative quantification method for our PCR data that is independent of the software-defined threshold values. ## 3 Assessing the Impact of Incorrect Thresholds ```r library(RColorBrewer) ``` ```r load(file.path("RDataObjects", "rawPCRWash_Unfiltered.RData")) load(file.path("RDataObjects", "PCRResults.RData")) ``` ```r # extract annotation info rawPCRPlate2 <- rawPCRWash[which(rawPCRWash$Plate == "Plate.2"), ] wells <- rawPCRPlate2$Well genes <- rawPCRPlate2$Target.Name thresholds <- unique(rawPCRPlate2$Ct.Threshold) names(thresholds) <- unique(genes) ``` ```r colors <- array(NA, length(genes)) colors[which(genes == "ADH1B")] <- rep(brewer.pal(9, "OrRd")[3:8], 4)[1:23] colors[which(genes == "FABP4")] <- rep(brewer.pal(9, "Purples")[3:8], 4)[1:23] colors[which(genes == "18S")] <- rep(brewer.pal(9, "Greens")[3:8], 4)[1:23] ``` ```r par(mar = c(5.1, 5.1, 4.1, 3.1)) plot(NA, xlim = c(1, 40), ylim = c(-5, 1.1), xlab = "Cycle Number", ylab = "", yaxt = "n", main = "Plate 2") axis(2, at = seq(-5, 2, by = 2), labels = 10^seq(-5, 2, by = 2), las = 2, cex.axis = 0.7) mtext("Delta Rn", side = 2, at = -2, line = 4, las = 0, cex = 1) for (i1 in 1:length(wells)) { wn <- wells[i1] tempRn <- log10(rawPCRPlate2[which(rawPCRPlate2$Well == wn), grep("Cycle", colnames(rawPCRPlate2))]) missing <- is.na(tempRn) points(c(1:40), tempRn, col = colors[i1], type = "l") } # add thresholds abline(h = log10(thresholds)) mtext(names(thresholds), side = 4, at = log10(thresholds) + c(0.1, 0, -0.1), line = 0.5, las = 2, cex = 0.75) # add proposed threshold otherFABP4 <- c(1.674, 1.674, 1.631, 1.589) estimatedThreshold <- mean(otherFABP4) abline(h = log10(estimatedThreshold), col = "red", lty = 2) mtext("FABP4", side = 2, at = log10(estimatedThreshold), line = 0.5, las = 2, cex = 0.75, col = "red") ``` ![plot of chunk deltaRnPlot](figure/deltaRnPlot.png) ```r goi <- "FABP4" fabp4Wells <- wells[which(genes == goi)] defaultCt <- as.numeric(rawPCRPlate2$Ct[which(genes == goi)]) mean(defaultCt, na.rm = TRUE) ``` ``` ## [1] 23.26 ``` ```r estimatedCt <- array(NA, length(fabp4Wells)) for (i1 in 1:length(fabp4Wells)) { wellID <- fabp4Wells[i1] subRn <- as.numeric(rawPCRPlate2[which(rawPCRPlate2$Well == wellID), grep("Cycle", colnames(rawPCRPlate2))]) ub <- which(subRn > estimatedThreshold)[1] m <- (subRn[ub] - subRn[ub - 1]) b <- subRn[ub - 1] estimatedCt[i1] <- (ub - 1) + (estimatedThreshold - b)/m } mean(estimatedCt, na.rm = TRUE) ``` ``` ## [1] 28.79 ``` ```r plot(defaultCt, ylim = c(min(c(defaultCt, estimatedCt), na.rm = TRUE), 40), ylab = "Number of Cycles", main = "FABP4 PCR Results") abline(h = 40, lty = 2) points(c(1:23)[!is.na(defaultCt)], defaultCt[!is.na(defaultCt)], type = "l") points(estimatedCt, pch = 8, col = "red") imputeCt <- estimatedCt imputeCt[is.na(imputeCt)] <- 40 points(imputeCt, type = "l", col = "red") legend(15, 17, c("default Ct", "estimated Ct"), cex = 0.8, col = c("black", "red"), pch = c(1, 8), bty = "n") ``` ![plot of chunk plotProposedAndCurrent](figure/plotProposedAndCurrent.png) ```r # flagging bad rows goodRow <- rep(TRUE, nrow(rawPCRWash)) ## Plate Plate.1 '- 2nd set 021613' all fine Plate Plate.2 '021213', Rerun ## samples goodRow[(rawPCRWash$Sample.Name == "W37") & (rawPCRWash$Plate == "Plate.2")] <- FALSE goodRow[(rawPCRWash$Sample.Name == "W32") & (rawPCRWash$Plate == "Plate.2")] <- FALSE goodRow[(rawPCRWash$Sample.Name == "W20") & (rawPCRWash$Plate == "Plate.2")] <- FALSE ## Plate Plate.3 '021613' goodRow[(rawPCRWash$Well == "A6") & (rawPCRWash$Plate == "Plate.3")] <- FALSE goodRow[(rawPCRWash$Well == "B6") & (rawPCRWash$Plate == "Plate.3")] <- FALSE goodRow[(rawPCRWash$Well == "F3") & (rawPCRWash$Plate == "Plate.3")] <- FALSE goodRow[(rawPCRWash$Well == "G2") & (rawPCRWash$Plate == "Plate.3")] <- FALSE ## Plate Plate.4 '021813'; all 80-4 (rerun) goodRow[(rawPCRWash$Well == "C2") & (rawPCRWash$Plate == "Plate.4")] <- FALSE goodRow[(rawPCRWash$Well == "C3") & (rawPCRWash$Plate == "Plate.4")] <- FALSE goodRow[(rawPCRWash$Well == "C5") & (rawPCRWash$Plate == "Plate.4")] <- FALSE goodRow[(rawPCRWash$Well == "C6") & (rawPCRWash$Plate == "Plate.4")] <- FALSE ## Plate Plate.5 '3rd batch 201313' goodRow[(rawPCRWash$Well == "F6") & (rawPCRWash$Plate == "Plate.5")] <- FALSE ## Plate W6 '4th set 021413' all fine Plate W7 'second machine' row B ## (sample 81-2) may be weak overall rawPCRWash <- rawPCRWash[goodRow, ] ``` ```r sampleCt <- function(x) { geneMeans <- sapply(split(x, factor(x$Target.Name)), function(y) mean(as.numeric(y$Ct), na.rm = TRUE)) geneThresh <- unique(x$Ct.Threshold) names(geneThresh) <- unique(x$Target.Name) rv <- geneMeans["18S"] - geneMeans["FABP4"] if (geneThresh["FABP4"] < 1) { tmp <- x[which(x$Target.Name == "FABP4"), ] estimatedCt <- array(NA, nrow(tmp)) for (i1 in 1:length(estimatedCt)) { subRn <- as.numeric(tmp[i1, grep("Cycle", colnames(tmp))]) ub <- which(subRn > estimatedThreshold)[1] m <- (subRn[ub] - subRn[ub - 1]) b <- subRn[ub - 1] estimatedCt[i1] <- (ub - 1) + (estimatedThreshold - b)/m } rv <- c(rv, geneMeans["18S"] - mean(estimatedCt, na.rm = TRUE)) } else { rv <- c(rv, NA) } rv } ``` ```r ct <- NULL for (i1 in 1:length(unique(rawPCRWash$Plate))) { tmp <- rawPCRWash[which(rawPCRWash$Plate == unique(rawPCRWash$Plate)[i1]), ] tmpCt <- t(sapply(split(tmp, factor(tmp$Sample.Name)), sampleCt)) ct <- rbind(ct, tmpCt) } ``` ```r plot(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 1], PCRResults$FABP4[which(PCRResults$Source == "Washington")], main = "FABP4 PCR Results", xlab = "Delta Ct", ylab = "Alternative Quantification", pch = 21, bg = "grey", cex = 0.75) points(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 2], PCRResults$FABP4[which(PCRResults$Source == "Washington")], col = "red", pch = 8) arrows(ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 1], PCRResults$FABP4[which(PCRResults$Source == "Washington")], x1 = ct[PCRResults$Sample.Name[which(PCRResults$Source == "Washington")], 2], col = "pink", length = 0.15) legend("topleft", c("default Ct", "estimated Ct"), cex = 0.8, col = c("grey", "red"), pch = c(1, 8), bty = "n") ``` ![plot of chunk arrowPlot](figure/arrowPlot.png) ## Appendix ```r getwd() ``` ``` ## [1] "\\\\mdadqsfs02/workspace/kabagg/RDPaper/Webpage/ResidualDisease" ``` ```r sessionInfo() ``` ``` ## R version 2.15.3 (2013-03-01) ## Platform: x86_64-w64-mingw32/x64 (64-bit) ## ## locale: ## [1] LC_COLLATE=English_United States.1252 ## [2] LC_CTYPE=English_United States.1252 ## [3] LC_MONETARY=English_United States.1252 ## [4] LC_NUMERIC=C ## [5] LC_TIME=English_United States.1252 ## ## attached base packages: ## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages: ## [1] RColorBrewer_1.0-5 knitr_1.2 ## ## loaded via a namespace (and not attached): ## [1] digest_0.6.3 evaluate_0.4.3 formatR_0.7 stringr_0.6.2 ## [5] tools_2.15.3 ```