Logistic Model of Residual Disease by FABP4 Expression Levels in the Validation Cohort
========================================================

Susan L. Tucker


```r
opts_chunk$set(tidy = TRUE, message = TRUE)
```


## 1 Executive Summary

### 1.1 Introduction

The goal of this analysis is to use illustrate the relationship between FABP4 level and incidence of residual disease (RD) in the validation cohort using logistic regression.

### 1.2 Data \& Methods

We load the RData object containing the results of the validation study, including PCR measurements of FABP4 and RD status for each patient.

A logistic regression model is fitted to the data, describing RD as a function of FABP4.

Patients are grouped into 4 groups of 34-35 patients each, sorted by FABP4 values. The mean and standard deviation of FABP4 is computed for each group. The incidence of RD per group is computed and the standard deviation is estimated using binomial statistics.

The grouped data are plotted, with the fit of the logistic model shown for comparison.

### 1.3 Results

The incidence of RD in the 4 groups, in order of increasing FABP4,  is 14/34 (41%), 22/35 (63%), 18/35 (51%) and 30/35 (86%), respectively. 

### 1.4 Conclusion

The plot indicates a continuous trend toward increasing incidence of RD over the entire range of FABP4 values, with an estimated incidence of about 30% at the lowest values of FABP4 observed.

## 2 Loading \& Processing Data

The data object containing the PCR values and RD information is loaded. 


```r
load(file.path("RDataObjects", "PCRResults.RData"))
```


The FABP4 and RD information is extracted. 


```r
fabp4 <- PCRResults$FABP4
RD <- rep(0, length(fabp4))
RD[PCRResults$RDStatus == "Yes"] <- 1
```


The data are sorted by increasing FABP4 values.


```r
fabp4Sorted <- fabp4[order(fabp4)]
RDSorted <- RD[order(fabp4)]
```


The patients are grouped into 4 groups of 34-35 patients each.


```r
numGp <- 4
nGp <- c(34, 35, 35, 35)
gp <- c(rep(1, 34), rep(2, 35), rep(3, 35), rep(4, 35))
```


## 3 Analyses

We compute the mean and standard deviation of FABP4 values per group. We also determine the incidence of RD per group and compute its standard deviation using binomial statistics.


```r
meanPCR <- c()
sdPCR <- c()
kRD <- c()

for (i in 1:numGp) {
    meanPCR <- c(meanPCR, mean(fabp4Sorted[gp == i]))
    sdPCR <- c(sdPCR, sd(fabp4Sorted[gp == i]))
    kRD <- c(kRD, sum(RDSorted[gp == i] == 1))
}

sdRD <- sqrt(kRD * (nGp - kRD)/nGp)/(nGp)
incRD <- kRD/nGp
```


A logistic model is fitted to the data.


```r
fitPCR <- glm(RDSorted ~ fabp4Sorted, family = "binomial")
```


A plot is produced showing incidence of RD as a function of FABP4 assayed by qRT-PCR.

The points show the observed incidence of RD in each group, plotted at the mean value of FABP4 per group. Horizontal error bars represent +/- 1 standard deviation of the FABP4 values per group. Vertical error bars represent +/- 1 standard deviation of the incidence, computed using binomial statistics.

The dashed curve shows the fit of the logistic model to the ungrouped data.


```r

plot(meanPCR, incRD, pch = 16, ylim = c(0, 1), xlim = c(min(fabp4), max(fabp4)), 
    xlab = "FABP4", ylab = "Incidence of Residual Disease", main = "PCR Data")

points(fabp4Sorted, fitPCR$fitted.values, type = "l", lty = 2)

for (i in 1:numGp) {
    x <- c(meanPCR[i] - sdPCR[i], meanPCR[i] + sdPCR[i])
    y <- c(incRD[i], incRD[i])
    points(x, y, type = "l", lty = 1)
}

for (i in 1:numGp) {
    x <- c(meanPCR[i], meanPCR[i])
    y <- c(incRD[i] - sdRD[i], incRD[i] + sdRD[i])
    points(x, y, type = "l", lty = 1)
}
```

![plot of chunk plotLogitPCR](figure/plotLogitPCR.png) 


## 4 Appendix

### 4.1 File Location


```r
getwd()
```

```
## [1] "/Users/slt/SLT WORKSPACE/EXEMPT/OVARIAN/Ovarian residual disease study 2012/RD manuscript/Web page for paper/Webpage"
```


### 4.2 SessionInfo


```r
sessionInfo()
```

```
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] knitr_1.5
## 
## loaded via a namespace (and not attached):
## [1] evaluate_0.5.1 formatR_0.10   stringr_0.6.2  tools_3.0.2
```