Supplementary Appendix

This web page contains supplementary material, including supplementary methods and supplementary reports describing all data analyses, for the manuscript entitled "Molecular Biomarkers of Residual Disease after Surgical Debulking of High-Grade Serous Ovarian Cancer" by the ovarian cancer working group at MD Anderson Cancer Center.

All analyses were performed by Keith A. Baggerly, Shelley M. Herbrich, Susan L. Tucker, or Anna Unruh.


This page was last updated on Tuesday, April 1, 2014. The files posted here will not be changed after publication, allowing the web site to serve as permanent documentation of our analysis. Any changes will be posted on a separate page designed for addenda, errata, corrigenda and other adjustments.


Our analyses make use of raw data (e.g. Affymetrix CEL files) from a variety of sources. These files are not reproduced here, just links to where the data can be obtained.


Supplementary Methods:

Data for validation of biomarker datasets. The first of these was from the study of Bonome et al. [11]. We downloaded CEL files (Affymetrix U133A arrays, N=195; 185 tumor samples and 10 normal ovary) from the Gene Expression Omnibus (GEO; GSE26712) on September 10, 2012. The samples in this study were lasercapture microdissected, and the surgical outcome recorded as optimal or suboptimal. These data were used to assess whether qualitative differences in gene expression observed in the first two datasets (TCGA and Tothill et al.) were present here as well. The other dataset was from the Cancer Cell Line Encyclopedia (CCLE) [12]. We downloaded CEL files (Affymetrix U133+2 arrays, N=917) described in the initial CCLE publication from GEO (GSE36133) on September 14, 2012. These data were used to determine whether differences in gene expression seen in tumor samples are present in ovarian cancer cell lines.

Quantitative RT-PCR analysis. Total RNA was extracted from the tumor tissues using the TRIzol® extraction method. RNA was then quantified using a nanodrop method and the 260/280 ratios were also checked to determine quality. RNA (1µg/sample) was reverse transcribed into cDNA using the Verso cDNA kit (Thermo Scientific, West Palm Beach, FL) according to the manufacturer's protocol.

qRT-PCR was performed on a 7500 PCR system (Applied Biosystems, Warrington, UK) using 1µL of cDNA for each sample. SYBR green (Applied Biosystems) was used to detect the products and 20pmoles of primer were used for the reaction. All reactions were carried out with 20µL of reaction mix and were performed in triplicate. We used the following primers: For FABP4, 5'-TGATGATCATGTTAGGTTTGGC-3' (forward) and 5'-TGGAAACTTGTCTCCAGTGAA-3' (reverse). For ADH1B, 5'- AGGGTAGAGGAGGCTGAAGA-3' (forward), 5'-ACCTGCTTCACTCTGGGAAA-3' (reverse). The PCR reactions were run under the following conditions: 50°C for 2 minutes, 95°C for 15 minutes, followed by 40 cycles at 95°C for 1 minute each. All reactions were analyzed with the 7500 Applied Biosystems PCR software (v.2.0.5). The cycle threshold (Ct) values of the target genes were initially normalized to the Ct values of 18S rRNA and melt curves were checked to determine the specificity of the reactions.

Since initial examination of the qRT-PCR results showed that some gene-specific fluorescence thresholds automatically selected by the commercial PCR software were artificially low, resulting in overestimation of Ct values and underestimation of the amount of target RNA (see Supplementary Report: Problems with default PCR quantifications), we quantified the PCR samples with initial concentration estimates using the "window of linearity" method [S1]. This approach provides a simple, well-specific summary of initial amount that is independent of efficiency assumptions.

Supplementary References:

S1. Ruijter JM, Ramakers C, Hoogars WM, Karlen Y, Bakker O, van den Hoff MJ, Moorman AF. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res 2009;37:e45.


Supplementary Table 1: Probesets (N=47) and associated genes (N=38) having consistent differences in expression between residual disease (RD) and No-RD patients in the TCGA and Tothill data sets at a 10% false discovery rate in each data set.

Gene Probeset
ADAM12 213790_at
ADH1B 209612_s_at
ADH1B 209613_s_at
ADIPOQ 207175_at
ALDH1A3 203180_at
ALDH5A1 203609_s_at
AQP1 209047_at
BCHE 205433_at
COL11A1 37892_at
COL16A1 204345_at
COL3A1 201852_x_at
COL5A1 203325_s_at
COL6A2 213290_at
COL8A1 214587_at
CRISPLD2 221541_at
CXCL12 203666_at
CXCL12 209687_at
CYR61 201289_at
DCN 201893_x_at
DCN 209335_at
DCN 211813_x_at
DCN 211896_s_at
ETV1 221911_at
FABP4 203980_at
FAP 209955_s_at
GADD45B 207574_s_at
GADD45B 209304_x_at
GADD45B 209305_s_at
GFPT2 205100_at
GREM1 218468_s_at
GREM1 218469_at
KCNE4 222379_at
LUM 201744_s_at
NBL1 201621_at
NBL1 37005_at
NFYA 204107_at
OMD 205907_s_at
PDGFD 219304_s_at
PDLIM3 209621_s_at
PDPN 221898_at
POLR1C 207515_s_at
PTGIS 208131_s_at
SVEP1 213247_at
TIMP3 201150_s_at
VGLL3 220327_at
VSIG4 204787_at
XYLT1 213725_x_at


Supplementary Reports:

Here is a list of the supplementary reports, which are provided in HTML format. These reports were produced using knitr, markdown and RStudio.