This web page contains supplementary material, including supplementary methods and supplementary reports describing all data analyses, for the manuscript entitled "Molecular Biomarkers of Residual Disease after Surgical Debulking of High-Grade Serous Ovarian Cancer" by the ovarian cancer working group at MD Anderson Cancer Center.
All analyses were performed by Keith A. Baggerly, Shelley M. Herbrich, Susan L. Tucker, or Anna Unruh.
This page was last updated on Tuesday, April 1, 2014. The files posted here will not be changed after publication, allowing the web site to serve as permanent documentation of our analysis. Any changes will be posted on a separate page designed for addenda, errata, corrigenda and other adjustments.
Our analyses make use of raw data (e.g. Affymetrix CEL files) from a variety of sources. These files are not reproduced here, just links to where the data can be obtained.
Data for validation of biomarker datasets. The first of these was from the study of Bonome et al. . We downloaded CEL files (Affymetrix U133A arrays, N=195; 185 tumor samples and 10 normal ovary) from the Gene Expression Omnibus (GEO; GSE26712) on September 10, 2012. The samples in this study were lasercapture microdissected, and the surgical outcome recorded as optimal or suboptimal. These data were used to assess whether qualitative differences in gene expression observed in the first two datasets (TCGA and Tothill et al.) were present here as well. The other dataset was from the Cancer Cell Line Encyclopedia (CCLE) . We downloaded CEL files (Affymetrix U133+2 arrays, N=917) described in the initial CCLE publication from GEO (GSE36133) on September 14, 2012. These data were used to determine whether differences in gene expression seen in tumor samples are present in ovarian cancer cell lines.
Quantitative RT-PCR analysis. Total RNA was extracted from the tumor tissues using the TRIzol® extraction method. RNA was then quantified using a nanodrop method and the 260/280 ratios were also checked to determine quality. RNA (1µg/sample) was reverse transcribed into cDNA using the Verso cDNA kit (Thermo Scientific, West Palm Beach, FL) according to the manufacturer's protocol.
qRT-PCR was performed on a 7500 PCR system (Applied Biosystems, Warrington, UK) using 1µL of cDNA for each sample. SYBR green (Applied Biosystems) was used to detect the products and 20pmoles of primer were used for the reaction. All reactions were carried out with 20µL of reaction mix and were performed in triplicate. We used the following primers: For FABP4, 5'-TGATGATCATGTTAGGTTTGGC-3' (forward) and 5'-TGGAAACTTGTCTCCAGTGAA-3' (reverse). For ADH1B, 5'- AGGGTAGAGGAGGCTGAAGA-3' (forward), 5'-ACCTGCTTCACTCTGGGAAA-3' (reverse). The PCR reactions were run under the following conditions: 50°C for 2 minutes, 95°C for 15 minutes, followed by 40 cycles at 95°C for 1 minute each. All reactions were analyzed with the 7500 Applied Biosystems PCR software (v.2.0.5). The cycle threshold (Ct) values of the target genes were initially normalized to the Ct values of 18S rRNA and melt curves were checked to determine the specificity of the reactions.
Since initial examination of the qRT-PCR results showed that some gene-specific fluorescence thresholds automatically selected by the commercial PCR software were artificially low, resulting in overestimation of Ct values and underestimation of the amount of target RNA (see Supplementary Report: Problems with default PCR quantifications), we quantified the PCR samples with initial concentration estimates using the "window of linearity" method [S1]. This approach provides a simple, well-specific summary of initial amount that is independent of efficiency assumptions.
S1. Ruijter JM, Ramakers C, Hoogars WM, Karlen Y, Bakker O, van den Hoff MJ, Moorman AF. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res 2009;37:e45.
Supplementary Table 1: Probesets (N=47) and associated genes (N=38) having consistent differences in expression between residual disease (RD) and No-RD patients in the TCGA and Tothill data sets at a 10% false discovery rate in each data set.
Here is a list of the supplementary reports, which are provided in HTML format. These reports were produced using knitr, markdown and RStudio.
Our analysis source code relies on a number of software programs and auxiliary packages; we provide scripts, not stand-alone executables. Detailed descriptions of the packages (with version numbers) are listed in the individual reports. The pieces of software required to execute the source code can be obtained from the following locations: