The CMS subtypes were identified by colorectal cancer subtyping consortium (CRCSC) by assembling the gene expression measurements of 4151 CRC patients from a collection of 18 international studies. The consensus subtypes were identified primarily based on the biologic characteristics of colorectal cancer. Subsequently, other studies have demonstrated prognostic and predictive value of CMS in colorectal cancer.
This application implements an FFPE-based CMS classifier using the Nanostring platform that has strong accuracy for predicting the CMS in colorectal cancer tumor samples. This gene classifier was discovered and validated in silico by using the CRCSC data sets and subsequently optimized based on degree of correlation across tissue types (formalin fixed vs. fresh frozen samples) and platform type (Nanostring vs. Affymetrix).
Further, this FFPE tissue based gene classifier is validated in a CLIA-certified molecular diagnostic laboratory and demonstrated the prognostic significance of CMS in CRC. Apart from the four CMS subtypes (CMS1, CMS2, CMS3 and CMS4), the classifier might also output "Indeterminant" type representing mixed CMS subtypes in the corresponding sample.
Each NanoString experiment contains a set of positive controls comprised of a linear titration of in vitro transcribed synthetic RNA transcripts and corresponding probes and a set of negative controls consisting of probes with no sequence homology to human RNA sequences as well as a set of housekeeping genes. Quality Control (QC) and normalization of data is performed according to nSolver Data Analysis Software Guidelines.
Data quality is assessed by checking the following QC metrics. Samples were removed from further downstream analysis based on how they perform on each of these metrics.
Samples that pass all QC metrics will be included in sample normalization performed by nSolver software. The normalization involves three steps: a) first a normalization with respect to the geometric mean of the positive control spike counts, b) then a normalization with respect to the geometric mean of a group of housekeeping reference genes, and c) finally a background correction that consists of subtracting the mean + two standard deviation of the negative control counts.
CSV output file generated by nSolver software after data normalization is used as input to predict CMS in this application. The sample data provided (Download) represents the format expected by this application to predict CMS subtypes by the classifier.
Manuscript in Review
For feedback and questions regarding this application, contact dmaru@mdanderson.org.