Department of Bioinformatics and Computational Biology


From MD Anderson Bioinformatics
Jump to: navigation, search

The Cancer Proteome Atlas

Description A comprehensive resource for accessing, visualizing, and analyzing cancer functional proteomics.
Development Information
Language JavaScript (with some R data processing)
Current Version 2.10
License Not required
Status Active
Last Updated 2016-11-18
Citations Li J, Lu Y, Akbani R, Ju Z, Roebuck PL, Liu W, Yang J-Y, Broom BM, Verhaak RGW, Kane DW, Wakefield C, Weinstein JN, Mills GB, Liang H. (2013).
TCPA: A Resource for Cancer Functional Proteomics Data.
Nature Methods 10(11), 1046-1047
News 45 citations (as reported by WebOfScience 2016-12)
Help and Support
Contact Han Liang[1]
Discussion Project Forum

Please click this link to enter the official TCPA site.

Functional proteomics represents a powerful approach to rapidly improve our understanding the pathophysiology and therapy of cancer. To facilitate access of the broader research community to cancer proteomics datasets, we have developed a user-friendly data portal, TCPA (The Cancer Proteome Atlas). The current data release contains 8167 tumor samples in total, mainly consisting of TCGA tumor tissue sample sets. TCPA currently provides six modules: Summary, My Protein, Download, Visualization, Analysis and Cell line. Importantly, this resource provides a unique opportunity to validate the findings from TCGA data and identify model cell lines for functional investigation.


Browser Requirements

We recommend use of HTML5-compliant browsers such as Safari, Chrome and Firefox; depending on the version, the performance of IE might not be optimized. Javascript must be enabled by the browser.

Frequently Asked Questions

  • What is functional proteomics?

Functional proteomics is the large-scale study of proteins at the functional activity level, such as expression and modification. Studies of complex diseases such as cancer have shown that genetic alterations do not account for all of the causes of the disease. Changes in protein levels and structure have also been shown to play critical roles in tumor development and progression, which are not reflected by genetic changes. In cancers, several genetic and epigenetic changes are often required for development of the disease. Studying large-scale epigenetic changes such as protein phosphorylation or cleavage will greatly aid in understanding the causes and determining effective treatment of cancers and other complex diseases.

  • What is RPPA?

Reverse phase protein array (RPPA) is a high-throughput antibody-based technique with the procedures similar to that of Western blots. Proteins are extracted from tumor tissue or cultured cells, denatured by SDS, printed on nitrocellulose-coated slides followed by antibody probe. Our RPPA platform currently allows for the analysis of >1000 samples using at least 130 different antibodies.

  • What are the advantages of RPPA?
    • Inexpensive, high-throughput method utilizing automation for increased quality and reliability
    • Sample preparation requirements are similar to that of Western blots
    • Complete assay requires only 40 microliters of each sample for 150 antibodies
    • Robust quantification due to serial dilution of samples
  • How are the RPPA data processed?
    • Level 1 data

Cellular proteins are first denatured by 1% SDS (with beta mercaptoethanol) and diluted in five 2-fold dilutions in dilution buffer (lysis buffer containing 1% SDS). Serial diluted lysates are arrayed on nitrocellulose-coated slides (Grace Biolabs) by the Aushon 2470 Arrayer and probed with validated antibodies. Signals are amplified by TSA and captured by DAB colorimetric reaction. The slides are then scanned, analyzed and quantified by ArrayPro Analyzer to generate spot intensity.

    • Level 2 data

Based on Level 1 data, each dilution curve of spot intensities is fitted using the monotone increasing B-spline model in the SuperCurve R package. This fits a single curve using all the samples on a slide with the signal intensity as the response variable and the dilution steps as independent variables. The fitted curve is plotted with the signal intensities on the y-axis and the log2-concentration of proteins on the x-axis for diagnostic purposes.

    • Level 3 data

Based on Level 2 data, the data normalization is processed as follows:

  1. Calculate the median for each protein across all the samples.
  2. Subtract the median (from step 1) from values within each protein.
  3. Calculate the median for each sample across all proteins.
  4. Subtract the median (from step 3) from values within each sample.
  • How do we quantify protein expression and modification?

We use the approach of "SuperCurve Fitting" developed by the Department of Bioinformatics and Computational Biology at MD Anderson Cancer Center to quantify protein expression and modification. Briefly, a "standard curve" is constructed from 5808 spots on each slide (one slide probed for one antibody). These spots include 5 serial dilutions of each sample plus 528 QC spots of standard lysates at different concentrations. Relative levels of protein expression and modification for each sample are determined by interpolation of each dilution curve to the "standard curve" (supercurve) of the slide (antibody).

  • Can I combine all RPPA data together or RPPA data from different cancers for analysis?

As with any other biological assays, there are batch variations between each RPPA assay. At this time, it is not possible to directly combine the raw or normalized (level 3) protein values. We have developed a replicate-based method to combine RPPA data from different slides, and you should use the RPPA dataset marked with RBN (e.g., pancancer 11 RBN).


For questions or support related to the use of TCPA web app, please visit the project's forum page.
For questions about how the RPPA data are generated or antibodies used, contact Dr. Yiling Lu.
For other inquiries, contact Dr. Han Liang.


Before using TCGA data, please read TCGA guidelines for publication and moratoriums.

This web app is for educational and research purposes only.