Department of Bioinformatics and Computational Biology


From MD Anderson Bioinformatics
Jump to: navigation, search

MD Anderson Cell Lines Project

Description A comprehensive resource for accessing, visualizing, and analyzing functional proteomics of cancer cell lines.
Development Information
Language JavaScript (with some R data processing)
Current Version 1.0.5
License Not required
Status Active
Last Updated 2016-09-27
Citations Li J, Zhao W, Akbani R, Liu W, Ju Z, Ling S, Vellano CP, Roebuck P, Yu Q, Eterovic AK, Byers LA, Davies MA, Deng W, Gopal YNV, Chen G, von Euw EM, Slamon D, Conklin D, Heymach JV, Gazdar AF, Minna JD, Myers JN, Lu Y, Mills GB, Liang H. (2016).
Characterization of Human Cancer Cell Lines by Reverse-Phase Protein Arrays.
Cancer Cell (In Press).
News Coming Soon!
Help and Support
Contact Han Liang[1]

Please click this link to enter the official MCLP site.

Cancer cell lines are major model systems for mechanistic investigation and drug development. However, protein expression data linked to high-quality DNA, RNA and drug screening data have not been available across a large number of cancer cell lines. Using reverse-phase protein arrays, we measured expression levels of ~230 key cancer-related proteins in >650 independent cell lines, many of which are shared with other public cell line resources. Our dataset recapitulates the effects of mutated pathways on protein expression observed in patient samples, and demonstrates that protein markers and particularly phosphoproteins provide information content for predicting drug sensitivity that is not available from the corresponding mRNAs. We developed an interactive, user-friendly bioinformatic resource named MCLP for analyzing the data in a rich context, providing a valuable resource for the broader biomedical research community.


Browser Requirements

We recommend use of HTML5-compliant browsers such as Safari, Chrome and Firefox; depending on the version, the performance of IE might not be optimized. Javascript must be enabled by the browser.

Frequently Asked Questions

  • What is functional proteomics?

Functional proteomics is the large-scale study of proteins at the functional activity level, such as expression and modification. Studies of complex diseases such as cancer have shown that genetic alterations do not account for all of the causes of the disease. Changes in protein levels and structure have also been shown to play critical roles in tumor development and progression, which are not reflected by genetic changes. In cancers, several genetic and epigenetic changes are often required for development of the disease. Studying large-scale epigenetic changes such as protein phosphorylation or cleavage will greatly aid in understanding the causes and determining effective treatment of cancers and other complex diseases.

  • What is RPPA?

Reverse phase protein array (RPPA) is a high-throughput antibody-based technique with the procedures similar to that of Western blots. Proteins are extracted from tumor tissue or cultured cells, denatured by SDS, printed on nitrocellulose-coated slides followed by antibody probe. Our RPPA platform currently allows for the analysis of >1000 samples using at least 130 different antibodies.

  • What are the advantages of RPPA?
    • Inexpensive, high-throughput method utilizing automation for increased quality and reliability
    • Sample preparation requirements are similar to that of Western blots
    • Complete assay requires only 40 microliters of each sample for 150 antibodies
    • Robust quantification due to serial dilution of samples
  • How do we collect cell lines?

We collected cancer cell line samples through the CCSG-supported Cell Line Characterization Core facility (Houston, TX) and from a number of outside collaborations. All cell lines prepared at the MD Anderson Cancer Center were confirmed by short tandem repeat (STR) analysis in the core per institutional policy. The cell lines and STR are routinely cleaned by comparison with the public databases such as "Database of Cross-Contaminated or Misidentified Cell Lines". Our outside collaborators also routinely confirm cell lines by STR analysis. For the details, you can use the "Data Sets" module to check the numbers of the total samples and the independent samples for each cell line lineage. In MCLP, we only use independent cell lines for analysis modules.

  • How do we validate our antibodies?

We extensively validate our antibodies before applying them to our RPPA analysis: (i) a single or dominant band on western blotting is required and dynamic range and specificity is determined using: peptides, phosphopeptides, growth factors, inhibitors, RNAi, cells with wide levels of expression; and (ii) we use Pearson correlation coefficient to quantify how well the RPPA data correlate with western blotting.

  • How do we define the antibody validation status?
    • Validated as ELISA
      RPPA and western blotting has Pearson correlation R >= 0.7.
      Use with Caution
      RPPA and western blotting has Pearson correlation R between 0.6 and 0.7.
      Used for QC
      These antibodies are used for tissue sample quality control (QC).
  • How are the RPPA data processed?
    • Level 1 data

Cellular proteins are first denatured by 1% SDS (with beta mercaptoethanol) and diluted in five 2-fold dilutions in dilution buffer (lysis buffer containing 1% SDS). Serial diluted lysates are arrayed on nitrocellulose-coated slides (Grace Biolabs) by the Aushon 2470 Arrayer and probed with validated antibodies. Signals are amplified by TSA and captured by DAB colorimetric reaction. The slides are then scanned, analyzed and quantified by ArrayPro Analyzer to generate spot intensity.

    • Level 2 data

Based on Level 1 data, each dilution curve of spot intensities is fitted using the monotone increasing B-spline model in the SuperCurve R package. This fits a single curve using all the samples on a slide with the signal intensity as the response variable and the dilution steps as independent variables. The fitted curve is plotted with the signal intensities on the y-axis and the log2-concentration of proteins on the x-axis for diagnostic purposes.

    • Level 3 data

Based on Level 2 data, the data normalization is processed as follows:

  1. Calculate the median for each protein across all the samples.
  2. Subtract the median (from step 1) from values within each protein.
  3. Calculate the median for each sample across all proteins.
  4. Subtract the median (from step 3) from values within each sample.
    • Level 4 data

As with any other biological assays, there are batch variations between each RPPA assay. It is not possible to directly combine the raw or normalized (level 3) protein values. We have developed a replicate-based method to combine RPPA data from different slides, which we call level 4 data.


For questions or support related to the use of MCLP web app, please visit the project's forum page.
For questions about how the RPPA data are generated or antibodies used, contact Dr. Yiling Lu.
For other inquiries, contact Dr. Han Liang.


This web app is for educational and research purposes only.