Datasets

This site has been built as a repository for selected datasets collected and analyzed by investigators at MD Anderson. We have tried to provide a reasonable amount of explanation. Certain tools used to analyze these data are also posted under Software. The individual pages for the different datasets are linked to below.

Proteomics

Testing Response to Chemotherapy in Breast Cancer, Pusztai et al 2004
This dataset consists of 620 sample and QC SELDI spectra used in Pusztai et al, "Pharmacoproteomic Analysis of Prechemotherapy and Postchemotherapy Plasma Samples from Patients Receiving Neoadjuvant or Adjuvant Chemotherapy for Breast Carcinoma", Cancer 2004; 100:1814-1822.
Summary of Study: Proteomic changes in NAF plasma were taken before and after paclitaxel or FAC (5-fluorouracil, doxorubicin, and cyclophosphamide) chemotherapy in patients with Stage I - III breast carcinoma to measure response to the chemotherapy. Samples of healthy women were taken also, to help identify breast carcinoma-associated protein markers. Full Abstract

An Example Analysis Using Cromwell, Coombes et al 2005
To show how to use Cromwell, one of our current analysis packages, we've created an example using serum quality control (QC) data derived from the Pusztai et al 2004 dataset. The Cromwell package is decribed in Coombes et al, Proteomics 2005; to appear. An earlier version of this paper is available from the MDACC Bioinformatics page as a Technical Report (UTMDABTR-001-04).

Additional Resources

For analyzing proteomic data, we currently use

- The Cromwell Package (MATLAB scripts developed here).
- Cromwell uses the Rice Wavelet Toolbox (MATLAB; precompiled binaries exist for many platforms).