OOMPA

Object-Oriented Microarray and Proteomic Analysis

Note: As of 20 Novmeber 2008, OOMPA has been upgraded to work with R 2.8.0.

Beginning with this release, we have (finally) set up a repository of R code. In order to install the latest-and-greatest versions, you simply fire up your local version of R and use the command

install.packages(repos="http://bioinformatics.mdanderson.org/OOMPA")

Source code and windows binaries are available. If someone wants to volunteer to put Macintosh binaries together, we will post those as well.

Descriptions, and Older Versions

All binary packages on this page were compiled on a machine running version 2.5.0, so they will not work with earlier versions. Minor enhancements were made to most of the packages. Major changes were made to the TailRank package, both to make the function naming conventions consistent with other packages (internal capitalization instead of periods to separate words) and to implement a new algorithm relying on a different distributional model. Read the vignettes in the online help to get more information.

For older versions, see the archive.

OOMPA is a suite of R libraries for the analysis of gene expression (RNA) microarray data and of proteomics profiling mass spectrometry data. OOMPA uses S4 classes to construct object-oriented tools with a consistent user interface. All higher level analysis tools in OOMPA work with the expressionSet classes defined in BioConductor. The lower level processing tools offer an alternative to parts of BioConductor, but can also be used to enhance the existing BioConductor packages.

The packages included in the current release (OOMPA 2.5) are

oompaBase
Class unions and generic functions for OOMPA.
PreProcess
Basic functions for microarray pre-preprocessing, including objects that remember their history.
ClassComparison
Classes and methods for "class comparison" problems using microarray or proteomics data, including tests of differential expression.
ClassDiscovery
Classes and methods for "class discovery" with microarray or proteomics data.
TailRank
Implements the tail-rank statistic for selecting biomarkers from a microarray data set, an efficient nonparametric test focused on the distributional tails.
SuperCurve
Classes and methods for analyzing reverse-phase protein arrays (RPPA), where each sample is spotted on the slide in a dilution series and then assayed to determine the expression levels of specific forms of proteins.

The OOMPA suite of R libraries is the successor to the earlier Object-Oriented Microarray Analysis Library (OOMAL), which was originally written for S-Plus 2000. The incorporation of routines to analyze proteomics profiling data in addition to gene expression microarray data prompted a name change. (It also inspired our new icon. It suggests some possibilities for theme music, but we're pretty sure we don't want to go there.)


Picture of a tuba

oompaBase

Contains definitions that are needed in order for other packages to load properly. Some definitions (like class unions) must be visible before loading a package that uses them, and cannot be defined in the same package.

PreProcess

Preliminary library for low-level preprocessing of microarray data. Provides tools for using consistent color schemes in diagnostic and other plots. Also defines the Processor and Pipeline classes used so objects can maintain a history of how they were produced.

ClassComparison

The ClassComparison library provides tools to perform "class comparison" analyses of microarray or proteomics data. Class comparison problems start with two or more known groups of samples, and ask the analyst to find genes or proteins that are different in some way between the two groups. Methods implemented in this release include

  • Two-sample t-test
  • Fixed-effects linear models with ANOVA
  • Beta-uniform mixture (BUM) model to account for multiple testing by controlling the false discovery rate (FDR).
  • Wilcoxon rank-sum test with empirical Bayes
  • Signficance Analysis of Microarrays (SAM)
  • Total Number of Misclassification (TNoM)
  • Dudoit's adjustment of p-values to control the family-wise error rate (FWER)
  • Smooth t-test

Online help, manuals, source, and binary libraries are available.

ClassDiscovery

The ClassDiscovery library provides tools to perform "class discovery" analyses of microarray or proteomics data. Class discovery methods perform unsupervised analyses to try to "learn" or "discover" group structure in the data. Methods implemented in this release include

  • Nonparametric bootstrap to test the significance of clusters
  • Parametric bootstrap with gaussian noise to test the significance of clusters
  • Principal components analysis of the biological samples
  • Mosiac plots (i.e., the red-green two-way hierarchical clustering plots introduced into the microarray world by Mike Eisen)
  • PCANOVA, which provides an "analysis of variance" inspired method thatb uses principal components to test whether putative group structures are really present in the data

Online help, manuals, source, and binary libraries are available.

TailRank

The Tail Rank test is a new method we have developed for finding biomarkers in microarray or proteomics data sets. The method is essentially non-parametric, focusing on the tails of the distributions in the two classes being compared. The method allows analysts to perform realistic sample size and power computations.

SuperCurve

SuperCurve is a package we have developed to analyze reverse-phase protein arrays. The package includes routines to load raw data files quantified by MicroVigene, to fit a four-parameter joint logistic model in order to estimate protein concentrations, along with methods to assess the quality of the fit.