From MD Anderson Bioinformatics
| OOMPA |
Object-Oriented Microarray and Proteomic Analysis
|| OOMPA is an object-oriented microarray and proteomics analysis library implemented in R using S4 classes and compatible with BioConductor.
|| Any platform running R
|| Artistic License, 2.0
|| May 2013
|| Two new packages (integIRTy and SIBER) have been added. To avoid conflicts with existing CRAN packages, the GenAlg package was renamed GenAlgo. OOMPA currently works with R-3.0.0.
|| Kevin R. Coombes
|| Project Forum
OOMPA is a suite of R libraries for the analysis of gene expression (RNA) microarray data and of proteomics profiling mass spectrometry data. OOMPA uses S4 classes to construct object-oriented tools with a consistent user interface.
All higher level analysis tools in OOMPA work with the expressionSet classes defined in BioConductor. The lower level processing tools offer an alternative to parts of BioConductor, but can also be used to enhance the existing BioConductor packages.
It is the successor to the earlier Object-Oriented Microarray Analysis Library (OOMAL), which was originally written for S-Plus 2000. The incorporation of routines to analyze proteomics profiling data in addition to gene expression microarray data prompted a name change. (It also inspired our icon. It suggests some possibilities for theme music, but we're pretty sure we don't want to go there.)
The packages included in the current release (OOMPA 3.0) are
- Contains definitions that are needed in order for other packages to load properly. Some definitions (like class unions) must be visible before loading a package that uses them, and cannot be defined in the same package.
- Basic library for low-level preprocessing of microarray data. Provides tools for using consistent color schemes in diagnostic and other plots. Also defines the Processor and Pipeline classes used so objects can maintain a history of how they were produced.
- The ClassComparison library provides tools to perform "class comparison" analyses of microarray or proteomics data. Class comparison problems start with two or more known groups of samples, and ask the analyst to find genes or proteins that are different in some way between the two groups.
- Two-sample t-test
- Fixed-effects linear models with ANOVA
- Beta-uniform mixture (BUM) model to account for multiple testing by controlling the false discovery rate (FDR).
- Wilcoxon rank-sum test with empirical Bayes
- Signficance Analysis of Microarrays (SAM)
- Total Number of Misclassification (TNoM)
- Dudoit's adjustment of p-values to control the family-wise error rate (FWER)
- Smooth t-test
- The ClassDiscovery library provides tools to perform "class discovery" analyses of microarray or proteomics data. Class discovery methods perform unsupervised analyses to try to "learn" or "discover" group structure in the data.
- Nonparametric bootstrap to test the significance of clusters
- Parametric bootstrap with gaussian noise to test the significance of clusters
- Principal components analysis of the biological samples
- Mosiac plots (i.e., the red-green two-way hierarchical clustering plots introduced into the microarray world by Mike Eisen)
- PCANOVA, which provides an "analysis of variance" inspired method thatb uses principal components to test whether putative group structures are really present in the data
- New! We now include functions to compute the bimodality index, a tool for ranking genes by how likely they are to follow a "useful" bimodal distribution. This method was introduced by Wang et al. in a manuscript to appear in Cancer Informatics.
- The Tail Rank test is a new method we have developed for finding biomarkers in microarray or proteomics data sets. The method is essentially non-parametric, focusing on the tails of the distributions in the two classes being compared. The method allows analysts to perform realistic sample size and power computations.
- CrossVal is a package to automate cross-validation of omics-based predictive models.
- CRAAC is a package for "consistent, robust, algorithm-agnostic clustering", containing tools for consensus clustering that combines the results of multiple clustering algorithms.
- GenAlgo is a package that implements a genetic algorithm, with a specific focus on performing feature selection from omics datasets to develop predicative models.
- NameNeedle implements the Needleman-Wunsch global alignment algorithm, in a sufficiently general form that it can be used to match variant spellings of cell line names, as described in the paper "Blasted cell line names" by Jing Wang et al. in Cancer Inform. 2010 Oct 14;9:251-5.
- Umpire is a package that implements tools for simulating realistic gene expression data based on a variety of biological principles. An introduction to the package can be found in the paper "UMPIRE: Ultimate Microarray Prediction, Inference, and Reality Engine" by Jiexin Zhang et al. in BIOTECHNO 2011, The Third International Conference on Bioinformatics, Biocomputational Systems and Biotechnologies.
- integIRTy is a package that uses Item Response Theory to integrate data from multiple "omics" platforms, with a goal of identifying genes that are significantly altered in cancer. (To install integIRTy, use oompainstall(groupName="irt").)
- SIBER is a package to identify bimodally expressed genes from RNA sequencing data. (To install SIBER, use oompainstall(groupName="siber").)
- SuperCurve is a package we have developed to analyze reverse-phase protein arrays. The package includes routines to load raw data files quantified by MicroVigene, to fit a four-parameter joint logistic model in order to estimate protein concentrations, along with methods to assess the quality of the fit.
- SuperCurveGUI provides a graphical user interface for the SuperCurve package.
- SlideDesignerGUI is a graphical tool to allow researchers to describe the location and concentration of different positive and negative controls on a reverse phase protein array..
To use the current version of OOMPA, you must have R version >=2.9.
Since SuperCurve is part of the OOMPA package, please also check the SuperCurve system requirements.
Download and Installation
Beginning with Release 2.8, we set up a proper repository of R code. With Release 2.9, believing in the adage that good programmers write good code, but great programmers steal great code, we blithely stole the repository management scripts from BioConductor and adapted them to work for OOMPA. What these adapted scripts will do, is automatically install the appropriate version of OOMPA that works with your version of R.
So, the simplest way to install the OOMPA packages is now to fire up your local version of R and use the command:
These commands will install the basic OOMPA packages. In order to get a slightly larger set of (default) packages, you can execute the command
If you want to get everything (which may include some experimental packages that are still being developed, then use the command
Here is a complete list of the allowed values of groupName:
Alternatively, if you want more control over which packages get installed, execute the following command and select from the resulting list.
By default, the installation routines assume that you have already installed anything that they depend on from CRAN or BioConductor. In order to get those packages installed automatically, you need more elaborate R code:
myRepos <- c(getOption("repos"),
Here you will find documentation for the individual components of OOMPA, written in the standard R package documentation format. Sample code can also be found within the manuals.
Current Version (v2.14.0)
- Released in conjunction with the new version of R.
- Released in conjunction with the new version of R. Added the integIRTy and SIBER packages, and updated some documentation.
- Released in conjunction with the new version of R. Variety of updates to various packages.
- Released in conjunction with the new version of R. Adds a new package, Umpire, to simulate microarray realistic microarray data.
OOMPA Version 2.12 includes experimental versions of two new packages:
- ArrayCube: builds on fundamental classes from BioConductor to define a structure that generalizes the MINiML format used at the Gene Expression Omnibus. The main enhancement over MINiML format is the inclusion of an annotated data frame containing sample characteristics. The package provides routines to convert an ArrayCube into either an AffyBatch or an RGList, as appropriate.
- MINiML: reads files in the MINiML format, as downloaded from the Gene Expression Omnibus, and stores them in R as ArrayCubes.
You can install these packages by setting groupName="arraycube" in the oompainstall function.
- OOMPA Version 2.11 includes the GenAlg package, which provides an R implementation of a genetic algorithm that can be used for feature selection. You can install this package by setting groupName="genalg" in the oompainstall function.
- Moved (duplicated) matrix mean and variance code from the ClassComparison and TailRank packages into oompaBase package.
- Moved the color functions (e.g., redgreen, jetColors) from the ClassDiscovery package into oompaBase package, in order to simplify dependencies
- Added support for namespaces
For Frequently Asked Questions, Bug Reports, and other concerns, please visit the forum at this link