RPPAFit {SuperCurve} | R Documentation |
RPPAFit
fits a four-parameter logistic model to the dilution
series in a reverse-phase protein array experiment. Individual sample
concentrations are estimated by matching individual sample dilution
series to the overall logistic response for the slide.
RPPAFit(rppa, design, measure, xform=function(x) x, method = c("pure", "mixed", "quantiles", "rlm"), ci = FALSE, ignoreNegative = TRUE, bayesian = FALSE, trace = FALSE, verbose = FALSE, veryVerbose = FALSE, warnLevel = 0)
rppa |
An RPPA object containing the raw data to be fit. |
design |
A RPPADesign object describing the layout
of the array. |
measure |
A character string identifying the column of the raw RPPA data that should be used to fit to the model. |
xform |
(Experimental) A function that takes a single input
vector and returns a single output vector of the same length. The
measure column is transformed using this function before
fitting the model. NOT YET IMPLEMENTED. |
method |
optional parameter specifying the method for fitting the
parameters alpha and beta . Default method is
pure , which simply uses the optimal fit based on nonlinear
least squares. Setting method to mixed uses nls
to fit the three general model parametrs, but uses rlm to fit
the sample-specific parameters. Setting method to
quantiles uses the 5th and 95th quantiles from the raw
data. Setting method to rlm tries to refit the values
(afer an appropriate transformation) with a robust linear model. |
ci |
A logical value: if TRUE, then compute 90% confidence intervals on the concentration estimates. |
ignoreNegative |
A logical value: if TRUE, then negative values are converted to NA before fitting the model. |
bayesian |
A logical value: if TRUE, we use bayesian methods to estimate per sample values of the lower bound alpha. |
trace |
this is passed to nls in the bayesian portion of the routine. |
verbose |
a logical value; if TRUE, the function prints updates while it is fitting the data. |
veryVerbose |
a logical value; if TRUE, then the function prints voluminous updates as it is fitting each individual dilution series. |
warnLevel |
used to set the warn option before calling
rlm . Since this is wrapped in a try function, it
won't cause failure but will give us a chance to figure out which
dilution series are failing. Setting warnLevel to two or
greater may change the values returned by the function. |
The basic mathematical model is given by
Y = α + β*g(gamma*(X+delta_i)),
where Y is the observed intensity and X is the designed dilution step. The heart of the model is the function g(x) = e^x/(1+e^x), which is the inverse of the logistic function
f(x) = log(p/(1-p)).
By fitting a joint model, we assume that the parameters α, β, and gamma are the same for all dilution series on the array. The real point of the model, however, is to be able to draw inferences on the delta_i, which represent the (log) concentrations of the protein present in different dilution series.
As the first step in fitting the model, we get crude estimates of the parameters α and β by computing the min and max of the observed intensities Y. We then perform a logistic transformation, working with the values W = f((Y - α)/β). We then compute an initial estimate of gamma as the median (over all dilution series) of the slope of a robust linear fit to W as a function of the dilution steps X. Initial estimates of the individual delta_i are also computed robustly, conditional on the previously estimated parameters.
The next step depends on which method
has been specified for
model fitting. If method="pure"
or method="mixed"
, then
we use the non-linear least squares function nls
. Conditional
on the current estimates of the delta_i, we use nls
to
update the estimates of the other three parameters. Then, conditional
on the updated values of α, β, and gamma,
we update the estimates of the delta_i one dilution series at a
time. The update uses nls
when method="pure"
and uses
rlm
when method="mixed"
.
If method='quantiles'
, then we retain quantile estimates of
α and β. In this case, we first use nls
to update the value of gamma and then, conditional on that
estimate, update the delta_i.
If method="rlm"
, we first follow the procedure described for
method='nls'
. We follow this by trying to refit the estimates
of α and β using a robust linear model with the
rlm
function from the MASS
package. This computation is
peformed conditionally on the estimates of \gamma
and
\delta_i
, in which case the observed intensities Y are
linear in the sigmoid-transformed dilution steps X.
The bayesian
option alters the model by assuming that the
baseline level α can be different for each dilution
series. The globally estimated α is used as a strong
prior, and the individual estimates of alpha are shrunk toward
the global value. This idea is motivated by the possibility that
background levels might be different on different parts of the reverse
phase protein array.
If the ci
argument is set to TRUE, then the function also
computes confidence intervals around the estimates of the log
concentration. Since this step can be time-consuming, it is not
performed by default. Moreover, confidence intervals can be computed
after the main model is fit and evaluated, using the
getConfidenceInterval
function; see its documentation
for details on how the intervals are estimated.
This function constructs and returns an object of the
RPPAFit
class.
Kevin R. Coombes <kcoombes@mdanderson.org>
KRC
RPPAFit-class
, RPPA
,
RPPADesign
path <- system.file("rppaTumorData", package="SuperCurve") erk2 <- RPPA("ERK2.txt", path=path) design <- RPPADesign(erk2, grouping="blockSample", controls=list("neg con", "pos con")) fit.nls <- RPPAFit(erk2, design, "Mean.Net") summary(fit.nls) coef(fit.nls)