Modifications to "Run Batch Effects and Ovarian Cancer"

This web page is designed to hold updates or changes to our web site supplement for the manuscript Run batch effects potentially compromise the usefulness of genomic signatures for ovarian cancer, by Keith A. Baggerly, E. Shannon Neeley, and Kevin R. Coombes.

Contents

An archive of the incorrect data we initially examined is here.
Initial Rebuttal
We posted an initial response to the Dressman et al. reply (KRC; 10 March 2008).
Correspondence Published
Our correspondence appeared, along with a reply from Dressman, Potti, Nevins and Lancaster (KRC; 1 March 2008).

Initial Rebuttal

Posted 10 March 2008 (KRC)

The volume of the Journal of Clinical Oncology where our correspondence was published contains a response from Dressman et al. Not surprisingly, we disagree with several of their assertions. Here we provide a point-by-point rebuttal.

  1. We pointed out that the mapping from quantifications to sample identifiers was scrambled. They acknowledge the scrambling. They assert that it happened when assembling the data to post to the web and had no effect on the analysis. The latter assertion is not verifiable. They also assert that the correct data have now been posted, but as of this morning, the errors had not been corrected.
  2. We pointed out that the status of patients (with unchanged survival times) changed both from alive to dead and from dead to alive across multiple studies they have published. They acknowledge the errors. They assert that the errors occurred because they omitted the censoring variables when the data was posted to the web. Since the posted data contained censoring variables, we do not understand this assertion. They further assert that this error did not affect their analysis. This assertion appears to be incorrect, since Figure 2 in their paper can easily be seen to have used the incorrectly censored data.
  3. We said, "We identified 107 Affymetrix probeset IDs corresponding to the 'best' 100 genes reported by Dressman et al.; ambiguities in annotation led to some duplication." In response, they said, "Ambiguity in matching probeset IDs with genes is not relevant since the model identifies a unique set of probeset IDs, not genes." Their reply is disingenuous. We agree that the model identifies probe sets IDs. However, they never reported the probe set IDs; instead, they only reported the gene names and symbols. Failure to report the actual results is an impediment to reproducibility.
  4. We said "The CEL files can be grouped into clearly separated batches on the basis of run date. Response and survival are confounded with run date, particularly with the samples processed earliest." They claim first that this assertion is not true. They follow this with a claim that it does not matter, since their "method of analysis ... corrects for differences due to batch." It is not completely clear why they would correct for an effect that is not present. The full description they provide as to how they did this is: "The remaining RMA data were further processed by applying sparse regression model methods [33] to correct for assay artifacts." We find that this description provides insufficient details to allow us to attempt full reproducibility.

    Their main point, however, is a claim that our own report "clearly stated ... that there is no evident confounding aspect with respect to clinical response." We admit that the summary in our report (ovca04) mistakenly stated "Most clinical data (stage, grade, debulking, response) do not appear to be confounded with run date." At the time we were writing that summary, we were more concerned with confounding with survival. On page 6 of ovca04, we point out that some batches are confounded with response. We apologize for the confusion caused by our poorly worded summary. We still contend that the actual analysis in ovca04 shows that survival is strongly confounded with batch, and that there is clear evidence that response is confounded with at least some batches.

  5. Dressman et al. reject the remaining points in our letter by insisting that they are invalid since we did not use exactly their methods. They repeatedly insist that "to reproduce" means "to repeat, following the original methods". We would very much like to be able to evaluate their claims by running exactly the same computer code on exactly the same data. The analysis that we performed and reported meets this high standard, since we make the actual computer code available. However, Dressman et al. only described their method in words without supplying the complete code. As noted above, they also acknowledged in their response that the data they posted contained several errors. Without the same data and without the same code, it is impossible for anyone to reproduce their analysis in this strong sense.

    Fortunately, the true definition of "reproduce" is not so restrictive. The (online) Merriam-Webster dictionary defines "to reproduce" as "to produce again; to cause to exist again or anew; to imitate closely; to present again; to make a representation of; to revive mentally." Our reported analysis is a good faith effort "to imitate closely" their methods, based on our interpretation of the verbal descriptions they supplied. We have shown people exactly what we did. Dressman et al. have not shown that any of the steps we reported are wrong. They disagree with our conclusions because the methods are different. We will continue to check the data and, as more information becomes available, test how well we can reproduce their methods.

Correspondence Published

Posted 1 March 2008 (KRC)

Our letter to Journal of Clinical Oncology was published on 1 March 2008. The full reference is: Baggerly, KA, Neeley, ES. Run Batch Effects Potentially Compromise the Usefulness of Genomic Signatures for Ovarian Cancer. J Clin Oncol, 2008; 26(7):1186-1187.