Department of Bioinformatics and Computational Biology

Home > Public Software > MuSE


hidden rowfor table layout
DescriptionSomatic point mutation caller for tumor-normal paired samples in next-generation sequencing data.
Development Information
GitHub danielfan/MuSE
Current version1.0rc
PlatformsPlatform independent
LicenseGNU GPL Version 2
Citation Fan, Y., Xi, L., Hughes, D. S. T., Zhang, J., Zhang, J., Futreal, P. A., Wheeler, D. A., and Wang, W. Accounting for inter-tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling for sequencing data. Genome Biology. 2016. 
Help and Support
Contact Wenyi Wang 
Discussion On GitHub 


The detection of somatic point mutations is a key component of cancer genomic research, which has been rapidly developing since next-generation sequencing (NGS) technology revealed its potential for describing genetic alterations in cancer. We present MuSE, a novel approach to mutation calling based on the F81 Markov substitution model for molecular evolution 1, which models the evolution of the reference allele to the allelic composition of the matched tumor and normal tissue at each genomic locus. To improve overall accuracy, we further adopt a sample-specific error model to identify cutoffs, reflecting the variation in tumor heterogeneity among samples.


Source File:

Linux Binary File: MuSEv1.0rc_b , MuSEv1.0rc_c


After downloading the source file, for Unix-like operating systems please type the following commands sequentially in the command line to generate the executable:

cd MuSEv1.0rc

For Windows, please install Cygwin first, which provides functionality similar to a Linux distribution on Windows. The following procedures are the same as above.

Input Data

MuSE is comprised of two steps, which requires

The first step, ‘MuSE call’, takes as input (1) and (2). The BAM files require aligning all the sequence reads against the reference genome using the Burrows-Wheeler alignment tool (BWA), with either the backtrack or the maximal exact matches (MEM) algorithm 2. In addition, the BAM files need to be processed by following the Genome Analysis Toolkit (GATK) Best Practices 3,4,5 that include marking duplicates, realigning the paired tumor-normal BAMs jointly and recalibrating base quality scores.

To speed up ‘MuSE call’, we recommend splitting the WGS data into small blocks (<50Mb) by using the provided option either ‘-r’ or ‘-l’, and concatenating all the output files by the Linux command CAT.

The second step, ‘MuSE sump’, takes as input the output file from ‘MuSE call’ and (3). We provide two options for building the sample-specific error model. One is applicable to WES data (option ‘-E’), and the other to WGS data (option ‘-G’).

Example Commands

The following commands briefly illustrate how to use MuSE. As to the preparation of BAM files, please refer to the first part, PRE-PROCESSING, of the Genome Analysis Toolkit (GATK) Best Practices ( ).

./MuSE call –O Output.Prefix –f Reference.Genome Tumor.bam Matched.Normal.bam
./MuSE sump -I Output.Prefix.MuSE.txt -G –O Output.Prefix.vcf –D dbsnp.vcf.gz


The final output of MuSE is a VCF file that lists the identified somatic variants.


  1. Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal Of Molecular Evolution 17, 368–376 (1981). [return]
  2. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25, 1754–1760 (2009). [return]
  3. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 1297–1303 (2010). [return]
  4. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics 43, 491–498 (2011). [return]
  5. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current Protocols in Bioinformatics 11, 11.10.1–11.10.33 (2013). [return]