GuidePro: An ensemble predictor for prioritizing sgRNAs in CRISPR/Cas9 protein knockouts


CRISPR/Cas9 has evolved as the most powerful tool for gene perturbation, and is widely used in protein functional analysis. Successful knockout of a protein-coding gene relies on the selection of sgRNAs with high efficiency. Here, we propose GuidePro, a two-layer ensemble predictor that enables the integration of multiple predictive methods and feature sets to predict sgRNA efficiency for the CRISPR/Cas9 protein knockouts

As shown in the figure at right, GuidePro integrates three sub-predictors trained with different types of features that jointly contribute to protein knockouts:i) The first predictor (SA) predicts sgRNA activity combining the outputs of other sgRNA sequence-based methods. ii) The second predictor (FP) predicts the frameshift probabilities leveraging the indel type predictions of three machine learning models. iii) The third predictor (AS) predicts the amino acid sensitivity to the knockouts from annotation of protein features. Tested on independent datasets, GuidePro demonstrated consistent superior performance in predicting phenotypes caused by protein loss-of-function, suggesting its robustness in a broad spectrum of experimental settings.

Select highly efficient sgRNAs for one or more protein-coding genes here.

Genome-wide top10 prioritized sgRNAs and a command-line script to extract sgRNAs for user-defined gene list can be obtained here.

To know more about our research, please visit Xu lab.


Copyright (C) 2020 @

Select high efficiency sgRNAs for CRISPR-Cas9 mediated Protein Knockouts


Download

sgRNA selection guidelines:

1. Hover your mouse on the column header to see the detailed description of each column.

2. Select sgRNAs uniquely aligned to the genome to reduce potential off targets.

3. Select sgRNAs with higher GuidePro score to obtain higher protein knockout efficiency.

4. Guidepro scores larger than 0 are recommended if plenty of sgRNAs are needed.

5. Click the genomic loci to further explore other genomic features on UCSC genome browser.

Any feedbacks and reasonable requests are welcomed to contact us


Wei He (First author)
Postdoctoral Fellow
The University of Texas MD Anderson Cancer Center
Department of Epigenetics and Molecular Carcinogenesis
Science Park
1808 Park Road 1C
Smithville, Texas 78957
512-237-6510
whe3@mdanderson.org


Han Xu (Supervisor)
Principal Investigator
The University of Texas MD Anderson Cancer Center
Department of Epigenetics and Molecular Carcinogenesis
Science Park
1808 Park Road 1C
Smithville, Texas 78957
512-237-9474
hxu4@mdanderson.org