Home  |  Single Gene  |  Top Events  |  PSI Data Download  |  FAQ  |  About

Frequently Asked Questions

What is SpliceSeq?

SpliceSeq is an analytical tool for RNASeq data with an interactive, visual results viewer. It can be used to explore the transcriptome of a single sample or group of samples and performs comparative analysis to identify significant changes in splicing patterns (splice events). SpliceSeq analysis is based on composite splice graphs of each gene constructed from Ensembl transcripts. RNASeq reads are aligned to unique locations in a splice graph and normalized expression levels are calculated for every exon and splice junction. SpliceSeq also detects and quantifies novel splices. Please see the SpliceSeq methods page for more details.


What is TCGA SpliceSeq?

The TCGA SpliceSeq is an alternative splicing database constructed by applying SpliceSeq's analysis methods to RNASeq samples from The Cancer Genome Atlas project (TCGA). TCGA SpliceSeq provides search and visualization functions for examination of transcript patterns for groups of tumor samples, and identification of splicing pattern differences between tumor types and between tumor and adjacent normal samples.

Analysis is based on splice graphs (one per gene) assembled from all the Ensembl transcripts of a gene. The thin exon sections represent untranslated regions (UTR), and the thick exon sections represent coding regions. Exons are drawn to scale and the connecting arcs represent splice paths. Numerical values below the exons and above the splices are average normalized read counts for the selected gene/tissue.


What types of Splice Events are detected?

A splice event is a change in splicing patterns between groups of samples. Seven different types of splice events are detected:


How are Splice Events quantified / What is PSI?

For each sample and every possible splice event (e.g. an exon skip event), we calculate a percent-splice-in (PSI) value. PSI is the ratio of normalized read counts indicating inclusion of a transcript element over the total normalized reads for that event (both inclusion and exclusion reads). A PSI value of .8 for an exon skip event would indicate that the exon is included in approximately 80% of the transcripts in the sample. Changes in average PSI values when comparing groups of samples indicate a shift in splicing patters between the groups or a splice event. Examination of scatter plots, standard deviation ranges, and t-test p-values are used to confirm the robustness of a splicing event.


How does the single gene splice events page work?

To examine a gene of interest, follow the path on the home page for a single gene search or select the Single Gene link in the header bar menu. The single gene page displays all the potential splice events for the gene symbol you enter. Click a row in the table to see the average PSI across tumor types for that splicing event. Select a specific PSI value in the table to see the splice graph values for a particular tumor type.


Every time there are divergent paths in a splice graph, there is potential for a splice event. In some graphs the number of potential splice events is quite high. Filters are employed to identify the most robust events that show variation across tumor types or between tumor and adjacent normal samples. Filters allow you to show more or less events, focus on specific types of splicing events, and/or limit the tumor types displayed.


How do I get information about an exon in a splice event to relate it to other references?

SpliceSeq exon numbers are assigned when we build our splice graphs and may not relate to exon names in external references. Select the Exon tab in the Gene Info panel to get a description of the exons including the hg19 coordinates of the exon. This can be used to match exons to external exon names. Select the sequence column to copy / paste exon sequences. If you would like a full reference to the genomic locations of each gene/exon in our splice graphs, you can download it here: TCGA_SpliceSeq_Grap_Structure.zip


Why include adjacent normal PSI values?

Adjacent normal samples in the TCGA data must be used with caution in splice event analysis because the adjacent normal samples often are not well matched in tissue composition to the tumor progenitor cells. Tissue type differences can certainly obfuscate tumor/normal splicing analysis. Further, adjacent normal tissue is subject to field effects from the tumor. Despite these concerns, the adjacent normal values are presented to give some feel for the range of PSI values for a particular event in normal tissues. This can show some interesting patters where despite a wide range of PSI values across tissues, the tumor PSI values are generally increased from that of their matched normal tissues. For example, the exon 8 skip event in MBNL1 shows a fairly dramatic increase in PSI for all tissues with adjacent normal samples.
Note: the table can be toggled with the Tumor/Normal drop down to show adjacent normal rather than tumor PSI values. Toggling to normal and clicking on a cell will also allow you to see the splice graph values for normal tissue.


What does the UniProt tab tell me?

The UniProt tab shows how the exons of the splice graph map to the canonical protein product of the gene. This may be useful in predicting the impact of a splicing event. In addition to the protein sequence, the protein tab contains may UniProt annotations to assist with identification of exons with functional elements. The '?' tooltip on the UniProt tab displays a color code and list of the UniProt annotations available.


How and why do I use the Top Events page?

Your genes of interest may not be the genes that show large splicing variation. An alternate path for splicing investigation is to look through the list of genes that do show lots of splicing differences between different tumor types or between tumor and adjacent normal samples. The Top Events list allows you to select all tumor types or a subset of tumor types to investigate. It then displays a sorted list of splicing events. Based on your selection it will be sorted to show the events with the largest variation across tumor samples or the largest variation between tumor and normal samples. Scrolling down through this list while looking at the PSI graph and gene description panel is a quick way to locate potentially interesting splicing events.



How and why would I download data on the Gene PSI Data Download page?

There are many types of data available for TCGA data (methlyation, gene expression, mutation, copy number, protein expression, etc). The PSI values that can be downloaded from this page can be used for integrative analysis in which a researcher wishes to investigate the relation between other factors and resultant splicing changes. For example, one could investigate associations between expression levels of a splicing factor and changes in exon inclusion in a metabolism gene of interest.

Select a tissue type and gene(s) and sample(s) to download splice event PSI data. Leave gene(s) and/or sample(s) blank to get all genes/samples. You can also filter to specific types of splice events. The returned data is a comma separated file suitable for bioinformatic analysis or inclusion in a spreadsheet.