Home Introduction and methodology Aannotation and samples eRNA quantification Integrated analysis About
Image

TCeA portal includes a catalog of H3K27ac ChIP-seq defined super-enhancers in 86 ENCODE cell and tissue types (Denes Hnisz et al 2013 Data S2). We studied 5 Mb core super-enhancers regions active in >20 of the 86 cell and tissue types and developed a PCA model discribing the association between nuclesome positioning and eRNA expression. The PCA model was applied onto the other tissue-specific super-enhancers regions (~372 Mb) for eRNA loci discovery. A total of ~300,000 eRNA loci were found using this methodology.

We also expanded the super-enhancer regions for eRNA loci discovery by collecting a brunch of >300 high quality H3K27ac ChIP-seq profiles from the SEdb and cistrome database. We applied the ROSE program (Whyte et al 2013) to annotate ~320Mb irredundant super-enhancer regions outside those in the 86 samples. TCeA identified ~200,000 eRNA loci in these regions. Besides these, TCeA also included ~63,000 eRNAs from typical enhancers annotated by the FANTOM project. This is an updated dataset of Chen et al 2018.

All the sample, enhancer, and eRNA annotations can be downloaded below:

Download the 377 Mb super-enhancer regions in 86 human cell and tissue types (From Denes Hnisz et al 2013 Data S2)
Download the 5Mb core super-enhancer regions in > 20 cell and tissue types
Download the ~300,000 eRNA loci identified in the ENCODE dataset
Download H3K27ac ChIP-seq sample list from SEdb/cistrome database
Download the 320 Mb putative super-enhancer regions discivered in SEdb/cistrome database
Download the ~200,000 eRNA loci identified in the SEdb/cistrome dataset
Download the ~63,000 eRNAs loci from typical enhancers






Image

TCeA includes ~9,300 TCGA tumor samples and ~700 matched normal samples. The TCGA patient IDs are used as tickers in TCeA. Check here for details of TCGA sample barcodes. For example, the TCeA ticker "TCGA-BL-A13J_tumor" and "TCGA-BL-A13J_normal" represent the tumor and normal samples from the BLCA patient labeled with TCGA patient ID "TCGA-BL-A13J". We used only one tumor (or normal) sample for patients with multiple runs. We choose the primary tumor samples over the metastatic ones and choose the one with the smallest portion ID when multiple portions of the same sample are available in TCGA RNA-seq dataset. All samples used in TCeA are considered as in the whitelist by TCGA consortium. The details can be downloaded below:


TCGA cancer type Disease name # of tumors # of normal samples Sample details
All 32 TCGA cancers All 32 TCGA cancers 9,284 720 Download
ACC Adrenocortical carcinoma 79 0 Download
BLCA Bladder Urothelial Carcinoma 408 19 Download
BRCA Breast invasive carcinoma 1,095 113 Download
CESC Cervical squamous cell carcinoma and endocervical adenocarcinoma 304 3 Download
CHOL Cholangiocarcinoma 36 9 Download
COAD/READ Colon adenocarcinoma / Rectum adenocarcinoma 374 51 Download
DLBC Lymphoid Neoplasm Diffuse Large B-cell Lymphoma 48 0 Download
ESCA Esophageal carcinoma 183 11 Download
GBM Glioblastoma multiforme 154 5 Download
HNSC Head and Neck squamous cell carcinoma 520 44 Download
KICH Kidney Chromophobe 66 25 Download
KIRC Kidney renal clear cell carcinoma 448 67 Download
KIRP Kidney renal papillary cell carcinoma 290 32 Download
LAML Acute Myeloid Leukemia 28 0 Download
LGG Brain Lower Grade Glioma 516 0 Download
LIHC Liver hepatocellular carcinoma 371 50 Download
LUAD Lung adenocarcinoma 516 59 Download
LUSC Lung squamous cell carcinoma 501 51 Download
MESO Mesothelioma 87 0 Download
OV Ovarian serous cystadenocarcinoma 293 0 Download
PAAD Pancreatic adenocarcinoma 177 4 Download
PCPG Pheochromocytoma and Paraganglioma 178 3 Download
PRAD Prostate adenocarcinoma 492 52 Download
SARC Sarcoma 259 2 Download
SKCM Skin Cutaneous Melanoma 361 1 Download
STAD Stomach adenocarcinoma 415 34 Download
TGCT Testicular Germ Cell Tumors 149 0 Download
THCA Thyroid carcinoma 505 59 Download
THYM Thymoma 119 2 Download
UCEC Uterine Corpus Endometrial Carcinoma 175 24 Download
UCS Uterine Carcinosarcoma 57 0 Download
UVM Uveal Melanoma 80 0 Download







Image

A total of 934 paired-fastq files of Cancer Cell Line Encyclopedia (CCLE) project were downloaded from the legacy UCSC Cancer Genome Hub (CGhub), which can now be found on EBI. The CGhub IDs (e.g. G20490.HCT_116.2 for the HCT116 cell line) are used as tickers in TCeA data portal. The cancer type of each cancer cell line (e.g. COAD for the HCT116 cell line) is matched to those used in TCGA or ICGC projects. Nine of the 934 fastq files were found to be truncated or with low mappability after multiple download attempts and thus dropped from TCeA dataset. Details of the 925 cancer cell lines included in TCeA can be downloaded below:

Download CCLE cell line annotation






Image

TCeA combines the GTEx "submitted_subject_id" and "histological_type" as tickers. For example, the sample "GTEX-T6MO__Ovary" represents the ovarian sample from the individual with GTEx id "GTEX-T6MO". Notably, samples with multiple records (or runs) are pooled in the TCeA portal. Details of the GTEx samples included in TCeA can be downloaded below:


GTEx tissue type # of primary tissues Sample details
All GTEx tissues 6453 Download
Adipose Tissue 435 Download
Adrenal Gland 159 Download
Bladder 10 Download
Blood 432 Download
Blood Vessel 438 Download
Brain 201 Download
Breast 217 Download
Cervix Uteri 8 Download
Colon 277 Download
Esophagus 395 Download
Fallopian Tube 7 Download
Heart 319 Download
Kidney 36 Download
Liver 135 Download
Lung 335 Download
Muscle 451 Download
Nerve 331 Download
Ovary 108 Download
Pancreas 191 Download
Pituitary 124 Download
Prostate 118 Download
Salivary Gland 69 Download
Skin 507 Download
Small Intestine 104 Download
Spleen 118 Download
Stomach 200 Download
Testis 198 Download
Thyroid 344 Download
Uterus 89 Download
Vagina 97 Download