--- title: "Cancer Testis Datasets" author: - name: Julie Devis, Laurent Gatto, Axelle Loriot package: CTdata output: BiocStyle::html_document: toc_float: true vignette: > %\VignetteIndexEntry{Cancer Testis Datasets} %\VignetteEngine{knitr::rmarkdown} %\VignetteKeywords{ExperimentHub, Cancer, Testis, Gene expression, methylation, Homo sapiens} %\VignetteEncoding{UTF-8} --- ```{r style, echo = FALSE, results = 'asis'} BiocStyle::markdown() ``` # Introduction `CTdata` is the companion Package for `CTexploreR` and provides omics data to select and characterise cancer testis genes. Data come from public databases and include expression and methylation values of genes in normal, fetal and tumor samples as well as in tumor cell lines, and expression in cells treated with a demethylating agent is also available. The data are served through the `ExperimentHub` infrastructure, which allows download them only once and cache them for further use. Currently available data are summarised in the table below and details in the next section. ```{r data} library("CTdata") DT::datatable(CTdata()) ``` # Installation To install the package: ```{r install1, eval = FALSE} if (!require("BiocManager")) install.packages("CTdata") BiocManager::install("CTdata") ``` To install the package from GitHub: ```{r install2, eval = FALSE} if (!require("BiocManager")) install.packages("BiocManager") BiocManager::install("UCLouvain-CBIO/CTdata") ``` # Available data ```{r, echo=FALSE, fig.align='center', out.width = '100%'} knitr::include_graphics("Datasets.png") ``` For details about each data, see their respective manual pages. ## Normal adult tissues ### GTEX data A `SummarizedExperiment` object with gene expression data in normal tissues from GTEx database: ```{r, message = FALSE} library("SummarizedExperiment") ``` ```{r} GTEX_data() ``` ### Normal tissue gene expression A `SummarizedExperiment` object with gene expression values in normal tissues with or without allowing multimapping: ```{r} normal_tissues_multimapping_data() ``` ### Methylation in normal adult tissues A `RangedSummarizedExperiment` containing methylation of CpGs located within gene promoters in normal tissues: ```{r} methylation_in_tissues() ``` (`CT_methylation_in_tissues` before v 1.5) A `SummarizedExperiment` with all genes' promoters mean methylation in normal tissues: ```{r} mean_methylation_in_tissues() ``` (`CT_mean_methylation_in_tissues` before v 1.5) ### Testis scRNAseq data A `SingleCellExperiment` object containing gene expression from testis single cell RNAseq experiment (*The adult human testis transcriptional cell atlas* (Guo et al. 2018)): ```{r, message = FALSE} library("SingleCellExperiment") ``` ```{r} testis_sce() ``` ### Oocytes scRNAseq data A `SingleCellExperiment` object containing gene expression from oocytes single cell RNAseq experiment (Yan et al. 2021): ```{r} oocytes_sce() ``` ### Human Protein Atlas scRNAseq data A `SingleCellExperiment` object containing gene expression in different human cell types based on scRNAseq data obtained from the Human Protein Atlas (https://www.proteinatlas.org) ```{r} scRNAseq_HPA() ``` ### Human Protein Atlas cell type specificity data A `tibble` object containing cell type specificities based on scRNAseq data analysis from the Human Protein Atlas (https://www.proteinatlas.org) ```{r} HPA_cell_type_specificities() ``` ## Fetal cells ### Fetal germ cell scRNAseq data A `SingleCellExperiment` object containing gene expression from fetal germ cells single cell RNAseq experiment (*Single-cell roadmap of human gonadal development *(Garcia-Alonso et al. 2022)): ```{r} FGC_sce() ``` ### Methylation in fetal germ cells (scWGBS) A `RangedSummarizedExperiment` containing methylation of CpGs (hg19 based) located within gene promoters in fetal germ cells (*Dissecting the epigenomic dynamics of human fetal germ cell development at single-cell resolution*(Li et al. 2021)): ```{r} methylation_in_FGC() ``` A `SummarizedExperiment` with all genes' promoters mean methylation in fetal germ cells: ```{r} mean_methylation_in_FGC() ``` ### Embryonic stem cells RNA-Seq data A `SummarizedExperiment` object with gene expression data in embryonic stem cells from ENCODE database: ```{r} hESC_data() ``` ### Methylation in embryonic stem cells A `RangedSummarizedExperiment` containing methylation of CpGs located within gene promoters in embryonic stem cells from ENCODE ```{r} methylation_in_hESC() ``` A `SummarizedExperiment` with all genes' promoters mean methylation in embryonic stem cells: ```{r} mean_methylation_in_hESC() ``` ### Early embryo scRNA-seq data A `SingleCellExperiment` object containing gene expression from early embryo single cell RNAseq experiment (Petropulous et al, 2014): ```{r} embryo_sce_Petropoulos() ``` A `SingleCellExperiment` object containing gene expression from early embryo single cell RNAseq experiment (Zhu et al, 2018): ```{r} embryo_sce_Zhu() ``` ### Methylation in early embryo A `RangedSummarizedExperiment` containing methylation of CpGs (hg19 based) located within gene promoters in early embryo (*Single Cell DNA Methylome Sequencing of Human Preimplantation Embryos* (Zhu et al. 2018)): ```{r} methylation_in_embryo() ``` A `RangedSummarizedExperiment ` with all genes' promoters mean methylation in early embryo: ```{r} mean_methylation_in_embryo() ``` ## Demethylated gene expression A `SummarizedExperiment` object containing genes differential expression analysis (with RNAseq expression values) in cell lines treated or not with a demethylating agent (5-Aza-2'-Deoxycytidine). ```{r} DAC_treated_cells() ``` As above, with multimapping: ```{r} DAC_treated_cells_multimapping() ``` ## Tumor cells ### CCLE data A `SummarizedExperiment` object with gene expression data in cancer cell lines from CCLE: ```{r} CCLE_data() ``` Also, a `matrix` with gene expression correlations in CCLE cancer cell lines: ```{r} dim(CCLE_correlation_matrix()) CCLE_correlation_matrix()[1:10, 1:5] ``` ### TCGA data A `SummarizedExperiment` with gene expression data in TCGA samples (tumor and peritumoral samples : SKCM, LUAD, LUSC, COAD, ESCA, BRCA and HNSC): ```{r} TCGA_TPM() ``` Also, a `SummarizedExperiment` with gene promoters methylation data in TCGA samples (tumor and peritumoral samples : SKCM, LUAD, LUSC, COAD, ESCA, BRCA and HNSC): ```{r} TCGA_methylation() ``` (`TCGA_CT_methylation` before v 1.5) ## CT genes determination ### All genes ```{r, echo = FALSE} all_genes <- all_genes() n <- nrow(all_genes) ``` The analysis for all genes can be found in `all_genes`, a tibble like `CT_genes` containing all `r n` genes characterisation. ```{r} all_genes() ``` ### CT genes ```{r, echo = FALSE} ctgenes <- CT_genes() n <- nrow(ctgenes) ``` With the datasets above, we generated a list of `r n` CT and CTP genes (see figure below for details). We used multimapping because many CT genes belong to gene families from which members have identical or nearly identical sequences. This is likely the reason why these genes are not detected in GTEx database, as GTEx processing pipeline specifies that overlapping intervals between genes are excluded from all genes for counting. Some CT genes can thus only be detected in RNAseq data in which multimapping reads are not discarded. ```{r, echo=FALSE, fig.align='center', out.width = '100%'} knitr::include_graphics("Figure_CT.png") ``` A `tibble` with Cancer-Testis (CT) genes and their characteristics: ```{r} CT_genes() ``` # Session information {-} ```{r sessioninfo, echo=FALSE} sessionInfo() ```