[R] Help with DNA Methylation Analysis

Spencer Brackett @pbr@ckett20 @end|ng |rom @@|ntjo@ephh@@com
Mon Aug 27 02:34:13 CEST 2018


Caitlin,

  Forgive me, but I’m not quite sure exactly what your question is asking.
The data is originally from the TCGA and I have it downloaded onto another
R script. I opened a new script to perform the functions I posted to this
forum because I was unable to input any other commands into the console....
due to the fact that the translated data filled the entirety of said
consule. Perhaps overloaded it? Regardless, I was unable to input any
further commands.

-Spencer Brackett


On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer using gmail.com> wrote:

> You're welcome Spencer :)
>
> The 4th line:
>
> path <– "."
>
> refers to the current directory (the dot in other words). Is the data
> stored in the same directory where the code is being run?
>
>
>
> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett <
> spbrackett20 using saintjosephhs.com> wrote:
>
>>  Thank you! I will make note of that. Unfortunately, lines 1 and 4 of the
>> first portion of this analysis appear to be where the error begins... to
>> which several subsequent lines also come up as ‘errored’. Perhaps this is
>> an issue of the capitalization and/or spacing (something within the text)?
>> The proposed method for methylation data extraction is based on the first
>> third of the following TCGA workflow:
>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po=0.0715308
>>
>> Best,
>>
>> Spencer Brackett
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer using gmail.com> wrote:
>>
>>> Hi Spencer.
>>>
>>> Should you capitalize the following library import?
>>>
>>> library(summarizedExperiment)
>>>
>>> In other words, I think that line should be:
>>>
>>> library(SummarizedExperiment)
>>>
>>> Hope this helps.
>>>
>>> ~Caitlin
>>>
>>>
>>>
>>>
>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett <
>>> spbrackett20 using saintjosephhs.com> wrote:
>>>
>>>> Good evening,
>>>>
>>>>   I am attempting to run the following analysis on TCGA data, however
>>>> something is being reported as an error in my arguments... any ideas as
>>>> to
>>>> what is incorrect in the following? Thanks!
>>>>
>>>> 1 library(TCGAbiolinks)
>>>> 2
>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM.
>>>> 4 path <– "."
>>>> 5
>>>> 6 query.met <– TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450",
>>>> level = 3)
>>>> 7 TCGAdownload(query.met, path = path )
>>>> 8 met <– TCGAprepare(query = query.met,dir = path,
>>>> 9                      add.subtype = TRUE, add.clinical = TRUE,
>>>> 10                    summarizedExperiment = TRUE,
>>>> 11                      save = TRUE, filename = "lgg_gbm_met.rda")
>>>> 12
>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM.
>>>> 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform =
>>>> "IlluminaHiSeq_
>>>> RNASeqV2",level = 3)
>>>> 15
>>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_
>>>> results")
>>>> 17
>>>> 18 exp <– TCGAprepare(query = query.exp, dir = path,
>>>> 19                    summarizedExperiment = TRUE,
>>>> 20                      add.subtype = TRUE, add.clinical = TRUE,
>>>> 21                    type = "rsem.genes.normalized_results",
>>>> 22                      save = T,filename = "lgg_gbm_exp.rda")
>>>>
>>>> To download data on DNA methylation and gene expression…
>>>>
>>>> 1 library(summarizedExperiment)
>>>> 2 # get expression matrix
>>>> 3 data <– assay(exp)
>>>> 4
>>>> 5 # get sample information
>>>> 6 sample.info <– colData(exp)
>>>> 7
>>>> 8 # get genes information
>>>> 9 genes.info <– rowRanges(exp)
>>>>
>>>> Following stepwise procedure for obtaining GBM and LGG clinical data…
>>>>
>>>> 1 # get clinical patient data for GBM samples
>>>> 2 gbm_clin <– TCGAquery_clinic("gbm","clinical_patient")
>>>> 3
>>>> 4 # get clinical patient data for LGG samples
>>>> 5 lgg_clin <– TCGAquery_clinic("lgg","clinical_patient")
>>>> 6
>>>> 7 # Bind the results, as the columns might not be the same,
>>>> 8 # we will plyr rbind.fill , to have all columns from both files
>>>> 9 clinical <– plyr::rbind.fill(gbm_clin ,lgg_clin)
>>>> 10
>>>> 11 # Other clinical files can be downloaded,
>>>> 12 # Use ?TCGAquery_clinic for more information
>>>> 13 clin_radiation <– TCGAquery_clinic("lgg","clinical_radiation")
>>>> 14
>>>> 15 # Also, you can get clinical information from different tumor types.
>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT
>>>> 17 data <– TCGAquery_clinic(clinical_data_type = "clinical_patient",
>>>> 18    samples = c("TCGA-06-5416-01A-01D-1481-05",
>>>> 19  "TCGA-2G-AAEW-01A-11D-A42Z-05",
>>>> 20  "TCGA-2G-AAEX-01A-11D-A42Z-05"))
>>>>
>>>>
>>>> # Searching idat file for DNA methylation
>>>> query <- GDCquery(project = "TCGA-GBM",
>>>>                  data.category = "Raw microarray data",
>>>>                  data.type = "Raw intensities",
>>>>                  experimental.strategy = "Methylation array",
>>>>                  legacy = TRUE,
>>>>                  file.type = ".idat",
>>>>                  platform = "Illumina Human Methylation 450")
>>>>
>>>> **Repeat for LGG**
>>>>
>>>> To access mutational information concerning TMZ methylation…
>>>>
>>>> > mutation <– TCGAquery_maf(tumor = "lgg")
>>>> 2   Getting maf tables
>>>> 3   Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files
>>>> 4   We found these maf files below:
>>>> 5       MAF.File.Name
>>>> 6   2             hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf
>>>> 7
>>>> 8   3
>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf
>>>> 9
>>>> 10       Archive.Name Deploy.Date
>>>> 11   2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0
>>>>   10-DEC-13
>>>> 12   3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0
>>>>  24-DEC-14
>>>> 13
>>>> 14   Please, select the line that you want to download: 3
>>>>
>>>> **Repeat this for GBM***
>>>>
>>>> Selecting specified lines to download…
>>>>
>>>> 1 gbm.subtypes <− TCGAquery_subtype(tumor = "gbm")
>>>> 2 lgg.subtypes <− TCGAquery_subtype(tumor = "lgg”)
>>>>
>>>>
>>>>
>>>> Downloading data via the Bioconductor package RTCGAtoolbox…
>>>>
>>>> library(RTCGAToolbox)
>>>> 2
>>>> 3 # Get the last run dates
>>>> 4 lastRunDate <− getFirehoseRunningDates()[1]
>>>> 5 lastAnalyseDate <− getFirehoseAnalyzeDates(1)
>>>> 6
>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG
>>>> 8 lgg.data <− getFirehoseData(dataset = "LGG",
>>>> 9       gistic2_Date = getFirehoseAnalyzeDates(1), runDate =
>>>> lastRunDate,
>>>> 10       Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = TRUE,
>>>> 11       Mutation = T,
>>>> 12       fileSizeLimit = 10000)
>>>> 13
>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM
>>>> 15 gbm.data <− getFirehoseData(dataset = "GBM",
>>>> 16       runDate = lastDate, gistic2_Date = getFirehoseAnalyzeDates(1),
>>>> 17       Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = TRUE,
>>>> 18       fileSizeLimit = 10000)
>>>> 19
>>>> 20 # To access the data you should use the getData function
>>>> 21 # or simply access with @ (for example gbm.data using Clinical)
>>>> 22 gbm.mut <− getData(gbm.data,"Mutations")
>>>> 23 gbm.clin <− getData(gbm.data,"Clinical")
>>>> 24 gbm.gistic <− getData(gbm.data,"GISTIC")
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Genomic Analysis/Final data extraction:
>>>>
>>>> Enable “getData” to access the data
>>>>
>>>> Obtaining GISTIC results…
>>>>
>>>> 1 # Download GISTIC results
>>>> 2 gistic <− getFirehoseData("GBM",gistic2_Date ="20141017" )
>>>> 3
>>>> 4 # get GISTIC results
>>>> 5 gistic.allbygene <− gistic using GISTIC@AllByGene
>>>> 6 gistic.thresholedbygene <− gistic using GISTIC@ThresholedByGene
>>>>
>>>> Repeat this procedure to obtain LGG GISTIC results.
>>>>
>>>> ***Please ignore the 'non-coded' text as they are procedural
>>>> steps/classifications***
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>

	[[alternative HTML version deleted]]




More information about the R-help mailing list