[R] TCGA biolinks, DNA methylation

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Fri Aug 31 13:03:00 CEST 2018


At this point, it seems pretty clear that the issue is in the data file itself. Possibilities are that it is either not a CSV file to begin with or in some exotic encoding (utf-16?). 

You probably need to look at the file in a text editor to see whether the context makes sense as comma-separated variables. 

Also, perhaps review the download mechanism --- recently I have found several students shooting themselves in the foot by downloading .csv files, having them automatically  opened by Excel and the save them _in_ Excel, garbling the file in the process.

-pd

> On 31 Aug 2018, at 03:05 , Spencer Brackett <spbrackett20 using saintjosephhs.com> wrote:
> 
> My apologies... the following is what I received from the correction
> 
> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep=",")
> Warning messages:
> 1: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>  line 3 appears to contain embedded nulls
> 2: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>  line 4 appears to contain embedded nulls
> 3: In read.table(file = file, header = header, sep = sep, quote = quote,  :
>  line 5 appears to contain embedded nulls
> 4: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
>  embedded nul(s) found in input
>> 
> 
> 
> On Thu, Aug 30, 2018 at 8:57 PM Patrick Barry <pdbarry using alaska.edu> wrote:
> 
>> You still haven't fixed the first thing both Sarah and I pointed out. You
>> are lacking an = between sep and ","
>> 
>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",")
>> 
>> should be
>> 
>> the_data <- read.csv(file = "GBM_clinical_drug.csv", header = TRUE, *sep
>> = ","*)
>> 
>> as Sarah pointed out, you should use spaces to help make these errors more
>> obvious.
>> 
>> On Thu, Aug 30, 2018 at 4:53 PM, Spencer Brackett <
>> spbrackett20 using saintjosephhs.com> wrote:
>> 
>>> Hello again,
>>> 
>>> My apologies for the delayed response... computer troubles. In reference
>>> to
>>> Ms. Goslee's and Mr. Barry's query, the following is the error code
>>> received after I inputted my R command
>>> 
>>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",")
>>> Error: unexpected string constant in
>>> "the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",""
>>> 
>>> Given this, should I proceed with implementing the path<getwd() ,since I
>>> am, as he suggested trying to set the variable *path* to my working
>>> directory with path<-"."
>>> 
>>> Mr. Mittal also recommended importing with r studio, which I shall try in
>>> the meantime.
>>> 
>>> Many thanks,
>>> 
>>> Spencer Brackett
>>> 
>>> 
>>> On Wed, Aug 29, 2018 at 10:14 PM Amit Mittal <prof.amit.mittal using gmail.com>
>>> wrote:
>>> 
>>>> Use r studio and import from the menu. Read_csv has changed
>>>> 
>>>> Also you can see any format problems
>>>> 
>>>> On Thu, 30 Aug 2018 3:36 am Spencer Brackett, <
>>>> spbrackett20 using saintjosephhs.com> wrote:
>>>> 
>>>>> Good evening R users,
>>>>> 
>>>>>  I am attempting to carry out DNA methylation analysis on two separate
>>>>> CSV
>>>>> files (LGG and GBM), which I have downloaded onto my R console. To set
>>> the
>>>>> path<-"." to be indicative of one or both of the csv files, I utilized
>>> the
>>>>> following functions and received the errors shown. How do I set the
>>> "." so
>>>>> that I can begin analysis on my files?
>>>>> 
>>>>>> the_data <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",")
>>>>> Error: unexpected string constant in "the_data
>>>>> <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",""
>>>>>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",")
>>>>> Error: unexpected string constant in
>>>>> "the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",""
>>>>> 
>>>>> This is the preliminary portion of the analysis I am trying to run,
>>> which
>>>>> I
>>>>> am referring to:
>>>>> 
>>>>> 1 library(TCGAbiolinks)
>>>>> 2
>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM.
>>>>> 4 path <– "."
>>>>> 5
>>>>> 6 query.met <– TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450",
>>>>> level = 3)
>>>>> 7 TCGAdownload(query.met, path = path )
>>>>> 8 met <– TCGAprepare(query = query.met,dir = path,
>>>>> 9                      add.subtype = TRUE, add.clinical = TRUE,
>>>>> 10                    summarizedExperiment = TRUE,
>>>>> 11                      save = TRUE, filename = "lgg_gbm_met.rda")
>>>>> 12
>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM.
>>>>> 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform =
>>>>> "IlluminaHiSeq_
>>>>> RNASeqV2",level = 3)
>>>>> 15
>>>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_
>>>>> results")
>>>>> 17
>>>>> 18 exp <– TCGAprepare(query = query.exp, dir = path,
>>>>> 19                    summarizedExperiment = TRUE,
>>>>> 20                      add.subtype = TRUE, add.clinical = TRUE,
>>>>> 21                    type = "rsem.genes.normalized_results",
>>>>> 22                      save = T,filename = "lgg_gbm_exp.rda")
>>>>> 
>>>>> Many thanks,
>>>>> 
>>>>> Spencer Brackett
>>>>> 
>>>>>        [[alternative HTML version deleted]]
>>>>> 
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> 
>>>> 
>>> 
>>>        [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com




More information about the R-help mailing list