[R] TCGA biolinks, DNA methylation
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Fri Aug 31 13:03:00 CEST 2018
At this point, it seems pretty clear that the issue is in the data file itself. Possibilities are that it is either not a CSV file to begin with or in some exotic encoding (utf-16?).
You probably need to look at the file in a text editor to see whether the context makes sense as comma-separated variables.
Also, perhaps review the download mechanism --- recently I have found several students shooting themselves in the foot by downloading .csv files, having them automatically opened by Excel and the save them _in_ Excel, garbling the file in the process.
-pd
> On 31 Aug 2018, at 03:05 , Spencer Brackett <spbrackett20 using saintjosephhs.com> wrote:
>
> My apologies... the following is what I received from the correction
>
> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep=",")
> Warning messages:
> 1: In read.table(file = file, header = header, sep = sep, quote = quote, :
> line 3 appears to contain embedded nulls
> 2: In read.table(file = file, header = header, sep = sep, quote = quote, :
> line 4 appears to contain embedded nulls
> 3: In read.table(file = file, header = header, sep = sep, quote = quote, :
> line 5 appears to contain embedded nulls
> 4: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
> embedded nul(s) found in input
>>
>
>
> On Thu, Aug 30, 2018 at 8:57 PM Patrick Barry <pdbarry using alaska.edu> wrote:
>
>> You still haven't fixed the first thing both Sarah and I pointed out. You
>> are lacking an = between sep and ","
>>
>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",")
>>
>> should be
>>
>> the_data <- read.csv(file = "GBM_clinical_drug.csv", header = TRUE, *sep
>> = ","*)
>>
>> as Sarah pointed out, you should use spaces to help make these errors more
>> obvious.
>>
>> On Thu, Aug 30, 2018 at 4:53 PM, Spencer Brackett <
>> spbrackett20 using saintjosephhs.com> wrote:
>>
>>> Hello again,
>>>
>>> My apologies for the delayed response... computer troubles. In reference
>>> to
>>> Ms. Goslee's and Mr. Barry's query, the following is the error code
>>> received after I inputted my R command
>>>
>>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",")
>>> Error: unexpected string constant in
>>> "the_data<-read.csv(file="GBM_clinical_drug.csv",header=TRUE,sep",""
>>>
>>> Given this, should I proceed with implementing the path<getwd() ,since I
>>> am, as he suggested trying to set the variable *path* to my working
>>> directory with path<-"."
>>>
>>> Mr. Mittal also recommended importing with r studio, which I shall try in
>>> the meantime.
>>>
>>> Many thanks,
>>>
>>> Spencer Brackett
>>>
>>>
>>> On Wed, Aug 29, 2018 at 10:14 PM Amit Mittal <prof.amit.mittal using gmail.com>
>>> wrote:
>>>
>>>> Use r studio and import from the menu. Read_csv has changed
>>>>
>>>> Also you can see any format problems
>>>>
>>>> On Thu, 30 Aug 2018 3:36 am Spencer Brackett, <
>>>> spbrackett20 using saintjosephhs.com> wrote:
>>>>
>>>>> Good evening R users,
>>>>>
>>>>> I am attempting to carry out DNA methylation analysis on two separate
>>>>> CSV
>>>>> files (LGG and GBM), which I have downloaded onto my R console. To set
>>> the
>>>>> path<-"." to be indicative of one or both of the csv files, I utilized
>>> the
>>>>> following functions and received the errors shown. How do I set the
>>> "." so
>>>>> that I can begin analysis on my files?
>>>>>
>>>>>> the_data <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",")
>>>>> Error: unexpected string constant in "the_data
>>>>> <-read.csv(file="LGG_clinical_drug.csv",header=T,sep",""
>>>>>> the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",")
>>>>> Error: unexpected string constant in
>>>>> "the_data<-read.csv(file="GBM_clinical_drug.csv",header=T,sep",""
>>>>>
>>>>> This is the preliminary portion of the analysis I am trying to run,
>>> which
>>>>> I
>>>>> am referring to:
>>>>>
>>>>> 1 library(TCGAbiolinks)
>>>>> 2
>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM.
>>>>> 4 path <– "."
>>>>> 5
>>>>> 6 query.met <– TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450",
>>>>> level = 3)
>>>>> 7 TCGAdownload(query.met, path = path )
>>>>> 8 met <– TCGAprepare(query = query.met,dir = path,
>>>>> 9 add.subtype = TRUE, add.clinical = TRUE,
>>>>> 10 summarizedExperiment = TRUE,
>>>>> 11 save = TRUE, filename = "lgg_gbm_met.rda")
>>>>> 12
>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM.
>>>>> 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform =
>>>>> "IlluminaHiSeq_
>>>>> RNASeqV2",level = 3)
>>>>> 15
>>>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_
>>>>> results")
>>>>> 17
>>>>> 18 exp <– TCGAprepare(query = query.exp, dir = path,
>>>>> 19 summarizedExperiment = TRUE,
>>>>> 20 add.subtype = TRUE, add.clinical = TRUE,
>>>>> 21 type = "rsem.genes.normalized_results",
>>>>> 22 save = T,filename = "lgg_gbm_exp.rda")
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Spencer Brackett
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-help
mailing list