[R] Fwd: UPDATE

Spencer Brackett @pbr@ckett20 @ending from @@intjo@ephh@@com
Thu Dec 27 06:14:11 CET 2018


Follow up,

Would read.txt also work, as I am certain that I have both datasets in .txt
files? As to a previous users question concern the .csv nature of the
supposed excel file, I am uncertain as to how this was translated as such.
The file is most certainly in excel.


On Thu, Dec 27, 2018 at 12:10 AM Spencer Brackett <
spbrackett20 using saintjosephhs.com> wrote:

> Caitlin,
>
>   I tried your command in both RGui and RStudio but both came up as
> errors. I believe I made a mistake somewhere I labeling/downloading the
> files, which is the source of the confusion in R. I will re-examine the
> files saved on my desktop to determine the error. Regardless, would it be
> better to use a read.table or read.csv function when attempting to download
> my datasets? I tried using read.xl on RStudio as this process seemed much
> easier, however, it would seem that my proclivity to error prevents such.
>
> Best,
>
> Spencer
>
> On Wed, Dec 26, 2018 at 11:55 PM Caitlin Gibbons <bioprogrammer using gmail.com>
> wrote:
>
>> Does this help Spencer? The read.delim() function assumes a tab character
>> by default, but I specifically included it using the read.csv function. The
>> downloaded file is NOT an Excel file so this should help.
>>
>> GBM_protein_expression <- read.csv("C:/Users/Spencer/Desktop/GBM
>> protein_expression.tsv", sep=“\t”)
>>
>> Sent from my iPhone
>>
>> > On Dec 26, 2018, at 9:23 PM, Richard M. Heiberger <rmh using temple.edu>
>> wrote:
>> >
>> > this is wrong because the file is a csv file.  read_excel is designed
>> > for xls files.
>> > GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
>> > protein_expression.csv")
>> >
>> > How did you get a csv? it downloads as tsv.
>> >
>> > the statement you should use is in base, no library() statement is
>> needed.
>> >
>> > GBM_protein_expression <- read.delim("C:/Users/Spencer/Desktop/GBM
>> > protein_expression.csv")
>> >
>> > read.delim is the same as read.csv except that it sets the sep
>> > argument to "\t".
>> >
>> >
>> >
>> > On Wed, Dec 26, 2018 at 11:11 PM Spencer Brackett
>> > <spbrackett20 using saintjosephhs.com> wrote:
>> >>
>> >> Sorry, my mistake.
>> >>
>> >> So I could still use read.table and should I try using a .txt version
>> of
>> >> the file to avoid the silent changes you described?
>> >>
>> >> Also, when I tried to simply this process by downloading the dataset
>> onto
>> >> RStudio opposed to R (Gui) I received the following...
>> >> library(readxl)
>> >>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
>> >> protein_expression.csv")
>> >> Error: Can't establish that the input is either xls or xlsx.
>> >>> View(GBM_protein_expression)
>> >> Error in View : object 'GBM_protein_expression' not found
>> >> Error in gzfile(file, mode) : cannot open the connection
>> >> In addition: Warning message:
>> >> In gzfile(file, mode) :
>> >>  cannot open compressed file
>> >> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
>> >> probable reason 'No such file or directory'
>> >>> library(readxl)
>> >>> GBM_protein_expression <-
>> >> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
>> >> readxl works best with a newer version of the tibble package.
>> >> You currently have tibble v1.4.2.
>> >> Falling back to column name repair from tibble <= v1.4.2.
>> >> Message displays once per session.
>> >>> View(GBM_protein_expression)
>> >>
>> >>
>> >> Is this perhaps the result of lack of preview (which I did not
>> complete at
>> >> the time I hit import as the preview failed to load), or the fact that
>> the
>> >> excel file itself contains no numerical data, but only TRUE or FALSE
>> >> entries?
>> >>
>> >> On Wed, Dec 26, 2018 at 10:59 PM Jeff Newmiller <
>> jdnewmil using dcn.davis.ca.us>
>> >> wrote:
>> >>
>> >>> Please always reply-all to keep the list involved.
>> >>>
>> >>> If you used Save As to change the data format to Excel AND the file
>> >>> extension to xlsx, then yes, you should be able to read with readxl. I
>> >>> don't recommend it, though... Excel often changes data silently and in
>> >>> irregularly located places in your file.
>> >>>
>> >>> On December 26, 2018 7:38:16 PM PST, Spencer Brackett <
>> >>> spbrackett20 using saintjosephhs.com> wrote:
>> >>>> So even if I imported the file form ICGC to my desktop as an excel
>> >>>> file,
>> >>>> and can view and saved the data as such, it is still a TSV?
>> >>>>
>> >>>> On Wed, Dec 26, 2018 at 10:35 PM Jeff Newmiller
>> >>>> <jdnewmil using dcn.davis.ca.us>
>> >>>> wrote:
>> >>>>
>> >>>>> CSV and TSV are not Excel files. Yes, I know Excel will open them,
>> >>>> but
>> >>>>> that does not make them Excel files.
>> >>>>>
>> >>>>> Read a TSV file with read.table or read.csv, setting the sep
>> argument
>> >>>> to
>> >>>>> "\t".
>> >>>>>
>> >>>>> On December 26, 2018 7:26:35 PM PST, Spencer Brackett <
>> >>>>> spbrackett20 using saintjosephhs.com> wrote:
>> >>>>>> I tried importing the file without preview and recieved the
>> >>>>>> following....
>> >>>>>>
>> >>>>>> library(readxl)
>> >>>>>>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM
>> >>>>>> protein_expression.csv")
>> >>>>>> Error: Can't establish that the input is either xls or xlsx.
>> >>>>>>> View(GBM_protein_expression)
>> >>>>>> Error in View : object 'GBM_protein_expression' not found
>> >>>>>> Error in gzfile(file, mode) : cannot open the connection
>> >>>>>> In addition: Warning message:
>> >>>>>> In gzfile(file, mode) :
>> >>>>>> cannot open compressed file
>> >>>>>
>> >>>>>
>> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds',
>> >>>>>> probable reason 'No such file or directory'
>> >>>>>>> library(readxl)
>> >>>>>>> GBM_protein_expression <-
>> >>>>>> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx")
>> >>>>>> readxl works best with a newer version of the tibble package.
>> >>>>>> You currently have tibble v1.4.2.
>> >>>>>> Falling back to column name repair from tibble <= v1.4.2.
>> >>>>>> Message displays once per session.
>> >>>>>>> View(GBM_protein_expression)
>> >>>>>>
>> >>>>>> Also, the area above my console says that no data is available in
>> >>>> the
>> >>>>>> table. Is this perhaps the result of lack of preview or the fact
>> >>>> that
>> >>>>>> the
>> >>>>>> excel file itself contains no numerical data, but only TRUE or
>> FALSE
>> >>>>>> entries?
>> >>>>>>
>> >>>>>> On Wed, Dec 26, 2018 at 9:57 PM Spencer Brackett <
>> >>>>>> spbrackett20 using saintjosephhs.com> wrote:
>> >>>>>>
>> >>>>>>> Hello again,
>> >>>>>>>
>> >>>>>>> I worked on directly downloading the file into R as was suggested,
>> >>>>>> but
>> >>>>>>> have thus far been unsuccessful. This is what  I generated on my
>> >>>>>> second
>> >>>>>>> attempt...
>> >>>>>>>
>> >>>>>>> GBM protein_expression<-(file.choose(), header=TRUE, sep="\t")
>> >>>>>>> Error: unexpected symbol in "GBM protein_expression"
>> >>>>>>>> GBM
>> >>>>>>>
>> >>>>>
>> >>>
>> >>>>>
>> protein_expression<-(file.choose(GBM_protein_expression.xlsx),header=TRUE,
>> >>>>>>> sep="\t")
>> >>>>>>> Error: unexpected symbol in "GBM protein_expression"
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>> What part of the argument is in error?
>> >>>>>>>
>> >>>>>>> Also I tried importing the dataset as an excel file on RStudio to
>> >>>> see
>> >>>>>> if I
>> >>>>>>> could solve my problem that way. However, my imported excel file
>> >>>> has
>> >>>>>> been
>> >>>>>>> stuck in the 'retrieving preview data' and no data is appearing.
>> >>>> Is
>> >>>>>> the
>> >>>>>>> data file prehaps too large or in the wrong format?
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Wed, Dec 26, 2018 at 6:42 PM Spencer Brackett <
>> >>>>>>> spbrackett20 using saintjosephhs.com> wrote:
>> >>>>>>>
>> >>>>>>>> Mr. Heiberger,
>> >>>>>>>>
>> >>>>>>>> Thank you for the insight! I will try out suggestion.
>> >>>>>>>>
>> >>>>>>>> Best,
>> >>>>>>>>
>> >>>>>>>> Spencer Brackett
>> >>>>>>>>
>> >>>>>>>> On Wed, Dec 26, 2018 at 6:34 PM Richard M. Heiberger
>> >>>>>> <rmh using temple.edu>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> I looked at the first file.  It gives an option to download as
>> >>>> TSV
>> >>>>>>>>> (tab separated values).
>> >>>>>>>>> That is the same as CSV except with tabs instead of commas.
>> >>>>>>>>> You do not need any external software to read it.  Read the
>> >>>>>> downloaded
>> >>>>>>>>> file directly into R.
>> >>>>>>>>>
>> >>>>>>>>> read.delim looks as if it would work directly on the downloaded
>> >>>>>> file.
>> >>>>>>>>> ?read.delim
>> >>>>>>>>> The notation "\t" means the tab character.
>> >>>>>>>>>
>> >>>>>>>>> As an aside, stay away from notepad. it is too naive for almost
>> >>>>>>>>> anything interesting.
>> >>>>>>>>> The specific case I often see is people reading linux-style text
>> >>>>>> files
>> >>>>>>>>> with notepad, which doesn't
>> >>>>>>>>> understand NL terminated lines.  nicely formatted text files
>> >>>> become
>> >>>>>>>>> illegible.
>> >>>>>>>>>
>> >>>>>>>>> On Wed, Dec 26, 2018 at 6:04 PM Spencer Brackett
>> >>>>>>>>> <spbrackett20 using saintjosephhs.com> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Good evening,
>> >>>>>>>>>>
>> >>>>>>>>>> I am attempting to anaylze the protein expression data
>> >>>> contained
>> >>>>>> within
>> >>>>>>>>>> these two ICGC, TCGA datasets (one for GBM and the other for
>> >>>> LGG)
>> >>>>>>>>>>
>> >>>>>>>>>> *File for GBM  protein expression*:
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22GBM-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>> >>>>>>>>>>
>> >>>>>>>>>> *File for LGG protein expression:*
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> *
>> >>>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>> >>>>>>>>>> <
>> >>>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>> >>>>>>>>>> *
>> >>>>>>>>>>
>> >>>>>>>>>>  When I tried to transfer the files from .txt (via Notepad)
>> >>>> to
>> >>>>>> .csv
>> >>>>>>>>> (via
>> >>>>>>>>>> Excel), the data appeared in the columns as unorganized and
>> >>>>>> random
>> >>>>>>>>>> script... not like how a typical csv should be arranged at
>> >>>> all. I
>> >>>>>> need
>> >>>>>>>>> the
>> >>>>>>>>>> dataset to be converted into .csv in order to analyze it in R,
>> >>>>>> which
>> >>>>>>>>> is why
>> >>>>>>>>>> I am hoping someone here might help me in doing that. If not,
>> >>>> is
>> >>>>>> there
>> >>>>>>>>>> perhaps some other way that I could analyze the datatsets on
>> >>>> R,
>> >>>>>> which
>> >>>>>>>>> again
>> >>>>>>>>>> is downloaded from the dataportal ICGC?
>> >>>>>>>>>>
>> >>>>>>>>>> Best,
>> >>>>>>>>>>
>> >>>>>>>>>> Spencer Brackett
>> >>>>>>>>>>
>> >>>>>>>>>>        [[alternative HTML version deleted]]
>> >>>>>>>>>>
>> >>>>>>>>>> ______________________________________________
>> >>>>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>> >>>> see
>> >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>>>>>>> PLEASE do read the posting guide
>> >>>>>>>>> http://www.R-project.org/posting-guide.html
>> >>>>>>>>>> and provide commented, minimal, self-contained, reproducible
>> >>>>>> code.
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>>      [[alternative HTML version deleted]]
>> >>>>>>
>> >>>>>> ______________________________________________
>> >>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>>> PLEASE do read the posting guide
>> >>>>>> http://www.R-project.org/posting-guide.html
>> >>>>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>>>
>> >>>>> --
>> >>>>> Sent from my phone. Please excuse my brevity.
>> >>>>>
>> >>>
>> >>> --
>> >>> Sent from my phone. Please excuse my brevity.
>> >>>
>> >>
>> >>        [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list