[R] you are making it far too difficult

Spencer Brackett @pbr@ckett20 @ending from @@intjo@ephh@@com
Thu Dec 27 22:57:33 CET 2018


For future reference,

I unpacked the full gene expression file from ICGC by directly downloading
the following link
protein_expression.GBM-US.tsv.gz, and then using the argument Mr. Heiberger
layed out.

Best,

Spencer Brackett

On Thu, Dec 27, 2018 at 4:30 PM Spencer Brackett <
spbrackett20 using saintjosephhs.com> wrote:

> Mr. Heiberger,
>
>   I followed your argument and it works. I received the same data. And
> yes, ICGC breakes their datasets into separate files based on data type.
> Thank you for the pointer on selecting all 50 rows, as I assumed that the
> entirety of gene expression data within the data set would be downloaded
> through the process you followed, as it does when directly downloading the
> file via a tsv.gz download.
>
> On Thu, Dec 27, 2018 at 4:05 PM Richard M. Heiberger <rmh using temple.edu>
> wrote:
>
>> I downloaded the Donors dataset
>>
>>
>> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22GBM-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D
>>
>> by clicking "Export table as TSV".
>>
>> Then I read it with
>>
>> donors <- read.delim("~/Downloads/donors_2018_12_27_03_52_03.tsv")
>>
>> Here is the transcript.
>>
>> > donors <- read.delim("~/Downloads/donors_2018_12_27_03_52_03.tsv")
>> > donors
>>    Donor.ID Project.Code Primary.Site Gender Age.at.Diagnosis
>> 1   DO10892       GBM-US        Brain Female               45
>> 2   DO12328       GBM-US        Brain   Male               56
>> 3   DO11657       GBM-US        Brain Female               73
>> 4   DO13510       GBM-US        Brain Female               63
>> 5   DO12670       GBM-US        Brain Female               63
>> 6   DO11501       GBM-US        Brain Female               59
>> 7   DO13809       GBM-US        Brain Female               74
>> 8   DO13647       GBM-US        Brain   Male               56
>> 9   DO11645       GBM-US        Brain   Male               73
>> 10  DO14145       GBM-US        Brain Female               85
>>    Tumor.Stage.at.Diagnosis Survival.Time..days.  SSM CNSM  STSM   SGV
>> METH.A
>> 1                        NA                   NA True True False False
>>  True
>> 2                        NA                  154 True True False False
>>  True
>> 3                        NA                   NA True True False False
>>  True
>> 4                        NA                 1448 True True False False
>>  True
>> 5                        NA                  772 True True False False
>>  True
>> 6                        NA                   NA True True False False
>>  True
>> 7                        NA                  213 True True False False
>>  True
>> 8                        NA                  383 True True False False
>>  True
>> 9                        NA                  113 True True False False
>>  True
>> 10                       NA                   94 True True False False
>>  True
>>    METH.S EXP.A EXP.S PEXP miRNA.S   JCN Mutations Mutated.Genes
>> 1   False  True  True True   False False       269           392
>> 2   False  True False True   False False       192           263
>> 3   False  True False True   False False       128           209
>> 4   False  True  True True   False False       130           199
>> 5   False  True  True True   False False       142           194
>> 6   False  True  True True   False False       129           190
>> 7   False  True False True   False False       130           178
>> 8   False  True False True   False False       116           175
>> 9   False  True False True   False False       125           174
>> 10  False  True  True True   False False       108           169
>> >
>>
>> I don't know how to get the download of the whole file.  It looks like
>> you could page through it with the page menu at the bottom of the webpage.
>> If you do that, set it for 50 at a time instead of the default 10.
>>
>> For the Genes and the two types of Mutation files, it will be more
>> nuisance this way because there are about 10000 rows for each of those
>> three files, thus about 200 of these statements per dataset.
>>
>> I think it is time to move to the bioconductor list for specific guidance
>> on this type of dataset.
>>
>>
>> On Thu, Dec 27, 2018 at 3:28 PM Spencer Brackett <
>> spbrackett20 using saintjosephhs.com> wrote:
>>
>>> Mr. Calboli,
>>>
>>> After beginning to unpack the GBM file you sent me via directly importing
>>> it unit my console, I received the following:
>>>
>>> View(GBM_PEXP.tsv)
>>>
>>> **Note that I named the file GBM_PEXP.tsv)**
>>>
>>>   Upon downloading, my script now contains a 2 by 2 table, with the x
>>> column still containing encoded script. As for my Data summary to the
>>>  right, this new file reports that 2 objects are acting upon 1 variable.
>>> How should I proceed?
>>>
>>> -Spencer
>>>
>>> On Thu, Dec 27, 2018 at 3:12 PM Federico Calboli <
>>> federico.calboli using kuleuven.be> wrote:
>>>
>>> > Unpack these files.
>>> >
>>> > F
>>> >
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

	[[alternative HTML version deleted]]



More information about the R-help mailing list