[BioC] How to create nested data frames

James W. MacDonald jmacdon at med.umich.edu
Fri Oct 26 15:08:18 CEST 2007


Hi Ana,

I don't think you need anything special to use the pls package. All you 
need is a data.frame containing your data, or alternatively to have your 
data in your .GlobalEnv (which IMO is easier to do anyway). Note that 
the pls package is expecting your data to be in the conventional format 
of subjects in rows and observations in columns, so you will have to 
transpose your matrix of expression data.

Does

results <- mvr(medical ~ t(expr), other args)

not work for you?

Best,

Jim



Ana Conesa wrote:
> Dear Oleg
> 
> Thanks for your help, I have tried, but it seems that method
> indicated works when the 2 matrices have the same length, which it is
> not my case. I cannot contruct the data.frame as you indicated if I
> have different lengths for a and b...
> 
> Ana
>>
>> ---- Mensaje Original ----
>> De: osklyar at ebi.ac.uk
>> Para: aconesa at ochoa.fib.es
>> Asunto: Re: [BioC] How to create nested data frames
>> Fecha: Thu, 25 Oct 2007 23:42:51 +0100
>>
>>> AFAIK it should be impossible, at least directly: a data.frame is
>>> essentially a list of vectors of equal length. However, a matrix in
>> R is
>>> essentially a vector with dim attributes set. So what you can do is
>>> something like this:
>>>
>>> # function that uses such a crazy data.frame, x
>>> f = function(x) {
>>>  a = x$a
>>>  dim(a) = attr(x,"matrixdim")
>>>  b = x$b
>>>  dim(b) = dim(a)
>>>  # use matrices, e.g.
>>>  print(dim(b))
>>> }
>>>
>>> m1 = matrix(runif(10),2,5)
>>> m2 = matrix(runif(10),2,5)
>>>
>>> df = data.frame(a=as.numeric(m1), b=as.numeric(m2))
>>> attr(df, "matrixdim") = dim(m1)
>>>
>>> f(df)
>>>
>>> ### should print 2 5 as those are the dimensions of matrices!
>>>> f(df)
>>> [1] 2 5
>>>
>>> indeed!
>>>
>>> Well, you need to consider that when you do as.numeric and when you
>> do
>>> 'dim' on a vector - you copy the data! But honestly you can always
>> find
>>> a way to pass another object and a list would be more reasonable as
>> you
>>> do not need to copy data. And although the above example works I
>> cannot
>>> think of a situation where it would be justified to use it, also
>>> data.frames are Slow!
>>>
>>>
>>> -  
>>> Dr Oleg Sklyar * EMBL-EBI, Cambridge CB10 1SD, UK * +441223494466
>>>
>>>
>>> On Fri, 2007-10-26 at 00:19 +0200, Ana Conesa wrote:
>>>> Dear List,
>>>>
>>>> This is more an R than a Bioconductor question but I cannot post
>> at
>>>> the R list at the moment, so I apologize for using Bioconductor
>>>> instead.
>>>>
>>>> I am trying to use the pls package to compute pls regression of
>> gene
>>>> expression data on medical variables. I should provide my data as
>> a
>>>> data.frame (NOT A LIST) which contains the matrices of X and Y
>>>> variables, i.e. if mydata is such data frame then mydata$expr
>> gives
>>>> the expression matrix and mydata$medical is the matrix of medical
>>>> data. 
>>>> If I do:
>>>>> mydata <- data.frame(expr=expr, medical=medical)
>>>> I simply obtain a single data.frame combining the two and I am not
>>>> able to call the matrices independently  I have been seaching the
>> R
>>>> help and documentation without sucess. 
>>>> Any help appreciated.
>>>>
>>>> Ana
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.inf
>> ormatics.conductor
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list