[R-sig-ME] Gene expression lme

Wed Apr 25 00:25:17 CEST 2012

Angelina Mukherjee <angelina.mukherjee88 at ...> writes:

> I have gene expression data and have a subset of the same, comprising
> about 20,000 rows (genes) and 3 factors. I am interested in how these
> factors like patient affect the gene expression. An outline of my data
> frame is shown below:
> 
>    Probe  patient region subregion expression
> 1  650349    1      1         1    12.969875
> 2 2510494    1      1         1    9.042255
> 3 2940041    1      1         1    7.010943
> 4 3830112    1      1         1    7.520437
> 5 6560392    1      1         1    7.685423
> 6 1450041    1      1         1    6.595077
> 
> I fit the following: *> fit <- lme(expression ~ Probe + region +
> subregion , random=~1|patient/region/subregion, data=df)*
> 
> I get the following error:*Error in model.matrix.default(fixed, data = X) :
>   allocMatrix: too many elements specified*
> 
> Here I am investigating whether different subregions within multiple
> different regions nested within various patients affect gene
> expression response. I cannot model the response per *probe* as I'd
> have only one observation then.

  Any chance of a reproducible example 
(see http://tinyurl.com/reproducible-000 ), e.g. post the data
or a subset of the data that reproduces the error somewhere, or
simulate some data accordingly (you can probably replace your
"expression" column with random numbers) ?

  I'm a little concerned that 'Probe' is being treated as numeric,
although that might not be relevant to the particular error message.
How many distinct probes and patients are there, and how do they
overlap?  (e.g. multiple probes per patient, or patients per probe,
or ... ?)

  If it turns out that this is due to a limitation of lme and
not to some syntax and/or data issue, it's possible that switching
to lme4/lmer would help.

  Ben Bolker