[R] Linear models (lme4) - basic question

Thu Sep 2 21:01:37 CEST 2010

Perhaps even more to the point, "covariate adjustment" and
"classification" should not be separate. One should fit the
appropriate model that does both.

-- Bert

On Thu, Sep 2, 2010 at 11:34 AM, Ben Bolker <bbolker at gmail.com> wrote:
> On 10-09-02 02:26 PM, James Nead wrote:
>> My apologies - I have made this more confusing than it needs to be.
>>
>> I had microarray gene expression data which I want to use for
>> classification algorithms. However, I want to 'adjust' the data for
>> all confounding factors (such as age, experiment number etc.), before
>> I could use the data as input for the classification algorithms. Since
>> the phenotype is known to be effected by age, I thought that this
>> would be a fixed effect whereas something like 'beadchip' would be a
>> random effect.
>>
>> Should I be looking at something else for this?
>>
>
>  Sounds to me as though you should use residuals() rather than fitted()
> if you want to "adjust for confounding factors".
>
>  But since you've made up a nice small example, I think you should look
> at the results
>  of fitted() and residuals()
> for your example and see if it's doing what you want.
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Ben Bolker <bbolker at gmail.com>
>> *To:* r-help at stat.math.ethz.ch
>> *Sent:* Thu, September 2, 2010 2:06:47 PM
>> *Subject:* Re: [R] Linear models (lme4) - basic question
>>
>> James Nead <james_nead <at> yahoo.com <http://yahoo.com>> writes:
>>
>> >
>> > Sorry, forgot to mention that the processed data will be used as
>> input for a
>> > classification algorithm. So, I need to adjust for known effects
>> before I can
>> > use the data.
>> >
>> > > I am trying to adjust raw data for both fixed and mixed effects.
>> > The data that I
>> > > output should account for these effects, so that I can use
>> > the adjusted data
>> > >for
>> > > further analysis.
>> > >
>> > > For example, if I have the blood sugar levels for 30 patients,
>> > and I know that
>> > > 'weight' is a fixed effect and that 'height' is a random effect,
>> > what I'd want
>> > > as output is blood sugar levels that have been adjusted for these
>> effects.
>>
>>   What's not clear to me is what you mean by 'adjusted for'.
>> fitted(lm.adj) will give predicted values based on the height
>> and weight. I don't really know what the justification for/meaning
>> of the adjustment is, so I don't know whether you want to predict
>> on the basis of the heights, or whether you want to get a
>> 'population-level'
>> prediction, i.e. one with height effects set to zero.  Maybe you want
>> residuals(lm.adj) ...?
>>
>>   I suggest that follow-ups go to r-sig-mixed-models at r-project.org
>> <mailto:r-sig-mixed-models at r-project.org>
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml