[R] two questions about regression models and clustering routines

Maura E Monville maura.monville at gmail.com
Thu Jun 5 21:46:01 CEST 2008


 I managed to use an example (see attachment) of clever regression
routines. I customized it to suit my needs.
The initial model I try to fit consists of the first 10 powers of time
(time the observation was recorded) and the first 10 powers of the
phase. In fact my files record patients' breathing signals as a
sequence of breathing cycles. Every cycle sampled phase (inhale -
exhale) is mapped to an angle in the range [0,2PI]
I have two questions,

1. Surprisingly (for me) for some files the summary of the regular lm
command shows a number of non significant coefficients (those for
which the column  "Pr(>|t|)" value is > 0.05)
    But after running the step command on the model output from lm I
see that all the 20 coefficients have become significant, which makes
me feel astonished because  I have  always thought that step  would
prune  the model stripping it off the  non significant coefficients.
    So I was thinking to submit the model output from "step" to the Cp
test anyway. As it is implemented right now the Cp stage is run only
if the model output from "step" still has some non significant
coefficients.
    Your thoughts .....

2. The regression model coefficients, stored in the first 20  columns
of matrix rg, are used to calculate a distance matrix that is then
input to clustering routines.
    I am writing a more sophisticated clustering algorithm that uses PAM.
    The 21st column of matrix rg stores the file ID, which is
obviously not used in the distance evaluation.
    I would like to be able to attach the file ID as labels visible in
each cluster. The dist command description mentions some "Labels". But
it fails to explain clearly  how the observations labels can be saved
in the distance matrix and then displayed in the cluster plot.
    Can you please help me with that ?

Thank you in advance
Best regards,
--
Maura E.M
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Build-Regression-Model.txt
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080605/382ed80d/attachment.txt>


More information about the R-help mailing list