[R] I have a problem
David Winsemius
dwinsemius at comcast.net
Sat Jul 31 17:15:56 CEST 2010
On Jul 31, 2010, at 10:26 AM, 笑啸 wrote:
> dear£ºDavid Winsemius
> in the example£¨nomogram£©£¬I don't understand the meanings
> of the program which have been marked by comment line.And how to
> compile the program.
R is an interpreted language. There is no compilation. You submit
these lines of text to the console by one of:
a) copying and pasting or
b) you bring them in from a text file with source() or
c) in this case you could get them to run automagically by typing:
# require(rms) ... I assume you already have that package loaded
example(nomogram)
> (L <- .4*(sex=='male') + .045*(age-50) +(log(cholesterol -
> 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))).
>
>
> example(nomogram)
> n <- 1000 # define sample size
> set.seed(17) # so can reproduce the results
> age <- rnorm(n, 50, 10)
> blood.pressure <- rnorm(n, 120, 15)
> cholesterol <- rnorm(n, 200, 25)
> sex <- factor(sample(c('female','male'), n,TRUE))
>
>
> #---need explanation for code starting here-----
>
>
> # Specify population model for log odds that Y=1
> L <- .4*(sex=='male') + .045*(age-50) +
> (log(cholesterol - 10)-5.2)*(-2*(sex=='female') + 2*(sex=='male'))
This line takes input from all the variables created above and creates
a summary simulation value for each "subject" on the log-odds scale.
It is used as input to a random number function below that returns a 1
or 0 based on whether or not plogis(L) is above or below a uniform
random value between 0 and 1. You can alter the simulation model by
changing the parameters in that L function. If the lrm function is
applied properly it will be able to "recover" or "discover" those
parameter values. For instance there should be an age coefficient of
0.045. The other coefficients might be more difficult to extract
simply, since there is an implied interaction between the cholesterol
and gender variables.
If you run an lrm model with a linear term for age (which is not how
Harrell's examples are set up)
lrm(y ~ age+sex*rcs(cholesterol,4)+blood.pressure)
you get an estimate for age, estimates for an interaction between sex
and cholesterol, and an estimate for blood pressures (which was not
used in construction of the simulated values so the the BP estimate
should be close to zero.)
... and the line for the estimate for the age term is
Coef S.E. Wald Z P
age 0.034953 0.006672 5.24 0.0000
so the "true value of the age effect" (0.045) is within the 95% CI
estimate (0.0349 +/- 1.96*0.0066).
... and the estimate for blood pressure has an estimate whose 95% CI
includes zero, i.e. is not "statistically significant".
Coef S.E. Wald Z P
blood.pressure -0.002083 0.004366 -0.48 0.6333
I suspect you need to read up in your logistic regression texts as
well as reviewing further documents from Harrell's website.:
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS
>
>
> # ---end----
>
>
> # Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
> y <- ifelse(runif(n) < plogis(L), 1, 0)
> ddist <- datadist(age, blood.pressure, cholesterol, sex)
> options(datadist='ddist')
> f <- lrm(y ~ lsp(age,50)+sex*rcs(cholesterol,4)+blood.pressure)
> nom <- nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis
> fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999),
> funlabel="Risk of Death")
> #Instead of fun.at, could have specified fun.lp.at=logit of
> ...
> Thank you for you expain !
>
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list