[R] simulate binary data from a logistic regression model

Frank E Harrell Jr feh3k at spamcop.net
Thu Oct 9 15:04:18 CEST 2003


On Thu, 9 Oct 2003 11:38:20 +0200 (MEST)
Michele Grassi <grassi at psico.univ.trieste.it> wrote:

> Hi.
> How can i simulate a binary data set from a logistic 
> regression model?I need to manipulate parameters and so 
> obtain my set of data.
> I want to show the improve in analyzing binary data 
> with GLM(binomial) model instead of classical ANOVA or 
> NON-MODELS procedures(relative risk-odds ratio-Pearson 
> test of godness of fit...)
> Can you say me what is the right function to use?
> Do you know any interesting simulation in the web?
> 
> Thank you.
> Michele.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help

This is from the Overview help file from the Design package:

    n <- 1000    # define sample size
    set.seed(17) # so can reproduce the results
    treat <- factor(sample(c('a','b','c'), n, T))
    num.diseases <- sample(0:4, n, T)
    age <- rnorm(n, 50, 10)
    cholesterol <- rnorm(n, 200, 25)
    weight <- rnorm(n, 150, 20)
    sex <- factor(sample(c('female','male'), n, T))

    # Specify population model for log odds that Y=1
    L <- .1*(num.diseases-2) + .045*(age-50) +
      (log(cholesterol - 10)-5.2)*(-2*(treat=='a') +
          3.5*(treat=='b')+2*(treat=='c'))
    # Simulate binary y to have Prob(y=1) = 1/[1+exp(-L)]
    y <- ifelse(runif(n) < plogis(L), 1, 0)

But note that it's no longer necessary to demonstrate that logistic regression works better than ordinary regression when the response is binary.

---
Frank E Harrell Jr    Professor and Chair            School of Medicine
                      Department of Biostatistics    Vanderbilt University




More information about the R-help mailing list