[R] how to assing unique ID in a table and do regression

Rui Barradas ruipbarradas at sapo.pt
Sat Aug 4 17:15:57 CEST 2012


Hello,

Try the following.


id.groups <- with(dat, cut(ID, breaks=0:ceiling(max(ID))))
sp <- split(dat, id.groups)
regressors <- grep("en", names(dat))
models <- lapply(sp, function(.df)
     lapply(regressors, function(x) lm(.df[["S"]] ~ .df[[x]])))

mod.summ <- lapply(models, function(x) lapply(x, summary))
# First R2
mod.r2 <- lapply(mod.summ, function(x) lapply(x, `[[`, "r.squared"))
mod.r2

# Now p-values
mod.coef <- lapply(mod.summ, function(x) lapply(x, coef))
mod.pvalue <- lapply(mod.coef,  function(x) lapply(x, `[`, , 4))
# p-values in matrix form, columns are 'en2', en3', etc
#lapply(mod.pvalue, function(x) do.call(cbind, x))

Hope this helps,

Rui Barradas

Em 04-08-2012 15:22, Kristi Glover escreveu:
> Hi R- User
> I have very big data set (5000 rows). I wanted to make classes based on a column of that table (that column has the data which is continuous .) After converting into different class, this class would be Unique ID. I want to run regression for each ID.
> For example I have a data set
>> dput(dat)
> structure(list(ID = c(0.1, 0.8, 0.1, 1.5, 1.1, 0.9, 1.8, 2.5,
> 2, 2.5, 2.8, 3, 3.1, 3.2, 3.9, 1, 4, 4.7, 4.3, 4.9, 2.1, 2.4),
>      S = c(4L, 7L, 9L, 10L, 10L, 8L, 8L, 8L, 17L, 18L, 13L, 13L,
>      11L, 1L, 10L, 20L, 22L, 20L, 18L, 16L, 7L, 20L), en2 = c(-2.5767,
>      -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5347,
>      -2.5347, -2.5347, -2.5347, -2.5347, -2.5347, -2.4939, -2.4939,
>      -2.4939, -2.4939, -2.4939, -2.4939, -2.4939, -2.4543, -2.4543
>      ), en3 = c(-1.1785, -0.6596, -0.6145, -0.6437, -0.6593, -0.7811,
>      -1.1785, -1.1785, -1.1785, -0.6596, -0.6145, -0.6437, -0.6593,
>      -1.1785, -0.1342, -0.2085, -0.4428, -0.5125, -0.8075, -1.1785,
>      -1.1785, -0.1342), en4 = c(-1.4445, -1.3645, -1.1634, -0.7735,
>      -0.6931, -1.1105, -1.4127, -1.5278, -1.4445, -1.3645, -1.1634,
>      -0.7735, -0.6931, -1.0477, -0.8655, -0.1759, 0.1203, -0.2962,
>      -0.4473, -1.0436, -0.9705, -0.8953), en5 = c(-0.4783, -0.3296,
>      -0.2026, -0.3579, -0.5154, -0.5726, -0.6415, -0.3996, -0.4529,
>      -0.5762, -0.561, -0.6891, -0.7408, -0.6287, -0.4337, -0.4586,
>      -0.5249, -0.6086, -0.7076, -0.7114, -0.4952, 0.1091)), .Names = c("ID",
> "S", "en2", "en3", "en4", "en5"), class = "data.frame", row.names = c(NA,
> -22L))
>
> Here ID has continuous value, I want to make groups with value 0-1, 1-2, 2-3, 3-4 from the column ID.
> and then. I wanted to run regression with S (dependent variable) and en2 (independent variable); again regression of S and en3 , and so on.
> After that, I wanted to have a table with r2 and p value.
>
> would you help me how I can do it? I was trying it manually - but it took so much time. therefore I thought to write you for your help.
>
> Thanks for your help.
> Kristi
>
>
>
>   		 	   		
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list