[R] how to assing unique ID in a table and do regression

Rui Barradas ruipbarradas at sapo.pt
Mon Aug 6 18:30:43 CEST 2012


Sorry, forgot to Cc the list.

Em 06-08-2012 17:29, Rui Barradas escreveu:
> Hello,
>
> I'm glad it helped.
>
> The result of function cut() is a factor variable so you can coerce it 
> to integer, giving more "normal" names, or, if you want to keep track 
> of the intervals the adjusted r2 belong to, got straight to the last 
> two lines in the following code.
>
>
> #dat1$groups <- as.integer( cut( ...etc... ) )
>
> [...rest of your code... ]
>
> adj <- summary(lin.temp1)$adj.r.squared
> class(adj) <- "list"
>
>
> That's it. It has as names the intervals produced by cut that appear 
> in the output you've posted.
>
> Rui Barradas
>
> Em 06-08-2012 17:07, Kristi Glover escreveu:
>>
>>
>>
>> Dear Rui,
>> Thanks for the help. I really appricated . It helped me out.
>> I modified some of the script you gave me becasue I found the package 
>> 'nlme'  can also do it. But I do use the script you gave me to split 
>> the data
>>   dat1$groups<-cut(dat1$LATITUDE, seq(-56,79, by=2.5))
>> lin.temp1<-lmList(S~mean_temp|groups,data=dat1)
>>   could you please give me an idea how I can extract r adjusted and 
>> put them in a table?
>> I called summary but it gave me the value of r2 adjusted for each 
>> group but I don't know how I can put teh r2 adjusted in table (like: 
>> group , r2 sqaure, r2 adjusted)
>>> summary(lin.temp1)$adj.r.squared
>> (-56,-53.5] :
>> [1] 0.2565786
>> (-53.5,-51] :
>> [1] 0.0715485
>> (-51,-48.5] :
>> [1] 0.2265334
>>
>> Thanks
>> Kristi
>>
>>> Date: Sat, 4 Aug 2012 16:15:57 +0100
>>> From: ruipbarradas at sapo.pt
>>> To: kristi.glover at hotmail.com
>>> CC: r-help at r-project.org
>>> Subject: Re: [R] how to assing unique ID in a table and do regression
>>>
>>> Hello,
>>>
>>> Try the following.
>>>
>>>
>>> id.groups <- with(dat, cut(ID, breaks=0:ceiling(max(ID))))
>>> sp <- split(dat, id.groups)
>>> regressors <- grep("en", names(dat))
>>> models <- lapply(sp, function(.df)
>>>       lapply(regressors, function(x) lm(.df[["S"]] ~ .df[[x]])))
>>>
>>> mod.summ <- lapply(models, function(x) lapply(x, summary))
>>> # First R2
>>> mod.r2 <- lapply(mod.summ, function(x) lapply(x, `[[`, "r.squared"))
>>> mod.r2
>>>
>>> # Now p-values
>>> mod.coef <- lapply(mod.summ, function(x) lapply(x, coef))
>>> mod.pvalue <- lapply(mod.coef,  function(x) lapply(x, `[`, , 4))
>>> # p-values in matrix form, columns are 'en2', en3', etc
>>> #lapply(mod.pvalue, function(x) do.call(cbind, x))
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 04-08-2012 15:22, Kristi Glover escreveu:
>>>> Hi R- User
>>>> I have very big data set (5000 rows). I wanted to make classes 
>>>> based on a column of that table (that column has the data which is 
>>>> continuous .) After converting into different class, this class 
>>>> would be Unique ID. I want to run regression for each ID.
>>>> For example I have a data set
>>>>> dput(dat)
>>>> structure(list(ID = c(0.1, 0.8, 0.1, 1.5, 1.1, 0.9, 1.8, 2.5,
>>>> 2, 2.5, 2.8, 3, 3.1, 3.2, 3.9, 1, 4, 4.7, 4.3, 4.9, 2.1, 2.4),
>>>>       S = c(4L, 7L, 9L, 10L, 10L, 8L, 8L, 8L, 17L, 18L, 13L, 13L,
>>>>       11L, 1L, 10L, 20L, 22L, 20L, 18L, 16L, 7L, 20L), en2 = 
>>>> c(-2.5767,
>>>>       -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5767, -2.5347,
>>>>       -2.5347, -2.5347, -2.5347, -2.5347, -2.5347, -2.4939, -2.4939,
>>>>       -2.4939, -2.4939, -2.4939, -2.4939, -2.4939, -2.4543, -2.4543
>>>>       ), en3 = c(-1.1785, -0.6596, -0.6145, -0.6437, -0.6593, -0.7811,
>>>>       -1.1785, -1.1785, -1.1785, -0.6596, -0.6145, -0.6437, -0.6593,
>>>>       -1.1785, -0.1342, -0.2085, -0.4428, -0.5125, -0.8075, -1.1785,
>>>>       -1.1785, -0.1342), en4 = c(-1.4445, -1.3645, -1.1634, -0.7735,
>>>>       -0.6931, -1.1105, -1.4127, -1.5278, -1.4445, -1.3645, -1.1634,
>>>>       -0.7735, -0.6931, -1.0477, -0.8655, -0.1759, 0.1203, -0.2962,
>>>>       -0.4473, -1.0436, -0.9705, -0.8953), en5 = c(-0.4783, -0.3296,
>>>>       -0.2026, -0.3579, -0.5154, -0.5726, -0.6415, -0.3996, -0.4529,
>>>>       -0.5762, -0.561, -0.6891, -0.7408, -0.6287, -0.4337, -0.4586,
>>>>       -0.5249, -0.6086, -0.7076, -0.7114, -0.4952, 0.1091)), .Names 
>>>> = c("ID",
>>>> "S", "en2", "en3", "en4", "en5"), class = "data.frame", row.names = 
>>>> c(NA,
>>>> -22L))
>>>>
>>>> Here ID has continuous value, I want to make groups with value 0-1, 
>>>> 1-2, 2-3, 3-4 from the column ID.
>>>> and then. I wanted to run regression with S (dependent variable) 
>>>> and en2 (independent variable); again regression of S and en3 , and 
>>>> so on.
>>>> After that, I wanted to have a table with r2 and p value.
>>>>
>>>> would you help me how I can do it? I was trying it manually - but 
>>>> it took so much time. therefore I thought to write you for your help.
>>>>
>>>> Thanks for your help.
>>>> Kristi
>>>>
>>>>
>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>
>



More information about the R-help mailing list