[R] How to improve, at all, a simple GLM code
Ben Bolker
bbolker at gmail.com
Fri Mar 30 22:26:56 CEST 2012
On 12-03-30 12:40 PM, Clifton, Abigail J. wrote:
> Hi again!
>
> Thanks very much for the code, it appears to work! Finally, I want
> to extract the coefficients and tried coef(g1), which works.
> However, there only appear to be intercepts/coefficients for 'V22N'
> out of thousands of possibilities, which are all displayed as
> dots/NaN. Is there a way of getting more coefficients - perhaps by
> changing lambda or something like that? Is it also possible to
> print the final 'model'?
I'm afraid I'm out of time right now -- cc'ing to r-help in case
someone else has the time and energy to help. All I can suggest is
that you spend some time reading through all of the documentation for
the package (start with help(package="glmnet") and browse through all
the help pages, run the examples, etc. Unfortunately there is no
general-purpose vignette for that package ... an entire book on the
subject is available online
http://www-stat.stanford.edu/~tibs/ElemStatLearn/ , but that won't
provide quick answers ...
Ben Bolker
> Kind regards,
>
> Abigail
>
>
> -----Original Message----- From: Ben Bolker <bbolker at gmail.com>
> Sender: r-help-bounces at r-project.orgDate: Fri, 30 Mar 2012 02:58:04
> To: <r-help at stat.math.ethz.ch> Subject: Re: [R] How to improve,
> at all, a simple GLM code
>
> Abigail Clifton <abigailclifton <at> me.com> writes:
>
>> I am wanting to find a good predictive model, yes. It's part of a
>> project so if I have time after finding the model I may want to
>> find some patterns but it's not a priority. I just want the
>> model for now (I need the coefficients above all).
>
>> It's all categorical data, I categorised any continuous data
>> before I started trying to fit the glm.
>
> That's not necessarily a good idea (categorising often loses power
> relative to fitting something like an additive model), but OK.
>
>
>> I was unsure of how to get the csv file to you,however, I have
>> uploaded it and it should be available for download from here:
>> http://www.filedropper.com/prepareddata
>
> Here's how far I got:
>
> Prepared_Data <- na.omit(read.csv("Prepared_Data.csv",
> header=TRUE)) pd <- Prepared_Data[,-3] ## data minus response
> variable
>
> ## how many levels per variable? lev <- sapply(pd,function(x)
> length(unique(x)))
>
> ## total parameters for n variables par(las=1,bty="l")
> plot(cumprod(lev),log="y")
>
> library(Matrix) m <- sparse.model.matrix(~.^2,data=pd) ## slower
> than model.matrix ncol(m) ##8352 columns (!!)
>
> library(glmnet) g1 <- glmnet(m,Prepared_Data$C3,
> family="binomial")
>
> This doesn't appear to work properly, yet (I get funny values),
> but it's the direction I would go ...
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
> posting guide http://www.R-project.org/posting-guide.html and
> provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list