[R] Help : glm p-values for a factor predictor
Benoît PELE
benoit.pele at acoss.fr
Thu Jun 29 15:00:18 CEST 2017
Thank you for your answer.
The used code is the next one :
champ_model<-c("y","categ_juridique","Indic_CTRLAUTRE_RPOS","Indic_CTRLAUTRE_RNEG","Indic_CTRLCCA_RPOS",
"Indic_CTRLCCA_RNEG","Indic_CTRLCPAP_RPOS","Indic_CTRLCPAP_RNEG","Indic_CTRLLCTI_RPOS",
"Indic_Changement_NomLogiciel","Indic_Changement_NomEditeur","Changt_NomEditeurPaie",
"Changt_NomLogicielPaie","Infoabs_NomEditeurPaie","Infoabs_NomLogicielPaie",
"Indic_Decla_comple","Indic_Decla_AnnuRempl","class_ape","class_Logiciel","class_Editeur",
"moda_delai_soldeN_1","moda_delai_soldeN_2","moda_delai_soldeN_3","moda_delai_soldeN_4",
"moda_delai_soldeN_5",
"moda_anciennete_debitN_1","moda_anciennete_debitN_2","moda_anciennete_debitN_3",
"moda_anciennete_debitN_4","moda_anciennete_debitN_5",
"moda_moy_anciennete_debit","moda_std_anciennete_debit",
"moda_moy_delai_solde","moda_std_delai_solde",
var_cluster_Arome,var_cluster_BRC,var_cluster_Cedre,var_cluster_cntx2,var_cluster_ctrl,
var_cluster_DADS_assiette2,var_cluster_DADS_avantage2,var_cluster_DADS_contrat2,
var_cluster_DADS_salarie2,var_cluster_Sequoia)
--> The predictors between quotes (excepted y) are qualitative ; others
are groups of continuous predictors
Var_model<-paste0("y ~ ", paste(champ_model_cont[-1],collapse=" + "))
Logit_appr<-glm(formula=Var_model,family=binomial(link="logit"),data=pop_ctrl_siren_cca2017_appr)
--> The results of this glm do not provide overall pvalues for the
qualitative predictors, only one pvalue by modality. And for selecting the
qualitative predictors, i need that overall pvalue that SAS for example
provides with PROC LOGISTIC.
Benoit Pelé.
De : "Bob O'Hara" <rni.boh at gmail.com>
A : Benoît PELE <benoit.pele at acoss.fr>,
Cc : r-help <r-help at r-project.org>
Date : 29/06/2017 11:46
Objet : Re: [R] Help : glm p-values for a factor predictor
It might help if you provided the code you used. It's possible that
you didn't use direction="backward" in stepAIC(). Or if you did, it
was still running, so whatever else you try will still be slow. The
statement "R provides only the pvalues for each level" is wrong: look
at the anova() function.
Bob
On 29 June 2017 at 11:13, Benoît PELE <benoit.pele at acoss.fr> wrote:
> Hello,
>
> i am a newby on R and i am trying to make a backward selection on a
> binomial-logit glm on a large dataset (69000 lines for 145 predictors).
>
> After 3 days working, the stepAIC function did not terminate. I do not
> know if that is normal but i would like to try computing a "homemade"
> backward with a repeated glm ; at each step, the predictor with the max
> pvalue would be excluded until reaching a set of 20 predictors for
> example.
>
> My question is about the factor predictors with several levels. R
provides
> only the pvalues for each level whereas i need an overall pvalue for
> testing the predictor.
>
> On internet, the only solution i found suggests to compute a Khi2
> log-likelihood test between the complete model and the model without the
> factor predictor to emphasize its relevance.
>
> Do you know other ways? Another R package managing this kind of issue?
>
> Thank you and best regards, Benoit.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bob O'Hara
NOTE NEW ADDRESS!!!
Institutt for matematiske fag
NTNU
7491 Trondheim
Norway
Mobile: +49 1515 888 5440
Journal of Negative Results - EEB: www.jnr-eeb.org
[[alternative HTML version deleted]]
More information about the R-help
mailing list