[R] Pseudo R squared for quantile regression with replicates

Donia Smaali Bouhlila sma.ali at fsjegj.rnu.tn
Fri Sep 19 09:20:02 CEST 2014


Thank you for considering my email and replying.
Well, I am working with TIMSS 2007  survey data ( Trends in 
international mathematics and science study).

TIMSS is a stratified DATA, where the primary sampling unit is the 
school
I want to test the effect of some independent variables reflecting the 
socio-economic status of students on math achievement



I have 18 independent variables :Female+Age+calculator+computer+desk+
dictionary+internet+work+Book2+Book3+Book4+Book5+Pedu1+Pedu2+Pedu3+Pedu4+Born1+Born2


All my variables are binary variables except for age, and of course the 
dependent variable wich is the achievement score of the student (Math1 
in my regression)

I have a sample of 3169 observations (i.e students, 3169 achievement 
scores)

Data is missing at 23%.

The idea is to test the effect of these variables (known to reflect the 
socio-economic status of students) at the lower and upper quantiles 
taking into account the survey design features (sampling weight (which 
is in my case the studnt weight) clustering and stratification)

Why do I need th psudo-R squared? Because I want to calculate the 
increase in the pseudo r squared after introducing the school variales: 
I explain
I will do first regressions with the socio-economic variables (SES), and 
I calculate the pseudo R squared
In a second stage, I will introduce the school variables ( which are 
also binary variables), after controlling for socio-economic status, and 
calculate the pseudo R squared. The difference between the two pseudo R 
squared will tell me if SES variables account more or less  than school 
variables for students' achievement.


Why with replicates? To eliminate the intra-cluster correlations between 
schools.

I ran the following commands successfully:


mydesign
<-svydesign(ids=~IDSCHOOL,strata=~IDSTRATE,data=TUN,nest=TRUE,weights=~TOTWGT)

> bootdesign <- as.svrepdesign(mydesign,type="auto",replicates=150)

>   
> fit<-withReplicates(bootdesign,quote(coef(rq(Math1~Female+Age+calculator+computer+desk+
+dictionary+internet+work+Book2+Book3+Book4+Book5+Pedu1+Pedu2+Pedu3+Pedu4+Born1+Born2,tau=0.9,weights=.weights,
>> method="fn"))))



>> I want get the pseudo R squared but I failed. I read a query dating 
>> from
  August 2006, [R] Pseudo R for Quant Reg and the answer to it:

# from https://stat.ethz.ch/pipermail/r-help/2006-August/110386.html

>> rho <- function(u,tau=.5)u*(tau - (u < 0))
>>   V <- sum(rho(f$resid, f$tau))


The structure of my fit object is the following:


>> str (fit)
>> Class 'svrepstat'  atomic [1:19] 713.24 -24.01 -18.37 9.05 7.71
>> ...
>>    ..- attr(*, "var")= num [1:19, 1:19] 2839.3 10.2 -122.1 -332.4
>> -42.3
>> ...
>>    .. ..- attr(*, "dimnames")=List of 2
>>    .. .. ..$ : chr [1:19] "(Intercept)" "Female" "Age"
>> "calculator" ...
>>    .. .. ..$ : chr [1:19] "(Intercept)" "Female" "Age"
>> "calculator" ...
>>    .. ..- attr(*, "means")= Named num [1:19] 710.97 -24.03 -18.3
>> 9.39
>> 7.58 ...
>>    .. .. ..- attr(*, "names")= chr [1:19] "(Intercept)" "Female"
>> "Age"
>> "calculator" ...
>>    ..- attr(*, "statistic")= chr "theta"



Thank you in advance for replying and for your help







Le 2014-09-19 04:19, Anthony Damico a écrit :
> here is a reproducible example, mostly from ?withReplicates.  i think
> something would have to be done using return.replicates=TRUE to
> manually compute survey-adjusted residuals, but i'm not really sure
> what nor whether the pseudo r^2 would be meaningful  :/
> 
> library(survey)
> library(quantreg)
> 
> data(api)
> 
> ## one-stage cluster sample
> dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
> 
> ## convert to bootstrap
> bclus1<-as.svrepdesign(dclus1,type="bootstrap", replicates=100)
>  
> ## median regression
> fit <- withReplicates(bclus1, quote(coef(rq(api00~api99, tau=0.5,
> weights=.weights,method="fn"))))
> 
> # # # no longer from ?withReplicates # # #
> # from https://stat.ethz.ch/pipermail/r-help/2006-August/110386.html
> [3]
> 
> rho <- function(u,tau=.5)u*(tau - (u < 0))
> 
> V <- sum(rho(fit$resid, fit$tau)) # # breaks
> 
> On Thu, Sep 18, 2014 at 1:55 PM, David L Carlson <dcarlson at tamu.edu>
> wrote:
> 
>> It is hard to say because we do not have enough information. R has
>> approximately 6,000 packages and you have not told us which ones you
>> are using. You have not told us much about your data and you have
>> not told us where to find the query from August 2006. The basic
>> problem is that your "fit" is not the same as the "f" in the query.
>> Your fit object is not very complicated. If you look at the output
>> from str(fit) you will see that fit is an "atomic" vector (note the
>> wording in your error message) with a series of attributes that are
>> probably documented in the help pages for the functions you are
>> using. There is nothing called resid inside fit. It is likely that
>> the post you are looking at refers to the output from rq(...) or
>> perhaps predict(rq(...)), but not the output from
>> withReplicates(..., quote(coef(rq(...)))) which is what fit is.
>> 
>> -------------------------------------
>> David L Carlson
>> Department of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Donia Smaali
>> Bouhlila
>> Sent: Thursday, September 18, 2014 9:54 AM
>> To: r-help at r-project.org
>> Subject: [R] Pseudo R squared for quantile regression with
>> replicates
>> 
>> Hi,
>> 
>> I am a new user of r software. I intend to do quantile regressions
>> with
>> complex survey data using replicate method. I have ran the
>> following
>> commands successfully:
>> 
>>   mydesign
>> 
> <-svydesign(ids=~IDSCHOOL,strata=~IDSTRATE,data=TUN,nest=TRUE,weights=~TOTWGT)
>> bootdesign <- as.svrepdesign(mydesign,type="auto",replicates=150)
>> 
>>   fit<-
>> 
> withReplicates(bootdesign,quote(coef(rq(Math1~Female+Age+calculator+computer+desk+
>> +
>> 
> dictionary+internet+work+Book2+Book3+Book4+Book5+Pedu1+Pedu2+Pedu3+Pedu4+Born1+Born2,tau=0.5,weights=.weights,
>> method="fn"))))
>> 
>> I want get the pseudo R squared but I failed. I read a query dating
>> from
>> August 2006, [R] Pseudo R for Quant Reg and the answer to it:
>> 
>> rho <- function(u,tau=.5)u*(tau - (u < 0))
>>   V <- sum(rho(f$resid, f$tau))
>> 
>>   I copied it and paste it , replacing f by fit I get this error
>> message:
>> Error in fit$resid : $ operator is invalid for atomic vectors, I
>> don't
>> know what it means
>> 
>> The fit object is likely to be quite complicated  I used str() to
>> see
>> what it looks like:
>> 
>> str (fit)
>> Class 'svrepstat'  atomic [1:19] 713.24 -24.01 -18.37 9.05 7.71
>> ...
>>    ..- attr(*, "var")= num [1:19, 1:19] 2839.3 10.2 -122.1 -332.4
>> -42.3
>> ...
>>    .. ..- attr(*, "dimnames")=List of 2
>>    .. .. ..$ : chr [1:19] "(Intercept)" "Female" "Age"
>> "calculator" ...
>>    .. .. ..$ : chr [1:19] "(Intercept)" "Female" "Age"
>> "calculator" ...
>>    .. ..- attr(*, "means")= Named num [1:19] 710.97 -24.03 -18.3
>> 9.39
>> 7.58 ...
>>    .. .. ..- attr(*, "names")= chr [1:19] "(Intercept)" "Female"
>> "Age"
>> "calculator" ...
>>    ..- attr(*, "statistic")= chr "theta"
>> 
>> How can I retrieve the residuals?? and calculate the pseudo R
>> squared??
>> 
>> Any help please
>> 
>> --
>> Dr. Donia Smaali Bouhlila
>> Associate-Professor
>> Department of Economics
>> Faculté des Sciences Economiques et de Gestion de Tunis
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help [1]
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html [2]
>> and provide commented, minimal, self-contained, reproducible code.
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help [1]
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html [2]
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> Links:
> ------
> [1] https://stat.ethz.ch/mailman/listinfo/r-help
> [2] http://www.R-project.org/posting-guide.html
> [3] https://stat.ethz.ch/pipermail/r-help/2006-August/110386.html

-- 
Dr. Donia Smaali Bouhlila
Associate-Professor
Department of Economics
Faculté des Sciences Economiques et de Gestion de Tunis



More information about the R-help mailing list