[R] Singular design matrix in rq
William Dunlap
wdunlap at tibco.com
Fri Apr 19 18:51:40 CEST 2013
I believe that those repeated values (more than half your x values are 0.0)
are causing bs() problems, because its default knots are at quantiles of the data
at equally spaced probabilities. The following may be the same problem:
> set.seed(1)
> x <- c(rep(0, 20), 1:15)
> y <- sort(rnorm(length(x)))
> rq(y~bs(x, df=15), tau=.5)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> # lm deals with a singular design matrix by dropping columns from the model
> lm(y~bs(x, df=15))
Call:
lm(formula = y ~ bs(x, df = 15))
Coefficients:
(Intercept) bs(x, df = 15)1 bs(x, df = 15)2 bs(x, df = 15)3
1.59024 NA NA NA
bs(x, df = 15)4 bs(x, df = 15)5 bs(x, df = 15)6 bs(x, df = 15)7
NA NA NA -2.09983
bs(x, df = 15)8 bs(x, df = 15)9 bs(x, df = 15)10 bs(x, df = 15)11
-1.06874 -1.20798 -0.99340 -0.87365
bs(x, df = 15)12 bs(x, df = 15)13 bs(x, df = 15)14 bs(x, df = 15)15
-0.71927 -0.50564 0.06184 NA
> svd(cbind(1, bs(x, df=15)))$d # design matrix is not full rank
[1] 7.029298e+00 2.773759e+00 1.286165e+00 1.160239e+00 9.992134e-01 8.102012e-01
[7] 6.334326e-01 4.098332e-01 3.185013e-01 4.476983e-16 1.643202e-16 8.614772e-17
[13] 7.597613e-17 5.575475e-17 1.760443e-17 1.727013e-18
Try using equally spaced knots or removing repeated quantiles when you call bs().
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Jonathan Greenberg
> Sent: Friday, April 19, 2013 6:29 AM
> To: Koenker, Roger W
> Cc: r-help
> Subject: Re: [R] Singular design matrix in rq
>
> Roger:
>
> Doh! Just realized I had that error in the code -- raw_data is the same as
> mydata, so it should be:
>
> mydata <- read.csv("singular.csv")
> plot(mydata$predictor,mydata$response)
> # A big cloud of points, nothing too weird
> summary(mydata)
> # No NAs:
>
> # X response predictor
> # Min. : 1 Min. : 0.0 Min. : 0.000
> # 1st Qu.:12726 1st Qu.: 851.2 1st Qu.: 0.000
> # Median :25452 Median : 2737.0 Median : 0.000
> # Mean :25452 Mean : 3478.0 Mean : 5.532
> # 3rd Qu.:38178 3rd Qu.: 5111.6 3rd Qu.: 5.652
> # Max. :50903 Max. :26677.8 Max. :69.342
>
> fit_spl <- rq(response ~ bs(predictor,df=15),tau=1,data=mydata)
> # Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
>
> --j
>
>
>
> On Fri, Apr 19, 2013 at 8:15 AM, Koenker, Roger W <rkoenker at illinois.edu>wrote:
>
> > Jonathan,
> >
> > This is not what we call a reproducible example... what is raw_data? Does
> > it have something to do with mydata?
> > what is i?
> >
> > Roger
> >
> > url: www.econ.uiuc.edu/~roger Roger Koenker
> > email rkoenker at uiuc.edu Department of Economics
> > vox: 217-333-4558 University of Illinois
> > fax: 217-244-6678 Urbana, IL 61801
> >
> > On Apr 16, 2013, at 2:58 PM, Greenberg, Jonathan wrote:
> >
> > > Quantreggers:
> > >
> > > I'm trying to run rq() on a dataset I posted at:
> > >
> > https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
> > > (it's a 1500kb csv file named "singular.csv") and am getting the
> > following error:
> > >
> > > mydata <- read.csv("singular.csv")
> > > fit_spl <- rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
> > > > Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
> > >
> > > Any ideas what might be causing this or, more importantly, suggestions
> > for how to solve this? I'm just trying to fit a smoothed hull to the top
> > of the data cloud (hence the large df).
> > >
> > > Thanks!
> > >
> > > --jonathan
> > >
> > >
> > > --
> > > Jonathan A. Greenberg, PhD
> > > Assistant Professor
> > > Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> > > Department of Geography and Geographic Information Science
> > > University of Illinois at Urbana-Champaign
> > > 607 South Mathews Avenue, MC 150
> > > Urbana, IL 61801
> > > Phone: 217-300-1924
> > > http://www.geog.illinois.edu/~jgrn/
> > > AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
> >
> >
>
>
> --
> Jonathan A. Greenberg, PhD
> Assistant Professor
> Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
> Department of Geography and Geographic Information Science
> University of Illinois at Urbana-Champaign
> 607 South Mathews Avenue, MC 150
> Urbana, IL 61801
> Phone: 217-300-1924
> http://www.geog.illinois.edu/~jgrn/
> AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list