[R-sig-ME] MCMCglmm predictions with fixed effects Splines

Joshua Wiley jwiley.psych at gmail.com
Wed Mar 27 08:10:35 CET 2013


I do not think the ifelse statements you have are what you want
either.  Look at how this simple example is expanded with only three
knots:

> spl(1:3, k = 3)
           [,1]       [,2]       [,3]
[1,] 3.80521304 -2.7748166 0.55521304
[2,] 0.09910059  0.1982012 0.09910059
[3,] 0.55521304 -2.7748166 3.80521304


simply cutting the data is not sufficient (FYI a more efficient way to
do that if you were would be with ?cut).  Have you looked at the
source code for spl?  It is short and readable.  Essentially taking
the outer product of the data  and knot arrays (determined as you
said), by cubing the absolute deviations, and then post multiplies
that using the singular value decomposition of it (X = UDV'), V
sqrt(D^-1) U'

Anyway, to make the new appropriate matrix for prediction, rather than
rolling your own, probably easiest to pass in your own knot values.
So:

knots <- quantile(1:10, 1:3/(3+1))
> knots
 25%  50%  75%
3.25 5.50 7.75

## imaginary "real" use
> spl(1:10, knots = knots)
            [,1]        [,2]       [,3]
 [1,] 36.3243413 -26.4882371  5.3000313
 [2,] 21.3916729  -8.9747525  1.6810714
 [3,] 11.3613809  -0.1816752  0.1360608
 [4,]  5.2276907   2.4526145 -0.2523869
 [5,]  1.9051290   2.0996457  0.2879496
 [6,]  0.2879496   2.0996457  1.9051290
 [7,] -0.2523869   2.4526145  5.2276907
 [8,]  0.1360608  -0.1816752 11.3613809
 [9,]  1.6810714  -8.9747525 21.3916729
[10,]  5.3000313 -26.4882371 36.3243413

## now for the prediction data
> spl(1:3, knots = knots)
         [,1]        [,2]      [,3]
[1,] 36.32434 -26.4882371 5.3000313
[2,] 21.39167  -8.9747525 1.6810714
[3,] 11.36138  -0.1816752 0.1360608

## what happens if you do not use the same knots

> spl(1:3, k=3)
           [,1]       [,2]       [,3]
[1,] 3.80521304 -2.7748166 0.55521304
[2,] 0.09910059  0.1982012 0.09910059
[3,] 0.55521304 -2.7748166 3.80521304


Agree with Jarrod that a penalized spline model is more typical.
Since you're Bayesian, you could also shrink them using stronger
priors on those terms in the fixed effects region.  Not really the
same thing, but could help reign in the effects if you want to leave
them as fixed effects.

Cheers,

Josh





On Tue, Mar 26, 2013 at 10:38 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
> Hi Antonio,
>
> The penalised bit of a penalised spline is achieved by having the spline
> coefficients as random effects rather than fixed.
>
> Cheers,
>
> Jarrod
>
>
> Quoting "Antonio P. Ramos" <ramos.grad.student at gmail.com> on Tue, 26 Mar
> 2013 10:06:57 -0700:
>
>> Hi Jarrod,
>>
>> Thanks for your reply.
>>
>> I think I need a spline as a fixed effect only - time doesn't vary by
>> CASEID and all observations are subjects for the same time trends. I would
>> be nice to have a cubic spline though. Thanks
>>
>>
>> On Tue, Mar 26, 2013 at 2:38 AM, Jarrod Hadfield
>> <j.hadfield at ed.ac.uk>wrote:
>>
>>> Hi,
>>>
>>> Check out the examples in ?spl: you haven't fitted a penalised spline.
>>> You
>>> probably want something like:
>>>
>>> glm.MC.2 <- MCMCglmm(mortality.under.2 ~ maternal_age_c +
>>> I(maternal_age_c^2)  + birth_year + residence + maternal_educ +
>>> sex + wealth, nitt=20000, thin=10, burnin=1000,random=
>>> ~idv(spl(birdth_year))+CASEID,
>>> prior=prior.2,data=rwanda2,**family='categorical',
>>> pr=TRUE)
>>>
>>> Note that I have saved the random effects (pr=TRUE) because the first k
>>> random effects are the spline coefficients. You will need to associate
>>> these with the relevant columns of Z (rather than X) to get predictions.
>>> Remember to include the fixed birth_year effect too.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>>
>>> Quoting "Antonio P. Ramos" <ramos.grad.student at gmail.com> on Mon, 25 Mar
>>> 2013 19:57:51 -0700:
>>>
>>>  I think I get what is going on a little better now. Does it make sense?
>>>>
>>>>
>>>> # getting the predictions
>>>> # where the nots are?
>>>> k <- 10
>>>> x <- quantile(rwanda2$birth_year, 1:k/(k + 1), na.rm = T)
>>>> y <- unique(rwanda2$birth_year)
>>>>
>>>> # creating new data for the poor
>>>> pred.data <- data.frame(maternal_age_c=rep(**0,28),wealth=rep("Lowest
>>>> quintile",28),
>>>>
>>>> sex=rep("Female",28),**residence=rep("Rural",28),
>>>>                         "spl(birth_year)1"=ifelse(y<**1976,1,0),
>>>>                         "spl(birth_year)2"=ifelse(y>**1975&y<1979,1,0),
>>>>                         "spl(birth_year)3"=ifelse(y>**1978&y<1981,1,0),
>>>>                         "spl(birth_year)4"=ifelse(y>**1980&y<1983,1,0),
>>>>                         "spl(birth_year)5"=ifelse(y>**1982&y<1985,1,0),
>>>>                         "spl(birth_year)6"=ifelse(y>=**1984&y<1988,1,0),
>>>>
>>>> "spl(birth_year)7"=ifelse(y>=**1987&y<=1990,1,0),
>>>>                         "spl(birth_year)8"=ifelse(y>**1989&y<1993,1,0),
>>>>                         "spl(birth_year)9"=ifelse(y>**1993&y<1996,1,0),
>>>>                         "spl(birth_year)10"=ifelse(y>**1995,1,0),
>>>>                         birth_order=rep(1,28),
>>>>                         maternal_educ=rep("No education",28))
>>>>
>>>> pred.data$wealth <- factor(pred.data$wealth,
>>>>                            levels=c("Lowest quintile", "Second
>>>> quintile","Middle quintile","Fourth quintile","Highest quintile"))
>>>> pred.data$sex <- factor(pred.data$sex, levels=c("Male","Female"))
>>>> pred.data$residence <- factor(pred.data$residence,**
>>>> levels=c("Rural","Urban"))
>>>> pred.data$maternal_educ <- factor(pred.data$maternal_**educ,
>>>>                                   levels=c("No education", "Primary",
>>>> "Secondary","Higher"))
>>>>
>>>>
>>>> # design matrix
>>>> X <- model.matrix(~ maternal_age_c + I(maternal_age_c^2) + residence +
>>>>                      + maternal_educ +
>>>>                     birth_order +  wealth +
>>>>                     spl.birth_year.1  +
>>>>                     spl.birth_year.2  +
>>>>                     spl.birth_year.3  +
>>>>                     spl.birth_year.4  +
>>>>                     spl.birth_year.5  +
>>>>                     spl.birth_year.6  +
>>>>                     spl.birth_year.7  +
>>>>                     spl.birth_year.8  +
>>>>                     spl.birth_year.9  +
>>>>                     spl.birth_year.10, data=pred.data)
>>>>
>>>>
>>>> V <- rowSums(glm.MC.2$VCV) # marginalizing over random effects
>>>> beta <- glm.MC.2$Sol # fixed effects
>>>> c2 <- (16*sqrt(3)/(15*pi))^2 # alterative parametrization for the
>>>> logistic
>>>> distribution
>>>> pred<- t(plogis(t(beta%*%t(X)/sqrt(1+**c2*V))))
>>>> pred <- as.data.frame(pred)
>>>> colnames(pred) <- 1970:1997 # predictions for the poor for every year
>>>> colSums(pred)
>>>> pred.poor <- pred
>>>>
>>>>
>>>>
>>>> On Mon, Mar 25, 2013 at 5:14 PM, Antonio P. Ramos <
>>>> ramos.grad.student at gmail.com> wrote:
>>>>
>>>>  maybe the model's summary would also help:
>>>>>
>>>>>
>>>>> > summary(glm.MC.2)
>>>>>
>>>>>  Iterations = 1001:19991
>>>>>  Thinning interval  = 10
>>>>>  Sample size  = 1900
>>>>>
>>>>>  DIC: 23202.78
>>>>>
>>>>>  G-structure:  ~CASEID
>>>>>
>>>>>        post.mean l-95% CI u-95% CI eff.samp
>>>>> CASEID     1.008   0.8508    1.139    73.88
>>>>>
>>>>>  R-structure:  ~units
>>>>>
>>>>>       post.mean l-95% CI u-95% CI eff.samp
>>>>> units         1        1        1        0
>>>>>
>>>>>  Location effects: mortality.under.2 ~ maternal_age_c +
>>>>> I(maternal_age_c^2) + spl(birth_year) + residence + maternal_educ + sex
>>>>> +
>>>>> wealth
>>>>>
>>>>>                            post.mean   l-95% CI   u-95% CI eff.samp
>>>>> pMCMC
>>>>> (Intercept)               -2.2844882 -4.0378822 -0.5243228    270.8
>>>>> 0.00947 **
>>>>> maternal_age_c            -0.0278874 -0.0396679 -0.0169772    409.5 <
>>>>> 5e-04 ***
>>>>> I(maternal_age_c^2)        0.0067512  0.0040534  0.0096366    369.2 <
>>>>> 5e-04 ***
>>>>> spl(birth_year)1          -0.0069841 -0.0172375  0.0024631    387.8
>>>>> 0.15789
>>>>> spl(birth_year)2           0.0257588  0.0003497  0.0511228    425.9
>>>>> 0.05474 .
>>>>> spl(birth_year)3          -0.0251424 -0.0871379  0.0381827    376.8
>>>>> 0.41368
>>>>> spl(birth_year)4          -0.0451816 -0.1068456  0.0188489    315.1
>>>>> 0.17895
>>>>> spl(birth_year)5          -0.0506369 -0.1256510  0.0256118    325.8
>>>>> 0.20737
>>>>> spl(birth_year)6           0.0571108 -0.0450529  0.1564138    229.7
>>>>> 0.25368
>>>>> spl(birth_year)7          -0.0993668 -0.1830175 -0.0135886    275.6
>>>>> 0.01474 *
>>>>> spl(birth_year)8          -0.0812237 -0.1417047 -0.0322527    273.6
>>>>> 0.00316 **
>>>>> spl(birth_year)9           0.0626604  0.0448635  0.0797381    300.8 <
>>>>> 5e-04 ***
>>>>> spl(birth_year)10         -0.0148629 -0.0234434 -0.0054245    387.5
>>>>> 0.00105 **
>>>>> residenceUrban            -0.2141484 -0.3716601 -0.0511390    280.1
>>>>> 0.00947 **
>>>>> maternal_educNo education  1.0334332  0.1228220  2.0524150    207.8
>>>>> 0.03263 *
>>>>> maternal_educPrimary       0.7954471 -0.1554284  1.7616937    206.3
>>>>> 0.10316
>>>>> maternal_educSecondary    -0.0083354 -0.9622503  1.0182426    195.0
>>>>> 0.98000
>>>>> sexMale                    0.1893753  0.1072094  0.2710264    275.2 <
>>>>> 5e-04 ***
>>>>> wealthSecond quintile      0.1773074  0.0444755  0.3283123    398.4
>>>>> 0.01368 *
>>>>> wealthMiddle quintile     -0.0667648 -0.2060366  0.0676636    376.9
>>>>> 0.34316
>>>>> wealthFourth quintile      0.0485797 -0.0913787  0.1942995    416.2
>>>>> 0.47579
>>>>> wealthHighest quintile    -0.1339028 -0.3017483  0.0077947    338.2
>>>>> 0.08000 .
>>>>> ---
>>>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 25, 2013 at 5:12 PM, Antonio P. Ramos <
>>>>> ramos.grad.student at gmail.com> wrote:
>>>>>
>>>>>  Hi all,
>>>>>>
>>>>>>
>>>>>> I am trying to get some predictions from a MCMCglmm model but it is
>>>>>> not
>>>>>> working. I guess I don't really following what the model is doing with
>>>>>> tje
>>>>>> spl() command. Here is an example of the issue.
>>>>>>
>>>>>> Thanks a bunch
>>>>>>
>>>>>>
>>>>>>
>>>>>> # inve.wishart(V=1,nu=4) is equivalent to inv-gamma(shape=2,scale=2)
>>>>>> for
>>>>>> mothers random effects
>>>>>> prior.2 <- list(R = list(V = 1, fix = 1), G = list(G1 = list(V = 1,nu
>>>>>> =
>>>>>> 4)))
>>>>>> glm.MC.2 <- MCMCglmm(mortality.under.2 ~ maternal_age_c +
>>>>>> I(maternal_age_c^2)  +
>>>>>>                        spl(birth_year) + residence + maternal_educ +
>>>>>>                        sex + wealth,
>>>>>>                      nitt=20000, thin=10, burnin=1000,
>>>>>>                      random= ~CASEID, prior=prior.2,data=rwanda2,
>>>>>> family='categorical')
>>>>>>
>>>>>>
>>>>>> > # creating new data for the poor
>>>>>> > pred.data <- data.frame(maternal_age_c=rep(**
>>>>>> 18,25),wealth=rep("Lowest
>>>>>> quintile",25),
>>>>>> +                         sex=rep("Female",25),**
>>>>>> residence=rep("Rural",25),
>>>>>> +                         "spl(birth_year)1"=1971:1995,
>>>>>> +                         "spl(birth_year)2"=1971:1995,
>>>>>> +                         "spl(birth_year)3"=1971:1995,
>>>>>> +                         "spl(birth_year)4"=1971:1995,
>>>>>> +                         "spl(birth_year)5"=1971:1995,
>>>>>> +                         "spl(birth_year)6"=1971:1995,
>>>>>> +                         "spl(birth_year)7"=1971:1995,
>>>>>> +                         "spl(birth_year)8"=1971:1995,
>>>>>> +                         "spl(birth_year)9"=1971:1995,
>>>>>> +                         "spl(birth_year)10"=1971:1995,
>>>>>> +                         birth_order=rep(1,25),
>>>>>> +                         maternal_educ=rep("No education",25))
>>>>>> >
>>>>>> > pred.data$wealth <- factor(pred.data$wealth,
>>>>>> +                            levels=c("Lowest quintile", "Second
>>>>>> quintile","Middle quintile","Fourth quintile","Highest quintile"))
>>>>>> > pred.data$sex <- factor(pred.data$sex, levels=c("Male","Female"))
>>>>>> > pred.data$residence <-
>>>>>> factor(pred.data$residence,**levels=c("Rural","Urban"))
>>>>>> > # pred.data$birth_year <- factor(pred.data$birth_year,
>>>>>> levels=1970:1997)
>>>>>> > pred.data$maternal_educ <- factor(pred.data$maternal_**educ,
>>>>>> +                                   levels=c("No education",
>>>>>> "Primary",
>>>>>> "Secondary","Higher"))
>>>>>> >
>>>>>> >
>>>>>> > # design matrix
>>>>>> > X <- model.matrix(~ maternal_age_c + I(maternal_age_c^2) + residence
>>>>>> > +
>>>>>> +                       + maternal_educ +
>>>>>> +                       birth_order +  wealth +
>>>>>> +                       spl.birth_year.1  +
>>>>>> +                       spl.birth_year.2  +
>>>>>> +                       spl.birth_year.3  +
>>>>>> +                       spl.birth_year.4  +
>>>>>> +                       spl.birth_year.5  +
>>>>>> +                       spl.birth_year.6  +
>>>>>> +                       spl.birth_year.7  +
>>>>>> +                       spl.birth_year.8  +
>>>>>> +                       spl.birth_year.9  +
>>>>>> +                       spl.birth_year.10, data=pred.data)
>>>>>> >
>>>>>> >
>>>>>> > V <- rowSums(glm.MC.2$VCV) # marginalizing over random effects
>>>>>> > beta <- glm.MC.2$Sol # fixed effects
>>>>>> > c2 <- (16*sqrt(3)/(15*pi))^2 # alterative parametrization for the
>>>>>> logistic distribution
>>>>>> > pred<- t(plogis(t(beta%*%t(X)/sqrt(1+**c2*V))))
>>>>>> > pred <- as.data.frame(pred)
>>>>>> > colnames(pred) <- 1971:1995 # predictions for the poor for every
>>>>>> > year
>>>>>> > colSums(pred)
>>>>>>     1971     1972     1973     1974     1975     1976     1977
>>>>>> 1978
>>>>>>   1979     1980     1981     1982     1983
>>>>>> 1677.094 1677.093 1677.093 1677.093 1677.093 1677.093 1677.093
>>>>>> 1677.093
>>>>>> 1677.093 1677.093 1677.093 1677.093 1677.093
>>>>>>     1984     1985     1986     1987     1988     1989     1990
>>>>>> 1991
>>>>>>   1992     1993     1994     1995
>>>>>> 1677.092 1677.092 1677.092 1677.092 1677.092 1677.092 1677.092
>>>>>> 1677.092
>>>>>> 1677.092 1677.092 1677.092 1677.092
>>>>>> >
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>>
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com



More information about the R-sig-mixed-models mailing list