[R] problem with predict(mboost,...)

Benjamin Hofner Benjamin.Hofner at imbe.med.uni-erlangen.de
Wed Oct 20 16:35:57 CEST 2010


Hi Tim,

you have two "problems" at the same time:

1.) The warning you get means that you predictor (e.g. predictor1) has 
another range in the training set than in the test set. In this case you 
have data in you test set that lies outside of the range of the training 
set (for predictor1). This is only a problem if the ranges are REALLY 
different. However, this doesn't lead to your second problem! So I think 
you can just ignore the warning (especially as you write both training 
and test set have the same range).

2.) The second problem you describe (negative prediction for a positive 
outcome) has nothing to do with boosting or mboost. This results from 
the fact that you estimate a model for a positive outcome but the 
prediction might be ANY number.
You can avoid this by, for example, considering log-transformed outcomes 
and / or using another family (depending on the type of your outcome). 
Please consult literatur on generalized linear models (GLMs) for further 
help.

Hope that helps
   Benjamin


On 20.10.2010 12:00, r-help-request at r-project.org wrote:
> Message: 129
> Date: Wed, 20 Oct 2010 11:08:44 +0200
> From: H?ring, Tim (LWF)<Tim.Haering at lwf.bayern.de>
> To:<r-help at r-project.org>
> Subject: [R] problem with predict(mboost,...)
> Message-ID:
> 	<70FC67C4A585D1489E66225A4E0238BAB3600C at RZS-EXC-CL06.rz-sued.bayern.de>
> 	
> Content-Type: text/plain;	charset="iso-8859-1"
>
> Hi,
>
> I use a mboost model to predict my dependent variable on new data. I get the following warning message:
> In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree,  :
>     some 'x' values beyond boundary knots may cause ill-conditioned bases
>
> The new predicted values are partly negative although the variable in the training data ranges from 3 to 8 on a numeric scale. In order to restrict the predicted values to the value range from 3 to 8 I limit the feature space of the prediction data on the minima and maxima of the training data for every predictor variable before applying the model on the new data.
> As baselearner in mboost I use splines ("bbs"):
>
> mod<- mboost(MF ~ bbs(predictor1) + bbs(predictor2) + bbs(...), data = train)
>
> I wonder why there are negative values when applying the model on new data, because both, training and prediction data have the same value ranges in the predictor variables.
>
> Did somebody get the same warning message? Can someone help me please?
>
> TIM
>
> ------------------------------------------
> Tim H?ring
> Bavarian State Institute of Forestry
> Department of Forest Ecology
> Hans-Carl-von-Carlowitz-Platz 1
> D-85354 Freising
>
> E-Mail:tim.haering at lwf.bayern.de
> http://www.lwf.bayern.de

-- 
******************************************************************************
Dipl.-Stat. Benjamin Hofner

Institut für Medizininformatik, Biometrie und Epidemiologie
Friedrich-Alexander-Universität Erlangen-Nürnberg
Waldstr. 6 - 91054 Erlangen - Germany

Tel: +49-9131-85-22707
Fax: +49-9131-85-25740

Office:
   Room 3.036
   Universitätsstraße 22
   (Entrance at the left side of the building)

benjamin.hofner at imbe.med.uni-erlangen.de

http://www.imbe.med.uni-erlangen.de/~hofnerb/
http://www.benjaminhofner.de



More information about the R-help mailing list