[R] problem with predict(mboost,...)
Benjamin Hofner
Benjamin.Hofner at imbe.med.uni-erlangen.de
Wed Oct 20 16:35:57 CEST 2010
Hi Tim,
you have two "problems" at the same time:
1.) The warning you get means that you predictor (e.g. predictor1) has
another range in the training set than in the test set. In this case you
have data in you test set that lies outside of the range of the training
set (for predictor1). This is only a problem if the ranges are REALLY
different. However, this doesn't lead to your second problem! So I think
you can just ignore the warning (especially as you write both training
and test set have the same range).
2.) The second problem you describe (negative prediction for a positive
outcome) has nothing to do with boosting or mboost. This results from
the fact that you estimate a model for a positive outcome but the
prediction might be ANY number.
You can avoid this by, for example, considering log-transformed outcomes
and / or using another family (depending on the type of your outcome).
Please consult literatur on generalized linear models (GLMs) for further
help.
Hope that helps
Benjamin
On 20.10.2010 12:00, r-help-request at r-project.org wrote:
> Message: 129
> Date: Wed, 20 Oct 2010 11:08:44 +0200
> From: H?ring, Tim (LWF)<Tim.Haering at lwf.bayern.de>
> To:<r-help at r-project.org>
> Subject: [R] problem with predict(mboost,...)
> Message-ID:
> <70FC67C4A585D1489E66225A4E0238BAB3600C at RZS-EXC-CL06.rz-sued.bayern.de>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> I use a mboost model to predict my dependent variable on new data. I get the following warning message:
> In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree, :
> some 'x' values beyond boundary knots may cause ill-conditioned bases
>
> The new predicted values are partly negative although the variable in the training data ranges from 3 to 8 on a numeric scale. In order to restrict the predicted values to the value range from 3 to 8 I limit the feature space of the prediction data on the minima and maxima of the training data for every predictor variable before applying the model on the new data.
> As baselearner in mboost I use splines ("bbs"):
>
> mod<- mboost(MF ~ bbs(predictor1) + bbs(predictor2) + bbs(...), data = train)
>
> I wonder why there are negative values when applying the model on new data, because both, training and prediction data have the same value ranges in the predictor variables.
>
> Did somebody get the same warning message? Can someone help me please?
>
> TIM
>
> ------------------------------------------
> Tim H?ring
> Bavarian State Institute of Forestry
> Department of Forest Ecology
> Hans-Carl-von-Carlowitz-Platz 1
> D-85354 Freising
>
> E-Mail:tim.haering at lwf.bayern.de
> http://www.lwf.bayern.de
--
******************************************************************************
Dipl.-Stat. Benjamin Hofner
Institut für Medizininformatik, Biometrie und Epidemiologie
Friedrich-Alexander-Universität Erlangen-Nürnberg
Waldstr. 6 - 91054 Erlangen - Germany
Tel: +49-9131-85-22707
Fax: +49-9131-85-25740
Office:
Room 3.036
Universitätsstraße 22
(Entrance at the left side of the building)
benjamin.hofner at imbe.med.uni-erlangen.de
http://www.imbe.med.uni-erlangen.de/~hofnerb/
http://www.benjaminhofner.de
More information about the R-help
mailing list