[R] (no subject)

David Winsemius dwinsemius at comcast.net
Tue May 18 18:36:49 CEST 2010


On May 18, 2010, at 12:07 PM, Arantzazu Blanco Bernardeau wrote:

>
> Hello
> Well, the problem is, that arcilla is the percentage of clay in the  
> soil sample. So, for linear model, I need to work with that number  
> or value. Now, R thinks that arcilla (arcilla means clay in  
> spanish), is a factor, and gives me the value as a factor, so the  
> output of the linear model is
> Call:
> lm(formula = formula, data = caperf)

Would help if you also displayed the value for "formula", so we might  
have an idea what you are calling your "y"-variable  .... and it would  
be wise not to continue to name your formulas "formula."

require(fortunes)
fortune("dog")

What happens when you create a new variable in caperf with the numeric  
equivalant of the arcilla levels?

caperf$claynum <- as.numeric(as.character(arcilla))

lm(y ~ claynum + limo + CO_gkg1 + C03Ca  , data=caperf)

-- 
David.


>
> Residuals:
>        Min         1Q     Median         3Q        Max
> -1.466e+01 -1.376e-15  1.780e-16  2.038e-15  1.279e+01
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept)    1.68964    6.33889   0.267 0.790221
> arcilla0.9     1.90228    8.90888   0.214 0.831239
> arcilla10      1.26371    7.96734   0.159 0.874212
> arcilla10.3   15.70081    9.05141   1.735 0.085090 .
> arcilla10.4    7.27517    7.72806   0.941 0.348183
> arcilla10.45   7.03879    9.02600   0.780 0.436853
> arcilla10.5    2.41241    8.90827   0.271 0.786954
> arcilla10.65  15.44298    9.03879   1.709 0.089838 .
> arcilla10.7   19.35651    9.04675   2.140 0.034185 *
> arcilla10.9    3.55947    9.18501   0.388 0.698974
>
> [...]
>
> arcilla9.9     6.31949    7.35724   0.859 0.391892
> arcilla#N/A   24.17959    8.87201   2.725 0.007274 **
> limo           0.24920    0.04605   5.412 2.76e-07 ***
> CO_gkg1        0.21015    0.03931   5.346 3.73e-07 ***
> C03Ca          0.01711    0.02727   0.628 0.531337
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 6.249 on 135 degrees of freedom
>   (50 observations deleted due to missingness)
> Multiple R-squared: 0.9736,    Adjusted R-squared: 0.9014
> F-statistic: 13.47 on 370 and 135 DF,  p-value: < 2.2e-16
>
> So, in the desired linear model, arcilla should be just a line, with  
> the valors of the linear model.
> I hope you understand better more. If not, I could make an english  
> version of the file to send, so you can try the commands.
> Thanks a lot for your help!
>
>
>
> Arantzazu Blanco Bernardeau
> Dpto de Química Agrícola, Geología y Edafología
> Universidad de Murcia-Campus de Espinardo
>
>
>
>
>
>
>
>
>
> ----------------------------------------
>> Date: Tue, 18 May 2010 11:54:20 -0400
>> Subject: Re: [R] (no subject)
>> From: mailinglist.honeypot at gmail.com
>> To: aramucia at hotmail.com
>> CC: r-help at r-project.org
>>
>> Hi,
>>
>> Sorry, I'm not really getting what going on here ... perhaps having
>> more domain knowledge would help me make better sense of our  
>> question.
>>
>> In particular:
>>
>> On Tue, May 18, 2010 at 11:35 AM, Arantzazu Blanco Bernardeau
>> wrote:
>>>
>>> Hello
>>> I have a data array with soil variables (caperf), in which the  
>>> variable "clay" is factor (as I see entering str(caperf)) . I need  
>>> to do a regression model, so I need to have arcilla (=clay) as a  
>>> numeric variable.  For that I have entered
>>>
>>> as.numeric(as.character(arcilla))
>>>
>>> and even entering
>>> 'as.numeric(levels(arcilla))[arcilla]'
>>
>> The above code doesn't make sense to me ...
>>
>> Perhaps cleaning up your question and providing some reproducible
>> example we can use to help show you the light (just describing what a
>> variable has isn't enough -- give us minimal code we can paste into R
>> that reproduces your problem).
>>
>> Alternatively, depending no what your "levels" mean, you might want  
>> to
>> recode your data using "dummy variables" (I'm not sure if that's the
>> official term) .. this is what I mean:
>>
>> http://dss.princeton.edu/online_help/analysis/dummy_variables.htm
>>
>> In your example, let's say you have four levels for "clay" ... maybe
>> "soft", "hard", "smooth", "red"
>>
>> Instead of only using 1 variable with values 1-4, you would recode
>> this into 4 variables with values 0,1
>>
>> So, if one example has a value of "smooth" for clay. Instead of  
>> coding it like:
>> clay: 3
>>
>> You would do:
>> soft: 0
>> hard: 0
>> smooth: 1
>> red : 0
>>
>> -steve
>> --
>> Steve Lianoglou
>> Graduate Student: Computational Systems Biology
>> | Memorial Sloan-Kettering Cancer Center
>> | Weill Medical College of Cornell University
>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> 		 	   		
> _________________________________________________________________
> Consejos para seducir ¿Puedes conocer gente nueva a través de  
> Internet? ¡Regístrate ya!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list