[R] Request for functions to calculate correlated factors influencing an outcome.

Michael Dewey lists at dewey.myzen.co.uk
Sun May 3 13:24:36 CEST 2015


Dear Lalitha, see inline below

On 03/05/2015 10:19, Lalitha Viswanathan wrote:
> Hi
> I have a dataset of the type attached.
> Here's my code thus far.
> dataset <-data.frame(read.delim("data", sep="\t", header=TRUE));
> newData<-subset(dataset, select = c(Price, Reliability, Mileage, Weight,
> Disp, HP));

In fact in the file the variable seems to be called Disp.

> cor(newData, method="pearson");
> Results are
>                   Price Reliability    Mileage     Weight       Disp
> HP
> Price        1.0000000          NA -0.6537541  0.7017999  0.4856769
>   0.6536433
> Reliability         NA           1         NA         NA         NA
> NA
> Mileage     -0.6537541          NA  1.0000000 -0.8478541 -0.6931928
> -0.6667146
> Weight       0.7017999          NA -0.8478541  1.0000000  0.8032804
>   0.7629322
> Disp         0.4856769          NA -0.6931928  0.8032804  1.0000000
>   0.8181881
> HP           0.6536433          NA -0.6667146  0.7629322  0.8181881
>   1.0000000
>
> It appears that Wt and Price, Wt and Disp, Wt and HP, Disp and HP, HP and
> Price are strongly correlated.
> To find the statistical significance,
> I am trying  sample.correln<-cor.test(newData$Disp, newData$HP,
> method="kendall", exact=NULL)
> Kendall's rank correlation tau
>
> data:  newx$Disp and newx$HP
> z = 7.2192, p-value = 5.229e-13
> alternative hypothesis: true tau is not equal to 0
> sample estimates:
>        tau
> 0.6563871
>
> If I try the same with
> sample.correln<-cor.test(newData$Disp, newData$HP, method="pearson",
> exact=NULL)

When I try that it works fine.
The real question is why when you asked it for the Pearson coefficient 
it decided to give you the Spearman as the warning message below points 
out. I suspect you have done something else which you did not tell us about.

> I get Warning message:
> In cor.test.default(newx$Disp, newx$HP, method = "spearman", exact = NULL) :
>    Cannot compute exact p-value with ties
>> sample.correln
>
> Spearman's rank correlation rho
>
> data:  newx$Disp and newx$HP
> S = 5716.8, p-value < 2.2e-16
> alternative hypothesis: true rho is not equal to 0
> sample estimates:
>        rho
> 0.8411566
>
> I am not sure how to interpret these values.
> Basically, I am trying to figure out which combination of factors
> influences efficiency.
>
> Thanks
> Lalitha
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



More information about the R-help mailing list