[R] Correlation code not working but not sure why

John C Frain frainj at gmail.com
Thu Mar 23 00:03:47 CET 2017


I can't see anything wrong with your code. You should read the posting
guide and produce a minimal example showing the problem and any other
details requested there.

If I take just the small sample of data that you have provided and run a
slightly adapted version of your code

data = read.table("query.txt",header = TRUE )
data
AllTemps <-
c(data[,"BHCS306"],data[,"BH9OB1U"],data[,"BHCS276"],data[,"BHCS207"])
AirTempC <- data[,"AirTempC"]
airTemps53 <- c(rep(AirTempC, times = 4))
cor.test(AllTemps, airTemps53, alternative = "two.sided", method =
"pearson")

I get the following output - which gives what I require. (I would not
endorse the idea of looking at this correlation. Perhaps some kind of
stacked regression with dummy variables for the sites might be more
appropriate)


> data = read.table("query.txt",header = TRUE )
> data
  BHCS306 BH9OB1U BHCS276 BHCS207 AirTempC
1    12.2    12.4    12.2    12.7     15.3
2    12.2    12.5    12.3    12.7     16.2
3    12.3    12.5    12.5    12.8     16.1
> AllTemps <-
c(data[,"BHCS306"],data[,"BH9OB1U"],data[,"BHCS276"],data[,"BHCS207"])
> AirTempC <- data[,"AirTempC"]
> airTemps53 <- c(rep(AirTempC, times = 4))
> cor.test(AllTemps, airTemps53, alternative = "two.sided", method =
"pearson")

Pearson's product-moment correlation

data:  AllTemps and airTemps53
t = 0.68527, df = 10, p-value = 0.5087
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.4122189  0.7005406
sample estimates:
      cor
0.2117855


John C Frain
3 Aranleigh Park
Rathfarnham
Dublin 14
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:frainj at tcd.ie
mailto:frainj at gmail.com

On 22 March 2017 at 14:05, Ashley Patton via R-help <r-help at r-project.org>
wrote:

> Good afternoon,
>
> I was wondering if someone could help me with what I am sure is likely to
> be a really simple problem but I cannot work out what I have done wrong. I
> have tried searching the forums/Google etc but can't find anything quite
> like the code I am using other than things that do not differ from what I
> have done. I suspect then that the problem is in my naming of things but I
> don't know what is causing the issue.
>
> I have data that comprises 53 columns containing temperature data for 53
> sites recording continuously for a year, 48 times a day (half hourly). I
> also have one column that contains average air temperature for a city
> during the same time period. I would like to see if my collective site
> temperature data shows any correlation with the city air temperature data
> and so I have attempted to combined the data from the 53 site columns using
> the code below and then repeat the air temperature 53 times to correlate it
> against and then perform a Pearson's correlation. My data looks something
> like this:
>
> Site   BHCS306   BH9OB1U   BHCS276   BHCS207...      AirTempC
>          12.2          12.4            12.2           12.7
>  15.3
>          12.2          12.5            12.3           12.7
>  16.2
>          12.3          12.5            12.5           12.8
>  16.1...
> repeating for 53 sites recording every half hour for a year
>
> The code I used was this:
>
>
> #String together data from all 53 sites into one column
> AllTemps <- c(data[,"BHCS306"],data[,"BH9OB1U"],data[,"BHCS276"],
> data[,"BHCS207AL"],data[,"BHCS178AL"],data[,"BHCS159AL"]
> ,data[,"BHCS318"],data[,"BHCS211"],data[,"BH7OB1L"],
> data[,"BHCS274B"],data[,"BHCS337"],data[,"BH2PB1"],
> data[,"BHCS038"],data[,"BHCS074AL"],data[,"BH9OB1L"],data[,"Site
> 5"],data[,"BH6PB4"],data[,"BH6PB1"],data[,"BHCS329"],
> data[,"BH5PB1T"],data[,"BH4PB1T"],data[,"BHCS233T"],
> data[,"BHCS229"],data[,"BHCS272T"],data[,"BHCS217T"],
> data[,"BHCS283"],data[,"BHCS248"],data[,"BHCS002A"],
> data[,"BHCS245B"],data[,"BH4PB2T"],data[,"BH6PB2"],
> data[,"BH5PB1B"],data[,"BH4PB1B"],data[,"BHCS233B"],
> data[,"BHCS313L"],data[,"BHCS272B"],data[,"BHCS266"],
> data[,"BHCS217B"],data[,"BHCS241"],data[,"BH4PB2B"],
> data[,"BHCS116AL"],data[,"BHCS067A"],data[,"BHCS304L"],
> data[,"BH1OB1L"],data[,"BHCS307L"],data[,"BHCS037C"],
> data[,"BHCS301L"],data[,"BHCS238A"],data[,"BH3OB1"],
> data[,"BHCS308L"],data[,"BHCS278"],data[,"BHCS285"],
> data[,"BHCS133CL"],data[,"BHCS332L"])
>
> #Copy air temp data 53 times
> airTemps53 <- c(rep(AirTempC, times = 53))
>
> #Run correlation between site temps and air temps
> cor.test(AllTemps, airTemps53, alternative = "two.sided", method =
> "pearson")
>
> The error it returned was this:
>
> > #Copy air temp data 53 times
> > airTemps53 <- c(rep(AirTempC, times = 53))
> >
> > #Run correlation between site temps and air temps
> > cor.test(AllTemps, airTemps53, alternative = "two.sided", method =
> "pearson")
> Error in cor.test(AllTemps, airTemps53, alternative = "two.sided", method
> = "pearson") :
>   object 'AllTemps' not found
>
> Can anyone spot my mistake? I am very new to this so I am sure I have done
> something obvious and silly so please forgive me.
>
> Additionally I was wondering if there was a an easy way to offset the data
> to see if, for example, I can see if there is a lag time between changes in
> air temperature correlating with changes in temperature at my sites or do I
> need to do this by manually offsetting the data in Excel first?
>
> Many thanks,
> Ashley
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list