[R] Dates as headers causing confusion but needed to convert to Julian days for ANOVA

Daniel Nordlund djnord|und @end|ng |rom gm@||@com
Tue Oct 26 00:15:03 CEST 2021


On 10/25/2021 7:09 AM, Philip Monk wrote:
> Hello,
>
> First post - apologies if I get anything wrong - either in describing the
> question (I've only been using R for a week) or etiquette.
>
> I have CSV files of Land Surface Temperature (LST) data derived from
> Landsat 8 data and exported from Google Earth Engine.  I am investigating
> whether the construction of utility-scale solar power plants affects the
> local climate.
>
> I need to tidy the CSV files so that I can use Two-way ANOVA w/repeated
> measures but am having problems due to column headers (necessarily, I
> think) being dates.
>
> Each CSV currently has the following columns:
>
> Buffer
> Values 100-2000 in 100 increments.  Buffers are 100m wide and extend
> outwards from each site boundary.
>
> 24 columns of monthly data.
> Column headers are in date format (currently dd/mm/yyyy in Excel) and
> relate to the date on which the original Landsat 8 image from which the LST
> data are derived was captured.
> I need these dates to calculate the 'Julian day' (1-365.25) for each month,
> and also to extract the Year.
>
> Time
> Currently 1 = pre-construction and 2 = post-construction.
>
> The data frame created when importing one of these CSV's into R looks like
> this:
>
> 'data.frame':	20 obs. of  14 variables:
>   $ Buffer     : int  100 200 300 400 500 600 700 800 900 1000 ...
>   $ X15.01.2010: num  6.09 5.27 4.45 3.39 2.9 ...
>   $ X16.02.2010: num  6.41 5.99 5.61 4.78 4.31 ...
>   $ X20.03.2010: num  8.93 7.38 6.12 5.61 5.61 ...
>   $ X24.04.2011: num  6.28 5.81 5.15 4.54 4.32 ...
>   $ X07.05.2010: num  6.13 5.54 5.35 4.82 4.52 ...
>   $ X08.06.2010: num  7.71 7.4 6.82 6.14 5.82 ...
>   $ X13.07.2011: num  4.07 2.93 2.69 2.47 2.53 ...
>   $ X11.08.2010: num  5.96 5.68 5.38 4.96 4.57 ...
>   $ X12.09.2010: num  5.76 5.15 4.54 3.87 3.46 ...
>   $ X17.10.2011: num  3.16 2.51 2.51 2.06 2.01 ...
>   $ X15.11.2010: num  4.72 3.77 3.24 2.74 2.49 ...
>   $ X01.12.2010: num  4.26 3.516 2.154 1.056 0.315 ...
>   $ Time       : int  1 1 1 1 1 1 1 1 1 1 ...
>
>
> Importing a CSV into R that has a date as a column header (in whatever
> format) causes problems!  R adds the 'X', and converts the separator.
>
> I was using 'gather' and 'pivot_longer' (see below) but the date issue has
> wrecked that approach.  I've tried reformating the date, trying to remove
> the X, and going away to learn more about data frames, dplyr, and readr.
> I'm not making any progress, though, and I'm just getting more confused.
>
> Helped requested
> ~~~~~~~~~~~~~~
>
> How should I proceed to tidy the data such that I can:
>
> *) extract the year and Julian day for each date, then convert the date to
> the name of the month?
> *) create a tidy table with columns for Buffer, Month, Year, Julian day,
> LST (the values), and Time (1 = pre-construction, 2 = post-construction of
> a solar farm).
>
> Prior to deciding I needed to calculate the Julian day for use in ANOVA I
> was doing this (with month names rather than dates - please remember I'm a
> newbie!):
>
> data <- read.csv(...
> attach(data)
> # data_long <- data %>% pivot_longer(!Buffer, names_to = "month", values_to
> = "LST")
> # data_long <- data %>% pivot_longer(!Buffer, names_to = c("month",
> "Time"), names_sep = 13, values_to = "LST")
> data_long <- gather(data, Month, LST, January:December, factor_key=TRUE)
> data_long$Time <- as.factor(data$Time)
> str(data_long)
>
> 'pivot_longer' didn't work, but 'gather' did to create the long data needed
> for ANOVA.
>
> For example:
>
> 'data.frame': 480 obs. of  4 variables:
>   $ Buffer: int  100 200 300 400 500 600 700 800 900 1000 ...
>   $ Time  : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
>   $ Month : Factor w/ 12 levels "January","February",..: 1 1 1 1 1 1 1 1 1 1
> ...
>   $ LST   : num  NA 0.803 0.803 1.044 0.475 ...
>
> Suggestions/hints/solutions would be most welcome.  :)
>
> Thanks for your time,
>
> Philip
>
> Part-time PhD Student (Environmental Science)
> Lancaster University, UK.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Using the data that you read in, you can use "pivot_longer", and various 
"date" functions to get what you want. For example

# pivot your data
data_long <- data %>% pivot_longer(starts_with("X"), names_to = 
"chr_date", values_to = "LST")

# now you can use various data functions to get your month, day, and year
# for example
data_long$month <- month(as.Date(data_long$chr_date,"X%d.%m.%Y"))

You may want to read up on the various date functions built in to R such 
as as.POSIXct, as.POSIXlt, as.Date, and maybe look at the contributed 
package, lubridate.

Hope this is helpful,

Dan

-- 
Daniel Nordlund
Port Townsend, WA  USA


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the R-help mailing list