[R] Help with read.csv.sql()
H
@gent@ @end|ng |rom medd@t@|nc@com
Mon Jul 20 23:54:56 CEST 2020
On 07/18/2020 11:42 PM, Rasmus Liland wrote:
> On 2020-07-18 18:09 +0100, Rui Barradas wrote:
> | �s 17:59 de 18/07/2020, H escreveu:
> | | On Fri, Jul 17, 2020 at 6:28 PM H <agents using meddatainc.com> wrote:
> | | |
> | | | The problem I am having is that
> | | | the csv files have header rows
> | | | with column names that are
> | | | slightly different from the column
> | | | names I have assigned in the
> | | | dataframe and it seems that when I
> | | | read the csv data into the
> | | | dataframe, the column names from
> | | | the csv file replace the column
> | | | names I chose when creating the
> | | | dataframe.
> | | |
> | | | A secondary issue is that the csv
> | | | files have a column with a date in
> | | | mm/dd/yyyy format that I would
> | | | like to make into a Date type
> | | | column in my dataframe. Again, I
> | | | have been unable to find a way -
> | | | if at all possible - to force a
> | | | conversion into a Date format when
> | | | importing into the dataframe. The
> | | | best I have so far is to import is
> | | | a character column and then use
> | | | as.Date() to later force the
> | | | conversion of the dataframe
> | | | column.
> | |
> | | The documentation for read.csv.sql()
> | | suggests that colClasses() and/or
> | | field.types() should work but I may
> | | well have misunderstood the
> | | documentation, hence my question in
> | | this group.
> |
> | As for colClasses, those are R class
> | names.
>
> Ok Mister H, I might have hit the nail
> on the head this time with this badass
> example for your usecase:
>
> # Make a csv with %d/%m/%Y dates in it ...
> Lines <- "STM05-1 2005/02/28 17:35 Good -35.562 177.158
> STM05-1 2005/02/28 19:44 Good -35.487 177.129
> STM05-1 2005/02/28 23:01 Unknown -35.399 177.064
> STM05-1 2005/03/01 07:28 Unknown -34.978 177.268
> STM05-1 2005/03/01 18:06 Poor -34.799 177.027
> STM05-1 2005/03/01 18:47 Poor -34.85 177.059
> STM05-2 2005/02/28 12:49 Good -35.928 177.328
> STM05-2 2005/02/28 21:23 Poor -35.926 177.314
> "
> DF <- read.table(textConnection(Lines), as.is = TRUE,
> col.names = c("Id", "Date", "Time", "Quality", "Lat", "Long"))
> DF$Date <- format(as.Date(DF$Date, "%Y/%m/%d"), "%d/%m/%Y")
> write.csv(DF, file="df.csv", row.names=FALSE)
>
> colClasses <-
> c("character",
> "Date",
> "character",
> "character",
> "numeric",
> "numeric")
> sql <- paste0(
> "select ",
> "date(", # [2]
> "substr(Date, 8, 4) || '-' || ", # [1]
> "substr(Date, 5, 2) || '-' || ",
> "substr(Date, 2, 2)), Long, Lat, Quality ",
> "from ff where Quality like '%oo%' and Long>177.129")
> ff <- file(description="df.csv", open="r")
> dat <- sqldf::read.csv.sql(
> sql=sql, colClasses=colClasses)
> close(ff)
>
> str(dat)
>
> as.Date(dat[,1])
> dat[,3]
>
> Both sqlite and Postgres has a function
> substr you can call on strings like
> this.[5] I have a hunch this has always
> been possible in sql from way back ...
>
> The warning from sqldf about unused
> connections, might suggest file
> descriptor handling to be a bit crusty
> ... [3]
>
> The thing is, defining the second column
> as of type Date in colClasses happens to
> work, but it's still character when you
> check with str(dat) ... perhaps it has
> something to do with this info from [4]:
>
> as_tibble_row() converts a vector to
> a tibble with one row. The input
> must be a bare vector, e.g. vectors
> of dates are not supported yet. If
> the input is a list, all elements
> must have length one.
>
> [1] https://stackoverflow.com/questions/15563656/convert-string-to-date-in-sqlite
> [2] https://www.sqlite.org/lang_datefunc.html
> [3] https://groups.google.com/forum/#!topic/sqldf/mcQ_K_E--q8
> [4] https://tibble.tidyverse.org/reference/as_tibble.html
> [5] https://www.sqlite.org/lang_corefunc.html#substr,
> https://www.postgresql.org/docs/9.1/functions-string.html,
> http://www.h2database.com/html/functions.html#substring
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Thank you for your extensive example. However, I have decided to simply convert column types as necessary and rename columns as desired after importing the data since that seems the simplest solution.
[[alternative HTML version deleted]]
More information about the R-help
mailing list