[R] Lahman Baseball Data Using R DBI Package
Bill Dunlap
w||||@mwdun|@p @end|ng |rom gm@||@com
Thu Oct 8 18:26:29 CEST 2020
This is really a feature of SQL, not R. SQL requires that you double quote
column names that start with numbers, include spaces, etc., or that are SQL
key words. E.g.,
> d <- data.frame(Order=c("sit","stay","heel"),
Where=c("here","there","there"), From=c("me","me","you"))
> sqldf::sqldf("select Order,Where,From from d WHERE From=\"me\"")
Error: near "Order": syntax error
> sqldf::sqldf("select \"Order\",\"Where\",\"From\" from d Where
\"From\"=\"me\"")
Order Where From
1 sit here me
2 stay there me
You may as well double quote all column names in SQL queries.
-Bill
On Wed, Oct 7, 2020 at 9:57 PM William Michels <wjm1 using caa.columbia.edu>
wrote:
> Hi Philip,
>
> You've probably realized by now that R doesn't like column names that
> start with a number. If you try to access an R-dataframe column named
> 2B or 3B with the familiar "$" notation, you'll get an error:
>
> > library(DBI)
> > library(RSQLite)
> > con2 <- dbConnect(SQLite(), "~/R_Dir/lahmansbaseballdb.sqlite")
> > Hack12Batting <- dbGetQuery(con2,"SELECT * FROM batting WHERE yearID =
> 2018 AND AB >600 ORDER BY AB DESC")
> > Hack12Batting$AB
> [1] 664 661 639 632 632 632 626 623 620 618 617 613 606 605 602
> > Hack12Batting$3B
> Error: unexpected numeric constant in "Hack12Batting$3"
>
> How to handle? You can rename columns on-the-fly by piping. See
> reference [1] and use either library(magrittr) or library(dplyr) or a
> combination thereof:
>
> library(magrittr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% set_colnames(make.names(colnames(.)))
>
> #OR one of the following:
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% rename(X2B = `2B`, X3B = `3B`)
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% `colnames<-`(make.names(colnames(.)))
>
> library(dplyr)
> dbGetQuery(con2,"SELECT * FROM batting WHERE yearID = 2018 AND AB >600
> ORDER BY AB DESC") %>% magrittr::set_colnames(make.names(colnames(.)))
>
> Best, Bill.
>
> W. Michels, Ph.D.
>
> [1]
> https://stackoverflow.com/questions/28100780/use-with-replacement-functions-like-colnames
>
>
>
>
>
>
>
>
>
>
> On Fri, Oct 2, 2020 at 7:34 PM Bill Dunlap <williamwdunlap using gmail.com>
> wrote:
> >
> > The double quotes are required by SQL if a name is not of the form
> > letter-followed-by-any-number-of-letters-or-numbers or if the name is a
> SQL
> > keyword like 'where' or 'select'. If you are doing this from a function,
> > you may as well quote all the names.
> >
> > -Bill
> >
> > On Fri, Oct 2, 2020 at 6:18 PM Philip <herd_dog using cox.net> wrote:
> >
> > > The \”2B\” worked. Have no idea why. Can you point me somewhere that
> can
> > > explain this to me.
> > >
> > > Thanks,
> > > Philip
> > >
> > > *From:* Bill Dunlap
> > > *Sent:* Friday, October 2, 2020 3:54 PM
> > > *To:* Philip
> > > *Cc:* r-help
> > > *Subject:* Re: [R] Lahman Baseball Data Using R DBI Package
> > >
> > > Have you tried putting double quotes around 2B and 3B: "...2B, 3B,
> ..."
> > > -> "...\"2B\",\"3B\",..."?
> > >
> > > -Bill
> > >
> > > On Fri, Oct 2, 2020 at 3:49 PM Philip <herd_dog using cox.net> wrote:
> > >
> > >> I’m trying to pull data from one table (batting) in the Lahman
> Baseball
> > >> database. Notice X2B for doubles and X3B for triples – fourth and
> fifth
> > >> from the right.
> > >>
> > >> The dbGetQuery function runs fine when I leave there two out but I get
> > >> error messages (in red) when I include 2B/3B or X2B/X3B.
> > >>
> > >> Can anyone give me some direction?
> > >>
> > >> Thanks,
> > >> Philip Heinrich
> > >>
> > >>
> ***************************************************************************************************************************************************
> > >> tail(dbReadTable(Lahman,"batting"))
> > >>
> > >> ID playerID yearID stint teamID team_ID
> > >> lgID G G_batting AB R H X2B X3B HR
> RBI SB
> > >> 107414 107414 yastrmi01 2019 1 SFN 2920
> > >> NL 107 NA 371 64 101 22 3 21
> > >> 55 2
> > >> 107416 107416 yelicch01 2019 1 MIL 2911
> > >> NL 130 NA 489 100 161 29 3 44
> 97 30
> > >> 107419 107419 youngal01 2019 1 ARI 2896
> > >> NL 17 NA 25 1 1 0 0
> > >> 0 0 0
> > >> 107420 107420 zagunma01 2019 1 CHN 2901 NL
> > >> 30 NA 36 2 9 3 0 0
> > >> 5 0
> > >> 107422 107422 zavalse01 2019 1 CHA 2900
> > >> AL 5 NA 12 1 1 0 0
> > >> 0 0 0
> > >> 107427 107427 zimmery01 2019 1 WAS 2925 NL
> > >> 52 NA 171 20 44 9 0 6 27
> 0
> > >> 107428 107428 zobribe01 2019 1 CHN 2901
> > >> NL 47 NA 150 24 39 5 0 1
> > >> 17 0
> > >> 107429 107429 zuninmi01 2019 1 TBA 2922
> > >> AL 90 NA 266 30 44 10 1 9
> > >> 32 0
> > >>
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,2B,3B,HR,
> > >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >> WHERE yearID = 2018 AND AB >99")
> > >> Error: unrecognized token: "2B"
> > >>
> > >> Hack11Batting <- dbGetQuery(Lahman,"SELECT
> > >> playerID,yearID,AB,R,H,X2B,X3B,HR,
> > >> RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP FROM
> > >> batting
> > >> WHERE yearID = 2018 AND AB >99")
> > >> Error: no such column: X2B
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> ______________________________________________
> > >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list