[R] converting zipcodes to latitude/longitude
Nicola Ruggiero
n|co|@@rugg|ero@unt @end|ng |rom gm@||@com
Wed May 15 22:29:51 CEST 2019
Hi Jim,
I ended up collaborating with someone, and, on the basis of looking at
your code (we did take it into consideration and talk about it), we
came up with this:
library(stringr)
numextract <- function(string){
str_extract(string, "\\-*\\d+\\,*\\d*")
}
myDataSet$zip<-numextract(myDataSet$state)
combineddata<-merge(zipcode, myDataSet, by.x="zip", by.y="zip")
So, as I understand it, we build a function the purpose of which was
to extract the numerical value from a string value, imputed that into
a column, then merged the two data frames together. It worked!
Now I just need to figure out this thing called shape data...basically
I need to figure out how to interpose a shape of the United States
underneath my data points so that I can see them over the location to
which they correspond.
Nicola
On Mon, May 13, 2019 at 9:09 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>
> Hi Nicola,
> Getting the blank rows will be a bit more difficult and I don't see
> why they should be in the final data frame, so:
>
> townzip<-read.table(text="waltham, Massachusetts 02451
> Columbia, SC 29209
>
> Wheat Ridge , Colorado 80033
> Charlottesville, Virginia 22902
> Fairbanks, AK 99709
> Montpelier, VT 05602
> Dobbs Ferry, New York 10522
>
> Henderson , Kentucky 42420",
> sep="\t",stringsAsFactors=FALSE)
> zip_split<-function(x) {
> commasplit<-unlist(strsplit(x,","))
> state<-trimws(gsub("[[:digit:]]","",commasplit[2]))
> zip<-trimws(gsub("[[:alpha:]]","",commasplit[2]))
> return(c(commasplit[1],state,zip))
> }
> townzipsplit<-as.data.frame(t(sapply(townzip$V1,zip_split)))
> rownames(townzipsplit)<-NULL
> names(townzipsplit)<-c("town","state","zip")
> townzipsplit$latlon<-NA
> # I don't know the name of the zipcode column in the "zipcode" data frame
> newzipdf<-merge(townzipsplit,zipcodedf,by.x="zip",by.y="zip")
>
> Jim
>
> On Tue, May 14, 2019 at 5:57 AM Nicola Ruggiero
> <nicola.ruggiero.unt using gmail.com> wrote:
> >
> > Hello everyone,
> >
> > I've downloaded Jeffrey Breen's R package "zipcode," which has the
> > latitude and longitude for all of the US zip codes. So, this is a
> > data.frame with 43,191 observations. That's one data frame in my
> > environment.
> >
> > Then, I have another data.frame with over 100,000 observations that
> > look like this:
> >
> > waltham, Massachusetts 02451
> > Columbia, SC 29209
> >
> > Wheat Ridge , Colorado 80033
> > Charlottesville, Virginia 22902
> > Fairbanks, AK 99709
> > Montpelier, VT 05602
> > Dobbs Ferry, New York 10522
> >
> > Henderson , Kentucky 42420
> >
> > The spaces represent absences in the column. Regardless,
> > I need to figure out how to write a code that would, presumably, match
> > the zipcodes and produce another column to the data frame with the
> > latitude and longitude. So, for example, the code would recognize
> > 02451 above, and, in the the column next to it, the code would write
> > 42.3765° N, 71.2356° W in the column next to it, since that's the
> > latitude and longitude for Waltham, Massachusetts.
> >
> > Any idea of how to begin a code that would perform such an operation?
> >
> > Again, I have a data.frame with the zipcodes linked to the the
> > latitudes and longitudes, on the one hand, and another data.frame with
> > only zipcodes (and some holes). I need to produce the corresponding
> > latitude/longitudes in the latter data.frame.
> >
> > Nicola
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list