[R] POSIX and ecdf()
Doran, Harold
HDoran at air.org
Mon Mar 30 11:40:56 CEST 2015
Below is some working code that, generally speaking, accomplishes why I want, but am looking for a necessary improvement in the final step. The code below scrapes data from a website (thousands of pages actually) and organizes athletes scores in a data frame. The final variable, called Workout05 in the original data is a timed event. So, I use strplit() to pull out the data I want in that column and format it using as.POSIXct() as you can see in the code below (using a regular expression Im sure would improve on how to pull out those data in the column, but that is not my primary question).
After I have all data, I want to find the empirical CDF of the data, so I use ecdf() on those data just as I would on other variables. Now, the main issue Im interested is in the final step where you plug in a specific time to find its percentile
## These are below in context of the real problem as well
fn <- ecdf(dat$score5)
fn(dat$score5[1])
This works, but not in the way I want. What I want is for a user to easily be able to enter their time in lay terms such as 5:35 and from that it would return the percentile rank.
So, Id like something like the following to be able to work
fn(5:35)
The larger context for this problem for why I want this can be seen if you visit my web app built using shiny. Ive built a site where athletes can build customized reports based on their performance on certain events by entering in data. This specific issue would be found on the get my percentile tab where a user can use the text input box to enter their time in a way humans typically understand it and then it gets passed to the R fn() function that runs in the background and builds the plot for them.
https://hdoran.shinyapps.io/openAnalysis/
So, my question is how can I structure this such that a time can be expressed as simply minute:seconds (e.g., 4:52) in a text box so that it would still work to return a percentile rank as Ive described here.
Thanks
library(XML)
i = 1; j = 0; division = 1
url <-
paste(paste('http://games.crossfit.com/scores/leaderboard.php?stage=5&sort=0&page=', i, sep=''), paste('&division=1®ion=', j, sep=''), '&numberperpage=100&competition=0&frontpage=0&expanded=1&year=15&full=1&showtoggles=0&hidedropdowns=0&showathleteac=1&=&is_mobile=0', sep='')
tmp <- try(readHTMLTable(readLines(url), which=1, header=TRUE))
if(!is.null(dim(tmp))){ # new part here
names(tmp) <- gsub("\\n", "", names(tmp))
names(tmp) <- gsub(" +", "", names(tmp))
tmp[] <- lapply(tmp, function(x) gsub("\\n", "", x))
tmp$region <- j
}
dat <- tmp
aa <- strsplit(dat$Workout05, split = '\\(')
bb <- sapply(aa, function(x) x[2])
aa <- strsplit(bb, split = '\\)')
dat$score5 <- as.character(sapply(strsplit(bb, split = '\\)'), function(x) x))
dat$score5 <- as.POSIXct(dat$score5, format="%M:%S")
fn <- ecdf(dat$score5)
fn(dat$score5[1])
[[alternative HTML version deleted]]
More information about the R-help
mailing list