[R] trouble calculating rates--sometimes the denominator is missing
Christopher W. Ryan
cryan at binghamton.edu
Wed Mar 10 16:30:19 CET 2010
Every day I get a csv file containing the names of the 64 schools in our
county, the number of students sent home ill, and the number of students
absent (plus lots of other variables). The file is cumulative since fall
of 2009. It is in "long" format: one line per school per day.
Each line is also supposed to contain the total number of students
enrolled in the school. That number doesn't change often or much, so the
same value is usually repeated on each line for each school. Thus
calculating proportion of students absent or sent home ill is easy (see
lines between the #####); here is the beginning of my code (my apologies
for the word-wrapping, I use some long variable names):
setwd("C:/data/bchd/schoolsurveillance")
library(ggplot2)
library(doBy)
library(reshape)
data <- read.csv("C:/DATA/BCHD/schoolsurveillance/Broome_02MAR10.csv",
header=TRUE, sep=",", fill=TRUE)
data$date <- as.character(data$ReportingDate)
data$date <- as.Date(data$ReportingDate, format="%d%b%y")
####
data$PercentStudentsAbsent <-
data$StudentsAbsentTotal/data$TotalStudentsEnrolled
data$PercentSentHome <- data$SentHomeTotal/data$TotalStudentsEnrolled
####
attach(data)
The problem is that sometimes, in some of the daily files, the
TotalStudentsEnrolled field is left entirely blank--in every record.
Unfortunately the data collection system is out of my hands, and still a
little rough around the edges. The powers-that-be can put those numbers
back in on the subsequent day, then my code runs fine. But if possible,
I want to make my code less susceptible to this external "threat."
What would be a good way to "store up" the names of the 64 schools and
their total enrollments (which are basically static), and them use those
values for the denominators for the rates as calculated above (####),
rather than relying on always having a complete, rectangular, data file,
every line containing the necessary value for a denominator?
Thanks.
--
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY 13904
cryanatbinghamtondotedu
"If you want to build a ship, don't drum up the men to gather wood,
divide the work and give orders. Instead, teach them to yearn for the
vast and endless sea." [Antoine de St. Exupery]
More information about the R-help
mailing list