[R] data frames, na.omit, and sums
Petr Pikal
petr.pikal at precheza.cz
Mon Dec 5 14:05:39 CET 2005
Hi
try to
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
I guess you probably need aggregate function like
aggregate(your.df[,-(1:2)], list(semestr = your.df$sem, year=
your.df$year), sum, na.rm=T)
Simple working example what you have done, what was Response and how
it failed your expectations could be helpful.
HTH
Petr
On 4 Dec 2005 at 18:55, Jason Miller wrote:
To: r-help at stat.math.ethz.ch
From: Jason Miller <millerj at truman.edu>
Date sent: Sun, 4 Dec 2005 18:55:06 -0600
Subject: [R] data frames, na.omit, and sums
> Dear R-helpers,
>
> New to R, I'm in the middle of a project that I'm using to force me
> learn R. I'm running into some behavior that I don't understand, and
> I need some advice. In the last week I've gotten some great advice
> from the list on visualizing my data, and I was hoping people could
> help me get over another barrier I've encountered to my progress.
>
> Before I describe what I'm trying to do and where I'm stuck with R,
> let me quickly outline what I need help with: (1) summing over the
> non-NA entries in each row of a data frame, and (1) using na.omit()
> and na.action() with rows of data from a frame.
>
> I have a data frame that contains information about when my academic
> department offered courses and their enrollments. The data frame
> looks something like
>
> sem year C1e C1s C2e C2s
> Fall 1991 10 2 NA NA
> Spring 1992 3 1 8 1
> Summer 1992 NA NA 100 10
>
> where C?e represents a specific course's enrollment that semester and
> C?s represents the number of sections of that course offered. The
> frame is filled with integers and NAs. The data frame is of medium
> size, with about 180 columns and 45 rows.
>
> I need to cull some basic information from this dataset such as:
> (1) total number of sections offered each semester (and each year),
> (2) total number of credit hours generated each semester (and each
> year), and (3) the student-to-faculty ratio of the department each
> semester (and each year).
>
> From a mathematical standpoint, how to do each of these is obvious
> to me. But having to negotiate working withing data frames and with
> matrices that have NA entries has really gotten me confused
> +frustrated. (I have no programming background.)
>
> To calculate (1) above for semester (rows), I know how to select the
> "sections" columns using grep(). What I'd like to do is sum the
> selected frame's non-NA entries row-by-row. For some reason, I was
> able to do this earlier today using the rowsum() function with
> na.rm=TRUE, but now it's not working. It complains of non-numeric
> entries. (In fact, I was able to use the rowsum() function to
> calculate (1) for each year.) When I try to convert the data frame
> (or a sub-frame) to a matrix, my integers turn into strings/
> characters, and I have no idea what to do with that!
>
> To calculate (2) above for a semester, I know how to select the
> enrollment columns using grep(). What I'd like to do is calculate
> the total credits generated by taking the dot product of each row
> with a vector whose components are the credit hour values of each
> course in my data frame. Of course, I'd nave to account for the NA
> values in my data frame, but in the past I've had decent luck with
> using na.omit() and na.action() to select the non-NA components of a
> vector. Unfortunately, na.omit is absolutely no working with my
> dataframe; it just returns the names of all the columns!
>
> Until I get (1) and (2) figured out, I have no hope of figuring out
> (3).
>
> Thank you for reading this far into this post. If you have any
> suggestions for how I can get na.omit() and summing to work for me,
> I'd appreciate hearing from you.
>
> Jason Miller
>
>
> ================================================================
> Jason E. Miller, Ph.D.
> Associate Professor of Mathematics
> Truman State University
> Kirksville, MO
> http://pyrite.truman.edu/~millerj/
> 660.785.7430
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
Petr Pikal
petr.pikal at precheza.cz
More information about the R-help
mailing list