[R] Calculating total observations based on combinations of variable values

Dylan Beaudette dylan.beaudette at gmail.com
Wed Aug 27 19:20:15 CEST 2008


On Wednesday 27 August 2008, Josip Dasovic wrote:
> Hello:
>
> As someone making the move from STATA to R, I'm finding it difficult at
> times to perform basic tasks in R, so forgive me if I've missed an obvious
> and easily obtained solution to my problem.   I've searched the help guides
> and the archives and have not been able to find a solution that works.
>
> I have a data frame with thousands of observations that looks something
> like this:
>
> YEAR MONTH DAY   COUNTRY         REGION                  PROVINCE          
>    CITY 1994     1  22 Sri Lanka     South Asia       Northern (Province)  
>     Pungudutivu 1994     1  25 Sri Lanka     South Asia        Central
> (Province)             Kandy 1994     2  26 Sri Lanka     South Asia       
> Central (Province)             Kandy 1994     2  28 Sri Lanka     South
> Asia        Eastern (Province)         Wakianeri 1994     6  28 Sri Lanka  
>   South Asia        Eastern (Province)        Valachenai 1994     6  31 Sri
> Lanka     South Asia        Central (Province)             Kandy 1995     3
>   1 Sri Lanka     South Asia          North (Province)       Kilinochchi
> 1995     3   6 Sri Lanka     South Asia        Western (Province)          
> Colombo 1995     7  15 Sri Lanka     South Asia       Northern (Province)  
>        Mankulam 1995     7  23 Sri Lanka     South Asia       Northern
> (Province)       Point Pedro 1995     9  25 Sri Lanka     South Asia      
> Northern (Province)            Kilali ...
>
> What I would like to do is to calculate the total number of observations by
> unique combinations of the values of (some of the) variables above.
>
> For example, I would like to know how many observations (i.e. rows) have
> the values YEAR==1994 and MONTH==1.
>
> In the end, I'd like a table that looks like this:
>
> YEAR MONTH #OBS
> 1994     1  2
> 1994     2  2
> 1994     3  0
> 1994     4  0
> 1994     5  0
> 1994     6  2
> 1994     7  0
> 1994     8  0
> 1994     9  0
> 1994     10  0
> 1994     11  0
> 1994     12  0
> 1995     1  0
> 1995     2  0
> 1995     3  2
> 1995     4  0
> ...
>
> I do need to fill out the table with all the possible combinations, even
> where there are no observations with that combination in the data set. At
> first, it seemed like this would not be  think that aggregate is probably
> the way to go, but there doesn't seem to be an appropriate summary function
> (FUN) available.  Thanks in advance for any help in this matter,
>
> Josip
>

?table
?xtabs

-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341



More information about the R-help mailing list