[R] simple subset question

David Winsemius dwinsemius at comcast.net
Sun Dec 2 20:54:52 CET 2012


The reason I suggested the alternative that I did was because your  
code could fail when the max(Total) value was not in the subset where  
Year==2012.

-- 
David
On Dec 2, 2012, at 11:34 AM, Felipe Carrillo wrote:

>
> Using my whole dataset I get:
> library(plyr)
> ddply(winter,"Year",summarise,maxTotal=max(Total))
>
>  fish <- structure(list(Year = 2002:2012, maxTotal = c(1464311L,  
> 1071051L,
> 714837L, 2115018L, 850491L, 207537L, 321195L, 935599L, 194429L,
> 157260L, 303259L)), .Names = c("Year", "maxTotal"), row.names = c(NA,
> -11L), class = "data.frame")
>
> I only want to extract the max Total for 2012 and want the whole row  
> like this:
>  IDWeek  Total   Fry  Smolt  FryEq Year
> 21     47 303259 34008 269248 491733 2012
>
> My whole dataset is too big to post it so thanks for your help and  
> will try
> to figure out why subset returns an empty row
>
> Felipe D. Carrillo
> Supervisory Fishery Biologist
> Department of the Interior
> US Fish & Wildlife Service
> California, USA
> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>
>
> From: William Dunlap <wdunlap at tibco.com>
>> To: Felipe Carrillo <mazatlanmexico at yahoo.com>; arun <smartpink111 at yahoo.com 
>> >
>> Cc: R help <r-help at r-project.org>
>> Sent: Sunday, December 2, 2012 11:00 AM
>> Subject: RE: [R] simple subset question
>>
>>> I am
>>> still getting an error message
>>>> with :
>>>>   x <- subset(fish,Year==2012 & Total==max(Total));x
>>>> I get:
>>>> [1] IDWeek Total  Fry    Smolt  FryEq  Year
>>>> <0 rows> (or 0-length row.names)
>>
>> The above is not an error message.  It says that there
>> are no rows satisfying your criteria.  Note that Total==max(Total)
>> returns a TRUE for each row in which the Total value
>> equals the maximum Total value over all the years in
>> the data.  Are you looking for the maximum value of Total
>> in each year?
>>
>>> tmp <- transform(fish, YearlyMaxTotal = ave(Total, Year, FUN=max))
>>> subset(tmp, Total==YearlyMaxTotal)
>>   IDWeek  Total    Fry  Smolt  FryEq Year YearlyMaxTotal
>> 21    47 303259  34008 269248 491733 2012        303259
>> 39    39 157260 156909    351 157506 2011        157260
>>> subset(tmp, Total==YearlyMaxTotal & Year==2012)
>>   IDWeek  Total  Fry  Smolt  FryEq Year YearlyMaxTotal
>> 21    47 303259 34008 269248 491733 2012        303259
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
>>> ] On Behalf
>>> Of Felipe Carrillo
>>> Sent: Sunday, December 02, 2012 10:47 AM
>>> To: arun
>>> Cc: R help
>>> Subject: Re: [R] simple subset question
>>>
>>> Works with the small dataset (2 years) but I get the error message  
>>> with the whole
>>> dataset (12 years of data). I am going to have
>>> to check what's wrong with it...Thanks
>>>
>>> Felipe D. Carrillo
>>> Supervisory Fishery Biologist
>>> Department of the Interior
>>> US Fish & Wildlife Service
>>> California, USA
>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>>>
>>>
>>> From: arun <smartpink111 at yahoo.com>
>>>> To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>>>> Cc: R help <r-help at r-project.org>; R. Michael Weylandt
>>> <michael.weylandt at gmail.com>
>>>> Sent: Sunday, December 2, 2012 10:29 AM
>>>> Subject: Re: [R] simple subset question
>>>>
>>>> Hi,
>>>> I am getting this:
>>>> x<-subset(fish,Year==2012 & Total==max(Total))
>>>>  x
>>>> #   IDWeek  Total   Fry  Smolt  FryEq Year
>>>> #21     47 303259 34008 269248 491733 2012
>>>> A.K.
>>>>
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>> From: Felipe Carrillo <mazatlanmexico at yahoo.com>
>>>> To: R. Michael Weylandt <michael.weylandt at gmail.com>
>>>> Cc: "r-help at r-project.org" <r-help at r-project.org>
>>>> Sent: Sunday, December 2, 2012 1:25 PM
>>>> Subject: Re: [R] simple subset question
>>>>
>>>> Sorry, I was trying it to subset from a bigger dataset called  
>>>> 'winter' and forgot to
>>> change the variable names
>>>> when I asked the question. David W suggestion works but the  
>>>> strange part is that I am
>>> still getting an error message
>>>> with :
>>>>   x <- subset(fish,Year==2012 & Total==max(Total));x
>>>> I get:
>>>> [1] IDWeek Total  Fry    Smolt  FryEq  Year
>>>> <0 rows> (or 0-length row.names)
>>>>
>>>> I will start a fresh session to see if that helps...Thank you all
>>>>
>>>> Felipe D. Carrillo
>>>> Supervisory Fishery Biologist
>>>> Department of the Interior
>>>> US Fish & Wildlife Service
>>>> California, USA
>>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>>>>
>>>>
>>>> From: R. Michael Weylandt <michael.weylandt at gmail.com>
>>>>> To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>>>>> Cc: "r-help at r-project.org" <r-help at r-project.org>
>>>>> Sent: Sunday, December 2, 2012 9:42 AM
>>>>> Subject: Re: [R] simple subset question
>>>>>
>>>>> On Sun, Dec 2, 2012 at 5:21 PM, Felipe Carrillo
>>>>> <mazatlanmexico at yahoo.com> wrote:
>>>>>>   Hi,
>>>>>> Consider the small dataset below, I want to subset by two  
>>>>>> variables in
>>>>>> one line but it wont work...it works though if I subset  
>>>>>> separately. I have
>>>>>> to be missing something obvious that I did not realize before  
>>>>>> while using subset..
>>>>>>
>>>>>> fish <- structure(list(IDWeek = c(27L, 28L, 29L, 30L, 31L, 32L,  
>>>>>> 33L,
>>>>>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>>>>>> 47L, 48L, 49L, 50L, 51L, 52L, 27L, 28L, 29L, 30L, 31L, 32L, 33L,
>>>>>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>>>>>> 47L, 48L, 49L, 50L, 51L, 52L), Total = c(0L, 0L, 326L, 1735L,
>>>>>> 1807L, 2208L, 3883L, 8820L, 6060L, 19326L, 63158L, 100718L,  
>>>>>> 53015L,
>>>>>> 91689L, 152629L, 122708L, 61293L, 15574L, 86538L, 75365L,  
>>>>>> 303259L,
>>>>>> 19691L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>>>>>> 13202L, 19726L, 30518L, 84949L, 157260L, 145691L, 85801L, 62044L,
>>>>>> 44439L, 23272L, 22391L, 20159L, 14854L, 35379L, 31142L, 7736L,
>>>>>> 13221L, 4894L), Fry = c(0L, 0L, 326L, 1735L, 1807L, 2208L, 3883L,
>>>>>> 8759L, 6060L, 19326L, 63119L, 100524L, 52582L, 88170L, 145564L,
>>>>>> 111416L, 38233L, 5248L, 17826L, 11038L, 34008L, 215L, 0L, 0L,
>>>>>> 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L, 13055L, 19488L,
>>>>>> 30518L, 84818L, 156909L, 144786L, 84207L, 57720L, 31049L, 6858L,
>>>>>> 1616L, 719L, 364L, 49L, 0L, 0L, 0L, 0L), Smolt = c(0L, 0L, 0L,
>>>>>> 0L, 0L, 0L, 0L, 62L, 0L, 0L, 38L, 195L, 433L, 3518L, 7067L,  
>>>>>> 11290L,
>>>>>> 23058L, 10327L, 68712L, 64328L, 269248L, 19479L, 0L, 0L, 0L,
>>>>>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 147L, 238L, 0L, 131L, 351L,
>>>>>> 905L, 1592L, 4324L, 13391L, 16414L, 20774L, 19444L, 14491L,  
>>>>>> 35330L,
>>>>>> 31142L, 7736L, 13221L, 4894L), FryEq = c(0L, 0L, 326L, 1735L,
>>>>>> 1807L, 2208L, 3883L, 8864L, 6060L, 19326L, 63185L, 100854L,  
>>>>>> 53318L,
>>>>>> 94151L, 157576L, 130610L, 77432L, 22805L, 134639L, 120393L,  
>>>>>> 491733L,
>>>>>> 33327L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>>>>>> 13306L, 19894L, 30518L, 85042L, 157506L, 146328L, 86914L, 65073L,
>>>>>> 53812L, 34763L, 36931L, 33769L, 24998L, 60110L, 52938L, 13149L,
>>>>>> 22476L, 8319L), Year = c(2012L, 2012L, 2012L, 2012L, 2012L,  
>>>>>> 2012L,
>>>>>> 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>>>>>> 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>>>>>> 2012L, 2012L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>>>>>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>>>>>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>>>>>> 2011L)), .Names = c("IDWeek", "Total", "Fry", "Smolt", "FryEq",
>>>>>> "Year"), row.names = c(NA, 52L), class = "data.frame")
>>>>>> fish
>>>>>> #  Subset to get the max Total for 2012
>>>>>>   x <- subset(winter,Year==2012 & Total==max(Total));b  # How  
>>>>>> come one line doesn't
>>> work?
>>>>>
>>>>> Works fine for me if I change "winter" to fish here.
>>>>>
>>>>> subset(fish,Year==2012 & Total==max(Total))
>>>>>   IDWeek  Total  Fry  Smolt  FryEq Year
>>>>> 21    47 303259 34008 269248 491733 2012
>>>>>
>>>>>>
>>>>>>   # It works if I subset the year first and then get the Total  
>>>>>> max from it
>>>>>>   xx <- subset(winter,Year==2012)
>>>>>> xxx <- subset(xx,Total==max(Total));xxx
>>>>>> xxx
>>>>>>
>>>>>> Felipe D. Carrillo
>>>>>> Supervisory Fishery Biologist
>>>>>> Department of the Interior
>>>>>> US Fish & Wildlife Service
>>>>>> California, USA
>>>>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>>>>>>
>>>>>>         [[alternative HTML version deleted]]
>>>>>>
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible  
>>>>>> code.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>>
>>>     [[alternative HTML version deleted]]
>>
>>
>>
>>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list