[R] incorrect number of levels
David Winsemius
dwinsemius at comcast.net
Fri Oct 8 22:51:04 CEST 2010
On Oct 8, 2010, at 4:37 PM, David Winsemius wrote:
>
> On Oct 8, 2010, at 3:04 PM, Chagaris, Dave wrote:
>
>> I have a data set 382 rows and 63 columns. One of the columns is
>> bay, and there are 6 bays. But, the number of levels for this
>> factor is 7 when it should be six because there is some 'blank'
>> level "". When I subset for the blank level "", I get 0 rows.
>
> How did you do the subset?
>
>> What in my data could be causing this? Thanks.
>>
>>> dim(datmtx)
>> [1] 382 63
>>
>>
>>> datmtx$bay
>> [1] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB TB TB TB TB
>> [51] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB TB TB TB TB
>> [101] TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB TB
>> TB TB TB HI TB HI TB TB
>> [151] TB TB TB TB TB HI TB HI HI HI TB HI HI HI TB HI HI HI HI HI
>> HI HI HI TB TB TB TB CH CH TB CH CH CH CH CH CH CH CH CH CH TB TB
>> CH CH CH CH CH CH CH CH
>> [201] CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH CH
>> TB HI HI HI TB HI HI TB TB TB TB TB TB TB TB TB TB HI TB TB TB TB
>> TB TB TB TB TB TB TB TB
>> [251] TB HI HI HI CH CH CH CH CH CH CH CH CH CH HI HI CH CH CH CH
>> CH CH CH CH CH CH CH CH TB TB TB TB TB TB TB TB TB TB CH CH AP AP
>> AP AP AP AP HI HI HI CH
>> [301] CH CH CH AP AP TB TB AP AP AP AP AP AP SA BB BB TB TB TB TB
>> AP HI AP SA AP HI AP AP HI HI TB HI AP SA AP AP AP AP AP AP AP AP
>> SA AP AP SA AP AP AP SA
>> [351] SA SA AP AP AP CH CH CH CH CH AP BB BB BB BB BB TB CH CH CH
>> CH CH CH CH CH CH CH CH CH CH CH CH
>> Levels: AP BB CH HI SA TB
>>
>>> levels(datmtx$bay)
>> [1] "" "AP" "BB" "CH" "HI" "SA" "TB"
>
> What do you get with:
>
> which(!datmtx$bay %in% c( "AP", "BB", "CH", "HI," "SA", "TB") )
It occurs to me that you should also report:
which(!levels(datmtx$bay) %in% c( "AP", "BB", "CH", "HI," "SA",
"TB") )
Since you will not necessarily have all levels represented by existing
instances. If you created the factor and then filled in a blank
instance, the earlier blank level would persist. If you want to
collapse the levels in your factor vector so that they are all
represented then you can do:
datmtx$bay <-factor(datmtx$bay)
>
> --
> David.
>>
>>> nlevels(datmtx$bay)
>> [1] 7
>>
>> David Chagaris
>> Associate Research Scientist
>> Florida Fish and Wildlife Conservation Commission
>> Florida Fish and Wildlife Research Institute
>> 100 8th Ave SE
>> St. Petersburg, FL 33701
>> (727) 896-8626 ext. 4305
>> (727) 893-1374 fax
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list