[R] issue with levels of a factor after subsetting

marcos carvajalino maancafe240 at gmail.com
Mon Oct 26 21:39:29 CET 2009


Hi

Second question in a day, i'm beginnning to feel incompetent...

This time i'm having a weird problem, i'm importing the next data base:

>car<-read.csv2("Historicos.csv")

'data.frame':   1818 obs. of  6 variables:
 $ Dpto  : Factor w/ 11 levels "ANTIOQUIA","ATLÁNTICO",..: 2 2 2 2 2 1
1 1 1 5 ...
 $ Rio   : Factor w/ 43 levels "Acandí","Anchicayá",..: 26 26 26 26 26
4 4 4 4 39 ...
 $ Var   : Factor w/ 13 levels "CAUDAL","CD",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Valor : num  7150 7150 7121 7121 7121 ...
 $ Año   : int  2002 2003 2004 2009 2005 2002 2003 2004 2005 2009 ...
 $ Región: Factor w/ 2 levels "CARIBE","PACIFICO": 1 1 1 1 1 1 1 1 1 2 ...

The variable "Rio" contents names of 43 rivers in Colombia, now my
boss wants me to show just 4 of them in a graph and the other 39 in
another, i subsetted them using the following code:

#The first 4 Rivers
>car4<-car[car$Rio%in%c("Magdalena","Atrato","San Juan","Mira"),]

#The other 39
>car5<-car[!car$Rio%in%c("Magdalena","Sinú","Atrato","San Juan","Mira","Micay",
"Patia","Canal del Dique","Iscuandé","Guapi"),]

And I plot the two graphs using:

xyplot(Valor~Año|Var,groups=Rio,data=car4[car4$Var%in%c("NT","PO4","HDD","CTE",
"SST","OCT"),],layout=c(2,3),subscripts=T,scale=list(y=list(relation="free")),type="b")

xyplot(Valor~Año|Var,groups=Rio,data=car5[car5$Var%in%c("NT","PO4","HDD","CTE",
"SST","OCT"),],layout=c(2,3),subscripts=T,scale=list(y=list(relation="free")),type="b")

Until then everything was going smoothly, but i tried to add a custom
key using key=list(corner=c(1,1),border=T,lines=T,text=list(levels(car4$Rio)))
and i was very suprised when instead of the expected 4 names of the
rivers i got the whole 43 in the legend.

i thought it was my fault and i missed something in the key
instruction but when i checked the structure of the car4 data frame
(The one with just the selected 4 rivers) i found out this:

>str(car4)
'data.frame':   230 obs. of  6 variables:
 $ Dpto  : Factor w/ 11 levels "ANTIOQUIA","ATLÁNTICO",..: 2 2 2 2 2 1
1 1 1 5 ...
 $ Rio   : Factor w/ 43 levels "Acandí","Anchicayá",..: 26 26 26 26 26
4 4 4 4 39 ...
 $ Var   : Factor w/ 13 levels "CAUDAL","CD",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Valor : num  7150 7150 7121 7121 7121 ...
 $ Año   : num  2002 2003 2004 2009 2005 ...
 $ Región: Factor w/ 2 levels "CARIBE","PACIFICO": 1 1 1 1 1 1 1 1 1 2 ...

The new data frame (car4) keeped the factor levels of the old data
frame (car), how can i drop them from the new data frame and just keep
the 4 selected levels?

Thanks by advanced...
--
Marcos Antonio Carvajalino Fernández
Estudiante de Ingeniería Ambiental y Sanitaria
Universidad del Magdalena, Colombia




More information about the R-help mailing list