[R-sig-eco] subsetting data in R

Gustavo Carvalho gustavo.bio+R at gmail.com
Sun Apr 24 16:30:38 CEST 2011


pa2 <- subset(pa, influencia=="AP")
pa2$influencia <- factor(pa2$influencia)
levels(pa2$influencia)

On Sun, Apr 24, 2011 at 11:24 AM, Manuel Spínola <mspinola10 at gmail.com> wrote:
> Thank you very much for your response, Christian, Roman, and Sarah.
>
> Sarah,
>
> I am trying your suggestion but I cannot see the levels:
>
>  > pa2 = factor(subset(pa, influencia=="AP")$influencia)
>  > levels(pa2$influencia)
> Error in pa2$influencia : $ operator is invalid for atomic vectors
>
> Best,
>
> Manuel
>
>
>
> On 24/04/2011 07:51 a.m., Sarah Goslee wrote:
>> By default, read.csv() turns character variables into factors, using all the
>> unique values as the levels.
>>
>> subset() retains those levels by default, as they are a vital element of the
>> data. If you are studying some attribute of men and women, say height,
>> even if you are only looking at the heights for women it's important to remember
>> that men still exist.
>>
>> If you don't want influencia to be a factor, you can change that in the import
>> stringsAsFactors=FALSE.
>>
>> If you do want influencia to be a factor, but want the unused levels to be
>> removed, you can use factor() to do that.
>>
>>> testdata<- data.frame(group=c("A", "B", "C", "A", "B", "C"), value=1:6)
>>> testdata
>>    group value
>> 1     A     1
>> 2     B     2
>> 3     C     3
>> 4     A     4
>> 5     B     5
>> 6     C     6
>>> str(testdata)
>> 'data.frame': 6 obs. of  2 variables:
>>   $ group: Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3
>>   $ value: int  1 2 3 4 5 6
>>> subset(testdata, group=="A")
>>    group value
>> 1     A     1
>> 4     A     4
>>> subset(testdata, group=="A")$group
>> [1] A A
>> Levels: A B C
>>> ?subset
>>> factor(subset(testdata, group=="A")$group)
>> [1] A A
>> Levels: A
>>
>> Sarah
>>
>> On Sun, Apr 24, 2011 at 9:04 AM, Manuel Spínola<mspinola10 at gmail.com>  wrote:
>>> Dear list members,
>>>
>>> I have a question regarding too subsetting a data set in R.
>>>
>>> I created an object for my data:
>>>
>>>   >pa = read.csv("espec_indic.csv", header = T, sep=",", check.names = F)
>>>
>>>   >  levels(pa$influencia)
>>> [1] "AID" "AII" "AP"
>>>
>>> The object has 3 levels for influencia (AP, AID, AII)
>>>
>>> Now I subset only observations with influencia = "AID"
>>>
>>>   >pa2 = subset(pa, influencia=="AID")
>>>
>>> but if I ask for the levels of influencia still show me the 3 levels,
>>> AP, AID, AII.
>>>
>>>   >  levels(pa2$influencia)
>>> [1] "AID" "AII" "AP"
>>>
>>> Why is that?
>>>
>>> I was thinking that I was creating a new data frame with only AID as a
>>> level for influencia.
>>>
>>> How can I make a complete new object with only the observations for
>>> "AID" and that the only level for influencia is indeed "AID"?
>>>
>>> Best,
>>>
>>> Manuel
>>>
>>>
>
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspinola at una.ac.cr
> mspinola10 at gmail.com
> Teléfono: (506) 2277-3598
> Fax: (506) 2237-7036
> Personal website: Lobito de río
> <https://sites.google.com/site/lobitoderio/>
> Institutional website: ICOMVIS <http://www.icomvis.una.ac.cr/>
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>



More information about the R-sig-ecology mailing list