[R] problems with which

Nicola Spotorno nicola.spotorno at isc.cnrs.fr
Sun Aug 15 17:02:46 CEST 2010


You took the point.

Thanks a lot,

Nicola

David Winsemius ha scritto:
>
> On Aug 15, 2010, at 4:32 AM, Nicola Spotorno wrote:
>
>> Dear all,
>> I'm quite new in R and I have a problem with the function which. When 
>> I use it to select a subset of a dataframe it works well but 
>> somewhere R takes trace of the past dataframe and this creates 
>> problems with following operations.
>> For example:
>>
>> sentences <- read.xls("frasi.tot.march.3.xls", header=TRUE)
>>
>> head(sentences)
>> fam subjID Cond  Code reg     total     first    second
>> 1   f     30   an fDan1   1 0.2812500 0.2812500 0.0000000
>> 2   f     30   an fDan1   2 1.7851562 0.5390625 1.2460938
>> 3   f     30   an fDan1   3 1.2304688 0.6679688 0.5625000
>> 4   f     30   an fDan1   4 0.6289062 0.4375000 0.1914062
>> 5   f     30   an fDan2   1 0.1367188 0.1367188 0.0000000
>> 6   f     30   an fDan2   2 0.8632812 0.6679688 0.1953125
>>
>> str(sentences)
>> 'data.frame':    4799 obs. of  8 variables:
>> $ fam   : Factor w/ 2 levels "f","uf": 1 1 1 1 1 1 1 1 1 1 ...
>> $ subjID: int  30 30 30 30 30 30 30 30 30 30 ...
>> $ Cond  : Factor w/ 4 levels "an","fi","le",..: 1 1 1 1 1 1 1 1 1 1 ...
>> $ Code  : Factor w/ 126 levels "fAan1","fAan2",..: 72 72 72 72 73 73 
>> 73 73 74 74 ...
>> $ reg   : int  1 2 3 4 1 2 3 4 1 2 ...
>> $ total : num  0.281 1.785 1.23 0.629 0.137 ...
>> $ first : num  0.281 0.539 0.668 0.438 0.137 ...
>> $ second: num  0 1.246 0.562 0.191 0 ...
>>
>> # If you look the variable "Cond" you see that it has 4 levels
>>
>> sentences_trial <- sentences[which(sentences$Cond!= "an"),]
>>
>> > str(sentences)
>> 'data.frame':    4799 obs. of  8 variables:
>> $ fam   : Factor w/ 2 levels "f","uf": 1 1 1 1 1 1 1 1 1 1 ...
>> $ subjID: int  30 30 30 30 30 30 30 30 30 30 ...
>> $ Cond  : Factor w/ 4 levels "an","fi","le",..: 1 1 1 1 1 1 1 1 1 1 ...
>> $ Code  : Factor w/ 126 levels "fAan1","fAan2",..: 72 72 72 72 73 73 
>> 73 73 74 74 ...
>> $ reg   : int  1 2 3 4 1 2 3 4 1 2 ...
>> $ total : num  0.281 1.785 1.23 0.629 0.137 ...
>> $ first : num  0.281 0.539 0.668 0.438 0.137 ...
>> $ second: num  0 1.246 0.562 0.191 0 ...
>>
>> # Now variable "Cond" still has 4 levels but with which I have 
>> excluded one level!
>
> You showed us two copies of str(sentences). How can we possibly know 
> what sentences_trial looks like?
>
>> #Whether  I apply at this point  interaction plot, the graph 
>> considers 4 levels of which.
>
> If you want to remove factor levels from a column just use factor() on 
> it again:
>
> sentences_trial <- factor(sentences_trial$Cond)
>
> Or to short-circuit that two-step process use subset with drop =TRUE:
>
> sentences_trial <- subset( sentences, Cond!= "an" , drop=TRUE
>
>>
>> attach(sentence_trial)
>> x11()
>> interaction.plot(Cond,fam,total)
>>
>> # Where is the problem?
>>
>
> I think I identified it, but it was without a reproducible example so 
> it remains only an attractive theory.
>
> David Winsemius, MD
> West Hartford, CT
>



More information about the R-help mailing list