[R] Help with IF command strings

arun smartpink111 at yahoo.com
Fri Jul 12 17:53:28 CEST 2013


Hi,
Regarding the 2nd issue of mean=3.8 being "too high", could you explain it.
#Using the same example:
 dat1$V21[dat1$V2==1|dat1$V2==0]
#[1]  6  2  1 10  0
 (6+2+1+10+0)/5
#[1] 3.8
 mean(dat1$V21[dat1$V2==1|dat1$V2==0])
#[1] 3.8

About missing data:
set.seed(55)
dat2<- as.data.frame(matrix(sample(c(NA,0:4),26*10,replace=TRUE),ncol=26))  ####new example dataset
 dat2$V2
 #[1]  4 NA  0  0  1  3  2  4  2  1
dat2$V21
 #[1] NA  3  0  0  2  0  4  0  3 NA
(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)
# [1] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE
 dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)]
#[1]  0  0  2 NA
mean(dat2$V21[(dat2$V2==1|dat2$V2==0) &!is.na(dat2$V2)],na.rm=TRUE)
#[1] 0.6666667
 (0+0+2)/3
#[1] 0.6666667


If this doesn't solve the problem, please provide a reproducible example using ?dput() 
ex:
dput(head(dataset,20))

A.K.



When I enter that formula I get "NA" or NaN" as an answer.  I have some 
missing data, which was entered in as NA, so I'm not sure if that is the
 problem.  Originally I thought I would need to do the entire set of 
equations you posted, but that gave me 3.8 as a mean, which I know is 
too high to be the mean for this data set. 

Thanks 



----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Friday, July 12, 2013 8:21 AM
Subject: Re: Help with IF command strings

Hi,

Not sure I understand your question.
Suppose `data1` is your real data, but if the column names are different, change "V21", "V2" by those in the real data. Based on your initial post, the column names seemed to be the same.
mean(data1$V21[data1$V2==1|data1$V2==0])

A.K.  


What values would I substitute by real data.  I did everything the way 
you posted, and I got 3.8 as well.  So I'm curious what values I would 
change to get the mean for the actual data? 


----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: R help <r-help at r-project.org>
Cc: 
Sent: Thursday, July 11, 2013 9:21 PM
Subject: Re: Help with IF command strings

HI,
Try this:
set.seed(485)
dat1<- as.data.frame(matrix(sample(0:10,26*10,replace=TRUE),ncol=26))
mean(dat1$V21[dat1$V2==1|dat1$V2==0])
#[1] 3.8
#or
with(dat1,mean(V21[V2==1|V2==0]))
#[1] 3.8


A.K.


I have data in 26 columns, I'm trying to get a mean for column 21 only for the participants that are either 0 or 1 in column 2. 

One of the commands I tried looked something like this 

mean(data1$V21, if(V2 = 1))   

So basically I need to have the program run a mean (and later 
other forms of analysis) on participants based on their condition. 
either 0 or 1. 

Help is greatly appreciated. 

Thanks



More information about the R-help mailing list