[R] Selecting names with regard to visit frequency
arun
smartpink111 at yahoo.com
Wed Jul 24 00:39:00 CEST 2013
Hi Michael,
It could be due to some extra space. If you use read.table(..., fill=TRUE), it should read. Then, there would be missing values. Using ?dput() will be better.
dput(df1)
structure(list(x = c(2L, 5L, 4L, 6L, 24L, 7L, 12L, 3L, 5L)), .Names = "x", class = "data.frame", row.names = c("A1",
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"))
Now, try the code by assigning:
df1<- structure(list(x.....
It wouldn't work with decimals because here:
3:5
#[1] 3 4 5 #it will matching all values that are 3,4, and 5
Trying this on another dataset:
df2<- structure(list(x = c(2, 5, 4.4, 6, 24, 7, 12, 3.6, 5)), .Names = "x", class = "data.frame", row.names = c("A1",
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"))
vec2<- unlist(df2)
names(vec2)<- row.names(df2)
vec2
# A1 A2 A3 A4 A5 A6 A7 A8 A9
# 2.0 5.0 4.4 6.0 24.0 7.0 12.0 3.6 5.0
names(vec2)[vec2%in% 3:5] #incorrect
#[1] "A2" "A9"
names(vec2)[vec2%in% seq(3,5,by=0.1)]
#[1] "A2" "A3" "A8" "A9"
#If I change
vec2[3]<- 4.46
names(vec2)[vec2%in% seq(3,5,by=0.1)]
#[1] "A2" "A8" "A9"
names(vec2)[round(vec2,1)%in% seq(3,5,by=0.1)]
#[1] "A2" "A3" "A8" "A9"
names(vec2)[vec2>=3 & vec2<=5] #should be better in such cases
#[1] "A2" "A3" "A8" "A9"
It is also better to check R FAQ 7.31.
A.K.
Hi Arun,
Perhaps these are dataframes I am working with, and have mistaken
them for vectors (I am still very new at this and learning the data
structures).
I tried to read the text in as you have it here (copied and pasted), but it did not work.
Error in read.table(text = " \n\"\",\"x\" \n\"A1\",2 \n\"A2\",5
\n\"A3\",4 \n\"A4\",6 \n\"A5\",24 \n\"A6\",7 \n\"A7\",12 \n\"A8\",3
\n\"A9\",5 \n", :
more columns than column names
I retried both:
names(vec1)[vec1%in% 3:5]
&
names(vec1)[!is.na(match(vec1,3:5))]
before and after processing my current dataframe to a vector but
I get a NULL return. I also get a NULL return if I unlist the dataframe
and try to execute:
names(vec1)[vec1>=3 & vec1<=5]
All 3 do work if I keep the dataframe in its original form, instead of using:
vec1<-unlist(df1)
names(vec1)<- row.names(df1)
I discovered another issue, however. I am working with a couple
datasets, one of them has whole numbers the other has percentages in
place of visits such as:
"A1",0.2
"A2",0.5
...
the two options:
names(vec1)[vec1%in% 3:5]
names(vec1)[!is.na(match(vec1,3:5))]
do not seem to work with ranges given in decimals (and that is
probably what I originally tested them on) but are fine with whole
numbers.
Thanks,
steele
More information about the R-help
mailing list