[R] grep won't work finding one column

Tue Oct 14 22:37:05 CEST 2014

On 15/10/14 04:09, Kate Ignatius wrote:
> In the sense - it does not work.  it works when there are 50 samples
> in the file, but it does not work when there is one.
>
> The usual headings are:  sample1.at sample1.dp
> sample1.fg sample2.at sample2.dp sample2.fg.... and so on to a max of
> sample50.at sample50.dp sample50.fg
>
> using this greps out all the .at columns perfectly:
>
> df[,grep(".at",colnames(df))]
>
> When I come across a file when there is one sample:
>
> sample1.at sample1.dp sample1.fg
>
> Using this:
>
> df[,grep(".at",colnames(df))]
>
> returns nothing.
>
> Oh - AT/at was just an example... thats not my problem...

You are being (deliberately?) obtuse.

It's *all* your problem.  You have to be precise when working with 
computers and when providing examples.  Don't build examples with 
confusing red herrings.

Your assertion that "df[,grep(".at",colnames(df))] returns nothing" is 
simple ***INCORRECT***.  It works just fine.  See the (tidy, completely 
reproducible) example in the attached file "kate.txt".

Note that, with a single ".at" column in your data frame, what is 
returned is ***NOT*** a data frame but rather a vector.  If you want a 
(one-column) data frame you need to use "drop=FALSE" in your 
subscripting call.

You need to study up on R and learn how it works (read the Introduction 
to R) and stop going off half-cocked.

cheers,

Rolf Turner

P.S.  It is a ***bad*** idea to use "df" as the name of a data frame. 
The string "df" is the name of a *function* in base R (it is the 
probability density function for the F distribution).  Although R is 
clever enough to distinguish functions from data objects in *most* 
circumstances, at the very least confusion could arise.

R. T.

-- 
Rolf Turner
Technical Editor ANZJS
-------------- next part --------------
#
# Check it out.
#

# Data frame with one ".at" column.
d1 <- as.data.frame(matrix(1,ncol=3,nrow=10))
n1 <- c("sample1.at","sample1.dp","sample1.g")
names(d1) <- n1

# Data frame with many ".at" columns.
d2 <- as.data.frame(matrix(1,ncol=50,nrow=10))
set.seed(42)
n2 <- paste("sample",1:50,sample(c(".at",".dp",".fg"),50,TRUE),sep="")
names(d2) <- n2

# Extract the ".at" columns.
print(d1[,grep(".at",colnames(d1))])
print(d2[,grep(".at",colnames(d2))])