[R] problem with grep under loop

arun smartpink111 at yahoo.com
Sat Sep 14 06:46:52 CEST 2013


Hi,
dat1<- read.table("gao.txt",sep="",header=FALSE,stringsAsFactors=FALSE)
 dat1
#           V1     V2         V3           V4      V5  V6       V7           V8
#1 ref_gene_id ref_id class_code cuff_gene_id cuff_id FMI     FPKM FPKM_conf_lo
#2           -      -          u          C.3   C.3.1 100 1.000000     0.000000
#3           -      -          u          C.2   C.2.1 100 1.000000     0.000000
#4           -      -          u          C.4   C.4.1 100 1.000000     0.000000
#5           -      -          u          C.1   C.1.1 100 1.000000     0.000000
#6           -      -          u          C.5   C.5.1 100 1.000000     0.000000
#            V9      V10 V11          V12           V13
#1 FPKM_conf_hi      cov len major_iso_id ref_match_len
#2     0.000000 0.056682  96        C.3.1             -
#3     0.000000 0.058453  99        C.2.1             -
#4     0.000000 0.059634 101        C.4.1             -
#5     0.000000 0.059634 101        C.2.1             -
#6     0.000000 0.059634 101        C.5.1             -

You should read the dataset with  read.table(...., header=TRUE) as your dataset already had colnames.

for(i in 1:nrow(dat1)){if(length(grep(dat1[i,4],dat1[i,12])==1)!=0) print("Y")}
[1] "Y"
[1] "Y"
[1] "Y"
[1] "Y"

A.K.



Sorry about that. I will try to reformat my question. 

I have a dataset with format like: 
---------------------------------- 
> head(data) 
           V1     V2         V3           V4       V5  V6       V7           V8 
1 ref_gene_id ref_id class_code cuff_gene_id  cuff_id FMI     FPKM FPKM_conf_lo 
2           -      -          u       C.3 C.3.1 100 1.000000     0.000000 
3           -      -          u       C.2 C.2.1 100 1.000000     0.000000 
4           -      -          u       C.4 C.4.1 100 1.000000     0.000000 
5           -      -          u       C.1 C.1.1 100 1.000000     0.000000 
6           -      -          u       C.5 C.5.1 100 1.000000     0.000000 
            V9      V10 V11          V12           V13 
1 FPKM_conf_hi      cov len major_iso_id ref_match_len 
2     0.000000 0.056682  96     C.3.1             - 
3     0.000000 0.058453  99     C.2.1             - 
4     0.000000 0.059634 101     C.4.1             - 
5     0.000000 0.059634 101     C.7.1             - 
6     0.000000 0.059634 101     C.5.1             - 
--------------------- 
here column5 has extra ".1" compared with column4, and column12 
might be different from column5 with similar format, for example row 5; 
but most of them (column5 and column12) are the same (like the rest of 
the rows) . I am trying to find the different ones by using "grep" 

this data has a dimension of  52086  by 13 

so my ran the following code: 
------------ 
> dim(data) 
[1] 52086    13 
>  if(grep(data[3,4],data[3,12])==1) print("Y") 
[1] "Y" 
> for(i in 1:52086){if(grep(data[i,4],data[i,12])==1) print ("Y")} 
Error in if (grep(data[i, 4], data[i, 12]) == 1) print("Y") : 
  argument is of length zero 

----------- 

here I tested the grep command first and it looks ok. However, when I put it in for loop, error message came: 
argument is of length zero 

Could you please help me figure out what happened here? 

Thanks a lot for your help. 




----- Original Message -----
From: capricy gao <capricyg at yahoo.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, September 13, 2013 9:29 PM
Subject: [R] problem with grep under loop



I am just testing the possibility of using grep under for loop:

>for(i in 1:10){grep("a",letters)}


nothing came out;

when I ran: 


>grep("a",letters), 


I got "1"

so in my for loop, I expected to see ten "1"s, but I did not.

Could anybody help me to figure out why? Thanks a lot for your help.

Capricy
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list