[R] strsplit help

David Winsemius dwinsemius at comcast.net
Wed Apr 11 20:17:07 CEST 2012


On Apr 11, 2012, at 2:01 PM, Jean V Adams wrote:

> Alison,
>
> Your code works fine on the first six lines of the data that you  
> provided.
>
> Rumino_Reps_agreeWalign <- data.frame(
>        geneid = c("657313.locus_tag:RTO_08940",
>                "457412.251848018",
>                "657314.locus_tag:CK5_20630",
>                "657323.locus_tag:CK1_33060",
>                "657313.locus_tag:RTO_09690",
>                "471875.197297106"),
>        count_Conser = c(7, 1, 2, 1, 3, 0),
>        count_NonCons = c(5, 4, 4, 0, 0, 2),
>        count_ConsSubst = c(5, 3, 1, 1, 3, 1),
>        count_NCSubst = c(1, 0, 0, 0, 1, 1))
> gene.list <- strsplit(as.character(Rumino_Reps_agreeWalign$geneid),  
> "\\.")
> Rumino_Reps_agreeWalignTR <- transform(Rumino_Reps_agreeWalign,
>        taxid=do.call(rbind, gene.list))
>
> Perhaps in later rows of the data there are cases where there is no  
> "." in
> geneid?  If not, can you provide a subset of your data that results  
> in the
> warning?  Use the dput() function.
>
> It's not a good idea to create an object named "strsplit".  That  
> will only
> mask the function strsplit() in later runs.

There is not a problem with masking the function unless the new name  
is replaced with a language object (which wasn't the case here). The  
potential confusion is in minds of users. Function names are stored  
separately from non-language object names so you can have a data  
object named 'strsplit' and it will not mask the function 'strsplit'.

-- 
David.
>
> If time is an issue, a slightly faster way to do this, after the
> strsplit() function is:
> Rumino_Reps_agreeWalign$geneid.prefix <- sapply(gene.list, "[", 1)
> Rumino_Reps_agreeWalign$geneid.suffix <- sapply(gene.list, "[", 2)
>
> Jean
>
>
> alison waller wrote on 04/11/2012 08:23:29 AM:
>
>> Dear all,
>>
>> I want to use string split to parse column names, however, I am  
>> having
>> some errors that I don't understand.
>> I see a problem when I try to rbind the output from strsplit.
>>
>> please let me know if I'm missing something obvious,
>>
>> thanks,
>> alison
>>
>> here are my commands:
>>> strsplit<-strsplit(as.character(Rumino_Reps_agreeWalign$geneid),"\ 
>>> \.")
>>>
>> Rumino_Reps_agreeWalignTR<-transform
>> (Rumino_Reps_agreeWalign,taxid=do.call(rbind,
>> strsplit))
>> Warning message:
>> In function (..., deparse.level = 1)  :
>>   number of columns of result is not a multiple of vector length (arg
> 1)
>>
>>
>> here is my data:
>>
>>> head(Rumino_Reps_agreeWalign)
>>                       geneid count_Conser count_NonCons  
>> count_ConsSubst
>> 1 657313.locus_tag:RTO_08940            7              
>> 5               5
>> 2           457412.251848018            1              
>> 4               3
>> 3 657314.locus_tag:CK5_20630            2              
>> 4               1
>> 4 657323.locus_tag:CK1_33060            1              
>> 0               1
>> 5 657313.locus_tag:RTO_09690            3              
>> 0               3
>> 6           471875.197297106            0              
>> 2               1
>>   count_NCSubst
>> 1             1
>> 2             0
>> 3             0
>> 4             0
>> 5             1
>> 6             1
>>
>> here are the results from strsplit:
>>> head(strsplit)
>> [[1]]
>> [1] "657313"              "locus_tag:RTO_08940"
>>
>> [[2]]
>> [1] "457412"    "251848018"
>>
>> [[3]]
>> [1] "657314"              "locus_tag:CK5_20630"
>>
>> [[4]]
>> [1] "657323"              "locus_tag:CK1_33060"
>>
>> [[5]]
>> [1] "657313"              "locus_tag:RTO_09690"
>>
>> [[6]]
>> [1] "471875"    "197297106"
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list