[R] Split Strings

Miluji Sb milujisb at gmail.com
Mon Jan 18 09:46:10 CET 2016


Thank you everyone for the codes and the link. They work well!

Mr. Lemon, thank you for the detailed code and the explanations. I
appreciate it. One thing though, in the last line

sapply(split_strings,fill_strings,list(max_length,element_sets))

should it be unlist instead of list - I get this error "Error in
FUN(X[[i]], ...) : (list) object cannot be coerced to type 'integer'".
Thanks again!



On Mon, Jan 18, 2016 at 9:19 AM, Jim Lemon <drjimlemon at gmail.com> wrote:

> Hi Miluji,
> While the other answers are correct in general, I noticed that your
> request was for the elements of an incomplete string to be placed in the
> same positions as in the complete strings. Perhaps this will help:
>
> strings<-list("pc_m2_45_ssp3_wheat","pc_m2_45_ssp3_wheat",
>  "ssp3_maize","m2_wheat","pc_m2_45_ssp3_maize")
> split_strings<-strsplit(unlist(strings),"_")
> max_length <- max(sapply(split_strings,length))
> complete_sets<-split_strings[sapply(split_strings,length)==max_length]
> element_sets<-list()
>
> # build a list with the unique elements of each complete string
> for(i in 1:max_length)
>  element_sets[[i]]<-unique(sapply(complete_sets,"[",i))
>
> # function to guess the position of the elements in a partial string
> # and return them in the hopefully correct positions
> fill_strings<-function(split_string,max_length,element_sets) {
>  if(length(split_string) < max_length) {
>   new_split_string<-rep(NA,max_length)
>   for(i in 1:length(split_string)) {
>    for(j in 1:length(complete_sets)) {
>     if(grep(split_string[i],element_sets[j]))
>      new_split_string[j]<-split_string[i]
>    }
>   }
>   return(new_split_string)
>  }
>  return(split_string)
> }
>
> # however, if you know that the incomplete strings will always
> # be composed of the last elements in the complete strings
> fill_strings<-function(split_string,max_length) {
>  lenstring<-length(split_string)
>  if(lenstring < max_length)
>   split_string<-c(rep(NA,max_length-lenstring),split_string)
>  return(split_string)
> }
>
> sapply(split_strings,fill_strings,list(max_length,element_sets))
>
> Jim
>
> On Mon, Jan 18, 2016 at 7:56 AM, Miluji Sb <milujisb at gmail.com> wrote:
>
>> I have a list of strings of different lengths and would like to split each
>> string by underscore "_"
>>
>> pc_m2_45_ssp3_wheat
>> pc_m2_45_ssp3_wheat
>> ssp3_maize
>> m2_wheat
>>
>> I would like to separate each part of the string into different columns
>> such as
>>
>> pc m2 45 ssp3 wheat
>>
>> But because of the different lengths - I would like NA in the columns for
>> the variables have fewer parts such as
>>
>> NA NA NA m2 wheat
>>
>> I have tried unlist(strsplit(x, "_")) to split, it works for one variable
>> but not for the list - gives me "non-character argument" error. I would
>> highly appreciate any help. Thank you!
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list