[R] About Creating a List by Parsing Text

Jim Holtman jholtman at gmail.com
Tue Aug 5 13:34:24 CEST 2008


You need to read in the line with read.table to parse the string. The  
solution assumes a dataframe .

Sent from my iPhone

On Aug 5, 2008, at 7:16, "Henrique Dallazuanna" <wwwhsd at gmail.com>  
wrote:

> I think that you need:
>
> p <- scan(textConnection(p), what = "")
>
> On Tue, Aug 5, 2008 at 7:41 AM, Gundala Viswanath  
> <gundalav at gmail.com> wrote:
>> Thanks Jim,
>>
>> But how can I modify this line of yours
>>
>> y <- lapply(split(x, x$V3), "[[", 8)
>>
>> to suit my 'comp.ll'
>>
>> I tried this but fail:
>>> p <-  "\tGene  11340 211952_at RANBP5  k= 1  LL= -970.692 "
>>> y <- lapply(split(p, p[3]), "[[", 8)
>>> y
>>> list()
>>
>>
>> - Gundala Viswanath
>> Jakarta - Indonesia
>>
>>
>>
>> On Tue, Aug 5, 2008 at 7:14 PM, jim holtman <jholtman at gmail.com>  
>> wrote:
>>> Does this get you close to what you want:
>>>
>>>> x <- read.table(textConnection("Gene  11340 211952_at RANBP5  k=  
>>>> 1  LL= -970.692
>>> + Gene  11340 211952_at RANBP5  k= 2  LL= -965.35
>>> + Gene  11340 211952_at RANBP5  k= 3  LL= -963.669
>>> + Gene  12682 213301_x_at TRIM24  k= 1  LL= -948.527
>>> + Gene  12682 213301_x_at TRIM24  k= 2  LL= -947.275
>>> + Gene  12682 213301_x_at TRIM24  k= 3  LL= -947.379
>>> + Gene  13764 214385_s_at AI521646  k= 1  LL= -827.86
>>> + Gene  13764 214385_s_at AI521646  k= 2  LL= -777.756
>>> + Gene  13764 214385_s_at AI521646  k= 3  LL= -812.083 "))
>>>> y <- lapply(split(x, x$V3), "[[", 8)
>>>>
>>>> y
>>> $`211952_at`
>>> [1] -970.692 -965.350 -963.669
>>>
>>> $`213301_x_at`
>>> [1] -948.527 -947.275 -947.379
>>>
>>> $`214385_s_at`
>>> [1] -827.860 -777.756 -812.083
>>>
>>>
>>>
>>> On Tue, Aug 5, 2008 at 3:09 AM, Gundala Viswanath <gundalav at gmail.com 
>>> > wrote:
>>>> Hi all,
>>>>
>>>> I have the following data in which I want to parse and
>>>> store them in a list
>>>>
>>>> __DATA__
>>>>> print(comp.ll)
>>>>  [1] "\tGene  11340 211952_at RANBP5  k= 1  LL= -970.692 "
>>>>  [2] "\tGene  11340 211952_at RANBP5  k= 2  LL= -965.35 "
>>>>  [3] "\tGene  11340 211952_at RANBP5  k= 3  LL= -963.669 "
>>>>  [4] "\tGene  12682 213301_x_at TRIM24  k= 1  LL= -948.527 "
>>>>  [5] "\tGene  12682 213301_x_at TRIM24  k= 2  LL= -947.275 "
>>>>  [6] "\tGene  12682 213301_x_at TRIM24  k= 3  LL= -947.379 "
>>>>  [7] "\tGene  13764 214385_s_at AI521646  k= 1  LL= -827.86 "
>>>>  [8] "\tGene  13764 214385_s_at AI521646  k= 2  LL= -777.756 "
>>>>  [9] "\tGene  13764 214385_s_at AI521646  k= 3  LL= -812.083 "
>>>> __END__
>>>>
>>>> I expect to get this kind of data structure:
>>>>
>>>>> wanted_output
>>>>
>>>> [['211952_at']]
>>>> $ll.list
>>>> [1] -970.692 -965.35 -963.669
>>>>
>>>> [['213301_x_at']]
>>>> $ll.list
>>>> [1] -948.527 -947.275 -947.379
>>>>
>>>> etc.
>>>>
>>>> How can I achieve that?
>>>>
>>>> I am stuck with the following construct
>>>>
>>>> __BEGIN__
>>>> comp.ll <- model_all[grep("Gene .* k=.*", model_all)]
>>>> print(comp.ll)
>>>>
>>>> patt <- "Gene  \\d+ ([\\w-/]+) [\\w-]+  k= (\\d)  LL= ([-]\\d+\.\ 
>>>> \d+)"
>>>> nresk <- unlist(strsplit(sub(patt, "\\1 \\2 \ 
>>>> \3",comp.ll,perl=TRUE)," "))
>>>> __END__
>>>>
>>>>
>>>> - Gundala Viswanath
>>>> Jakarta - Indonesia
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list