[R] About Creating a List by Parsing Text

jim holtman jholtman at gmail.com
Tue Aug 5 12:14:51 CEST 2008


Does this get you close to what you want:

> x <- read.table(textConnection("Gene  11340 211952_at RANBP5  k= 1  LL= -970.692
+ Gene  11340 211952_at RANBP5  k= 2  LL= -965.35
+ Gene  11340 211952_at RANBP5  k= 3  LL= -963.669
+ Gene  12682 213301_x_at TRIM24  k= 1  LL= -948.527
+ Gene  12682 213301_x_at TRIM24  k= 2  LL= -947.275
+ Gene  12682 213301_x_at TRIM24  k= 3  LL= -947.379
+ Gene  13764 214385_s_at AI521646  k= 1  LL= -827.86
+ Gene  13764 214385_s_at AI521646  k= 2  LL= -777.756
+ Gene  13764 214385_s_at AI521646  k= 3  LL= -812.083 "))
> y <- lapply(split(x, x$V3), "[[", 8)
>
> y
$`211952_at`
[1] -970.692 -965.350 -963.669

$`213301_x_at`
[1] -948.527 -947.275 -947.379

$`214385_s_at`
[1] -827.860 -777.756 -812.083



On Tue, Aug 5, 2008 at 3:09 AM, Gundala Viswanath <gundalav at gmail.com> wrote:
> Hi all,
>
> I have the following data in which I want to parse and
> store them in a list
>
> __DATA__
>> print(comp.ll)
>   [1] "\tGene  11340 211952_at RANBP5  k= 1  LL= -970.692 "
>   [2] "\tGene  11340 211952_at RANBP5  k= 2  LL= -965.35 "
>   [3] "\tGene  11340 211952_at RANBP5  k= 3  LL= -963.669 "
>   [4] "\tGene  12682 213301_x_at TRIM24  k= 1  LL= -948.527 "
>   [5] "\tGene  12682 213301_x_at TRIM24  k= 2  LL= -947.275 "
>   [6] "\tGene  12682 213301_x_at TRIM24  k= 3  LL= -947.379 "
>   [7] "\tGene  13764 214385_s_at AI521646  k= 1  LL= -827.86 "
>   [8] "\tGene  13764 214385_s_at AI521646  k= 2  LL= -777.756 "
>   [9] "\tGene  13764 214385_s_at AI521646  k= 3  LL= -812.083 "
> __END__
>
> I expect to get this kind of data structure:
>
>> wanted_output
>
> [['211952_at']]
> $ll.list
> [1] -970.692 -965.35 -963.669
>
> [['213301_x_at']]
> $ll.list
> [1] -948.527 -947.275 -947.379
>
> etc.
>
> How can I achieve that?
>
> I am stuck with the following construct
>
> __BEGIN__
> comp.ll <- model_all[grep("Gene .* k=.*", model_all)]
> print(comp.ll)
>
> patt <- "Gene  \\d+ ([\\w-/]+) [\\w-]+  k= (\\d)  LL= ([-]\\d+\.\\d+)"
> nresk <- unlist(strsplit(sub(patt, "\\1 \\2 \\3",comp.ll,perl=TRUE)," "))
> __END__
>
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list