[R] simplest way (set of functions) to parse a file

arun smartpink111 at yahoo.com
Mon Aug 27 15:28:43 CEST 2012


Hi,

I guess this helps you.
library(reshape)
dat1<-read.table(text="
Iteration    Jmin    Error    Elapsed    Corral    Duality_Gap    Step1    Step2    Step3    Step4
10    32    3.133476e-03    6.075853e-03    8    4.057531e-01    1.613035e-03    3.956920e-03    3.077200e-05    4.390900e-05
20    28    5.597685e-04    4.376530e-03    16    4.711146e-03    0.000000e+00    4.390998e-03    2.229600e-05    2.517100e-05
30    27    1.148159e-04    4.357923e-03    22    8.408166e-06    0.000000e+00    4.326610e-03    2.697700e-05    3.233200e-05
40    27    4.036778e-05    4.388260e-03    29    2.529294e-07    0.000000e+00    4.329726e-03    2.713000e-05    3.558400e-05
50    27    1.840383e-05    4.357373e-03    36    1.341111e-07    0.000000e+00    4.327526e-03    3.097000e-05    3.255700e-05
53    27    0.000000e+00    1.322382e-03    36    -2.842171e-14    0.000000e+00    1.327239e-03    1.420400e-05    2.043500e-05
",sep="",header=TRUE,stringsAsFactors=FALSE,fill=TRUE)
df2<-melt(dat1,"Iteration",c("Step1","Step2","Step3","Step4"))
df3<-df2[,c(2,1)]
df4<-data.frame(n=rep(c(1:6),times=4),iter=rep(1:4,each=6),df3)
 colnames(df4)[c(3,4)]<-c("step","value")
 head(df4)
  n iter  step value
1 1    1 Step1    10
2 2    1 Step1    20
3 3    1 Step1    30
4 4    1 Step1    40
5 5    1 Step1    50
6 6    1 Step1    53


A.K.



----- Original Message -----
From: Giovanni Azua <bravegag at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, August 27, 2012 6:24 AM
Subject: [R] simplest way (set of functions) to parse a file

Hello,

What would be the best set of R functions to parse and transform a file?

My file looks as shown below. I would like to plot this data and I need to parse it into a single data frame that sorts of "transposes the data" with the following structure:

> df <- data.frame(n=c(1,1,2,2),iter=c(1,2,1,2),step=as.factor(c('Step 1', 'Step2', 'Step 1', 'Step 2')),value=c(10, 10, 10, 10))
> str(df)
'data.frame':    4 obs. of  4 variables:
$ n    : num  1 1 2 2
$ iter : num  1 2 1 2
$ step : Factor w/ 3 levels "Step 1","Step 2",..: 1 3 1 2
$ value: num  10 10 10 10

n=extracted from the file name "logdet_two_moons_n>>>>10<<<<.txt"
iter=iter
step=column Step1, Step2, Step3, Step4
value=value of the specific Step column 

And this is one possible data frame variation to be able to plot the time proportions for the different steps of my algorithm. 

TIA,
Best regards,
Giovanni

Iteration    Jmin    Error    Elapsed    Corral    Duality Gap    Step1    Step2    Step3    Step4
2    2    0.000000e+00    1.912976e-03    1    0.000000e+00    1.779780e-03    7.214600e-05    1.243600e-05    2.246700e-05
../test/genmoons_data/logdet_two_moons_n10.txt,2,2,1.754115e-02,0.000000e+00,9.799000e+03,0.000000e+00,5.586293e-01,0.000000e+00 
Iteration    Jmin    Error    Elapsed    Corral    Duality Gap    Step1    Step2    Step3    Step4
4    9    0.000000e+00    1.280841e-03    2    -7.105427e-15    9.557570e-04    2.301610e-04    1.571100e-05    2.177300e-05
../test/genmoons_data/logdet_two_moons_n20.txt,4,5,6.062756e-03,0.000000e+00,1.365970e+05,0.000000e+00,2.253051e+01,0.000000e+00 
Iteration    Jmin    Error    Elapsed    Corral    Duality Gap    Step1    Step2    Step3    Step4
10    32    3.133476e-03    6.075853e-03    8    4.057531e-01    1.613035e-03    3.956920e-03    3.077200e-05    4.390900e-05
20    28    5.597685e-04    4.376530e-03    16    4.711146e-03    0.000000e+00    4.390998e-03    2.229600e-05    2.517100e-05
30    27    1.148159e-04    4.357923e-03    22    8.408166e-06    0.000000e+00    4.326610e-03    2.697700e-05    3.233200e-05
40    27    4.036778e-05    4.388260e-03    29    2.529294e-07    0.000000e+00    4.329726e-03    2.713000e-05    3.558400e-05
50    27    1.840383e-05    4.357373e-03    36    1.341111e-07    0.000000e+00    4.327526e-03    3.097000e-05    3.255700e-05
53    27    0.000000e+00    1.322382e-03    36    -2.842171e-14    0.000000e+00    1.327239e-03    1.420400e-05    2.043500e-05
../test/genmoons_data/logdet_two_moons_n64.txt,53,69,3.330987e-02,0.000000e+00,2.229830e+07,0.000000e+00,6.694201e+02,0.000000e+00 
Iteration    Jmin    Error    Elapsed    Corral    Duality Gap    Step1    Step2    Step3    Step4
10    70    7.739525e-03    2.389529e-02    8    1.494829e+00    2.975209e-03    1.873082e-02    4.713600e-05    5.837200e-05
20    74    3.379192e-03    2.084753e-02    15    3.372041e-01    0.000000e+00    2.084637e-02    4.302400e-05    3.907800e-05
30    76    1.322821e-03    2.093204e-02    21    1.018845e-01    0.000000e+00    2.083170e-02    4.704100e-05    5.707100e-05
40    78    1.176950e-03    2.095179e-02    28    2.447970e-02    0.000000e+00    2.088284e-02    4.890700e-05    4.955100e-05
50    78    2.233669e-04    2.050571e-02    35    1.573952e-02    0.000000e+00    2.045954e-02    4.046600e-05    3.899000e-05
60    78    2.167956e-04    2.095130e-02    39    8.362982e-03    0.000000e+00    2.082586e-02    6.699700e-05    8.506400e-05
70    78    2.085968e-04    2.085355e-02    46    5.135190e-03    0.000000e+00    2.083204e-02    5.432900e-05    4.078600e-05
80    78    2.570800e-04    2.044932e-02    51    5.470225e-04    0.000000e+00    2.033571e-02    5.334200e-05    5.318400e-05
81    78    0.000000e+00    2.099610e-03    51    1.421085e-14    0.000000e+00    2.100072e-03    9.147000e-06    2.324800e-05
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list