[R] How to scan df from a specific word?

Sat Oct 30 01:07:28 CEST 2010

Sorry, this isn't really an R solution, but here it goes anyway. You
can isolate the block from Source to the first following blank line by
the following unix/linux/cygwin command, assuming inFile is your input
file and outFile is the output file:

cat inFile | grep -A 100 Source | grep -m 1 -B 100 ^$ > outFile

(in this command the number of lines in the Source block is limited to
about 100-2=98, you can safely increase the number if necessary)

You will probably still have the problem of successive spaces in the
block you are interested in. This can also be handled on the command
line, for example by adding

| sed 's/   */\t/g'

(note there are 3 spaces between the first / and the *) between the $
and > characters in the command above. The last bit will replace all
occurrences of 2 or more spaces by a tab, so you can read the file as
tab-separated.

Equivalently, you could do the line filtering in R as Phil suggested.

Peter

On Fri, Oct 29, 2010 at 3:39 PM, M.Ribeiro <mresendeufv at yahoo.com.br> wrote:
>
> Sorry, the explanation wasn't very good...just to explain better.
>
> I am writing a loop to read and process every time a different file in the
> same script.
> And what I want to load into a variable each time is a data.frame that is
> bellow the word source in all of my files.
>
> So I would like to recognize the word Source in the text file....and read
> the table bellow source until the next blank line (the file has more written
> stuff bellow the data frame that I want to read too)
>
> Here is an example of the file. I want the df to read from source until the
> blank line right above the words "Analysis of Variance
>
> Notice:     37 singularities detected in design matrix.
>   1 LogL=-2664.01     S2=  1.0000       8367 df    :   2 components
> constrained
>   2 LogL=-2269.45     S2=  1.0000       8367 df
>   3 LogL=-1698.47     S2=  1.0000       8367 df
>   4 LogL=-1252.72     S2=  1.0000       8367 df
>   5 LogL=-1013.52     S2=  1.0000       8367 df
>   6 LogL=-957.409     S2=  1.0000       8367 df
>   7 LogL=-944.252     S2=  1.0000       8367 df
>   8 LogL=-939.976     S2=  1.0000       8367 df
>   9 LogL=-938.908     S2=  1.0000       8367 df
>  10 LogL=-938.798     S2=  1.0000       8367 df
>  11 LogL=-938.795     S2=  1.0000       8367 df
>  12 LogL=-938.795     S2=  1.0000       8367 df
>
>  Source                Model  terms     Gamma     Component    Comp/SE   % C
>  Residual               8383   8367
>  at(type,1).Nfam          62     62   10.1131       10.1131       1.81   0 P
>  at(type,2).Nfam          62     62   28.1153       28.1153       2.16   0 P
>  rep.iblk                768    768   63.2919       63.2919      10.94   0 P
>  at(type,1).Nfemale       44     44   29.9049       29.9049       2.93   0 P
>  at(type,1).Nclone      2689   2689   109.560       109.560      12.66   0 P
>  at(type,2).Nfemale       44     44   14.0305       14.0305       1.68   0 P
>  Variance                  0      0   479.040       479.040      36.23   0 P
>  Variance                  0      0   490.580       490.580      17.51   0 P
>  Variance                  0      0   469.932       469.932      36.51   0 P
>  Variance                  0      0   544.654       544.654      17.86   0 P
>
>  Analysis of Variance              NumDF              F_inc
>  27 mu                                1            5860.84
>
>  12 culture                           1               0.07
>  10 type                              1              29.59
>  28 culture.rep                       6              14.06
>  30 culture.rep.type                  7               2.17
>  36 at(type,1).Nfam                      62 effects fitted
> --
> View this message in context: http://r.789695.n4.nabble.com/How-to-scan-df-from-a-specific-word-tp3019841p3019846.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>