[R] read.table and variable length of tables

David Winsemius dwinsemius at comcast.net
Thu Jun 14 18:38:00 CEST 2012


On Jun 14, 2012, at 12:18 PM, Halldór Björnsson wrote:

> Thanks and with
>
> datlines <- as.data.frame(inp[( grep("<PRE>", inp)[1]+5 ):(grep("</ 
> PRE>", inp)[1]-1)]);
>

I suggest this instead.

 > read.fwf(textConnection(datlines), widths=rep(7,11))

        V1    V2    V3    V4  V5   V6  V7 V8    V9   V10   V11
1  1008.0    54   8.6   6.5  87 6.06   0  0 281.1 297.9 282.1
2  1000.0   103   8.4   6.9  90 6.28   0  1 281.6 299.0 282.6
3   925.0   742   3.4   3.0  97 5.16 345  5 282.8 297.3 283.7
4   885.0  1100   1.2   1.2 100 4.74  14  6 284.1 297.6 284.9
5   850.0  1425  -0.1  -0.1 100 4.49  40  7 286.0 298.9 286.8
6   795.0  1955  -3.2  -3.2 100 3.83  90 11 288.3 299.5 288.9
7   744.0  2479  -6.2  -6.2 100 3.25  85 20 290.5 300.2 291.1
8   736.0  2565  -6.7  -6.7 100 3.16  81 19 290.8 300.3 291.4
9   723.0  2704  -8.0 -10.8  80 2.34  75 16 290.9 298.0 291.3
10  722.0  2715  -8.1 -11.1  79 2.28  76 16 290.9 297.9 291.3

-- 
David.

> I get the data as needed.
>
> Thanks again
>
> H.
>
>
> On Jun 14, 2012, at 10:23 AM, Halldór Björnsson wrote:
>
> > Hi,
> >
> > I am trying to read in weather balloon data, where each file has a
> > header of fixed length and
> > a trailing section of a fixed length. The data section (the table)
> > is of variable length.
> >
> > An example of the data is on:
> >
> > http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> >
> > This data has 97 rows and can be read as:
> > read.table("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> > ",skip=10,nrows=97)
> >
> > If I set nrows=98 I run into the trailing section.
> >
> >> From day to day the table length changes. Is there a way to get
> >> read.table to always read in the correct
> > length and just stop when it hits the trailing section?
>
> Looks to be fairly straightforward HTML
>
> inp <- readLines(con=url("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> "))
>
>  > grep("<PRE>", inp)
> [1]   6 109
>
> That was followed by multi-line header.
>  > inp[grep("<PRE>", inp)[1]+4]
> [1]
> "-----------------------------------------------------------------------------"
>
>
> The ending can be found similarly:
>
>  > grep("</PRE>", inp)
> [1] 109 140
>
> datlines <- inp[( grep("<PRE>", inp)[1]+5 ):(grep("</PRE>", inp) 
> [1]-1)]
>
> You may need to use read.fwf for input since the table has missing
> values.
>
> -- 
>
> David Winsemius, MD
> West Hartford, CT
>
>
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list