[R] read.table and variable length of tables
David Winsemius
dwinsemius at comcast.net
Thu Jun 14 18:33:28 CEST 2012
On Jun 14, 2012, at 12:18 PM, Halldór Björnsson wrote:
> Thanks and with
>
> datlines <- as.data.frame(inp[( grep("<PRE>", inp)[1]+5 ):(grep("</
> PRE>", inp)[1]-1)]);
>
Er, ... are you sure? I got a factorized mess when I did that.
> str(datlines)
'data.frame': 98 obs. of 1 variable:
$ inp[(grep("<PRE>", inp)[1] + 5):(grep("</PRE>", inp)[1] - 1)]:
Factor w/ 98 levels " 20.9 26726 -43.7 -79.7 1
0.03 692.9 693.2 692.9",..: 98 97 96 95 94 93 92 91
90 89 ...
--
David.
> I get the data as needed.
>
> Thanks again
>
> H.
>
>
> On Jun 14, 2012, at 10:23 AM, Halldór Björnsson wrote:
>
> > Hi,
> >
> > I am trying to read in weather balloon data, where each file has a
> > header of fixed length and
> > a trailing section of a fixed length. The data section (the table)
> > is of variable length.
> >
> > An example of the data is on:
> >
> > http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> >
> > This data has 97 rows and can be read as:
> > read.table("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> > ",skip=10,nrows=97)
> >
> > If I set nrows=98 I run into the trailing section.
> >
> >> From day to day the table length changes. Is there a way to get
> >> read.table to always read in the correct
> > length and just stop when it hits the trailing section?
>
> Looks to be fairly straightforward HTML
>
> inp <- readLines(con=url("http://weather.uwyo.edu/cgi-bin/sounding?region=naconf&TYPE=TEXT%3ALIST&YEAR=2011&MONTH=06&FROM=1400&TO=1400&STNM=04018
> "))
>
> > grep("<PRE>", inp)
> [1] 6 109
>
> That was followed by multi-line header.
> > inp[grep("<PRE>", inp)[1]+4]
> [1]
> "-----------------------------------------------------------------------------"
>
>
> The ending can be found similarly:
>
> > grep("</PRE>", inp)
> [1] 109 140
>
> datlines <- inp[( grep("<PRE>", inp)[1]+5 ):(grep("</PRE>", inp)
> [1]-1)]
>
> You may need to use read.fwf for input since the table has missing
> values.
>
> --
>
> David Winsemius, MD
> West Hartford, CT
>
>
>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list