[R] download/retain text file structure with RCurl/getURL()
David Winsemius
dwinsemius at comcast.net
Mon Jan 19 20:52:13 CET 2009
It's a fixed width format, with irregular entries, perhaps something
along the lines of:
read.fwf(textConnection(txtfile), skip = 8, # skips the header
widths = <column widths vector>,
colnames= <colnames> ,
nrows=48 ) #drops the trailing summary text
perhaps :
widths = c(2, -1, 1, -1 ,4, -1, 3 .... the rest # the -col
entries drop the white-space
names = c("year","card", "Jan.date", "Jan.dep" .....
the rest
Just the first few columns seem to come in acceptably, although the
lines with all NA's will need to be deleted:
> read.fwf(textConnection(txtfile), skip = 8, # skips the header
+ widths = c(2, -1, 1, -1 ,4, -1, 3), # the -col entries drop
the white-space
+ col.names = c("year","card", "Jan.date", "Jan.dep"),
nrows=48 )
year card Jan.date Jan.dep
1 61 1 E/ST NA
2 62 1 E/ST NA
3 63 1 K/31 15
4 64 1 K/30 12
5 NA NA <NA> NA
6 65 1 E/ST NA
7 66 1 1/07 17
8 67 1 E/ST NA
9 68 1 K/28 12
10 69 1 K/31 22
11 NA NA <NA> NA
12 70 1 K/30 16
13 71 1 K/29 28
14 72 1 K/28 32
15 73 1 1/02 16
snip
--
David Winsemius
On Jan 19, 2009, at 1:26 PM, zack holden wrote:
>
> Dear list,
>
> I'm trying to download a text file directly from the internet using
> the RCurl package and the command getURL. Duncan Lang graciously
> helped me solve the first step in this problem using the following
> command:
>
> #################
> txtfile <- getURL('ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/13e19.txt'
> ,
> ftp.use.epsv = FALSE)
> #################
>
> This brings the text file into R in a single long character string.
> I've spent many hours now trying to bring this text file into R into
> a sensible form. I've tried every variant of different commands in
> getURL help file, as well as different
> strsplit() commands to try to break this character string into a
> sensible rows and columns, to no avail.
>
> Can anyone suggest a solution for doing this? I suspect there is a
> getURL command I'm missing. Alternatively, do I really have to break
> this long character string into rows and columns that I can then
> assemble into a table?
>
> I'd be grateful for any advice.
>
> Thanks in advance,
>
> Zack
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list