[R] extracting information from txt file
David Winsemius
dwinsemius at comcast.net
Wed Oct 31 19:11:38 CET 2012
On Oct 31, 2012, at 9:46 AM, chuck.01 wrote:
> Hello,
>
> Here is a link to some data:
> http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt
>
> I am trying to read this in, and want to use:
> chmval <-
> read.table("http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt",
> sep=",", skip= 84, header=T)
>
> the # 84, for 84 lines skipped needs to be derived from the 5th line of the
> txt file
> # Header Records: 85
>
> so, I need that # (-1) for input into the read.table statement above
That "# (-1)" is fairly cryptic to my reading, but it appears you are seeing the behavior of the "3" character in terminating input for comments. Changing the comment character in the call to read.table will allow input from that line.
?read.table
You will need to read only the first 5 or 6 lines first, then execute a separate read.table while skipping input from those lines as well as the variable list that forms a secondary header.
> headfrm <- read.table( file=url( "http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt"), nrows=6, sep=":", comment.char="")
> headfrm
V1 V2
1 Dataset EMAP Stream Chemistry Data
2 File Name chmval
3 Date Created 02/22/99
4 # Variables 75
5 # Header Records 85
6 # Data Records 711
>
> I've tried grep but that didn't work:
> (for this I downloaded the txt file and manually removed that hash mark!)
>
> grep("Header Records:", read.table("chmval.txt", header=T))
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
> :
> line 1 did not have 5 elements
>
> Any ideas?
> Can I just extract the 5th line?
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/extracting-information-from-txt-file-tp4648033.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list