[R] extracting information from txt file
Taimur Sajid
tsajid at primaticsfinancial.com
Wed Oct 31 18:56:39 CET 2012
This worked for the example you provided. Assumes the header count is the only numeric value on the 5th line.
epa_extract <- function(address){
doc <- readLines(address, n = 5)[5]
head_count <- as.numeric(gsub("\\D", "", doc))
read.table(address, sep = ",", header = TRUE, skip = head_count)
}
foo <- epa_extract("http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt")
Taimur Sajid
Research & Development Analyst
Primatics Financial
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of chuck.01
Sent: Wednesday, October 31, 2012 12:47 PM
To: r-help at r-project.org
Subject: [R] extracting information from txt file
Hello,
Here is a link to some data:
http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt
I am trying to read this in, and want to use:
chmval <-
read.table("http://www.epa.gov/emap/html/data/surfwatr/data/mastreams/9396/wchem/chmval.txt",
sep=",", skip= 84, header=T)
the # 84, for 84 lines skipped needs to be derived from the 5th line of the txt file # Header Records: 85
so, I need that # (-1) for input into the read.table statement above
I've tried grep but that didn't work:
(for this I downloaded the txt file and manually removed that hash mark!)
grep("Header Records:", read.table("chmval.txt", header=T)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
:
line 1 did not have 5 elements
Any ideas?
Can I just extract the 5th line?
--
View this message in context: http://r.789695.n4.nabble.com/extracting-information-from-txt-file-tp4648033.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list