[R] how to drop fields by name when reading in data?
David Winsemius
dwinsemius at comcast.net
Fri Mar 19 21:34:44 CET 2010
On Mar 19, 2010, at 3:03 PM, Peter Keller wrote:
>
> I have a number of space separated files of weather data, with some
> equivalent column names, and differing number of fields in each
> file. Some
> of the files have 40 or more vars, but I only want a subset of the
> fields.
> I can use colClasses with read.table to drop some of the fields, but
> only if
> I know where those columns are in the first place, and they're not
> always in
> the same place. So I would like to be able to drop all unwanted
> columns on
> import, by name.
>
> In addition, most fields have a "Q" (quality) field next to them,
> and I need
> to read of those as well, each "Q" next to its relevant field, such as
> "Temp", and rename to e.g., "Temp.Q".
Those will probably get changed to Q.1, Q.2, etc by check.names()
>
> Some example data:
> Date HrMn I Type Dir Q I Spd Q Visby Q I Q Temp Q Dewpt Q Slp Q Pr
> Amt I Q
> 19450101 0900 4 SAO 315 1 N 1.0 1 024000 1 N 1 -37.0 1 -45.9 1
> 1031.8 1 99
> 999.9 9 9
> 19450101 1000 4 SAO 315 1 N 1.0 1 024000 1 N 1 -35.9 1 -43.1 1
> 1032.2 1 99
> 999.9 9 9
> 19450101 1100 4 SAO 360 1 N 1.0 1 024000 1 N 1 -35.9 1 -43.1 1
> 1032.5 1 99
> 999.9 9 9
> 19450101 1200 4 SAO 315 1 N 1.0 1 024000 1 N 1 -36.4 1 -50.9 1
> 1032.9 1 99
> 999.9 9 9
> 19450101 1300 4 SAO 360 1 N 1.0 1 024000 1 N 1 -36.4 1 -43.1 1
> 1032.9 1 99
> 999.9 9 9
> 19450101 1400 4 SAO 315 1 N 1.0 1 016000 1 N 1 -36.4 1 -42.0 1
> 1032.5 1 99
> 999.9 9 9
> 19450101 1500 4 SAO 180 1 N 1.0 1 016000 1 N 1 -36.4 1 -45.3 1
> 1032.5 1 99
> 999.9 9 9
> 19450101 1600 4 SAO 360 1 N 1.0 1 024000 1 N 1 -37.5 1 -45.9 1
> 1032.9 1 99
> 999.9 9 9
>
> So if I want to extract Date, HrMn, Temp, and the Q following Temp:
> tmp1<-read.table("ex.dat", sep=" ", strip.white=TRUE,
> colClasses=c("character","character",
> rep("NULL",11),"numeric","factor",rep("NULL",8)),na.strings="999.9",
> header=T)
>
> But having to alter colClasses for every file, the fields of which may
> change when next year's data is retrieved, is no fun. And is there
> a way to
> specify na.strings per column?
There might be if you wanted to write an as.Method for a new data
type. There was a recent answer to an r-help currency conversion
question that illustrated this approach.
>
> --
> View this message in context: http://n4.nabble.com/how-to-drop-fields-by-name-when-reading-in-data-tp1601166p1601166.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list