[R] how to drop fields by name when reading in data?

David Winsemius dwinsemius at comcast.net
Fri Mar 19 21:34:44 CET 2010


On Mar 19, 2010, at 3:03 PM, Peter Keller wrote:

>
> I have a number of space separated files of weather data, with some
> equivalent column names, and differing number of fields in each  
> file.  Some
> of the files have 40 or more vars, but I only want a subset of the  
> fields.
> I can use colClasses with read.table to drop some of the fields, but  
> only if
> I know where those columns are in the first place, and they're not  
> always in
> the same place.   So I would like to be able to drop all unwanted  
> columns on
> import, by name.
>
> In addition, most fields have a "Q" (quality) field next to them,  
> and I need
> to read of those as well, each "Q" next to its relevant field, such as
> "Temp", and rename to e.g., "Temp.Q".

Those will probably get changed to Q.1, Q.2, etc by check.names()

>
> Some example data:
> Date HrMn I Type Dir Q I Spd Q Visby Q I Q Temp Q Dewpt Q Slp Q Pr  
> Amt I Q
> 19450101 0900 4 SAO 315 1 N 1.0 1 024000 1 N 1 -37.0 1 -45.9 1  
> 1031.8 1 99
> 999.9 9 9
> 19450101 1000 4 SAO 315 1 N 1.0 1 024000 1 N 1 -35.9 1 -43.1 1  
> 1032.2 1 99
> 999.9 9 9
> 19450101 1100 4 SAO 360 1 N 1.0 1 024000 1 N 1 -35.9 1 -43.1 1  
> 1032.5 1 99
> 999.9 9 9
> 19450101 1200 4 SAO 315 1 N 1.0 1 024000 1 N 1 -36.4 1 -50.9 1  
> 1032.9 1 99
> 999.9 9 9
> 19450101 1300 4 SAO 360 1 N 1.0 1 024000 1 N 1 -36.4 1 -43.1 1  
> 1032.9 1 99
> 999.9 9 9
> 19450101 1400 4 SAO 315 1 N 1.0 1 016000 1 N 1 -36.4 1 -42.0 1  
> 1032.5 1 99
> 999.9 9 9
> 19450101 1500 4 SAO 180 1 N 1.0 1 016000 1 N 1 -36.4 1 -45.3 1  
> 1032.5 1 99
> 999.9 9 9
> 19450101 1600 4 SAO 360 1 N 1.0 1 024000 1 N 1 -37.5 1 -45.9 1  
> 1032.9 1 99
> 999.9 9 9
>
> So if I want to extract Date, HrMn, Temp, and the Q following Temp:
> tmp1<-read.table("ex.dat",	sep=" ", strip.white=TRUE,
> colClasses=c("character","character",
> 	rep("NULL",11),"numeric","factor",rep("NULL",8)),na.strings="999.9",
> 	header=T)
>
> But having to alter colClasses for every file, the fields of which may
> change when next year's data is retrieved, is no fun.  And is there  
> a way to
> specify na.strings per column?

There might be if you wanted to write an as.Method for a new data  
type. There was a recent answer to an r-help currency conversion  
question that illustrated this approach.

>
> -- 
> View this message in context: http://n4.nabble.com/how-to-drop-fields-by-name-when-reading-in-data-tp1601166p1601166.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list