[R] Extracting data by row

William Dunlap wdunlap at tibco.com
Sat Oct 29 22:33:27 CEST 2011


You didn't show the details of what you did before
   Ten<-dataTable1[(dataTable1$sensor_depth_m=="10"),]
but that line makes me suspicious that you did some
experimentation with syntax before coming up with that
line.  In particular, why did you use parentheses around
   (dataTable1$sensor_depth_m=="10")
and why did you use quotes around the 10?

Here is an example of why you might have started using
the unnecessary (and misleading) parentheses.  I'll
use a narrow dataset so its R printout does not get
line-wrapped by a rogue mailer.

  > d <- data.frame(one=c(3,1,1,2), ten=c(3,1,1,2)*10)
  > # first, use = instead of == for comparison
  > d[ d$one = 1, ]
  Error: unexpected '=' in "d[ d$one ="
  > # react to error message by adding parentheses
  > d[ (d$one = 1), ]
    one ten
  1   3  30
  > # That is definitely the wrong answer.
  > # The parentheses hid that problem that you were using =
  > # instead of == and did d$one<-1, then d[1,] returned row 1.
  > # Note that now d has been changed!
  > d
    one ten
  1   1  30
  2   1  10
  3   1  10
  4   1  20

That may explain how you got a column of 10's (after a ...=10),
but I don't know what you may have done to get a second column
called sensor_depth_m.  Nor do I have a guess as to why you
put quotes around the number 10.

This is why R-help asks to see exactly what you did in R,
not just a synopsis.  (When I copied what you showed into
R, adding a read.csv(header=TRUE, textConnection("...data...")),
I got the correct 2 lines of output, not the incorrect 3 lines you
indicated that you wanted.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Vinny Moriarty
> Sent: Friday, October 28, 2011 6:05 PM
> To: r-help at r-project.org
> Subject: [R] Extracting data by row
> 
> Thanks everyone for you help with my last question, and now I have one last
> one...
> 
> 
> Here is a sample of my data in .csv format
> 
> site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temperature_c
> 06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63
> 06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56
> 06,2006-04-09 11:00:00,2006-04-09 21:00:00,BAK,sb39, 2, 29.51
> 06,2006-04-09 11:20:00,2006-04-09 21:20:00,BAK,sb39, 2, 29.53
> 06,2006-04-09 11:40:00,2006-04-09 21:40:00,BAK,sb39, 10, 29.57
> 06,2006-04-09 12:00:00,2006-04-09 22:00:00,BAK,sb39, 2, 29.60
> 06,2006-04-09 12:20:00,2006-04-09 22:20:00,BAK,sb39, 2, 29.66
> 06,2006-04-09 12:40:00,2006-04-09 22:40:00,BAK,sb39, 10, 29.68
> 06,2006-04-09 13:00:00,2006-04-09 23:00:00,BAK,sb39, 2, 29.68
> 
> 
> My goal was to extract all of the rows from a certain depth. Using the
> column "sensor_depth_m" to order my data by, I wanted all of the data from
> 10m. So this is what I wanted when I finished
> 
> site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temperature_c
> 06,2006-04-09 11:40:00,2006-04-09 21:40:00,BAK,sb39, 10, 29.57
> 06,2006-04-09 12:40:00,2006-04-09 22:40:00,BAK,sb39, 10, 29.68
> 06,2006-04-09 13:00:00,2006-04-09 23:00:00,BAK,sb39, 10, 29.68
> 
> 
> 
> 
> To pull out all of the data from a 10m sensor depth I came up with the code:
> 
> Ten<-dataTable1[(dataTable1$sensor_depth_m=="10"),]
> 
> 
> But when I run it I just get an extra column tacked onto the end like this
> 
> 
> site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temperature_c,
> sensor_depth_m
> 06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63,10
> 06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56,10
> 06,2006-04-09 11:00:00,2006-04-09 21:00:00,BAK,sb39, 2, 29.51,10
> 06,2006-04-09 11:20:00,2006-04-09 21:20:00,BAK,sb39, 2, 29.53,10
> 06,2006-04-09 11:40:00,2006-04-09 21:40:00,BAK,sb39, 10, 29.57,10
> 06,2006-04-09 12:00:00,2006-04-09 22:00:00,BAK,sb39, 2, 29.60,10
> 06,2006-04-09 12:20:00,2006-04-09 22:20:00,BAK,sb39, 2, 29.66,10
> 06,2006-04-09 12:40:00,2006-04-09 22:40:00,BAK,sb39, 10, 29.68,10
> 06,2006-04-09 13:00:00,2006-04-09 23:00:00,BAK,sb39, 10, 29.68,10
> 
> 
> It seems pretty straight forward, I'm not sure what I am missing.
> 
> 
> Thanks
> 
> V
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list