[R] How would I analyse data like this?

laurent.duperval@microcell.ca laurent.duperval at microcell.ca
Wed Mar 19 17:45:01 CET 2003


I'm a new R user and I'm having a little trouble getting started. I'm hoping
someone can help me out.

I have data that looks like this:

15555551234|3|983|1000|266|IN|2003-03-16 23:57:21-05|C
15555552345|3|983|3000|0|IN|2003-03-16 23:58:16-05|C
15555552346|3|983|1000|40|IN|2003-03-16 23:58:24-05|C

Which I've read using scan(). 

data <- scan(file = "data.dat", what = list("",0,0,0,0,"","",""), sep = "|", skip = 1)

Now, I want to do things like this:

- A histogram for the 5th column for every 50 units. I can generate the
  histogram but most of my values are between 0-500. A few are above that.
  I'd like to bundle them all in a generic 500+ category. I can't figure out
  how. This is what I'm doing

hist(data[[5]], br=c(0,50,100,150,200,250,300,350,400,450,500,1000))
Error in hist.default(data[[5]], br = c(0, 50, 100, 150, 200, 250, 300,  : 
	some `x' not counted; maybe `breaks' do not span range of `x'

- How do I count the number of times channel "IN" occurs with code = 983? How about if
  I want to combine IN and code=983 or 982 or 981?

- Finally (for today at least) how do I count the number of times code=983 and
  date=2003-03-16 (without the time) occur. I'm hoping this will also help
  me build histograms for days of the week and for hours of the day.


Laurent Duperval <laurent.duperval at microcell.ca>

    Once you open a can of worms, the only way to recan them is to use
    a larger can.

More information about the R-help mailing list