[R] How would I analyse data like this?
Jason Turner
jasont at indigoindustrial.co.nz
Wed Mar 19 20:13:18 CET 2003
On Wed, Mar 19, 2003 at 12:40:20PM -0500, laurent.duperval at microcell.ca wrote:
> On 19 Mar, james.holtman at convergys.com wrote:
> > Have you tried:
> > data <- read.table("data.dat", header=TRUE, sep="|", as.is=TRUE)
> >
>
> Yes I did. However, it takes a LOT more time because of the date/time
> string. The result looks like this:
>
>
> str(data)
> `data.frame': 317437 obs. of 8 variables:
> $ phone : num 1.52e+10 1.42e+10 1.82e+10 1.65e+10 1.65e+10 ...
> $ state : int 3 3 3 3 3 3 3 3 3 3 ...
> $ code : int 983 983 983 983 3000 983 983 983 983 5203 ...
> $ amount : int 1000 1000 2500 2500 2500 1000 1000 2500 2500 2500 ...
> $ left : int 260 0 0 25 0 1260 273 0 0 0 ...
> $ channel : Factor w/ 5 levels "CSR","IN","IVR",..: 2 5 4 2 3 2 2 3 4 3 ...
> $ time : Factor w/ 312198 levels "2002-10-16 ..",..: 1 2 3 4 5 6 7 8 9 10 ...
> $ mtd : Factor w/ 2 levels "C","D": 1 1 1 1 1 1 1 1 1 1 ...
>
> I think the 312198 factor level is wrong. Also, the phone column is a string,
> not a number. I didn't see how to specify that with read.table(). (In my
> original post, I think I forgot to mention that I had over 300,000 entries in
> my file).
Check out the colClasses argument to read.table. Something like...
library(methods) #necessary for colClasses
data <- read.table("data.dat", header=TRUE, sep="|",
colClasses=c("character","integer","integer",
"integer","integer","character","character",
"character"))
You can convert the items you need to be factors after they're loaded,
like this...
data$mtd <- factor(data$mtd)
Hope it helps
Jason
--
Indigo Industrial Controls Ltd.
64-21-343-545
jasont at indigoindustrial.co.nz
More information about the R-help
mailing list