[R] How to do indexing after splitting my data-frame?
Oliver Bandel
oliver at first.in-berlin.de
Sat Dec 20 21:35:08 CET 2008
Hello,
after splitting a data-frame I want to access the results.
Maybe the problem is, that the factor/index is a string...
...or do I miss knowing details of the index-uasge?
Please look and help:
=======================================
> weblog <- read_weblog("web.log")
>
>
> str(weblog)
'data.frame': 2247 obs. of 18 variables:
$ host : Factor w/ 77 levels "124.0.210.117",..: 23 44 44 23 46 46
26 26 42 32 ...
$ lname : Factor w/ 1 level "-": 1 1 1 1 1 1 1 1 1 1 ...
$ user : Factor w/ 1 level "-": 1 1 1 1 1 1 1 1 1 1 ...
$ date_time : chr "29/Nov/2008:00:09:52" "29/Nov/2008:01:08:37"
"29/Nov/2008:01:08:37" "29/Nov/2008:03:39:45" ...
$ timezone : chr "+0100" "+0100" "+0100" "+0100" ...
$ status : int 404 200 304 403 301 200 200 404 304 200 ...
$ size : num 307 32 0 314 333 ...
$ referrer : Factor w/ 19 levels "-","http://messenger.su/",..: 1 1 1
1 1 1 11 1 1 1 ...
$ client : Factor w/ 45 levels "digsby-asynchttp/0.1",..: 30 4 4 30
28 28 20 20 27 41 ...
$ req_file : chr "/software/tools/newfileaction/pftdbns/"
"/robots.txt" "/kurama_2007/tn_kurama_fire_festival_hpim4496.jpg"
"/software/libraries/mboxlib/mbox.mli.html" ...
$ req_method: chr "GET" "GET" "GET" "GET" ...
$ req_prot : chr "HTTP/1.0" "HTTP/1.1" "HTTP/1.1" "HTTP/1.0" ...
$ date : chr "29-Nov-2008" "29-Nov-2008" "29-Nov-2008"
"29-Nov-2008" ...
$ hour : chr "00" "01" "01" "03" ...
$ day : chr "29" "29" "29" "29" ...
$ month : chr "Nov" "Nov" "Nov" "Nov" ...
$ year : chr "2008" "2008" "2008" "2008" ...
$ t_sec : atomic 1.23e+09 1.23e+09 1.23e+09 1.23e+09 1.23e+09 ...
..- attr(*, "tzone")= chr ""
>
>
> weblog_by_date <- split(weblog, weblog$date)
>
> weblog_by_date$"01-Dec-2008"$host
[1] 74.6.22.164 74.6.22.164 74.6.22.164 67.195.37.169
[5] 67.195.37.169 74.6.22.164 174.36.196.98 174.36.196.98
[9] 67.195.37.169 72.30.65.23 72.30.65.23 65.55.210.177
[13] 65.55.210.177 74.6.22.160 74.6.22.160 74.6.22.121
[17] 74.6.22.121 208.80.194.30 66.249.71.141 66.249.71.141
[21] 66.249.71.141 216.34.181.101 216.34.181.101 65.55.210.182
[25] 65.55.210.182 38.99.44.101 217.212.224.183 217.212.224.186
[29] 89.111.176.102 89.111.176.102 66.249.71.141 65.55.210.180
[33] 65.55.210.180 65.55.210.179 65.55.210.179
77 Levels: 124.0.210.117 145.253.3.244 160.91.44.155 ... 94.23.3.220
>
> myindex <- "01-Dec-2008"
>
> weblog_by_date$myindex$host
NULL
> weblog_by_date[myindex]$host
NULL
>
=======================================
How can I grab into the data-structures, using the indexing by
date-string and by the names like "host" and so on?
So: is it posisble to use split in a way, that the original index-names
("host", "status" and so on) can be used?
Ciao,
Oliver
More information about the R-help
mailing list