[R] cut.POSIXt misconception/feature/bug?
Petr PIKAL
petr.pikal at precheza.cz
Thu Mar 11 08:26:32 CET 2010
Hi
Thanks for clarification. Actually I knew that with first case I get some
data with NAs at the beginning and at the end. Maybe my English is not
good enough to understand that to get vector of dates split to several
chunks I need to put also end date and last date to get the whole vector.
This is what help page for cut.POSIXt says about breaks and although I
read it carefully I did not find any mention that at least 2 values are
necessary. I did not connected it with information from cut help page,
sorry
breaks: a vector of cut points _or_ number giving the number of
^^^^^^^^^^^^^^^^^^^^^^
which can be vector of length one. However thinking about it more
thoroughly vector of length one is probably the same as one number with
respect of its interpretation.
Therefore I also missed the clue that I need not only
Details:
Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will
include both ends of the range of dates.
but also
br<-dat[c(1, 23, 42,60)]
To get the whole vector of cut dates without NAs in both ends.
Maybe
breaks: a vector of 2 or more cut points _or_ number giving the number
of...
Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ together with
starting and ending date will include both ends of the range of dates.
could make help page more digestable.
Thank you.
Petr
jim holtman <jholtman at gmail.com> napsal dne 10.03.2010 16:10:43:
> In the first case you did not look far enough into the data:
>
> > dat <- seq(c(ISOdate(2000,3,20)), by = "day", length.out = 60)
> > br<-dat[c(23, 42)]
> > cut(dat, breaks=br, right=T, include.lowest=T)
> [1] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [7] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [13] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [19] <NA> <NA> <NA>
> <NA> 2000-04-11 08:00:00 2000-04-11 08:00:00
> [25] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
2000-04-11
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [31] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
2000-04-11
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [37] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
2000-04-11
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [43] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [49] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [55] <NA> <NA> <NA>
> <NA> <NA> <NA>
> Levels: 2000-04-11 08:00:00
> >
> In the second case you did not read the documentation close enough"
>
>
> breaks
>
> either a numeric vector of two or more cut points or a single number
(greater
> than or equal to 2) giving the number of intervals into which x is to be
cut.
>
>
> Need at least a vector of length 2 for the breaks.
> Try this:
>
> > br<-dat[42]
> > cut(dat, breaks=c(dat[1], br), right=T, include.lowest=T)
> [1] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [7] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [13] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [19] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [25] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [31] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [37] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
2000-03-20
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [43] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [49] <NA> <NA> <NA>
> <NA> <NA> <NA>
> [55] <NA> <NA> <NA>
> <NA> <NA> <NA>
> Levels: 2000-03-20 07:00:00
> >
>
>
> On Wed, Mar 10, 2010 at 4:01 AM, Petr PIKAL <petr.pikal at precheza.cz>
wrote:
> Dear all
> recently I tried to split vector of dates according to some particular
> date to 2 (more) chunks, but I was not able to perform correct setting.
>
> When I want split to 3 chunks it partially works however from help page
I
> supposed to get result without NA.
>
> Details:
>
> Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will
> include both ends of the range of dates.
>
> dat <- seq(c(ISOdate(2000,3,20)), by = "day", length.out = 60)
> br<-dat[c(23, 42)]
> head(cut(dat, breaks=br, right=T, include.lowest=T))
>
> [1] <NA> <NA> <NA> <NA> <NA> <NA>
> Levels: 2000-04-11 14:00:00
>
> which apparently is not output I would like to have.
>
> When trying to split to 2 chunks there is a strange error
>
> br<-dat[42]
> cut(dat, breaks=br, right=T, include.lowest=T)
> Error in cut.default(unclass(x), unclass(breaks), labels = labels, right
=
> right, : cannot allocate vector of length 955454401
>
> I traced it back to
>
> Browse[5]> nb
> [1] 955454401
> ^^^^^^^^^^^^^^^^^^^^^^
> Browse[5]>
> debug: NULL
> Browse[5]>
> debug: breaks <- seq.int(rx[1L] - dx/1000, rx[2L] + dx/1000, length.out
=
> nb)
> Browse[5]>
> Error in cut.default(unclass(x), unclass(breaks), labels = labels, right
=
> right, :
> cannot allocate vector of length 955454401
>
> which is probably not correct.
>
> Can somebody help me to the right track?
>
>
> > version
> _
> platform i386-pc-mingw32
> arch i386
> os mingw32
> system i386, mingw32
> status Under development (unstable)
> major 2
> minor 11.0
> year 2010
> month 03
> day 09
> svn rev 51229
> language R
> version.string R version 2.11.0 Under development (unstable) (2010-03-09
> r51229)
>
> Regards
> Petr
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
More information about the R-help
mailing list