[R] cut.POSIXt misconception/feature/bug?

Petr PIKAL petr.pikal at precheza.cz
Thu Mar 11 08:26:32 CET 2010


Hi

Thanks for clarification. Actually I knew that with first case I get some 
data with NAs at the beginning and at the end. Maybe my English is not 
good enough to understand that to get vector of dates split to several 
chunks I need to put also end date and last date to get the whole vector.

This is what help page for cut.POSIXt says about breaks and although I 
read it carefully I did not find any mention that at least 2 values are 
necessary. I did not connected it with information from cut help page, 
sorry

breaks: a vector of cut points _or_ number giving the number of
        ^^^^^^^^^^^^^^^^^^^^^^
which can be vector of length one. However thinking about it more 
thoroughly vector of length one is probably the same as one number with 
respect of its interpretation. 

Therefore I also missed the clue that I need not only

Details:

     Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will
     include both ends of the range of dates.

but also

 br<-dat[c(1, 23, 42,60)]

To get the whole vector of cut dates without NAs in both ends.

Maybe

breaks: a vector of 2 or more cut points _or_ number giving the number 
of...

Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ together with 
starting and ending date will include both ends of the range of dates.

could make help page more digestable.

Thank you.

Petr

jim holtman <jholtman at gmail.com> napsal dne 10.03.2010 16:10:43:

> In the first case you did not look far enough into the data:
>  
> > dat <- seq(c(ISOdate(2000,3,20)), by = "day", length.out = 60)
> > br<-dat[c(23, 42)]
> > cut(dat, breaks=br, right=T, include.lowest=T)
>  [1] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
>  [7] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [13] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [19] <NA>                <NA>                <NA>                
> <NA>                2000-04-11 08:00:00 2000-04-11 08:00:00
> [25] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00 
2000-04-11 
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [31] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00 
2000-04-11 
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [37] 2000-04-11 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00 
2000-04-11 
> 08:00:00 2000-04-11 08:00:00 2000-04-11 08:00:00
> [43] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [49] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [55] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> Levels: 2000-04-11 08:00:00
> > 
> In the second case you did not read the documentation close enough"
>  
> 
> breaks
> 
> either a numeric vector of two or more cut points or a single number 
(greater 
> than or equal to 2) giving the number of intervals into which x is to be 
cut.
> 
>  
> Need at least a vector of length 2 for the breaks.
>   Try this:
>  
> > br<-dat[42]
> > cut(dat, breaks=c(dat[1], br), right=T, include.lowest=T)
>  [1] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
>  [7] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [13] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [19] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [25] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [31] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [37] 2000-03-20 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00 
2000-03-20 
> 07:00:00 2000-03-20 07:00:00 2000-03-20 07:00:00
> [43] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [49] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> [55] <NA>                <NA>                <NA>                
> <NA>                <NA>                <NA>               
> Levels: 2000-03-20 07:00:00
> > 
>  
> 

> On Wed, Mar 10, 2010 at 4:01 AM, Petr PIKAL <petr.pikal at precheza.cz> 
wrote:
> Dear all
> recently I tried to split vector of dates according to some particular
> date to 2 (more) chunks, but I was not able to perform correct setting.
> 
> When I want split to 3 chunks it partially works however from help page 
I
> supposed to get result without NA.
> 
> Details:
> 
>     Using both ‘right = TRUE’ and ‘include.lowest = TRUE’ will
>     include both ends of the range of dates.
> 
> dat <- seq(c(ISOdate(2000,3,20)), by = "day", length.out = 60)
> br<-dat[c(23, 42)]
> head(cut(dat, breaks=br, right=T, include.lowest=T))
> 
> [1] <NA> <NA> <NA> <NA> <NA> <NA>
> Levels: 2000-04-11 14:00:00
> 
> which apparently is not output I would like to have.
> 
> When trying to split to 2 chunks there is a strange error
> 
> br<-dat[42]
> cut(dat, breaks=br, right=T, include.lowest=T)
> Error in cut.default(unclass(x), unclass(breaks), labels = labels, right 
=
> right,  :  cannot allocate vector of length 955454401
> 
> I traced it back to
> 
> Browse[5]> nb
> [1] 955454401
> ^^^^^^^^^^^^^^^^^^^^^^
> Browse[5]>
> debug: NULL
> Browse[5]>
> debug: breaks <- seq.int(rx[1L] - dx/1000, rx[2L] + dx/1000, length.out 
=
> nb)
> Browse[5]>
> Error in cut.default(unclass(x), unclass(breaks), labels = labels, right 
=
> right,  :
>  cannot allocate vector of length 955454401
> 
> which is probably not correct.
> 
> Can somebody help me to the right track?
> 
> 
> > version
>               _
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status         Under development (unstable)
> major          2
> minor          11.0
> year           2010
> month          03
> day            09
> svn rev        51229
> language       R
> version.string R version 2.11.0 Under development (unstable) (2010-03-09
> r51229)
> 
> Regards
> Petr
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?


More information about the R-help mailing list