[Rd] Regression in strptime

peter dalgaard pdalgd at gmail.com
Sat Mar 12 19:11:40 CET 2016


OK, .Internal is not necessary to reproduce oddity in this area. I also see things like (notice 1980)

> strptime(paste0(sample(1900:1999,80,replace=TRUE),"/01/01"), "%Y/%m/%d", tz="CET")
 [1] "1942-01-01 CEST" "1902-01-01 CET"  "1956-01-01 CET"  "1972-01-01 CET" 
 [5] "1962-01-01 CET"  "1900-01-01 CET"  "1921-01-01 CET"  "1972-01-01 CET" 
 [9] "1918-01-01 CET"  "1989-01-01 CET"  "1900-01-01 CET"  "1970-01-01 CET" 
[13] "1971-01-01 CET"  "1910-01-01 CET"  "1956-01-01 CET"  "1953-01-01 CET" 
[17] "1964-01-01 CET"  "1932-01-01 CET"  "1968-01-01 CET"  "1990-01-01 CET" 
[21] "1961-01-01 CET"  "1920-01-01 CET"  "1961-01-01 CET"  "1941-01-01 CEST"
[25] "1947-01-01 CET"  "1979-01-01 CET"  "1943-01-01 CET"  "1976-01-01 CET" 
[29] "1951-01-01 CET"  "1912-01-01 CET"  "1983-01-01 CET"  "1985-01-01 CET" 
[33] "1970-01-01 CET"  "1917-01-01 CET"  "1930-01-01 CET"  "1966-01-01 CET" 
[37] "1953-01-01 CET"  "1938-01-01 CET"  "1974-01-01 CET"  "1959-01-01 CET" 
[41] "1984-01-01 CET"  "1928-01-01 CET"  "1970-01-01 CET"  "1959-01-01 CET" 
[45] "1935-01-01 CET"  "1934-01-01 CET"  "1935-01-01 CET"  "1951-01-01 CET" 
[49] "1907-01-01 CET"  "1985-01-01 CET"  "1906-01-01 CET"  "1912-01-01 CET" 
[53] "1966-01-01 CET"  "1944-01-01 CET"  "1952-01-01 CET"  "1936-01-01 CET" 
[57] "1967-01-01 CET"  "1925-01-01 CET"  "1980-01-01 CEST" "1930-01-01 CET" 
[61] "1999-01-01 CET"  "1965-01-01 CET"  "1903-01-01 CET"  "1942-01-01 CET" 
[65] "1917-01-01 CET"  "1995-01-01 CET"  "1939-01-01 CET"  "1949-01-01 CET" 
[69] "1950-01-01 CET"  "1966-01-01 CET"  "1996-01-01 CET"  "1966-01-01 CET" 
[73] "1999-01-01 CET"  "1961-01-01 CET"  "1946-01-01 CET"  "1902-01-01 CET" 
[77] "1983-01-01 CET"  "1981-01-01 CET"  "1949-01-01 CET"  "1977-01-01 CET" 

The issue seems to be present in R-devel but not in (CRAN) 3.2.0

-pd


> On 12 Mar 2016, at 17:43 , Mick Jordan <mick.jordan at oracle.com> wrote:
> 
> On 3/12/16 12:33 AM, peter dalgaard wrote:
>>> On 12 Mar 2016, at 00:05 , Mick Jordan <mick.jordan at oracle.com> wrote:
>>> 
>>> This is definitely obscure but we had a unit test that called .Internal(strptime, "1942/01/01", %Y/%m/%d") with timezone (TZ) set to CET.
>> Umm, that doesn't even parse. And fixing the typo, it doesn't run:
>> 
>>> .Internal(strptime, "1942/01/01", %Y/%m/%d")
>> Error: unexpected SPECIAL in ".Internal(strptime, "1942/01/01", %Y/%"
>>> .Internal(strptime, "1942/01/01", "%Y/%m/%d")
>> Error in .Internal(strptime, "1942/01/01", "%Y/%m/%d") :
>>   3 arguments passed to '.Internal' which requires 1
>> 
>> 
>> 
>>> In R-3.1.3 that returned "1942-01-01 CEST" which, paradoxically, is correct as they evidently did strange things in Germany during the war period. Java also returns the same. However, R-3.2.4 returns "1942-01-01 CET".
>> Did you mean:
>> 
>> pd$ r-release-branch/BUILD-dist/bin/R
>> 
>> R version 3.2.4 Patched (2016-03-10 r70319) -- "Very Secure Dishes"
>> Copyright (C) 2016 The R Foundation for Statistical Computing
>> Platform: x86_64-apple-darwin13.4.0/x86_64 (64-bit)
>> [...]
>>> strptime("1942/01/01", "%Y/%m/%d", tz="CET")
>> [1] "1942-01-01 CEST"
>> 
>> But then as you see, it does have DST on New Years Day.
>> 
>> All in all, there is something you are not telling us.
>> 
>> Notice that all DST information is OS dependent as it depends on which version of the "Olson database" is installed.
>> 
>> 
> You are correct that I was sloppy with syntax for the example. We are, for better or worse, calling the .Internal, but actually with a large vector of arguments, of which the 1942 entry is element 82. I can confirm that for the vector of length 1 example that I didn't test but just assumed would also fail, the answer is correct. However, it is not for the full vector:
> 
> > .Internal(strptime(argv[[1]], argv[[2]], "CET"))
>  [1] "1937-01-01 CET" "1916-01-01 CET" "1913-01-01 CET" "1927-01-01 CET"
>  [5] "1947-01-01 CET" "1913-01-01 CET" "1917-01-01 CET" "1923-01-01 CET"
>  [9] "1921-01-01 CET" "1926-01-01 CET" "1920-01-01 CET" "1915-01-01 CET"
> [13] "1914-01-01 CET" "1914-01-01 CET" "1914-01-01 CET" "1919-01-01 CET"
> [17] "1948-01-01 CET" "1911-01-01 CET" "1909-01-01 CET" "1913-01-01 CET"
> [21] "1925-01-01 CET" "1926-01-01 CET" "1910-01-01 CET" "1917-01-01 CET"
> [25] "1936-01-01 CET" "1938-01-01 CET" "1960-01-01 CET" "1915-01-01 CET"
> [29] "1919-01-01 CET" "1924-01-01 CET" "1914-01-01 CET" "1905-01-01 CET"
> [33] "1921-01-01 CET" "1929-01-01 CET" "1926-01-01 CET" "1921-01-01 CET"
> [37] "1908-01-01 CET" "1928-01-01 CET" "1919-01-01 CET" "1921-01-01 CET"
> [41] "1925-01-01 CET" "1934-01-01 CET" "1927-01-01 CET" "1928-01-01 CET"
> [45] "1934-01-01 CET" "1922-01-01 CET" "1923-01-01 CET" "1915-01-01 CET"
> [49] "1934-01-01 CET" "1925-01-01 CET" "1922-01-01 CET" "1930-01-01 CET"
> [53] "1924-01-01 CET" "1923-01-01 CET" "1919-01-01 CET" "1932-01-01 CET"
> [57] "1930-01-01 CET" "1923-01-01 CET" "1930-01-01 CET" "1922-01-01 CET"
> [61] "1919-01-01 CET" "1932-01-01 CET" "1939-01-01 CET" "1923-01-01 CET"
> [65] "1920-01-01 CET" "1919-01-01 CET" "1952-01-01 CET" "1927-01-01 CET"
> [69] "1924-01-01 CET" "1919-01-01 CET" "1925-01-01 CET" "1945-01-01 CET"
> [73] "1916-01-01 CET" "1943-01-01 CET" "1920-01-01 CET" "1920-01-01 CET"
> [77] "1931-01-01 CET" "1924-01-01 CET" "1919-01-01 CET" "1926-01-01 CET"
> [81] "1920-01-01 CET" "1942-01-01 CET" "1919-01-01 CET" "1930-01-01 CET"
> [85] "1925-01-01 CET" "1924-01-01 CET" "1926-01-01 CET" "1918-01-01 CET"
> [89] "1922-01-01 CET" "1921-01-01 CET" "1925-01-01 CET" "1928-01-01 CET"
> [93] "1925-01-01 CET" "1929-01-01 CET" "1933-01-01 CET" "1947-01-01 CET"
> [97] "1950-01-01 CET" "1945-01-01 CET" "1924-01-01 CET" "1939-01-01 CET"
> [101] "1924-01-01 CET" "1933-01-01 CET" "1928-01-01 CET"
> > .Internal( strptime("1942/01/01", "%Y/%m/%d", ''))
> [1] "1942-01-01 CEST"
> > > argv[[1]][[82]]
> [1] "1942/01/01"
> 
> We actually pass "" as the timezone, having set TZ=CET in the shell.
> 
> I am attaching a file that defines the large vector for sourcing.
> 
> Mick
> 
> <pbug.r>

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list