[R] Help in running Stata dataset in R

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Aug 6 17:21:18 CEST 2008


On Wed, 6 Aug 2008, Prof Brian Ripley wrote:

> Your examples work for me via read.dta() and via use(), including the Date 
> columns, with the current foreign_0.8-28.  They also work on Windows 2.7.1 
> with foreign_0.8-26.
>
> As far as I can see the only relevant part of read.dta is 
> as.Date("1960-1-1"): you might want to try that to see if it malfunctions.
> If it does, it is possible that the problem is in your timezone setting -- 
> the CHANGES file says
>
>    o   An attempt is made (once per session) to identify the current
>        timezone from the Windows' Registry.  If this does not work or
>        is incorrect, set the 'TZ' environment variable appropriately:
>        a list of known timezones is given in
>        R_HOME/share/zoneinfo/zones.tab ...
>
> So you could start R with TZ="Africa/Nairobi" at the end of the command line 
> (assuming you are physically in Kenya and your machine is set to Kenyan 
> time).

I've been able to reproduce this by putting my machine in 
TZ="Africa/Nairobi".  It looks like midnight 1960-01-01 did not exist in 
that time zone, and so the conversion is being rejected.  This happens on 
Linux too and timezone transitions at midnight are previously unheard of.
For now, try TZ="Europe/Rome" when you convert those files - there will be 
a fix in foreign 0.9-29.

>
>
> On Wed, 6 Aug 2008, Lazarus Mramba wrote:
>
>> Dear Prof Brian,
>> 
>> It is true that i used the epicalc package. The functions use() and
>> des() are from epicalc.
>> The problem does not occur if i use library(foreign) : tmp <-
>> read.dta("maltreat.dta", convert.dates=FALSE)
>>
>>>  sessionInfo()
>> R version 2.7.1 (2008-06-23)
>> i386-pc-mingw32
>> 
>> locale:
>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>> States.1252;LC_MONETARY=English_United
>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>> 
>> attached base packages:
>> [1] splines   stats     graphics  grDevices utils     datasets  methods
>> 
>> [8] base
>> 
>> other attached packages:
>> [1] epicalc_2.7.1.2 survival_2.34-1 foreign_0.8-26
>> 
>> 
>> library(foreign)
>>> tmp <- read.dta("maltreat.dta", convert.dates=FALSE)
>>> str(tmp)
>> 'data.frame':   670 obs. of  43 variables:
>> 
>> ## thanks. able to read the dataset.
>> However, epicalc function use() cannot call the data.
>> 
>> library(epicalc)
>>> use("maltreat.dta")
>> Error in fromchar(x) :
>>  character string is not in a standard unambiguous format
>> 
>> 
>> use("malvac.dta")
>> Error in fromchar(x) :
>>  character string is not in a standard unambiguous format
>> 
>> 
>> ## I have attached the two datasets herein
>> 
>> 
>> 
>> 
>> 
>> Kind regards,
>> Lazarus Mramba
>> Junior Statistician
>> P.O Box 986, 80108,
>> Kilifi, Kenya
>> Mobile No. +254721292370
>> Tel: +254 41 522063
>> Tel: +254 41 522390
>> (office extension : 419)
>> 
>>>>> Prof Brian Ripley <ripley at stats.ox.ac.uk> 08/06 12:38 PM >>>
>> What are use() and des()?  Please note the footer of this message.
>> (Are you using package epicalc without telling us?)
>> 
>> I suspect that foreign::read.dta is being used.  That has argument
>> 'convert.dates', and you could try setting it to FALSE, as the message
>> is
>> from as.Date.character() complaining about the date format.
>> 
>> It is likely that the difference is in the version of 'foreign' and not
>> in
>> the version of R: the posting guide asked for the output of
>> sessionInfo()
>> which would have told us which versions these were.
>> 
>> In short, try
>> 
>> library(foreign)
>> tmp <- read.dta("maltreat.dta", convert.dates=FALSE)
>> str(tmp)
>> tmp$dob
>> tmp$todaydate
>> 
>> and if the latter two are numbers, try converting them by e.g.
>> 
>> as.Date(tmp$dob, origin="1960-01-01")
>> 
>> It would help to make the dataset available for the developers to
>> investigate.
>> 
>> On Wed, 6 Aug 2008, Lazarus Mramba wrote:
>> 
>>> Dear All,
>>> 
>>> I installed R 2.7.0 and tried to call a dataset i had ealier own
>> called
>>> on R2.6.2 but i keep on getting an error:
>>> use("maltreat.dta")
>>> 
>>> Error in fromchar(x) :
>>>  character string is not in a standard unambiguous format
>>> 
>>> Tried doing the same with R2.7.1 but i get the same error.
>>> 
>>> However if i call the same on R 2.6.2, there is no error:
>>> 
>>> use("maltreat.dta")
>>>> des()
>>> 
>>> No. of observations =  670
>>>   Variable          Class           Description
>>> 1  scrno             integer
>>> 2  todaydate         Date
>>> 3  ethnic            character
>>> 4  othtribe          character
>>> 5  dob               Date
>>> 6  ageyrs            integer
>>> 7  agemths           integer
>>> 8  sex               character
>>> 
>>> I cannot figure out what the problem is.
>>> 
>>> Please help me.
>>> 
>>> 
>>> 
>>> Kind regards,
>>> Lazarus Mramba
>>> Junior Statistician
>>> P.O Box 986, 80108,
>>> Kilifi, Kenya
>>> Mobile No. +254721292370
>>> Tel: +254 41 522063
>>> Tel: +254 41 522390
>>> (office extension : 419)
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> -- 
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>> 
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list