[R] "Missing value representation in Excel before extraction to R with RODBC"

Fredrik Lundgren fredrik.bg.lundgren at bredband.net
Mon Jan 9 18:06:49 CET 2006


Dear list,

Well, those columns in Excel that starts with NA (actually 8 NA's in my 
case) is imported as all NA in R but if the columns starts with at least 
3 cells with values (i.e not NA) the are imported correctly to R. When 
as.is=TRUE is used a simular conversion takes place but now as all <NA> 
and dates are represented as date-and-time.
Is there any way to get this correct even when the Excel columns start 
with several NA's?

Sincerely
Fredrik


----- Original Message ----- 
From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk>
To: "Petr Pikal" <petr.pikal at precheza.cz>
Cc: "Fredrik Lundgren" <fredrik.bg.lundgren at bredband.net>; "R-help" 
<r-help at stat.math.ethz.ch>
Sent: Monday, January 09, 2006 9:36 AM
Subject: Re: [R] "Missing value representation in Excel before 
extraction to R with RODBC"


> On Mon, 9 Jan 2006, Petr Pikal wrote:
>
>> Hi
>>
>> I believe it has something to do with the column identification
>> decision. When R decides what is in a column it uses only some values
>> from the beginning of a file.
>
> Not R, Excel.  Excel tells ODBC what the column types are.
>
>> I do not use RODBC as read.delim("clipboard", ...) is usually more
>> convenient but probably there is a way how to tell RODBC what is in
>> the column instead of let R decide from the top of the file.
>
> Using as.is=TRUE stops RODBC doing any conversion.
>
>> But I may be completely mistaken.
>>
>> HTH
>> Petr
>>
>>
>> On 6 Jan 2006 at 20:47, Fredrik Lundgren wrote:
>>
>> From:           "Fredrik Lundgren" <fredrik.bg.lundgren at bredband.net>
>> To:             "R-help" <r-help at stat.math.ethz.ch>
>> Date sent:      Fri, 6 Jan 2006 20:47:29 +0100
>> Subject:        [R] "Missing value representation in Excel before 
>> extraction to R
>> with RODBC"
>>
>>> Dear list,
>>>
>>> How should missing values be expressed in Excel before extraction to 
>>> R
>>> via RODBC. I'm bewildered. Sometimes the representation with NA in
>>> Excel appears to work and shows up in R as <NA> but sometimes the 
>>> use
>>> of NA in Excel changes the whole vector to NA's. Blank or nothing or
>>> NA as representation for missing values in Excel with dateformat 
>>> gives
>>> NA's of the whole vector in R but with  general format in Excel 
>>> gives
>>> blanks for missing values in R. How should I represent missing 
>>> values
>>> in Excel?
>>>
>>>
>>> Best wishes and thanks for any help
>>> Fredrik Lundgren
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>




More information about the R-help mailing list