[R] "Missing value representation in Excel before extraction to R with RODBC"

Fredrik Lundgren fredrik.bg.lundgren at bredband.net
Tue Jan 10 14:37:30 CET 2006


Dear Petr,

Thank you for your help. I have tried (and succeded) to import myfile 
after conversion to text and with the help of read.table (also with the 
file = 'clipboard' alternative). Both methods give correct results, 
albeit dateformat turns up as a factor (minor problem). Also the 
read.xls from library gdata has been successful, albeit with some 
different representation of dateformat( minor problem). At least my 
Excelfile isn't corrupted in such a way to make this three ways 
impossible. No, my problem appears to be connected to the use of RODBC 
and that was what I wanted to get working. The first 8 rows are excluded 
from the file and columns with many NA's at the start are tranformed to 
all NA. If the NA's at the beginning of a column are given values (i. e. 
not NA) the tranformation of the column doesn't take place but the first 
8 rows are still excluded. I have tried - not necessarily in a correct 
way - to use as.is (keeps the dateformat correct) and  colClasses 
(doesn't apply?) but haven't been able to sort the problem out with 
these options.

Best wishes

Fredrik
----- Original Message ----- 
From: "Petr Pikal" <petr.pikal at precheza.cz>
To: "Fredrik Lundgren" <fredrik.bg.lundgren at bredband.net>; "R-help" 
<r-help at stat.math.ethz.ch>
Sent: Tuesday, January 10, 2006 10:08 AM
Subject: Re: [R] "Missing value representation in Excel before 
extraction to R with RODBC"


> Hi
>
> I tried to reproduce what you have told us by copy and paste
>
> read.delim("clipboard")
>
> but was not successful.
>
> Even with several blank values in each column in Excel i got correct
> import to R by this process. As I do not use RODBC I do not know all
> possible settings and features. If colClasses is available you can
> force the columns to by character, numeric, factor, Date or some
> other class.
>
> BTW Excel can be quite tricky and hides e.g. spaces in cells so you
> see them as empty even if they are not. So if I get some weird
> conversions of numeric columns there is often something hidden in
> Excel.
>
> HTH
> Petr
>
>
> On 9 Jan 2006 at 18:06, Fredrik Lundgren wrote:
>
> From:           "Fredrik Lundgren" <fredrik.bg.lundgren at bredband.net>
> To:             "Prof Brian Ripley" <ripley at stats.ox.ac.uk>,
> "Petr Pikal" <petr.pikal at precheza.cz>
> Copies to:      "R-help" <r-help at stat.math.ethz.ch>
> Subject:        Re: [R] "Missing value representation in Excel before 
> extraction to R with RODBC"
> Date sent:      Mon, 9 Jan 2006 18:06:49 +0100
>
>> Dear list,
>>
>> Well, those columns in Excel that starts with NA (actually 8 NA's in
>> my case) is imported as all NA in R but if the columns starts with at
>> least 3 cells with values (i.e not NA) the are imported correctly to
>> R. When as.is=TRUE is used a simular conversion takes place but now 
>> as
>> all <NA> and dates are represented as date-and-time. Is there any way
>> to get this correct even when the Excel columns start with several
>> NA's?
>>
>> Sincerely
>> Fredrik
>>
>>
>> ----- Original Message ----- 
>> From: "Prof Brian Ripley" <ripley at stats.ox.ac.uk>
>> To: "Petr Pikal" <petr.pikal at precheza.cz>
>> Cc: "Fredrik Lundgren" <fredrik.bg.lundgren at bredband.net>; "R-help"
>> <r-help at stat.math.ethz.ch> Sent: Monday, January 09, 2006 9:36 AM
>> Subject: Re: [R] "Missing value representation in Excel before
>> extraction to R with RODBC"
>>
>>
>> > On Mon, 9 Jan 2006, Petr Pikal wrote:
>> >
>> >> Hi
>> >>
>> >> I believe it has something to do with the column identification
>> >> decision. When R decides what is in a column it uses only some
>> >> values from the beginning of a file.
>> >
>> > Not R, Excel.  Excel tells ODBC what the column types are.
>> >
>> >> I do not use RODBC as read.delim("clipboard", ...) is usually more
>> >> convenient but probably there is a way how to tell RODBC what is 
>> >> in
>> >> the column instead of let R decide from the top of the file.
>> >
>> > Using as.is=TRUE stops RODBC doing any conversion.
>> >
>> >> But I may be completely mistaken.
>> >>
>> >> HTH
>> >> Petr
>> >>
>> >>
>> >> On 6 Jan 2006 at 20:47, Fredrik Lundgren wrote:
>> >>
>> >> From:           "Fredrik Lundgren"
>> >> <fredrik.bg.lundgren at bredband.net> To:             "R-help"
>> >> <r-help at stat.math.ethz.ch> Date sent:      Fri, 6 Jan 2006 
>> >> 20:47:29
>> >> +0100 Subject:        [R] "Missing value representation in Excel
>> >> before extraction to R with RODBC"
>> >>
>> >>> Dear list,
>> >>>
>> >>> How should missing values be expressed in Excel before extraction
>> >>> to R via RODBC. I'm bewildered. Sometimes the representation with
>> >>> NA in Excel appears to work and shows up in R as <NA> but
>> >>> sometimes the use of NA in Excel changes the whole vector to 
>> >>> NA's.
>> >>> Blank or nothing or NA as representation for missing values in
>> >>> Excel with dateformat gives NA's of the whole vector in R but 
>> >>> with
>> >>>  general format in Excel gives blanks for missing values in R. 
>> >>> How
>> >>> should I represent missing values in Excel?
>> >>>
>> >>>
>> >>> Best wishes and thanks for any help
>> >>> Fredrik Lundgren
>> >
>> > -- 
>> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> > Professor of Applied Statistics, 
>> > http://www.stats.ox.ac.uk/~ripley/
>> > University of Oxford,             Tel:  +44 1865 272861 (self) 1
>> > South Parks Road,                     +44 1865 272866 (PA) Oxford
>> > OX1 3TG, UK                Fax:  +44 1865 272595
>> >
>>
>>
>
> Petr Pikal
> petr.pikal at precheza.cz
>
>




More information about the R-help mailing list