[R] Importing Large Dataset from Excel
Patrick Connolly
p_connolly at slingshot.co.nz
Sun Dec 16 10:02:29 CET 2007
On Wed, 12-Dec-2007 at 11:35AM +0100, Peter Dalgaard wrote:
|> Philippe Grosjean wrote:
|> > The problem is often a misspecification of the comment.char argument.
|> > For read.table(), it defaults to '#'. This means that everywhere you
|> > have a '#' char in your Excel sheet, the rest of the line is ignored.
|> > This results in a different number of items per line.
|> >
|> > You should better use read.csv() which provides better default arguments
|> > for your particular problem.
|> > Best,
|> >
|> >
|> Or read.delim/read.delim2, which should be even better at TAB-separated
|> files.
|>
|> In general, be very suspicious of read.table() with such files, not only
|> because of the '#' but also because it expects columns separated by
|> _arbitrary_ amounts of whitespace. I.e., n TABs counts as one, so empty
|> fields are skipped over.
I don't recall that happening with TABs, but a problem can arise when
the last (rightmost) column has more than a few empty cells.
Occasionally, I've had to resort to adding a dummy column on the
right, but as Peter suggests, read.delim is usually less involved.
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Middle minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Anon
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
More information about the R-help
mailing list