[R] rewrite a data file use write.table(), count.fields() show different pattern, any suggestion appreciated.
Yong Wang
wangyong1 at gmail.com
Tue May 22 16:30:06 CEST 2007
Thank you for the suggestion, Dr. Ripley
However, I am a little bit confused. My understanding is that you
suspect the should-be-quoted fields (factor or character fields)
contains tabs.
if this is the case, count.fields() should detect the tab,
read.table(sep="t\") should read with the same awareness, and if
write.table(sep"\t") write and seperate with tab those fields as
acknowldged by read.table(sep="t\"), the two field counts should be
the same.
anyway, I will try to redo it per your suggestion.
Regards
yong
On 5/22/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> If you write out unquoted fields, how do you know they do not contain
> tabs?
>
> The default is quote=TRUE for a good reason.
>
> On Tue, 22 May 2007, Yong Wang wrote:
>
> > Dear all:
> >
> > I read in a tab delimited dataset, and then write it out as another
> > file as following: I did this simply to make sure I understand the
> > behavior of this command.
> >
> > data<-read.table(file,header=F,sep="\t",fill=T,colClasses="character");
> > write.table(data,file="newdata.txt",eol="\n",sep="\t",quote=F,row.names=F);
> >
> >
> > cf1 <- count.fields(newdata.txt, sep="\t")
> > table(cf1)
> > 13 17 23
> > 10 126 5445
> >
> > # is different to
> >
> > cf2 <- count.fields(file,sep="\t")
> > 13 17 23 33
> > 10 106 5433 32
> >
> > the worst problem is the maximal value of cf1 (33) is larger than the
> > maximal value of cf2 (23) which is the right number of fields for most
> > rows in the original file.
> >
> > I need to use write.table for some important data manipulation work,
> > your suggestion is
> > highly appreciated.
> >
> > Best Regards
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
More information about the R-help
mailing list