[R] Help: read a proportion of high through-put data

Tue Jan 24 04:26:31 CET 2012

It's pretty hard to answer this without the file in hand, but I'd
guess something like the following is at play:

Columns of data.frame()s have to have a single type. So if R sees
anything it thinks is a character, it will coerce the whole column to
character. Since you have not set the first row to be a header, it's
probably interpreting that as the first element of the row and
recognizes it as character. This behavior is sometimes auto-rectified
by read.table() or read.csv() if it sees a column without a member in
the first line -- as that suggests that we have column and rownames
around rectangular data -- but that doesn't seem to be happening here.

What happens if you try

read.table("sample.txt", header = TRUE)

An alternative route, if those names are coming in as headers, would
be to manually coerce the columns -- if everything is to be numeric,
just wrap the call in as.numeric()

Michael

On Mon, Jan 23, 2012 at 10:18 PM, Chee Chen <chee.chen at yahoo.com> wrote:
> Dear All,
> I have a text file, tab delimited, called "sample.txt",as follows:
> ID_REF    382    GC_Score    Theta    R    B_Allele_Freq    Log_R_Ratio
> 200003    BB    0.9101527    0.9734979    0.8788951    1    0
> 200006    AB    0.6003323    0.4385073    2.033364    0.4850979    0.01553433
>
> I have explored various options of the command: read.table, with one as:
> read.table("sample.txt", na.strings="NA",as.is = TRUE)
>
> However, everything that it reads in becomes a character.
>
> Could you please help me on this?
> Best regards,
> Chee
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.