[Rd] read.table with ":" in column names (PR#8511)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Jan 20 12:22:53 CET 2006
peverlorenvanthemaat at amc.uva.nl writes:
> Full_Name: emiel ver loren
> Version: 2.2.0
> OS: Windows XP
> Submission from: (NULL) (145.117.31.248)
>
>
> Dear R-community and developers,
>
> I have been trying to read in a tab delimeted file where the column names and
> the row names are of the form "GO:0000051" (gene ontology IDs). When using:
>
> > gomat<-read.table("test.txt")
> > colnames(gomat)[1]
> [1] "GO.0000051"
> > rownames(gomat)[1]
> [1] "GO:0000002"
>
> Which means that ":" is transformed into a "." !! This seems like Excel when it
> is trying to guess what I am really ment (and turning 1/1/1 into 1-1-2001).
This is what check.names=FALSE is for... (and NOT a bug, please don't
abuse the bug repository, use the mailing lists)
> Furthermore, I found the following quite strange as well:
>
> > gomat2<-read.delim2("test.txt",header=FALSE)
> > gomat2[1,1:2]
> V1 V2
> 1 GO:0000051 GO:0000280
> > as.character(gomat2[1,1:2])
> [1] "8" "2"
> > as.character(gomat2[1,1])
> [1] "GO:0000051"
>
> I have found a way to work around it, but I am wandering what's happening....
Yes, this is a bit nasty, but... What is happening is similar to this:
> d <- data.frame(a=factor(LETTERS), b=factor(letters))
> d[1,]
a b
1 A a
> as.character(d[1,])
[1] "1" "1"
> as.character(d[1,1])
[1] "A"
> as.character(d[1,1,drop=F])
[1] "1"
or this:
> l <- list(a=factor("x"),b=factor("y"))
> l
$a
[1] x
Levels: x
$b
[1] y
Levels: y
> as.character(l)
[1] "1" "1"
The thing is that as.character on a list will first coerce factors to
numeric, then numeric to character. I'm not sure whether there could
be a rationale for it, but it isn't S-PLUS compatible (not 6.2.1
anyway, which is the most recent one that I have access to).
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list