[R] more woes trying to convert a data.frame to a numerical matrix

Liaw, Andy andy_liaw at merck.com
Wed May 16 17:29:59 CEST 2007


I think this might be a bit more straight forward:

R> mat <- do.call(cbind, scan("clipboard", what=list(NULL, 0, 0, 0),
sep=",", skip=2))
Read 3 records
R> mat
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Andy


From: Andrew Yee
> 
> Thanks again to everyone for all your help.
> 
> I think I've figured out the solution to my dilemma.  Instead of using
> data.matrix or sapply, this works for me:
> 
> sample.data<-read.csv("sample.csv")
> sample.matrix.raw<-as.matrix(sample.data[-1,-1])
> sample.matrix <- matrix(as.numeric(sample.matrix.raw),
>     nrow=attributes(sample.matrix.raw)$dim[1], ncol=attributes(
> sample.matrix.raw)$dim[2])
> 
> With the above code, I get the desired matrix of:
> 
> 1 2 3
> 4 5 6
> 7 8 9
> 
> (I'd like to be able to import the whole csv and then subset 
> the relevant
> header and data sections (rather than creating a separate csv 
> for the header
> and csv for the data)
> 
> Of course, the above code seems kind of clunky, and welcome 
> any suggestions
> for improvement.
> 
> Thanks,
> Andrew
> 
> 
> On 5/16/07, Andrew Yee <andrewjyee at gmail.com> wrote:
> >
> > Thanks for the suggestion.
> >
> > However, I've tried sapply and data.matrix.
> >
> > The problem is that it while it returns a numeric matrix, 
> it gives back:
> >
> > 1 1 1
> > 2 2 2
> > 3 3 3
> >
> > instead of
> >
> > 1 2 3
> > 4 5 6
> > 7 8 9
> >
> > The latter matrix is the desired result
> >
> > Thanks,
> > Andrew
> >
> > On 5/16/07, Marc Schwartz < marc_schwartz at comcast.net> wrote:
> > >
> > > On Wed, 2007-05-16 at 08:40 -0400, Andrew Yee wrote:
> > > > Thanks for the suggestion and the explanation for why I 
> was running
> > > > into these troubles.
> > > >
> > > > I've tried:
> > > >
> > > > as.numeric(as.matrix(sample.data[-1, -1]))
> > > >
> > > > However, this creates another vector rather than a matrix.
> > >
> > > Right. That's because I'm an idiot and need more caffeine... :-)
> > >
> > > >  Is there a straight forward way to convert this directly into a
> > > > numeric matrix rather than a vector?
> > >
> > > Yeah, Dimitris' approach below of using data.matrix().
> > >
> > > You could also use:
> > >
> > > mat <- sapply(sample.data[-1, -1], as.numeric)
> > > rownames(mat) <- rownames(sample.data[-1, -1])
> > >
> > > > mat
> > >   x y z
> > > 2 1 1 1
> > > 3 2 2 2
> > > 4 3 3 3
> > >
> > > Though, this is essentially what data.matrix() does internally.
> > >
> > > > Additionally, I've also considered:
> > > >
> > > > data.matrix(sample.data[-1,-1]
> > > >
> > > > but bizarrely, it returns:
> > > >
> > > >   x y z
> > > > 2 1 1 1
> > > > 3 2 2 2
> > > > 4 3 3 3
> > >
> > > That is a numeric matrix:
> > >
> > > > str(data.matrix(sample.data[-1, -1]))
> > > int [1:3, 1:3] 1 2 3 1 2 3 1 2 3
> > > - attr(*, "dimnames")=List of 2
> > >   ..$ : chr [1:3] "2" "3" "4"
> > >   ..$ : chr [1:3] "x" "y" "z"
> > >
> > > HTH,
> > >
> > > Marc
> > >
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > >
> > > > On 5/16/07, Marc Schwartz < marc_schwartz at comcast.net> wrote:
> > > >         On Wed, 2007-05-16 at 08:10 -0400, Andrew Yee wrote:
> > > >         > I have the following csv file:
> > > >         >
> > > >         > name,x,y,z
> > > >         > category,delta,gamma,epsilon
> > > >         > a,1,2,3
> > > >         > b,4,5,6
> > > >         > c,7,8,9
> > > >         >
> > > >         > I'd like to create a numeric matrix of just 
> the numbers in
> > > >         this csv dataset.
> > > >         >
> > > >         > I've tried the following program:
> > > >         >
> > > >         > sample.data <- read.csv("sample.csv")
> > > >         > numerical.data <- as.matrix (sample.data[-1,-1])
> > > >         >
> > > >         > However, print(numerical.data ) returns what 
> appears to be a
> > > >         matrix of
> > > >         > characters:
> > > >         >
> > > >         >   x   y   z
> > > >         > 2 "1" "2" "3"
> > > >         > 3 "4" "5" "6"
> > > >         > 4 "7" "8" "9"
> > > >         >
> > > >         > How do I force it to be numbers rather than 
> characters?
> > > >         >
> > > >         > Thanks,
> > > >         > Andrew
> > > >
> > > >         The problem is that you have two rows which 
> contain alpha
> > > >         entries.
> > > >
> > > >         The first row is treated as the header, but the 
> second row is
> > > >         treated as
> > > >         actual data, thus overriding the numeric values in the
> > > >         subsequent rows.
> > > >
> > > >         You could use:
> > > >
> > > >           as.numeric(as.matrix(sample.data [-1, -1]))
> > > >
> > > >         to coerce the matrix to numeric, or if you 
> don't need the
> > > >         alpha entries,
> > > >         you could modify the read.csv() call to something like:
> > > >
> > > >           read.csv("sample.csv", header = FALSE, skip = 
> 2, row.names =
> > > >         1,
> > > >                    col.names = c("name", "x", "y", "z")
> > > >
> > > >         This will skip the first two rows, set the 
> first column to the
> > >
> > > >         row names
> > > >         and give you a data frame with numeric columns, 
> which in most
> > > >         cases can
> > > >         be treated as a numeric matrix and/or you could 
> explicitly
> > > >         coerce it to
> > > >         one.
> > > >
> > > >         HTH,
> > > >
> > > >         Marc Schwartz
> > > >
> > > >
> > > >
> > >
> > >
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}



More information about the R-help mailing list