[R] more woes trying to convert a data.frame to a numerical matrix
Liaw, Andy
andy_liaw at merck.com
Wed May 16 17:29:59 CEST 2007
I think this might be a bit more straight forward:
R> mat <- do.call(cbind, scan("clipboard", what=list(NULL, 0, 0, 0),
sep=",", skip=2))
Read 3 records
R> mat
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
Andy
From: Andrew Yee
>
> Thanks again to everyone for all your help.
>
> I think I've figured out the solution to my dilemma. Instead of using
> data.matrix or sapply, this works for me:
>
> sample.data<-read.csv("sample.csv")
> sample.matrix.raw<-as.matrix(sample.data[-1,-1])
> sample.matrix <- matrix(as.numeric(sample.matrix.raw),
> nrow=attributes(sample.matrix.raw)$dim[1], ncol=attributes(
> sample.matrix.raw)$dim[2])
>
> With the above code, I get the desired matrix of:
>
> 1 2 3
> 4 5 6
> 7 8 9
>
> (I'd like to be able to import the whole csv and then subset
> the relevant
> header and data sections (rather than creating a separate csv
> for the header
> and csv for the data)
>
> Of course, the above code seems kind of clunky, and welcome
> any suggestions
> for improvement.
>
> Thanks,
> Andrew
>
>
> On 5/16/07, Andrew Yee <andrewjyee at gmail.com> wrote:
> >
> > Thanks for the suggestion.
> >
> > However, I've tried sapply and data.matrix.
> >
> > The problem is that it while it returns a numeric matrix,
> it gives back:
> >
> > 1 1 1
> > 2 2 2
> > 3 3 3
> >
> > instead of
> >
> > 1 2 3
> > 4 5 6
> > 7 8 9
> >
> > The latter matrix is the desired result
> >
> > Thanks,
> > Andrew
> >
> > On 5/16/07, Marc Schwartz < marc_schwartz at comcast.net> wrote:
> > >
> > > On Wed, 2007-05-16 at 08:40 -0400, Andrew Yee wrote:
> > > > Thanks for the suggestion and the explanation for why I
> was running
> > > > into these troubles.
> > > >
> > > > I've tried:
> > > >
> > > > as.numeric(as.matrix(sample.data[-1, -1]))
> > > >
> > > > However, this creates another vector rather than a matrix.
> > >
> > > Right. That's because I'm an idiot and need more caffeine... :-)
> > >
> > > > Is there a straight forward way to convert this directly into a
> > > > numeric matrix rather than a vector?
> > >
> > > Yeah, Dimitris' approach below of using data.matrix().
> > >
> > > You could also use:
> > >
> > > mat <- sapply(sample.data[-1, -1], as.numeric)
> > > rownames(mat) <- rownames(sample.data[-1, -1])
> > >
> > > > mat
> > > x y z
> > > 2 1 1 1
> > > 3 2 2 2
> > > 4 3 3 3
> > >
> > > Though, this is essentially what data.matrix() does internally.
> > >
> > > > Additionally, I've also considered:
> > > >
> > > > data.matrix(sample.data[-1,-1]
> > > >
> > > > but bizarrely, it returns:
> > > >
> > > > x y z
> > > > 2 1 1 1
> > > > 3 2 2 2
> > > > 4 3 3 3
> > >
> > > That is a numeric matrix:
> > >
> > > > str(data.matrix(sample.data[-1, -1]))
> > > int [1:3, 1:3] 1 2 3 1 2 3 1 2 3
> > > - attr(*, "dimnames")=List of 2
> > > ..$ : chr [1:3] "2" "3" "4"
> > > ..$ : chr [1:3] "x" "y" "z"
> > >
> > > HTH,
> > >
> > > Marc
> > >
> > > >
> > > > Thanks,
> > > > Andrew
> > > >
> > > >
> > > > On 5/16/07, Marc Schwartz < marc_schwartz at comcast.net> wrote:
> > > > On Wed, 2007-05-16 at 08:10 -0400, Andrew Yee wrote:
> > > > > I have the following csv file:
> > > > >
> > > > > name,x,y,z
> > > > > category,delta,gamma,epsilon
> > > > > a,1,2,3
> > > > > b,4,5,6
> > > > > c,7,8,9
> > > > >
> > > > > I'd like to create a numeric matrix of just
> the numbers in
> > > > this csv dataset.
> > > > >
> > > > > I've tried the following program:
> > > > >
> > > > > sample.data <- read.csv("sample.csv")
> > > > > numerical.data <- as.matrix (sample.data[-1,-1])
> > > > >
> > > > > However, print(numerical.data ) returns what
> appears to be a
> > > > matrix of
> > > > > characters:
> > > > >
> > > > > x y z
> > > > > 2 "1" "2" "3"
> > > > > 3 "4" "5" "6"
> > > > > 4 "7" "8" "9"
> > > > >
> > > > > How do I force it to be numbers rather than
> characters?
> > > > >
> > > > > Thanks,
> > > > > Andrew
> > > >
> > > > The problem is that you have two rows which
> contain alpha
> > > > entries.
> > > >
> > > > The first row is treated as the header, but the
> second row is
> > > > treated as
> > > > actual data, thus overriding the numeric values in the
> > > > subsequent rows.
> > > >
> > > > You could use:
> > > >
> > > > as.numeric(as.matrix(sample.data [-1, -1]))
> > > >
> > > > to coerce the matrix to numeric, or if you
> don't need the
> > > > alpha entries,
> > > > you could modify the read.csv() call to something like:
> > > >
> > > > read.csv("sample.csv", header = FALSE, skip =
> 2, row.names =
> > > > 1,
> > > > col.names = c("name", "x", "y", "z")
> > > >
> > > > This will skip the first two rows, set the
> first column to the
> > >
> > > > row names
> > > > and give you a data frame with numeric columns,
> which in most
> > > > cases can
> > > > be treated as a numeric matrix and/or you could
> explicitly
> > > > coerce it to
> > > > one.
> > > >
> > > > HTH,
> > > >
> > > > Marc Schwartz
> > > >
> > > >
> > > >
> > >
> > >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}
More information about the R-help
mailing list