[R] more woes trying to convert a data.frame to a numerical matrix
Marc Schwartz
marc_schwartz at comcast.net
Wed May 16 14:58:51 CEST 2007
On Wed, 2007-05-16 at 08:40 -0400, Andrew Yee wrote:
> Thanks for the suggestion and the explanation for why I was running
> into these troubles.
>
> I've tried:
>
> as.numeric(as.matrix(sample.data[-1, -1]))
>
> However, this creates another vector rather than a matrix.
Right. That's because I'm an idiot and need more caffeine... :-)
> Is there a straight forward way to convert this directly into a
> numeric matrix rather than a vector?
Yeah, Dimitris' approach below of using data.matrix().
You could also use:
mat <- sapply(sample.data[-1, -1], as.numeric)
rownames(mat) <- rownames(sample.data[-1, -1])
> mat
x y z
2 1 1 1
3 2 2 2
4 3 3 3
Though, this is essentially what data.matrix() does internally.
> Additionally, I've also considered:
>
> data.matrix(sample.data[-1,-1]
>
> but bizarrely, it returns:
>
> x y z
> 2 1 1 1
> 3 2 2 2
> 4 3 3 3
That is a numeric matrix:
> str(data.matrix(sample.data[-1, -1]))
int [1:3, 1:3] 1 2 3 1 2 3 1 2 3
- attr(*, "dimnames")=List of 2
..$ : chr [1:3] "2" "3" "4"
..$ : chr [1:3] "x" "y" "z"
HTH,
Marc
>
> Thanks,
> Andrew
>
>
> On 5/16/07, Marc Schwartz <marc_schwartz at comcast.net> wrote:
> On Wed, 2007-05-16 at 08:10 -0400, Andrew Yee wrote:
> > I have the following csv file:
> >
> > name,x,y,z
> > category,delta,gamma,epsilon
> > a,1,2,3
> > b,4,5,6
> > c,7,8,9
> >
> > I'd like to create a numeric matrix of just the numbers in
> this csv dataset.
> >
> > I've tried the following program:
> >
> > sample.data <- read.csv("sample.csv")
> > numerical.data <- as.matrix(sample.data[-1,-1])
> >
> > However, print(numerical.data ) returns what appears to be a
> matrix of
> > characters:
> >
> > x y z
> > 2 "1" "2" "3"
> > 3 "4" "5" "6"
> > 4 "7" "8" "9"
> >
> > How do I force it to be numbers rather than characters?
> >
> > Thanks,
> > Andrew
>
> The problem is that you have two rows which contain alpha
> entries.
>
> The first row is treated as the header, but the second row is
> treated as
> actual data, thus overriding the numeric values in the
> subsequent rows.
>
> You could use:
>
> as.numeric(as.matrix(sample.data[-1, -1]))
>
> to coerce the matrix to numeric, or if you don't need the
> alpha entries,
> you could modify the read.csv() call to something like:
>
> read.csv("sample.csv", header = FALSE, skip = 2, row.names =
> 1,
> col.names = c("name", "x", "y", "z")
>
> This will skip the first two rows, set the first column to the
> row names
> and give you a data frame with numeric columns, which in most
> cases can
> be treated as a numeric matrix and/or you could explicitly
> coerce it to
> one.
>
> HTH,
>
> Marc Schwartz
>
>
>
More information about the R-help
mailing list