[R] Multiply
@vi@e@gross m@iii@g oii gm@ii@com
@vi@e@gross m@iii@g oii gm@ii@com
Sat Aug 5 01:45:22 CEST 2023
[See the end for an interesting twist on moving a column to row.names.]
Yes, many ways to do things exist but it may make sense to ask for what the user/OP really wants. Sometimes the effort to make a brief example obscures things.
Was there actually any need to read in a file containing comma-separated values? Did it have to include one, or perhaps more, non-numeric columns? Was the ID column guaranteed to exist and be named ID or just be the first column? Is any error checking needed?
So assuming we read in two data structures into data.frames called A and B to be unoriginal. A wider approach might be to split A into A.text and A.numeric by checking which columns test as numeric (meaning is.numeric() return TRUE) and which can be text or logical or anything else. Note that complex is not considered numeric if that matters.
You might then count the number of rows and columns of A.numeric and set A.text Aside for now.
You then get B and it seems you can throw away any non-numeric columns. The resulting numeric columns can be in B.numeric.
The number of rows and columns of what remains need to conform to the dimensions of A in the sense that if a is M rows by N columns, then B must be N x anything and the result of the multiplication is M x anything. If the condition is not met, you need to fail gracefully.
You may also want to decide what to do with data that came with things like NA content. Or, if your design allows content that can be converted to numeric, check and make any conversions.
Then you can convert the data into matrices, perform the matrix multiplication and optionally restore any column names you want along with any of the non-numeric columns you held back and note there could possible be more than one. Obviously, getting multiple ones in the original order is harder.
I am not sure if you are interested in another tweak. For some purposes, rownames() and colnames() make sense instead of additional rows or columns.
A line of code like this applied to a data.frame will copy your id column as a rowname then remove the actual ID column.
> dat1
ID x y z
1 A 10 34 12
2 B 25 42 18
3 C 14 20 8
> rownames(dat1) <- dat1$ID
> dat1$ID <- NULL
> dat1
x y z
A 10 34 12
B 25 42 18
C 14 20 8
> result <- as.matrix(dat1) %*% mat2
> result
weight weiht2
A 24.58 30.18
B 35.59 44.09
C 17.10 21.30
There are functions (perhaps in packages, with names like column_to_rownames() in the tidyverse packages that can be used and you can also reverse the process.
Just some thoughts. The point is that it is often wiser to not mix text with numeric and rownames and colnames provide a way to include the text for the purposes you want and not for others where they are in the way. And here is an oddity I found:
> dat2
ID weight weiht2
1 A 0.25 0.35
2 B 0.42 0.52
3 C 0.65 0.75
> temp <- data.frame(dat2, row.names=1)
> temp
weight weiht2
A 0.25 0.35
B 0.42 0.52
C 0.65 0.75
As shown, when you create a data.frame you can move any column by NUMBER into rownames. So consider your early code and note read.table supports the option row.names=1 and passes it on so in one step:
> ?read.table
> dat1 <-read.table(text="ID, x, y, z
+ A, 10, 34, 12
+ B, 25, 42, 18
+ C, 14, 20, 8 ",sep=",",header=TRUE,stringsAsFactors=F, row.names=1)
> dat1
x y z
A 10 34 12
B 25 42 18
C 14 20 8
You can make it a matrix immediately:
mat1 <- as.matrix(read.table(
text = text,
sep = ",",
header = TRUE,
stringsAsFactors = F,
row.names = 1
))
-----Original Message-----
From: Val <valkremk using gmail.com>
Sent: Friday, August 4, 2023 2:03 PM
To: avi.e.gross using gmail.com
Cc: r-help using r-project.org
Subject: Re: [R] Multiply
Thank you, Avi and Ivan. Worked for this particular Example.
Yes, I am looking for something with a more general purpose.
I think Ivan's suggestion works for this.
multiplication=as.matrix(dat1[,-1]) %*% as.matrix(dat2[match(dat1[,1],
dat2[,1]),-1])
Res=data.frame(ID = dat1[,1], Index = multiplication)
On Fri, Aug 4, 2023 at 10:59 AM <avi.e.gross using gmail.com> wrote:
>
> Val,
>
> A data.frame is not quite the same thing as a matrix.
>
> But as long as everything is numeric, you can convert both data.frames to
> matrices, perform the computations needed and, if you want, convert it back
> into a data.frame.
>
> BUT it must be all numeric and you violate that requirement by having a
> character column for ID. You need to eliminate that temporarily:
>
> dat1 <- read.table(text="ID, x, y, z
> A, 10, 34, 12
> B, 25, 42, 18
> C, 14, 20, 8 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> mat1 <- as.matrix(dat1[,2:4])
>
> The result is:
>
> > mat1
> x y z
> [1,] 10 34 12
> [2,] 25 42 18
> [3,] 14 20 8
>
> Now do the second matrix, perhaps in one step:
>
> mat2 <- as.matrix(read.table(text="ID, weight, weiht2
> A, 0.25, 0.35
> B, 0.42, 0.52
> C, 0.65, 0.75",sep=",",header=TRUE,stringsAsFactors=F)[,2:3])
>
>
> Do note some people use read.csv() instead of read.table, albeit it simply
> calls read.table after setting some parameters like the comma.
>
> The result is what you asked for, including spelling weight wrong once.:
>
> > mat2
> weight weiht2
> [1,] 0.25 0.35
> [2,] 0.42 0.52
> [3,] 0.65 0.75
>
> Now you wanted to multiply as in matrix multiplication.
>
> > mat1 %*% mat2
> weight weiht2
> [1,] 24.58 30.18
> [2,] 35.59 44.09
> [3,] 17.10 21.30
>
> Of course, you wanted different names for the columns and you can do that
> easily enough:
>
> result <- mat1 %*% mat2
>
> colnames(result) <- c("index1", "index2")
>
>
> But this is missing something:
>
> > result
> index1 index2
> [1,] 24.58 30.18
> [2,] 35.59 44.09
> [3,] 17.10 21.30
>
> Do you want a column of ID numbers on the left? If numeric, you can keep it
> in a matrix in one of many ways but if you want to go back to the data.frame
> format and re-use the ID numbers, there are again MANY ways. But note mixing
> characters and numbers can inadvertently convert everything to characters.
>
> Here is one solution. Not the only one nor the best one but reasonable:
>
> recombined <- data.frame(index=dat1$ID,
> index1=result[,1],
> index2=result[,2])
>
>
> > recombined
> index index1 index2
> 1 A 24.58 30.18
> 2 B 35.59 44.09
> 3 C 17.10 21.30
>
> If for some reason you need a more general purpose way to do this for
> arbitrary conformant matrices, you can write a function that does this in a
> more general way but perhaps a better idea might be a way to store your
> matrices in files in a way that can be read back in directly or to not
> include indices as character columns but as row names.
>
>
>
>
>
>
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of Val
> Sent: Friday, August 4, 2023 10:54 AM
> To: r-help using R-project.org (r-help using r-project.org) <r-help using r-project.org>
> Subject: [R] Multiply
>
> Hi all,
>
> I want to multiply two data frames as shown below,
>
> dat1 <-read.table(text="ID, x, y, z
> A, 10, 34, 12
> B, 25, 42, 18
> C, 14, 20, 8 ",sep=",",header=TRUE,stringsAsFactors=F)
>
> dat2 <-read.table(text="ID, weight, weiht2
> A, 0.25, 0.35
> B, 0.42, 0.52
> C, 0.65, 0.75",sep=",",header=TRUE,stringsAsFactors=F)
>
> Desired result
>
> ID Index1 Index2
> 1 A 24.58 30.18
> 2 B 35.59 44.09
> 3 C 17.10 21.30
>
> Here is my attempt, but did not work
>
> dat3 <- data.frame(ID = dat1[,1], Index = apply(dat1[,-1], 1, FUN=
> function(x) {sum(x*dat2[,2:ncol(dat2)])} ), stringsAsFactors=F)
>
>
> Any help?
>
> Thank you,
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list