[R] Problems with merge
vikas at mail.jnu.ac.in
Wed Oct 6 07:14:05 CEST 2004
This issue has been discussed on this list before but the solutions
offerred are not satisfactory. So I thought I shall raise it again.
I want to merge two datasets which have three common variables. These
variables DO NOT have the same names in both the files. In addition,
there are two variables with same name which do not necessarily have
exactly same data. That is, there could be some discrepancy between the
two datasets when it comes to these variables. I do not want them to be
used when I merge the datasets.
The problem is that R allows you to use by.x and by.y variables to
specify only one variable in x dataset and one variable in y dataset to
merge. Otherwise, if you do not specify anything, it matches all the
variables that have common names to merge. This is very problemmatic. In
my case, the variables I want to use to match do not have same names in
two datasets and the ones that have same names must not be used to match.
One approach will be to change names of variables and then merge. But
that is not elegant, to say the least.
If nothing else works, that is what I shall have to do. There again we
have some problem. How do I change the name of a particular column. One
solution suggested somewhere in the archives of the list is to use
names(data.frame)=c(list of column names)
But this requires you to list all the variable names. That can obviously
be cumbersome when you have large number of variables. What would be the
syntax if I want to change just one column name.
More information about the R-help