[R] inner join sqldf
Newbie19_02
nvanzuydam at gmail.com
Mon Mar 15 13:19:17 CET 2010
Hi,
I have two dataframes that have some common columns. I would like to join
them by the common columns prochi and prescribed_date as there are duplicate
prochis but they will be made unique by date. I tried doing an inner join
but that just duplicated the columns whereas I would like the information
from the test_sql_tsf to fill the NAs in the test_sql_psd common columns.
require(sqldf)
test_sql_psd <- http://n4.nabble.com/file/n1593282/test_sql_psd.txt
test_sql_psd.txt , header=TRUE, sep="\t", dec=".", na.strings="NA",
check.names=TRUE, quote= "\"'")
test_sql_tsf <-read.table(file=
http://n4.nabble.com/file/n1593282/test_sql_tsf.txt test_sql_tsf.txt , ,
header=TRUE, sep="\t", dec=".", na.strings="NA", check.names=TRUE, quote=
"\"'")
test_sql_innerjoin <- sqldf("select * test_sql_psd inner join test_sql_tsf
on test_sql_psd.prochi=test_sql_tsf.prochi")
colnames(test_sql_psd)
[1] "prochi" "prescribed_date" "dataMonth" "item_code"
[5] "res_seqno" "quantity" "directions" "no_of_packs"
[9] "datasource" "scan_ref_no" "name"
"approved_name"
[13] "formulation_code" "strength" "measure_code" "bnf_code"
[17] "bnf_description"
colnames(test_sql_tsf)
[1] "prochi" "prescribed_date" "dataMonth" "item_code"
[5] "res_seqno" "quantity" "directions" "no_of_packs"
[9] "datasource" "scan_ref_no" "name"
"formulation_code"
[13] "strength" "bnf_code"
with the result of:
colnames(test_sql_innerjoin)
[1] "prochi" "prescribed_date" "dataMonth" "item_code"
[5] "res_seqno" "quantity" "directions" "no_of_packs"
[9] "datasource" "scan_ref_no" "name"
"formulation_code"
[13] "strength" "bnf_code" "prochi"
"prescribed_date"
[17] "dataMonth" "item_code" "res_seqno" "quantity"
[21] "directions" "no_of_packs" "datasource" "scan_ref_no"
[25] "name" "approved_name" "formulation_code" "strength"
[29] "measure_code" "bnf_code" "bnf_description"
I'm not sure if I am using the correct sqldf command or if there is an
easier way to do this from the start. I also tried
test_sql_union<-sqldf("select * test_sql_tsf union select * test_sql_psd")
which gave me the same result as inner join. I'm not sure if I am using the
correct commands for what I want to do?
Thanks for your help.
Natalie
--
View this message in context: http://n4.nabble.com/inner-join-sqldf-tp1593282p1593282.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list