[R] Help with merge function

arun smartpink111 at yahoo.com
Fri Apr 26 22:43:26 CEST 2013


Hi,

Check whether this works.

Lines1<-readLines("NS_update.txt")
x1<-read.table(text=gsub('\"',"",Lines1),sep=",",header=TRUE,stringsAsFactors=FALSE)
 x2<- read.table("data.txt",sep="",header=TRUE,stringsAsFactors=FALSE,fill=TRUE)
 dim(x2)
#[1] 34577   189
library(plyr) 
 res<- join(x1,x2,type="right")
#Joining by: State_Prov, Shape_name, bob2009, bob2010, red2009, red2010, coy2009, coy2010, lyn2009, lyn2010
 dim(res)
#[1] 34577   193

 res2<- merge(x1,x2,all.y=TRUE)
 dim(res2)
#[1] 34577   193
A.K.





________________________________
From: Catarina Ferreira <catferreira at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Friday, April 26, 2013 4:20 PM
Subject: Re: [R] Help with merge function



here they are. As you see the NS_update is data for only 1 province and I want it to add this data to the bigger file (data), merging the common columns and adding the new columns. But what it is doing is duplicating the rows in the bigger file that correspond to NS_update, as well as creating the new columns (this is ok).




On Fri, Apr 26, 2013 at 4:16 PM, arun <smartpink111 at yahoo.com> wrote:

You can send the files. 
>
>
>
>
>
>
>
>________________________________
>From: Catarina Ferreira <catferreira at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Friday, April 26, 2013 4:15 PM
>
>Subject: Re: [R] Help with merge function
>
>
>
>is it ok if I send you the files, it's probably better for you to understand me. It didn't work on my files.
>
>
>
>
>On Fri, Apr 26, 2013 at 4:12 PM, arun <smartpink111 at yahoo.com> wrote:
>
>Hi,
>>
>> I am not sure what is the problem.  I used the datasets your provided "x1" and "x2".  I got the result that was shown in the output your desired.
>>Are you saying that this didn't worked in your original dataset or the one your provided?  In that case, could you dput(dataset,20)?
>>
>>
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Catarina Ferreira <catferreira at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Friday, April 26, 2013 4:01 PM
>>
>>Subject: Re: [R] Help with merge function
>>
>>
>>
>>Thank you. It still isn't working. Thank you in any case.
>>
>>
>>
>>
>>On Fri, Apr 26, 2013 at 2:31 PM, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>>From the output you wanted, it looks like:
>>>library(plyr)
>>>join(x1,x2,type="right")
>>>#Joining by: State_prov, Shape_name, bob2009, bob2010
>>>
>>> #  State_prov Shape_name bob2009 bob2010 bob2011 FID coy2009
>>>#1 Nova Scotia  Annapolis       0       0       1   0      10
>>>#2 Nova Scotia Antigonish       0       0       0   1       1
>>>#3 Nova Scotia        Gly       0       0      NA   2       1
>>> merge(x1,x2,all.y=TRUE)
>>>
>>>#   State_prov Shape_name bob2009 bob2010 bob2011 FID coy2009
>>>#1 Nova Scotia  Annapolis       0       0       1   0      10
>>>#2 Nova Scotia Antigonish       0       0       0   1       1
>>>#3 Nova Scotia        Gly       0       0      NA   2       1
>>>
>>>A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Catarina Ferreira <catferreira at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Friday, April 26, 2013 2:23 PM
>>>Subject: Re: [R] Help with merge function
>>>
>>>
>>>
>>>
>>>Hello,
>>>
>>>I didn't realize that the format had been changed after I sent the email. I'm sending you the original mail in attach in a word with the correct format, since I don't think your answer is the one I'm looking for, likely due to the erroneous format.
>>>
>>>Thank you again for your help.
>>>
>>>
>>>
>>>
>>>On Fri, Apr 26, 2013 at 2:11 PM, arun <smartpink111 at yahoo.com> wrote:
>>>
>>>Hi,
>>>>
>>>>The format is bit messed up. 
>>>>So, not sure this is what you wanted.
>>>>
>>>>x1<- read.table(text="State_prov,Shape_name,bob2009,bob2010,bob2011
>>>>Nova Scotia,Annapolis,0,0,1
>>>>Nova Scotia,Antigonish,0,0,0
>>>>Nova Scotia,Gly,NA,NA,NA
>>>>",sep=",",header=TRUE,stringsAsFactors=FALSE)
>>>>
>>>>x2<- read.table(text="
>>>>FID,State_prov,Shape_name,bob2009,bob2010,coy2009
>>>>0,Nova Scotia,Annapolis,0,0,10
>>>>1,Nova Scotia,Antigonish,0,0,1
>>>>2,Nova Scotia,Gly,0,0,1
>>>>",sep=",",header=TRUE,stringsAsFactors=FALSE)
>>>> merge(x1,x2,all=TRUE)
>>>>#   State_prov Shape_name bob2009 bob2010 bob2011 FID coy2009
>>>>#1 Nova Scotia  Annapolis       0       0       1   0      10
>>>>#2 Nova Scotia Antigonish       0       0       0   1       1
>>>>#3 Nova Scotia        Gly       0       0      NA   2       1
>>>>#4 Nova Scotia        Gly      NA      NA      NA  NA      NA
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>From: Catarina Ferreira <catferreira at gmail.com>
>>>>To: r-help at r-project.org
>>>>Cc:
>>>>Sent: Friday, April 26, 2013 1:10 PM
>>>>Subject: [R] Help with merge function
>>>>
>>>>Dear all,
>>>>
>>>>I'm trying to merge 2 dataframes, but I'm not being entirely successful and
>>>>I can't understand why.
>>>>
>>>>Dataframe x1
>>>>
>>>>State_prov     Shape_name   bob2009   bob 2010   bob2011
>>>>Nova Scotia    Annapolis         0                  0              1
>>>>Nova Scotia    Antigonish        0                  0              0
>>>>Nova Scotia    Gly                   NA               NA             NA
>>>>
>>>>Dataframe x2 - has 20000 rows and 193 variables, contains one important
>>>>field which is "FID" that is a link to a shapefile (this is not in x1) and
>>>>shares common columns with x1, like this:
>>>>
>>>>FID     State_prov     Shape_name   bob2009   bob 2010  coy 2009
>>>>0        Nova Scotia    Annapolis         0
>>>>0              10
>>>>1        Nova Scotia    Antigonish        0
>>>>0              1
>>>>2        Nova Scotia    Gly                   0
>>>>0              1
>>>>
>>>>So when I do
>>>>
>>>>x3  <- merge(x1, x2, by=intersect(names(x1), names(x2)), all=TRUE)
>>>>
>>>>it should do the trick. The thing is that it works for the columns (it adds
>>>>all the new columns not common to both dataframes), but it also adds the
>>>>rows. This is what I get (x3):
>>>>
>>>>FID     State_prov     Shape_name   bob2009   bob 2010  coy 2009   bob2011
>>>>0        Nova Scotia    Annapolis         0
>>>>0              10            NA
>>>>NA      Nova Scotia    Annapolis         NA               NA          NA
>>>>            1
>>>>1        Nova Scotia    Antigonish        0
>>>>0              1               NA
>>>>NA      Nova Scotia    Antigonish        NA               NA          NA
>>>>            0
>>>>2        Nova Scotia    Gly                   0
>>>>0              1               NA
>>>>NA      Nova Scotia    Gly                   NA               NA
>>>>NA             NA
>>>>
>>>>What I want to get is a true merge, like this:
>>>>
>>>>FID     State_prov     Shape_name   bob2009   bob 2010  coy 2009   bob2011
>>>>0        Nova Scotia    Annapolis         0
>>>>0              10            1
>>>>1        Nova Scotia    Antigonish        0
>>>>0              1               0
>>>>2        Nova Scotia    Gly                   0
>>>>0              1               NA
>>>>
>>>>Can anybody please help me to understand what I'm doing wrong.
>>>>Any help will be much appreciated!!
>>>>
>>>>
>>>>--
>>>>Catarina C. Ferreira, PhD
>>>>
>>>>    [[alternative HTML version deleted]]
>>>>
>>>>______________________________________________
>>>>R-help at r-project.org mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>>
>>>--
>>>Catarina C. Ferreira, PhD
>>>Post-doctoral Research Fellow
>>>Department of Biology
>>>Trent University
>>>Peterborough, ON Canada
>>>URL: http://www.researcherid.com/rid/A-3898-2011
>>>
>>
>>
>>--
>>Catarina C. Ferreira, PhD
>>Post-doctoral Research Fellow
>>Department of Biology
>>Trent University
>>Peterborough, ON Canada
>>URL: http://www.researcherid.com/rid/A-3898-2011
>>
>
>
>--
>Catarina C. Ferreira, PhD
>Post-doctoral Research Fellow
>Department of Biology
>Trent University
>Peterborough, ON Canada
>URL: http://www.researcherid.com/rid/A-3898-2011
>


-- 
Catarina C. Ferreira, PhD
Post-doctoral Research Fellow
Department of Biology
Trent University
Peterborough, ON Canada
URL: http://www.researcherid.com/rid/A-3898-2011



More information about the R-help mailing list