[R] problems with merge() - the output has many repeated lines
Cecilia Carmo
cecilia.carmo at ua.pt
Sun Aug 22 19:23:56 CEST 2010
I have done
intersect(names(df1), names(df2))
[1] "firm" "year"
This is the key I used to merge
merge(df1,df2,by=c("firm","year"))
And there is just one row firm/year in df1 that matches
with another firm/year row in df2. Df1 has more firm/year
rows than df2, and them don't match with none in df2.
Cecília
Em Sun, 22 Aug 2010 12:09:57 -0500
Erik Iverson <eriki at ccbr.umn.edu> escreveu:
> Cecilia -
>
>Find what columns you're matching on,
>
> intersect(names(df1), names(df2)),
>
> Maybe that will shed some light on the issue.
>
> On 08/22/2010 12:02 PM, Cecilia Carmo wrote:
>> Thanks, but I don't have multiple matches and the lines
>>repeated in the
>> final dataframe are exactly equal in all columns.
>>
>> Cecília
>>
>> Sat, 21 Aug 2010 10:58:53 -0500
>> Hadley Wickham <hadley at rice.edu> escreveu:
>>> You may find a close reading of ?merge helpful,
>>>particularly this
>>> sentence: "If there is more than one match, all possible
>>> matches contribute one row each" (so check that you
>>>don't have
>>> multiple matches).
>>>
>>> Hadley
>>>
>>> On Sat, Aug 21, 2010 at 10:45 AM, Cecilia Carmo
>>><cecilia.carmo at ua.pt>
>>> wrote:
>>>> Hi everyone,
>>>>
>>>> I have been merging many big dataframes (about 80000
>>>>rows each) and I
>>>> never
>>>> had this problem, but now it happened to me and I want
>>>>to know if
>>>> someone
>>>> knows what could be happening.
>>>> The final dataframe has many rows, an impossible number!
>>>>I have done
>>>> edit(dataframe) and I saw that there are many repeated
>>>>rows (all equal).
>>>>
>>>> Thanks for any help,
>>>>
>>>> Cecília Carmo
>>>> Universidade de Aveiro
>>>> Portugal
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained,
>>>>reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Assistant Professor / Dobelman Family Junior Chair
>>> Department of Statistics / Rice University
>>> http://had.co.nz/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>>reproducible code.
>
More information about the R-help
mailing list