[R] Merging data frames, or one column/vector with a data frame filling out empty rows with NA's
Sarah Goslee
sarah.goslee at gmail.com
Wed Apr 22 15:37:35 CEST 2009
Hi,
How about this:
> SNP5 <- merge(SNP4, SNP1[,2:3], all.x=TRUE)
> SNP5
Marker Animal Y x
1 P1001 194073197 0.021088 2
2 P1002 194073197 0.021088 1
3 P1004 194073197 0.021088 2
4 P1005 194073197 0.021088 0
5 P1006 194073197 0.021088 2
6 P1007 194073197 0.021088 0
This ignores Animal, and that may or may not be what you want -
it wasn't clear from your question.
But your error is due to memory limitations - could be due to
specifying the wrong merge, or to having files larger than your
computer can handle. This is a good job for a proper database.
>> SNP5 <- merge(SNP4, SNP1$x, by.x = 'Marker', by.y = 'Marker', all = TRUE)
> Error in fix.by(by.y, y) : 'by' must specify valid column(s)
If you just include SNP1$x, there is no Marker column to merge on. You
need to include at least two columns.
On Wed, Apr 22, 2009 at 3:30 AM, joe1985 <johannes at dsr.life.ku.dk> wrote:
>
> Hello
>
> I have two data frames, SNP4 and SNP1:
>
>> head(SNP4)
> Animal Marker Y
> 3213 194073197 P1001 0.021088
> 1295 194073197 P1002 0.021088
> 915 194073197 P1004 0.021088
> 2833 194073197 P1005 0.021088
> 1487 194073197 P1006 0.021088
> 1885 194073197 P1007 0.021088
>
>> head(SNP1)
> Animal Marker x
> 3213 194073197 P1001 2
> 1295 194073197 P1002 1
> 915 194073197 P1004 2
> 2833 194073197 P1005 0
> 1487 194073197 P1006 2
> 1885 194073197 P1007 0
>
> I want these two data frames merged by 'Marker', but when i try
>
>> SNP5 <- merge(SNP4, SNP1, by = 'Marker', all = TRUE)
> Error: cannot allocate vector of size 2.4 Gb
> In addition: Warning messages:
> 1: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
> Reached total allocation of 1535Mb: see help(memory.size)
> 2: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
> Reached total allocation of 1535Mb: see help(memory.size)
> 3: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
> Reached total allocation of 1535Mb: see help(memory.size)
> 4: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
> Reached total allocation of 1535Mb: see help(memory.size)
>
> And error occurs.
>
> What i want is the column SNP1$x merged together with SNP4 by Marker, so
> some markers will have NA's in the 'x'-column in the SNP5 dataset.
>
> I also tried this
>
>> SNP5 <- merge(SNP4, SNP1$x, by.x = 'Marker', by.y = 'Marker', all = TRUE)
> Error in fix.by(by.y, y) : 'by' must specify valid column(s)
>
> I won't work either.
>
> Does anyone have any idea how to solve this.
>
> Regards,
>
> Johannes.
>
>
>
>
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list