[R-sig-Geo] R: merge 2 txt file (PROBLEM OF MEMORY)

Roger Bivand Roger.Bivand at nhh.no
Fri Nov 7 22:17:04 CET 2008


On Fri, 7 Nov 2008, Alessandro wrote:

> Hi I
>
> The problem is this: when i merge ground1 and ground2 in one txt file
> (ground) I lost many rows
>
> EX: file ground1.txt, ground2.txt
> Format file: X,Y,Z with header row and sep=","
>
> ******************************
>
>> ground1 <- read.delim("ground_Filtered_268000_4149000.txt",
> sep=",",header=TRUE)
>> ground2 <- read.delim("ground_Filtered_269000_4149000.txt",
> sep=",",header=TRUE)
>> summary(ground1)
>       X                Y                 Z
> Min.   :267980   Min.   :4148980   Min.   :1399
> 1st Qu.:268256   1st Qu.:4149238   1st Qu.:1505
> Median :268528   Median :4149490   Median :1587
> Mean   :268515   Mean   :4149491   Mean   :1595
> 3rd Qu.:268777   3rd Qu.:4149743   3rd Qu.:1683
> Max.   :269020   Max.   :4150020   Max.   :1823
>> summary(ground2)
>       X                Y                 Z
> Min.   :268980   Min.   :4148980   Min.   :1628
> 1st Qu.:269265   1st Qu.:4149268   1st Qu.:1720
> Median :269512   Median :4149543   Median :1753
> Mean   :269509   Mean   :4149521   Mean   :1753
> 3rd Qu.:269753   3rd Qu.:4149768   3rd Qu.:1788
> Max.   :270020   Max.   :4150020   Max.   :1903
>> str(ground1)
> 'data.frame':   2356617 obs. of  3 variables:
> $ X: num  268000 268000 268001 268002 268002 ...
> $ Y: num  4149984 4149982 4149983 4149983 4149983 ...
> $ Z: num  1543 1543 1543 1543 1543 ...
>> str(ground2)
> 'data.frame':   3235340 obs. of  3 variables:
> $ X: num  270000 269999 269999 270000 270000 ...
> $ Y: num  4149873 4149873 4149873 4149874 4149876 ...
> $ Z: num  1744 1745 1744 1744 1744 ...
>> ground <- merge(ground1,ground2)
>> str(ground)
> 'data.frame':   89819 obs. of  3 variables:
> $ X: num  268980 268980 268980 268980 268980 ...
> $ Y: num  4148981 4149013 4149090 4149097 4149110 ...
> $ Z: num  1628 1640 1668 1670 1673 ...
>>
>
> THE result is: ground1 (=2356617) ground2 (=3235340) But the ground is only
> 89819
>
> I think there is a problem of memory

No, an evident problerm of not reading and understanding the documentation 
of merge. You have to try it out on small data sets first to understand 
how it works, not because of memory problems, but because they run faster. 
Once you know how to tell merge() what you want to do, it will do it, but 
it cannot guess, and you are not telling it. You have a habit of 
"thinking" too much, and this isn't helping you make progress.

I'm not even sure that you don't mean rbind(), as Ashton pointed out. Try 
to draw a workflow on a sheet of paper, and try out the functions that 
might help one by one on a small data set, so small that you can inspect 
the two input objects and the output object visually. It'll be great when 
you have learned enough to offer advice to repay the efforts of this list 
to help you help yourself.

Have a nice weekend,

Roger

>
> Ale
>
>
>
>
>
> -----Messaggio originale-----
> Da: Ashton Shortridge [mailto:ashton at msu.edu]
> Inviato: mercoledì 5 novembre 2008 10.15
> A: r-sig-geo at stat.math.ethz.ch; Alessandro
> Oggetto: Re: [R-sig-Geo] merge 2 txt file
>
> Hi Ale,
>
> When you say merge, do you mean you want to append one onto the other, or
> you
> really want to merge elements that have common x and y values?
>
> If it's the first case, then I'd probably read both in with read.table() and
>
> then rbind them:
> dat1 <- read.table("file1.txt")
> dat2 <- read.table("file2.txt")
> dat3 <- rbind(dat1,dat2)
>
> If it's the second, I can't help much. Did you try it? Did that work? If you
>
> can't tell (if, say, the text files are really big) then make two small test
>
> files and try that.
>
> Ashton
>
> On Tuesday 04 November 2008 06:52:20 pm Alessandro wrote:
>> Hi all,
>>
>>
>>
>> I have two txt file with X,Y,Z column and I need to merge together
>>
>>
>>
>> I tried
>>
>> file_all <- merge("file1.txt","file2.txt")
>>
>> but I don't sure about the result. Is It this code correct?
>>
>> Thanks Ale
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list