[R-sig-Geo] R: merge 2 txt file (PROBLEM OF MEMORY)
Roger Bivand
Roger.Bivand at nhh.no
Fri Nov 7 22:17:04 CET 2008
On Fri, 7 Nov 2008, Alessandro wrote:
> Hi I
>
> The problem is this: when i merge ground1 and ground2 in one txt file
> (ground) I lost many rows
>
> EX: file ground1.txt, ground2.txt
> Format file: X,Y,Z with header row and sep=","
>
> ******************************
>
>> ground1 <- read.delim("ground_Filtered_268000_4149000.txt",
> sep=",",header=TRUE)
>> ground2 <- read.delim("ground_Filtered_269000_4149000.txt",
> sep=",",header=TRUE)
>> summary(ground1)
> X Y Z
> Min. :267980 Min. :4148980 Min. :1399
> 1st Qu.:268256 1st Qu.:4149238 1st Qu.:1505
> Median :268528 Median :4149490 Median :1587
> Mean :268515 Mean :4149491 Mean :1595
> 3rd Qu.:268777 3rd Qu.:4149743 3rd Qu.:1683
> Max. :269020 Max. :4150020 Max. :1823
>> summary(ground2)
> X Y Z
> Min. :268980 Min. :4148980 Min. :1628
> 1st Qu.:269265 1st Qu.:4149268 1st Qu.:1720
> Median :269512 Median :4149543 Median :1753
> Mean :269509 Mean :4149521 Mean :1753
> 3rd Qu.:269753 3rd Qu.:4149768 3rd Qu.:1788
> Max. :270020 Max. :4150020 Max. :1903
>> str(ground1)
> 'data.frame': 2356617 obs. of 3 variables:
> $ X: num 268000 268000 268001 268002 268002 ...
> $ Y: num 4149984 4149982 4149983 4149983 4149983 ...
> $ Z: num 1543 1543 1543 1543 1543 ...
>> str(ground2)
> 'data.frame': 3235340 obs. of 3 variables:
> $ X: num 270000 269999 269999 270000 270000 ...
> $ Y: num 4149873 4149873 4149873 4149874 4149876 ...
> $ Z: num 1744 1745 1744 1744 1744 ...
>> ground <- merge(ground1,ground2)
>> str(ground)
> 'data.frame': 89819 obs. of 3 variables:
> $ X: num 268980 268980 268980 268980 268980 ...
> $ Y: num 4148981 4149013 4149090 4149097 4149110 ...
> $ Z: num 1628 1640 1668 1670 1673 ...
>>
>
> THE result is: ground1 (=2356617) ground2 (=3235340) But the ground is only
> 89819
>
> I think there is a problem of memory
No, an evident problerm of not reading and understanding the documentation
of merge. You have to try it out on small data sets first to understand
how it works, not because of memory problems, but because they run faster.
Once you know how to tell merge() what you want to do, it will do it, but
it cannot guess, and you are not telling it. You have a habit of
"thinking" too much, and this isn't helping you make progress.
I'm not even sure that you don't mean rbind(), as Ashton pointed out. Try
to draw a workflow on a sheet of paper, and try out the functions that
might help one by one on a small data set, so small that you can inspect
the two input objects and the output object visually. It'll be great when
you have learned enough to offer advice to repay the efforts of this list
to help you help yourself.
Have a nice weekend,
Roger
>
> Ale
>
>
>
>
>
> -----Messaggio originale-----
> Da: Ashton Shortridge [mailto:ashton at msu.edu]
> Inviato: mercoledì 5 novembre 2008 10.15
> A: r-sig-geo at stat.math.ethz.ch; Alessandro
> Oggetto: Re: [R-sig-Geo] merge 2 txt file
>
> Hi Ale,
>
> When you say merge, do you mean you want to append one onto the other, or
> you
> really want to merge elements that have common x and y values?
>
> If it's the first case, then I'd probably read both in with read.table() and
>
> then rbind them:
> dat1 <- read.table("file1.txt")
> dat2 <- read.table("file2.txt")
> dat3 <- rbind(dat1,dat2)
>
> If it's the second, I can't help much. Did you try it? Did that work? If you
>
> can't tell (if, say, the text files are really big) then make two small test
>
> files and try that.
>
> Ashton
>
> On Tuesday 04 November 2008 06:52:20 pm Alessandro wrote:
>> Hi all,
>>
>>
>>
>> I have two txt file with X,Y,Z column and I need to merge together
>>
>>
>>
>> I tried
>>
>> file_all <- merge("file1.txt","file2.txt")
>>
>> but I don't sure about the result. Is It this code correct?
>>
>> Thanks Ale
>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list