[R] Difficulty with 'merge'

Michael Kubovy kubovy at virginia.edu
Wed Jan 4 17:38:37 CET 2006


Dear R-helpers,

Happy New Year to all the helpful members of the list.

Here is the behavior I'm looking for:
 > v1 <- c("a","b","c")
 > n1 <- c(0, 1, 2)
 > v2 <- c("c", "a", "b")
 > n2 <- c(0, 1 , 2)
 > (f1  <- data.frame(v1, n1))
   v1 n1
1  a  0
2  b  1
3  c  2
 > (f2 <- data.frame(v2, n2))
   v2 n2
1  c  0
2  a  1
3  b  2
 > (m12 <- merge(f1, f2, by.x = "v1", by.y = "v2", sort = F))
   v1 n1 n2
1  c  2  0
2  a  0  1
3  b  1  2

Now to my data:
 > summary(pL)
         pairL
a fondo   :  41
alto      :  41
ampio     :  41
angoloso  :  41
aperto    :  41
appoggiato:  41
(Other)   :1271

 > pL$pairL[c(1,42)]
[1] appoggiato dentro
37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico  
complicato convesso davanti dentro destra ... verticale

 > summary(oppN)
         pairL              pairR         subject            
L                LL                RR               M
a fondo   :  41   a galla    :  41   S1     :  37   Min.   :0.3646    
Min.   :0.02083   Min.   :0.0010   Min.   :0.0000
alto      :  41   acuto      :  41   S10    :  37   1st Qu.:0.5521    
1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
ampio     :  41   arrotondato:  41   S11    :  37   Median :0.6354    
Median :0.47917   Median :0.2708   Median :0.2292
angoloso  :  41   basso      :  41   S12    :  37   Mean   :0.6403    
Mean   :0.46452   Mean   :0.2760   Mean   :0.2598
aperto    :  41   chiuso     :  41   S13    :  37   3rd Qu.:0.7188    
3rd Qu.:0.55208   3rd Qu.:0.3750   3rd Qu.:0.3854
appoggiato:  41   compl      :  41   S14    :  37   Max.   :0.9375    
Max.   :0.92708   Max.   :0.6042   Max.   :0.7812
(Other)   :1271   (Other)    :1271   (Other): 
1295                                      NA's   :3.0000   NA's   : 
3.0000
       asym             polar            polar_a1          clust
Min.   :-0.5555   Min.   :-1.2410   Min.   :-2.949e+00   c1:492
1st Qu.: 0.2091   1st Qu.: 0.4571   1st Qu.:-1.902e-01   c2:287
Median : 0.5555   Median : 1.1832   Median :-1.110e-16   c3: 82
Mean   : 0.6265   Mean   : 1.3428   Mean   :-5.745e-02   c4:246
3rd Qu.: 0.9383   3rd Qu.: 2.0712   3rd Qu.: 1.168e-01   c5: 82
Max.   : 2.7081   Max.   : 4.6151   Max.   : 4.218e+00   c6:328
                    NA's   : 3.0000   NA's   : 3.000e+00

 > oppN$pairL[c(1,42)]
[1] spesso fine
37 Levels: a fondo alto ampio angoloso aperto appoggiato asimmetrico  
complicato convesso davanti dentro destra ... verticale

 > unique(sort(oppM$pairL)) == unique(sort(pL$pairL))
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE  
TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[26] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

In other words I think that pL$pairL and oppN$pairL consists of 37  
blocks of 41 repetitions of names, and that these blocks are  
permutations of each other,

However:

 > summary(m1 <- merge(oppM, pairL, by.x = "pairL", by.y = "pairL",  
sort = F))
         pairL               pairR          subject             
L                LL                RR               M
a fondo   : 1681   a galla    : 1681   S1     : 1517   Min.   : 
0.3646   Min.   :0.02083   Min.   :0.0010   Min.   :0.0000
alto      : 1681   acuto      : 1681   S10    : 1517   1st Qu.: 
0.5521   1st Qu.:0.37500   1st Qu.:0.1771   1st Qu.:0.1042
ampio     : 1681   arrotondato: 1681   S11    : 1517   Median : 
0.6354   Median :0.47917   Median :0.2708   Median :0.2292
angoloso  : 1681   basso      : 1681   S12    : 1517   Mean   : 
0.6398   Mean   :0.46402   Mean   :0.2760   Mean   :0.2598
aperto    : 1681   chiuso     : 1681   S13    : 1517   3rd Qu.: 
0.7188   3rd Qu.:0.55208   3rd Qu.:0.3750   3rd Qu.:0.3854
appoggiato: 1681   compl      : 1681   S14    : 1517   Max.   : 
0.9375   Max.   :0.92708   Max.   :0.6042   Max.   :0.7812
(Other)   :51988   (Other)    :51988   (Other):52972
       asym             polar            polar_a1          clust
Min.   :-0.5555   Min.   :-1.2410   Min.   :-2.949e+00   c1:20172
1st Qu.: 0.2091   1st Qu.: 0.4571   1st Qu.:-1.904e-01   c2:11644
Median : 0.5555   Median : 1.1832   Median :-1.110e-16   c3: 3362
Mean   : 0.6234   Mean   : 1.3428   Mean   :-5.745e-02   c4:10086
3rd Qu.: 0.9383   3rd Qu.: 2.0712   3rd Qu.: 1.169e-01   c5: 3362
Max.   : 2.7081   Max.   : 4.6151   Max.   : 4.218e+00   c6:13448

I was expecting pairL to be 41 items longs, not 1681 = 41^2.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/




More information about the R-help mailing list