[R] plotting Principle components vs individual variables.
Brett Stansfield
brett at hbrc.govt.nz
Mon Apr 11 02:04:14 CEST 2005
Dear R,
I'm trying to plot the first principle component of an analysis vs the first
variable but am having trouble. I have no trouble doing the initial plot
but have difficulty thereafter.
First I want to highlight some points of the following data set
list(running)
[[1]]
X100m X200m X400m X800m X1500m X5K X10K Marathon
Argentina 10.39 20.81 46.84 1.81 3.70 14.04 29.36 137.72
Australia 10.31 20.06 44.84 1.74 3.57 13.28 27.66 128.30
Austria 10.44 20.81 46.82 1.79 3.60 13.26 27.72 135.90
Belgium 10.34 20.68 45.04 1.73 3.60 13.22 27.45 129.95
Bermuda 10.28 20.58 45.91 1.80 3.75 14.68 30.55 146.62
Brazil 10.22 20.43 45.21 1.73 3.66 13.62 28.62 133.13
Burma 10.64 21.52 48.30 1.80 3.85 14.45 30.28 139.95
Canada 10.17 20.22 45.68 1.76 3.63 13.55 28.09 130.15
Chile 10.34 20.80 46.20 1.79 3.71 13.61 29.30 134.03
China 10.51 21.04 47.30 1.81 3.73 13.90 29.13 133.53
Columbia 10.43 21.05 46.10 1.82 3.74 13.49 27.88 131.35
Cook Islands 12.18 23.20 52.94 2.02 4.24 16.70 35.38 164.70
Costa Rica 10.94 21.90 48.66 1.87 3.84 14.03 28.81 136.58
Czechoslovakia 10.35 20.65 45.64 1.76 3.58 13.42 28.19 134.32
Denmark 10.56 20.52 45.89 1.78 3.61 13.50 28.11 130.78
Dominican Republic 10.14 20.65 46.80 1.82 3.82 14.91 31.45 154.12
Finland 10.43 20.69 45.49 1.74 3.61 13.27 27.52 130.87
France 10.11 20.38 45.28 1.73 3.57 13.34 27.97 132.30
East Germany 10.12 20.33 44.87 1.73 3.56 13.17 27.42 129.92
West Germany 10.16 20.37 44.50 1.73 3.53 13.21 27.61 132.23
United Kingdom 10.11 20.21 44.93 1.70 3.51 13.01 27.51 129.13
Greece 10.22 20.71 46.56 1.78 3.64 14.59 28.45 134.60
Guatemala 10.98 21.82 48.40 1.89 3.80 14.16 30.11 139.33
Hungary 10.26 20.62 46.02 1.77 3.62 13.49 28.44 132.58
India 10.60 21.42 45.73 1.76 3.73 13.77 28.81 131.98
Indonesia 10.59 21.49 47.80 1.84 3.92 14.73 30.79 148.83
Ireland 10.61 20.96 46.30 1.79 3.56 13.32 27.81 132.35
Israel 10.71 21.00 47.80 1.77 3.72 13.66 28.93 137.55
Italy 10.01 19.72 45.26 1.73 3.60 13.23 27.52 131.08
Japan 10.34 20.81 45.86 1.79 3.64 13.41 27.72 128.63
Kenya 10.46 20.66 44.92 1.73 3.55 13.10 27.38 129.75
South Korea 10.34 20.89 46.90 1.79 3.77 13.96 29.23 136.25
North Korea 10.91 21.94 47.30 1.85 3.77 14.13 29.67 130.87
Luxembourg 10.35 20.77 47.40 1.82 3.67 13.64 29.08 141.27
Malaysia 10.40 20.92 46.30 1.82 3.80 14.64 31.01 154.10
Mauritius 11.19 22.45 47.70 1.88 3.83 15.06 31.77 152.23
Mexico 10.42 21.30 46.10 1.80 3.65 13.46 27.95 129.20
Netherlands 10.52 20.95 45.10 1.74 3.62 13.36 27.61 129.02
New Zealand 10.51 20.88 46.10 1.74 3.54 13.21 27.70 128.98
Norway 10.55 21.16 46.71 1.76 3.62 13.34 27.69 131.48
Papua New Guinea 10.96 21.78 47.90 1.90 4.01 14.72 31.36 148.22
Philippines 10.78 21.64 46.24 1.81 3.83 14.74 30.64 145.27
Poland 10.16 20.24 45.36 1.76 3.60 13.29 27.89 131.58
Portugal 10.53 21.17 46.70 1.79 3.62 13.13 27.38 128.65
Rumania 10.41 20.98 45.87 1.76 3.64 13.25 27.67 132.50
Singapore 10.38 21.28 47.40 1.88 3.89 15.11 31.32 157.77
Spain 10.42 20.77 45.98 1.76 3.55 13.31 27.73 131.57
Sweden 10.25 20.61 45.63 1.77 3.61 13.29 27.94 130.63
Switzerland 10.37 20.46 45.78 1.78 3.55 13.22 27.91 131.20
Taiwan 10.59 21.29 46.80 1.79 3.77 14.07 30.07 139.27
Thailand 10.39 21.09 47.91 1.83 3.84 15.23 32.56 149.90
Turkey 10.71 21.43 47.60 1.79 3.67 13.56 28.58 131.50
USA 9.93 19.75 43.86 1.73 3.53 13.20 27.43 128.22
USSR 10.07 20.00 44.60 1.75 3.59 13.20 27.53 130.55
Western Samoa 10.82 21.86 49.00 2.02 4.24 16.28 34.71 161.83
So I do the following
running2 <- running[c("USA","New Zealand", "Dominican Republic", "Western
Samoa", "Cook Islands"),]
I check running2 and it shows as this
list(running2)
[[1]]
X100m X200m X400m X800m X1500m X5K X10K Marathon
USA 9.93 19.75 43.86 1.73 3.53 13.20 27.43 128.22
New Zealand 10.51 20.88 46.10 1.74 3.54 13.21 27.70 128.98
Dominican Republic 10.14 20.65 46.80 1.82 3.82 14.91 31.45 154.12
Western Samoa 10.82 21.86 49.00 2.02 4.24 16.28 34.71 161.83
Cook Islands 12.18 23.20 52.94 2.02 4.24 16.70 35.38 164.70
I then ask to plot the first component vs X100m as follows:
plot(running$X100m, running.pca$scores[,1])
It does this no problems but when I ask it to highlight the running2 points
I get the following
points(running2$X100m, running.pca$scores[,1], col="red")
Error in xy.coords(x, y) : x and y lengths differ
How can I get the programme to highlight the 5 countries in red with the
remainder being black??
I have checked the pca$scores data
Comp.1 Comp.2 Comp.3 Comp.4
Argentina -0.04924775 -0.465091996 -0.1569462564 -0.005810845
Australia 1.90192176 0.101049166 -0.0120464104 0.651816682
Austria 0.04010907 -0.163884583 -0.3055014673 0.016136753
Belgium 1.37253647 0.587803868 0.1488158699 -0.019595422
Bermuda 0.69426608 -0.493030587 0.1593950774 0.120074355
Brazil 1.68418949 0.214898184 -0.0733240991 -0.003162787
Burma -1.40707421 0.188937600 -0.6623424285 -0.527215158
Canada 1.52735698 -0.404836611 -0.2142964494 0.176168719
Chile 0.40862535 -0.212618765 0.0408667861 -0.078681885
China -0.56552586 -0.223429359 -0.2928094868 -0.086596928
Columbia -0.11564898 -0.226977793 0.4589401052 -0.048022658
Cook Islands -8.27371262 0.384947623 -0.7357421902 0.801461946
Costa Rica -2.80544713 0.066593276 -0.1383029607 -0.147987785
Czechoslovakia 0.94084659 0.146421855 0.0317629603 0.063414890
Denmark 0.50127535 0.141213477 -0.0481367885 0.642995929
Dominican Republic 0.37381694 -1.061799496 -0.1213552291 -0.266094008
Finland 1.00082548 0.544083222 -0.0258548880 0.117235365
France 1.85734404 -0.004344636 -0.1282195202 -0.168095452
East Germany 2.02630668 0.059699745 0.0679807810 -0.039807436
West Germany 2.06659223 0.217657596 0.2898976618 0.043722768
United Kingdom 2.34631901 0.287007920 -0.2631249772 -0.052175961
Greece 0.60350129 -0.417937938 -0.2722470809 -0.297147859
Guatemala -2.86229181 -0.084782032 0.1092307154 0.124678037
Hungary 0.88380469 -0.196275223 -0.1074953145 -0.089263972
India -0.06219546 0.987010331 0.3912982329 -0.297843161
Indonesia -1.44463975 -0.248819611 -0.0905110184 -0.370562155
Ireland -0.14120447 0.296136604 0.0471511543 0.249279625
Israel -0.68753510 0.413206648 -0.9205920726 0.116497574
Italy 2.53306911 -0.552500188 -0.4802520770 0.350811975
Japan 0.51968949 -0.142260162 0.2336252451 -0.043102377
Kenya 1.25823679 0.791008473 0.1904824736 0.246021878
South Korea 0.09196164 -0.291847998 -0.2933387263 -0.270244999
North Korea -2.16401990 0.517221912 0.4818608282 -0.138369205
Luxembourg -0.23310298 -0.767690670 -0.4063507475 -0.077338384
Malaysia -0.03915668 -0.390104541 0.2784969875 0.006861838
Mauritius -3.34291107 0.866944770 0.7514511384 -0.093101352
Mexico -0.14612335 0.122630574 0.4475266976 -0.410199703
Netherlands 0.80128457 0.916525709 0.3262696094 0.063077066
New Zealand 0.52127592 0.669474301 -0.2625746371 -0.017136425
Norway -0.12659992 0.566884546 -0.2898334704 -0.248028150
Papua New Guinea -2.70378288 -0.154472681 0.4409443971 0.235409736
Philippines -1.05941793 0.766237007 0.6010607266 -0.071473736
Poland 1.63781408 -0.348361281 -0.0257919744 0.179337045
Portugal -0.33352564 0.216740757 -0.0452993297 -0.181755129
Rumania 0.51161806 0.394841278 0.0856325451 -0.206992718
Singapore -1.14434377 -1.068790135 0.3417011592 -0.338799596
Spain 0.62586977 0.265405133 -0.0949525813 0.021691027
Sweden 1.04263923 -0.144339064 0.1025328504 -0.044472795
Switzerland 0.86021420 -0.178080230 -0.0009573433 0.361336836
Taiwan -0.55013518 0.365172050 -0.0388626904 -0.209802871
Thailand -0.80074552 -0.718955765 -0.4329859758 -0.375276203
Turkey -1.11381623 0.489001441 -0.4128057565 -0.240581646
USA 3.11410100 -0.397644307 0.3158182132 0.357347990
USSR 2.30107089 -0.382390458 0.1891648810 0.330762873
Western Samoa -3.87627808 -1.843488955 0.8209468511 0.188597852
I think what is happening is that running2 only has 5 rows while pca$scores
has 55
Can anyone help here?
Brett Stansfield
More information about the R-help
mailing list