[R] by gives no results, gives warning that data are non-numeric, but the data appears to be numeric.
William Dunlap
wdunlap at tibco.com
Mon Dec 28 06:54:50 CET 2015
by(dataFrame, groupId, FUN) applies FUN a bunch of data.frames (row subsets
of the dataFrame input). mean() returns NA for data.frames. You could use
FUN=colMeans if you wanted column means or FUN=function(x)mean(colMeans(x))
or FUN=function(x)mean(unlist(x)) if you wanted some version of a grand mean
over all the columns.
If you want column means, you may find aggregate() more suited to the job,
as it
applies FUN to each column in each row subset of the data and returns a
data.frame
instead of a list of outputs of FUN.
> aggregate(mtcars[,3:5], mtcars[,2,drop=FALSE], mean)
cyl disp hp drat
1 4 105.1364 82.63636 4.070909
2 6 183.3143 122.28571 3.585714
3 8 353.1000 209.21429 3.229286
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sun, Dec 27, 2015 at 6:55 PM, John Sorkin <jsorkin at grecc.umaryland.edu>
wrote:
> When I run by, I get an error message and no results. Any help in
> understanding what is wrong would be appreciated.
>
> Error message:
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
> argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
> argument is not numeric or logical: returning NA
>
>
> Results:
> Arm: MUFA
> [1] NA
>
> -----------------------------------------------------------------------------------------------------------------------
> Arm: PUFA
> [1] NA
>
> Code:
> by(hold,Arm,mean,na.rm=TRUE)
>
> I don't understand why I am getting the error message, and why I am not
> getting any results. I don't believe my data are non-numeric.
>
> BY str works fine and confirms that the data are numeric
> > by(hold,Arm,str)
> 'data.frame': 23 obs. of 3 variables:
> $ Wtscr: num 97.2 103.9 58.2 130.9 135 ...
> $ Wt0 : num 96.2 106.1 56.7 127.4 133.1 ...
> $ Wt6 : num 93.8 101.7 55.5 127.6 130.9 ...
> 'data.frame': 16 obs. of 3 variables:
> $ Wtscr: num 120.2 104.6 100.1 74.8 112.6 ...
> $ Wt0 : num 117.2 105.3 99.5 75.7 110.7 ...
> $ Wt6 : num 114.6 104.8 84.5 77.7 107.4 ...
> Here is a listing of my data:
> > hold
> Wtscr Wt0 Wt6
> 1 120.2 117.2 114.60
> 2 104.6 105.3 104.80
> 3 97.2 96.2 93.80
> 4 103.9 106.1 101.70
> 5 58.2 56.7 55.50
> 6 130.9 127.4 127.60
> 7 135.0 133.1 130.90
> 8 100.1 99.5 84.50
> 9 130.3 115.3 115.80
> 10 150.5 148.7 133.40
> 11 74.8 75.7 77.70
> 12 112.6 110.7 107.40
> 13 90.0 91.0 83.40
> 14 139.1 138.5 126.70
> 15 99.1 96.4 95.70
> 16 108.3 107.5 109.30
> 17 75.1 72.9 72.20
> 18 97.5 102.1 98.50
> 19 202.2 90.1 90.60
> 20 91.7 89.4 93.40
> 21 102.1 102.2 100.80
> 22 116.9 118.9 118.00
> 23 94.6 95.3 90.30
> 24 122.2 117.0 117.00
> 25 105.6 103.3 103.60
> 26 96.9 96.8 98.80
> 27 102.9 100.3 89.00
> 28 115.8 118.5 117.30
> 29 95.7 96.2 95.40
> 30 88.2 86.9 88.30
> 31 108.7 108.8 108.80
> 32 89.2 88.6 81.20
> 33 86.8 86.5 82.70
> 34 135.5 130.1 125.40
> 35 112.5 113.9 111.45
> 36 111.0 105.3 109.50
> 37 103.4 100.5 95.50
> 38 117.6 117.4 101.40
> 39 116.7 118.5 101.80
>
> The INDEX is clearly a factor:
> > Arm
> [1] PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA PUFA PUFA MUFA
> MUFA PUFA PUFA PUFA MUFA MUFA MUFA MUFA MUFA PUFA MUFA MUFA PUFA MUFA PUFA
> MUFA PUFA
> [32] MUFA MUFA MUFA MUFA MUFA PUFA PUFA PUFA
> Levels: MUFA PUFA
>
> The data and the index have the same length:
> > cbind(hold,Arm)
> Wtscr Wt0 Wt6 Arm
> 1 120.2 117.2 114.60 PUFA
> 2 104.6 105.3 104.80 PUFA
> 3 97.2 96.2 93.80 MUFA
> 4 103.9 106.1 101.70 MUFA
> 5 58.2 56.7 55.50 MUFA
> 6 130.9 127.4 127.60 MUFA
> 7 135.0 133.1 130.90 MUFA
> 8 100.1 99.5 84.50 PUFA
> 9 130.3 115.3 115.80 MUFA
> 10 150.5 148.7 133.40 MUFA
> 11 74.8 75.7 77.70 PUFA
> 12 112.6 110.7 107.40 PUFA
> 13 90.0 91.0 83.40 PUFA
> 14 139.1 138.5 126.70 MUFA
> 15 99.1 96.4 95.70 MUFA
> 16 108.3 107.5 109.30 PUFA
> 17 75.1 72.9 72.20 PUFA
> 18 97.5 102.1 98.50 PUFA
> 19 202.2 90.1 90.60 MUFA
> 20 91.7 89.4 93.40 MUFA
> 21 102.1 102.2 100.80 MUFA
> 22 116.9 118.9 118.00 MUFA
> 23 94.6 95.3 90.30 MUFA
> 24 122.2 117.0 117.00 PUFA
> 25 105.6 103.3 103.60 MUFA
> 26 96.9 96.8 98.80 MUFA
> 27 102.9 100.3 89.00 PUFA
> 28 115.8 118.5 117.30 MUFA
> 29 95.7 96.2 95.40 PUFA
> 30 88.2 86.9 88.30 MUFA
> 31 108.7 108.8 108.80 PUFA
> 32 89.2 88.6 81.20 MUFA
> 33 86.8 86.5 82.70 MUFA
> 34 135.5 130.1 125.40 MUFA
> 35 112.5 113.9 111.45 MUFA
> 36 111.0 105.3 109.50 MUFA
> 37 103.4 100.5 95.50 PUFA
> 38 117.6 117.4 101.40 PUFA
> 39 116.7 118.5 101.80 PUFA
>
> But the by function does not work!
> > by(hold,Arm,mean,na.rm=TRUE)
> Arm: MUFA
> [1] NA
>
> -----------------------------------------------------------------------------------------------------------------------
> Arm: PUFA
> [1] NA
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
> argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
> argument is not numeric or logical: returning NA
>
>
> Perhaps this is a hint, print does not give two separate group:
> > by(hold,Arm,print)
> Wtscr Wt0 Wt6
> 3 97.2 96.2 93.80
> 4 103.9 106.1 101.70
> 5 58.2 56.7 55.50
> 6 130.9 127.4 127.60
> 7 135.0 133.1 130.90
> 9 130.3 115.3 115.80
> 10 150.5 148.7 133.40
> 14 139.1 138.5 126.70
> 15 99.1 96.4 95.70
> 19 202.2 90.1 90.60
> 20 91.7 89.4 93.40
> 21 102.1 102.2 100.80
> 22 116.9 118.9 118.00
> 23 94.6 95.3 90.30
> 25 105.6 103.3 103.60
> 26 96.9 96.8 98.80
> 28 115.8 118.5 117.30
> 30 88.2 86.9 88.30
> 32 89.2 88.6 81.20
> 33 86.8 86.5 82.70
> 34 135.5 130.1 125.40
> 35 112.5 113.9 111.45
> 36 111.0 105.3 109.50
> Wtscr Wt0 Wt6
> 1 120.2 117.2 114.6
> 2 104.6 105.3 104.8
> 8 100.1 99.5 84.5
> 11 74.8 75.7 77.7
> 12 112.6 110.7 107.4
> 13 90.0 91.0 83.4
> 16 108.3 107.5 109.3
> 17 75.1 72.9 72.2
> 18 97.5 102.1 98.5
> 24 122.2 117.0 117.0
> 27 102.9 100.3 89.0
> 29 95.7 96.2 95.4
> 31 108.7 108.8 108.8
> 37 103.4 100.5 95.5
> 38 117.6 117.4 101.4
> 39 116.7 118.5 101.8
> Arm: MUFA
> Wtscr Wt0 Wt6
> 3 97.2 96.2 93.80
> 4 103.9 106.1 101.70
> 5 58.2 56.7 55.50
> 6 130.9 127.4 127.60
> 7 135.0 133.1 130.90
> 9 130.3 115.3 115.80
> 10 150.5 148.7 133.40
> 14 139.1 138.5 126.70
> 15 99.1 96.4 95.70
> 19 202.2 90.1 90.60
> 20 91.7 89.4 93.40
> 21 102.1 102.2 100.80
> 22 116.9 118.9 118.00
> 23 94.6 95.3 90.30
> 25 105.6 103.3 103.60
> 26 96.9 96.8 98.80
> 28 115.8 118.5 117.30
> 30 88.2 86.9 88.30
> 32 89.2 88.6 81.20
> 33 86.8 86.5 82.70
> 34 135.5 130.1 125.40
> 35 112.5 113.9 111.45
> 36 111.0 105.3 109.50
>
> -----------------------------------------------------------------------------------------------------------------------
> Arm: PUFA
> Wtscr Wt0 Wt6
> 1 120.2 117.2 114.6
> 2 104.6 105.3 104.8
> 8 100.1 99.5 84.5
> 11 74.8 75.7 77.7
> 12 112.6 110.7 107.4
> 13 90.0 91.0 83.4
> 16 108.3 107.5 109.3
> 17 75.1 72.9 72.2
> 18 97.5 102.1 98.5
> 24 122.2 117.0 117.0
> 27 102.9 100.3 89.0
> 29 95.7 96.2 95.4
> 31 108.7 108.8 108.8
> 37 103.4 100.5 95.5
> 38 117.6 117.4 101.4
> 39 116.7 118.5 101.8
>
> But summary works as expected, giving two groups of results!
> > by(hold,Arm,summary)
> Arm: MUFA
> Wtscr Wt0 Wt6
> Min. : 58.20 Min. : 56.7 Min. : 55.5
> 1st Qu.: 95.75 1st Qu.: 92.7 1st Qu.: 92.0
> Median :105.60 Median :103.3 Median :101.7
> Mean :112.75 Mean :106.3 Mean :104.0
> 3rd Qu.:130.60 3rd Qu.:118.7 3rd Qu.:117.7
> Max. :202.20 Max. :148.7 Max. :133.4
>
> -----------------------------------------------------------------------------------------------------------------------
> Arm: PUFA
> Wtscr Wt0 Wt6
> Min. : 74.80 Min. : 72.90 Min. : 72.20
> 1st Qu.: 97.05 1st Qu.: 98.67 1st Qu.: 87.88
> Median :104.00 Median :103.70 Median : 99.95
> Mean :103.15 Mean :102.54 Mean : 97.58
> 3rd Qu.:113.62 3rd Qu.:112.28 3rd Qu.:107.75
> Max. :122.20 Max. :118.50 Max. :117.00
>
> BY also shows that there are no NAs in the data, and the BY works properly.
> > by(hold,Arm,is.na)
> Arm: MUFA
> Wtscr
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:16}}
More information about the R-help
mailing list