Avi Gross
Sun Sep 5 01:58:53 CEST 2021
Abou,
I believe I addressed this issue in a private message the other day.
As a general rule, truncating can leave a remainder. If
M = length(whatever)/3
Then M is no longer an integer. It can be a number ending in .333... or .666... as well as 0.
Now R may silently truncate something like 100/3 which you see to use and make it be as if you typed 33. Same for 2*M. In your code, you used integer division and that is a truncation too!
m1 <- n1 %/% 3
s1 <- sample(1:n1, n1)
group1.IDs <- sample1.IDs[s1[1:m1]]
group2.IDs <- sample1.IDs[s1[(m1+1):(2*m1)]]
group3.IDs <- sample1.IDs[s1[(m1*2+1):(3*m1)]]
A proper solution accounts for any leftover items. One method is to leave all extra items till the end and have:
MAX <- length(original or whatever)
group3.IDs <- sample1.IDs[s1[(m1*2+1):MAX]]
The last group then might have one or two extra items. Another is to go for a second sweep and take any leftover items and move one each into whatever groups you wish for some balance.
Or, as discussed, there are packages available that let you specify percentages you want and handle these edge cases too.
Dear Thomas:
Thank you very much for your input in this matter.
The core part of this R code(s) (please see below) was written by *Richard O'Keefe*. I had three examples with different sample sizes.
*First sample of size n1 = 204* divided randomly into three groups of sizes 68. *No problems with this one*.
*The second sample of size n2 = 112* divided randomly into three groups of sizes 37, 37, and 38. BUT this R code generated three groups of equal sizes (37, 37, and 37). *How to fix the code to make sure that the output will be three groups of sizes 37, 37, and 38*.
*The third sample of size n3 = 284* divided randomly into three groups of sizes 94, 95, and 95. BUT this R code generated three groups of equal sizes (94, 94, and 94). *Again*, h*ow to fix the code to make sure that the output will be three groups of sizes 94, 95, and 95*.
With many thanks
abou
########### ------------------------ #############
N1 <- 485
population1.IDs <- seq(1, N1, by = 1)
#### population1.IDs
n1<-204 ##### in this case the size
of each group of the three groups = 68
sample1.IDs <- sample(population1.IDs,n1) #### sample1.IDs
#### n1 <- length(sample1.IDs)
m1 <- n1 %/% 3
s1 <- sample(1:n1, n1)
group1.IDs <- sample1.IDs[s1[1:m1]]
group2.IDs <- sample1.IDs[s1[(m1+1):(2*m1)]]
group3.IDs <- sample1.IDs[s1[(m1*2+1):(3*m1)]]
groups.IDs <-cbind(group1.IDs,group2.IDs,group3.IDs)
groups.IDs
####### --------------------------
N2 <- 266
population2.IDs <- seq(1, N2, by = 1)
#### population2.IDs
n2<-112 ##### in this case the sizes of the three
groups are(37, 37, and 38)
##### BUT this codes generate three groups of equal sizes (37, 37, and 37) sample2.IDs <- sample(population2.IDs,n2) #### sample2.IDs
#### n2 <- length(sample2.IDs)
m2 <- n2 %/% 3
s2 <- sample(1:n2, n2)
group1.IDs <- sample2.IDs[s2[1:m2]]
group2.IDs <- sample2.IDs[s2[(m2+1):(2*m2)]]
group3.IDs <- sample2.IDs[s2[(m2*2+1):(3*m2)]]
groups.IDs <-cbind(group1.IDs,group2.IDs,group3.IDs)
groups.IDs
####### --------------------------
N3 <- 674
population3.IDs <- seq(1, N3, by = 1)
#### population3.IDs
n3<-284 ##### in this case the sizes of the three
groups are(94, 95, and 95)
##### BUT this codes generate three groups of equal sizes (94, 94, and 94) sample2.IDs <- sample(population2.IDs,n2) sample3.IDs <- sample(population3.IDs,n3) #### sample3.IDs
#### n3 <- length(sample2.IDs)
m3 <- n3 %/% 3
s3 <- sample(1:n3, n3)
group1.IDs <- sample3.IDs[s3[1:m3]]
group2.IDs <- sample3.IDs[s3[(m3+1):(2*m3)]]
group3.IDs <- sample3.IDs[s3[(m3*2+1):(3*m3)]]
groups.IDs <-cbind(group1.IDs,group2.IDs,group3.IDs)
groups.IDs
