[R] Selecting elements
PIKAL Petr
petr@p|k@| @end|ng |rom prechez@@cz
Mon Aug 23 10:05:30 CEST 2021
Hi
Only I got your HTML formated mail, rest of the world got complete mess. Do not use HTML formating.
As I got it right I wonder why in your second example you did not follow
3A - 3B - 2C - 2D
as D were positioned 1st and 4th.
I hope that you could use something like
sss <- split(data$Var.2, data$Var.1)
lapply(sss, cumsum)
$A
[1] 38 73 105 136 166 188 199 207 209 210
$B
[1] 39 67 92 115 131 146 153 159 164 168
$C
[1] 40 76 105 131 152 171 189 203 213 222
$D
[1] 37 71 104 131 155 175 192 205 217 220
Now you need to evaluate this result according to your sets. Here the highest value (76) is in C so the set with 2C is the one you should choose and select you value according to this set.
With
> set.seed(666)
> Var.1 = rep(LETTERS[1:4], 10)
> Var.2 = sample(1:40, replace=FALSE)
> data = data.frame(Var.1, Var.2)
> data <- data[order(data$Var.2, decreasing=TRUE), ]
> sss <- split(data$Var.2, data$Var.1)
> lapply(sss, cumsum)
$A
[1] 36 70 102 133 163 182 200 207 212 213
$B
[1] 35 57 78 95 108 120 131 140 148 150
$C
[1] 40 73 102 130 156 180 196 211 221 225
$D
[1] 39 77 114 141 166 189 209 223 229 232
Highest value is in D so either 3A - 3B - 2C - 2D or 3A - 3B - 2C - 2D should be appropriate. And here I am again lost as both sets are same. Maybe you need to reconsider your statements.
Cheers
Petr
From: Silvano Cesar da Costa <silvano using uel.br>
Sent: Friday, August 20, 2021 9:28 PM
To: PIKAL Petr <petr.pikal using precheza.cz>
Cc: r-help using r-project.org
Subject: Re: [R] Selecting elements
Hi, thanks you for the answer.
Sorry English is not my native language.
But you got it right.
> As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
I must select the 10 (not 15) highest values, but which follow a certain order:
3A - 3B - 2C - 2D or
2A - 5B - 0C - 3D or
3A - 3B - 2C - 2D
I'll put the example in Excel for a better understanding (with 20 elements only).
I must select 10 elements (the highest values of variable Var.2), which fit one of the 3 options above.
Number
Position
Var.1
Var.2
1
27
C
40
2
30
B
39
Selected:
3
5
A
38
Number
Position
Var.1
Var.2
4
16
D
37
1
27
C
40
5
23
C
36
2
30
B
39
3A - 3B - 2C - 2D
6
13
A
35
3
5
A
38
7
20
D
34
4
16
D
37
3A - 3B - 1C - 3D
8
12
D
33
5
23
C
36
9
9
A
32
6
13
A
35
2A - 5B - 0C - 3D
10
1
A
31
7
20
D
34
11
21
A
30
10
9
A
32
12
35
C
29
13
14
B
28
13
14
B
28
17
6
B
25
14
8
D
27
15
7
C
26
16
6
B
25
17
40
D
24
18
26
B
23
19
29
A
22
20
31
C
21
Second option (other data set):
Number
Position
Var.1
Var.2
1
36
D
20
2
11
B
19
Selected:
3
39
A
18
Number
Position
Var.1
Var.2
4
24
D
17
1
36
D
20
5
34
B
16
2
11
B
19
3A - 3B - 2C - 2D
6
2
B
15
3
39
A
18
7
3
A
14
4
24
D
17
3A - 3B - 1C - 3D
8
32
D
13
5
34
B
16
9
28
D
12
6
2
B
15
2A - 5B - 0C - 3D
10
25
A
11
7
3
A
14
11
19
B
10
8
32
D
13
12
15
B
9
9
25
A
11
13
17
A
8
10
18
C
7
14
18
C
7
15
38
B
6
16
10
B
5
17
22
B
4
18
4
D
3
19
33
A
2
20
37
A
1
How to make the selection of these 10 elements that fit one of the 3 options using R?
Thanks,
Prof. Dr. Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ciências Exatas
Departamento de Estatística
Fone: (43) 3371-4346
Em sex., 20 de ago. de 2021 às 03:28, PIKAL Petr <mailto:petr.pikal using precheza.cz> escreveu:
Hallo
I am confused, maybe others know what do you want but could you be more specific?
Let say you have such data
set.seed(123)
Var.1 = rep(LETTERS[1:4], 10)
Var.2 = sample(1:40, replace=FALSE)
data = data.frame(Var.1, Var.2)
What should be the desired outcome?
You can sort
data <- data[order(data$Var.2, decreasing=TRUE), ]
and split the data
> split(data$Var.2, data$Var.1)
$A
[1] 38 35 32 31 30 22 11 8 2 1
$B
[1] 39 28 25 23 16 15 7 6 5 4
$C
[1] 40 36 29 26 21 19 18 14 10 9
$D
[1] 37 34 33 27 24 20 17 13 12 3
T inspect highest values. But here I am lost. As C is first and fourth biggest value, you follow third option and select 3 highest A, 3B 2C and 2D?
Or I do not understand at all what you really want to achieve.
Cheers
Petr
> -----Original Message-----
> From: R-help <mailto:r-help-bounces using r-project.org> On Behalf Of Silvano Cesar da
> Costa
> Sent: Thursday, August 19, 2021 10:40 PM
> To: mailto:r-help using r-project.org
> Subject: [R] Selecting elements
>
> Hi,
>
> I need to select 15 elements, always considering the highest values
> (descending order) but obeying the following configuration:
>
> 3A - 4B - 0C - 3D or
> 2A - 5B - 0C - 3D or
> 3A - 3B - 2C - 2D
>
> If I have, for example, 5 A elements as the highest values, I can only choose
> (first and third choice) or 2 (second choice) elements.
>
> how to make this selection?
>
>
> library(dplyr)
>
> Var.1 = rep(LETTERS[1:4], 10)
> Var.2 = sample(1:40, replace=FALSE)
>
> data = data.frame(Var.1, Var.2)
> (data = data[order(data$Var.2, decreasing=TRUE), ])
>
> Elements = data %>%
> arrange(desc(Var.2))
>
> Thanks,
>
> Prof. Dr. Silvano Cesar da Costa
> Universidade Estadual de Londrina
> Centro de Ciências Exatas
> Departamento de Estatística
>
> Fone: (43) 3371-4346
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> mailto:R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list