[R] Update dataframe based on some conditions

Jorge I Velez jorgeivanvelez at gmail.com
Wed Oct 21 22:07:17 CEST 2015


Dear R-help,

I am working on what it seems to be a simple problem, but after several
hours trying to come up with a solution, unfortunately I have not been able
to.

I would like to go from "datain" to "dataout", that is, create the NEWREF
variable according with some restrictions, and update the values for the
remaining variables in the original data set (which is way more bigger than
this example). The problem can be described as having products (coded as
REF) in stock. Here, the total nomber of units in stock are named TOENDREF
and those required for the customer are given by TIMEREF. The idea is to
use as many units of the previous REF as possible before using a new REF.

## input
datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731",
"731", "731", "1442", "1442", "1442", "1442"), TIMEREF = c(120,
240, 360, 30, 30, 60, 90, 120, 30, 60, 90, 120), TOENDREF = c(390,
270, 150, 480, 480, 450, 420, 390, 480, 450, 420, 390)), .Names = c("REF",
"TIMEREF", "TOENDREF"), row.names = c(NA, 12L), class = "data.frame")
datain

## output
dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L,
731L,
731L, 1442L, 1442L, 1442L, 1442L), TIMEREF = c(120L, 240L, 360L,
30L, 30L, 60L, 90L, 120L, 30L, 60L, 90L, 120L), TOENDREF = c(390L,
270L, 150L, 120L, 90L, 30L, 420L, 300L, 270L, 210L, 120L, 0L),
    NEWREF = c(999L, 999L, 999L, 999L, 999L, 999L, 731L, 731L,
    731L, 731L, 731L, 731L)), .Names = c("REF", "TIMEREF", "TOENDREF",
"NEWREF"), row.names = c(NA, 12L), class = "data.frame")
dataout


I what follows I will try to explain what I want to accomplish:

* Example 1
Take rows 3 and 4 of "datain"

#REF TIMEREF TOENDREF
#3   999     360      150
#4  1099      30      480

As 150 units of REF 999 are available, we could substitute the 30 units of
REF 1099 with them. Hence, the 4th row of the _updated_ "datain" becomes

#REF TIMEREF TOENDREF NEWREF
#3   999     360      150      999
#4  1099      30      120      999

* Example 2
Now, let's take rows 3 to 8 of the _updated_ "datain":

#REF TIMEREF TOENDREF
#3   999     360      150
#4   999      30      120
#5   731      30      480
#6   731      60      450
#7   731      90      420
#8   731     120      390

In row 4, there 120 units available to be used. The number of units
required of REF 731 is 30, which can be easily covered by the remaining 120
units of REF 999. By doing so, the remaining units of REF 999 would then be
90.  Hence, the newly _updated_ "datain" becomes

#REF TIMEREF TOENDREF
#3   999     360      150
#4   999      30      120
#5   999      30       90
#6   999      60       30
#7   731      90      420
#8   731     120      300

Finally, the updated "datain" file after processing the remaining REF would
be

#REF TIMEREF TOENDREF
#9  731      30      270
#10 731      60      210
#11 731      90      120
#12 731     120        0

Hopefully I have explained well what I would like to end up with.  If this
is not the case, I will be more than happy to provide more information.

Any help would be very much appreciated.  Thanks in advance.

Best regards,
Jorge Velez.-

	[[alternative HTML version deleted]]



More information about the R-help mailing list