[R] [FORGED] Newbie Question on R versus Matlab/Octave versus C
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Mon Feb 4 03:09:57 CET 2019
Your code seems to be attempting to modify global variables from within
functions... R purposely makes this hard to do. Don't fight it. Instead,
use function arguments and produce function outputs with your functions.
Also, the ifelse function does not control flow of execution of code... it
selects values between two vectors according to the state of the logical
input vector. Note that all values in both possible input values must be
computed when using ifelse before it can do its magic, so ifelse can be
significantly slower than assigning into an indexed vector if a small
fraction of the vector will be changing.
Below is some proof-of-concept code. It mostly modifies values in-place
within the data frame rather than using ifelse.
You might want to read the Intro to R document available through the R
console via:
RShowDoc("R-intro")
to look up numeric indexing and logical indexing syntax while reading
through this.
#####################
makeNewWomen <- function( nWomen ) {
data.frame( isAlive = rep_len( TRUE, nWomen )
, isPregnant = rep_len( FALSE, nWomen )
, nChildren = rep_len( 0L, nWomen )
, age = rep_len( 0, nWomen )
, dateOfPregnancy = rep_len( 0, nWomen )
, endDateLastPregnancy = rep_len( 0.0, nWomen )
)
}
updateWomen <- function( DF
, jd
, maxAge
, timeStep
, pregProb
, gestation
, minBirthAge
, maxBirthAge
) {
DF$isAlive[ maxAge <= DF$age ] <- FALSE
fertileIdx <- with( DF, isAlive & !isPregnant & minBirthAge <= age & age <= maxBirthAge )
conceiveIdx <- fertileIdx
conceiveIdx[ conceiveIdx ] <- sample( c( FALSE, TRUE )
, size = sum( fertileIdx )
, replace = TRUE
, prob = c( 1-pregProb, pregProb )
)
DF$isPregnant[ conceiveIdx ] <- TRUE
DF$dateOfPregnancy[ conceiveIdx ] <- jd
birthIdx <- with( DF, isAlive & isPregnant & ( dateOfPregnancy + gestation ) <= jd )
femalechild <- sample( c( FALSE, TRUE )
, size = sum( birthIdx ) # random within birthing group
, replace = TRUE
, prob = c( 0.5, 0.5 )
)
DF$isPregnant[ birthIdx ] <- FALSE # pregnancy over
birthIdx[ birthIdx ] <- femalechild # track births further only where female
# DF$age <- ifelse( DF$isAlive
# , DF$age + timeStep
# , DF$age
# )
DF$age[ DF$isAlive ] <- DF$age[ DF$isAlive ] + timeStep
numNotAlive <- sum( !DF$isAlive )
numBirths <- sum( birthIdx )
if ( 0 < numBirths ) { # if needed, start female babies in existing or new rows
if ( 0 < numNotAlive ) {
reuseidx <- which( !DF$isAlive )
if ( numBirths <= numNotAlive ) {
# can fit all new births into existing DF
reuseidx <- reuseidx[ seq.int( numBirths ) ]
DF[ reuseidx, ] <- makeNewWomen( numBirths )
} else {
DF[ reuseidx, ] <- makeNewWomen( length( reuseidx ) )
DF <- rbind( DF
, makeNewWomen( numBirths - length( reuseidx ) )
)
}
} else { # no empty rows in DF
DF <- rbind( DF
, makeNewWomen( numBirths )
)
}
}
DF # return the updated data frame to the caller
}
calculatePopulation <- function( nWomen
, maxDate
, dpy
, pregProb
, maxAge
, timeStep
, gestation
, minBirthAge
, maxBirthAge
, prealloc
) {
jd <- 0
nextSampleJd <- jd + dpy
numSamples <- maxDate %/% dpy
result <- data.frame( jd = rep( NA, numSamples )
, NAlive = rep( NA, numSamples )
, NPreg = rep( NA, numSamples )
, NNotAlive = rep( NA, numSamples )
)
i <- 1L
DF <- makeNewWomen( prealloc )
DF$isAlive <- seq.int( prealloc ) <= nWomen # leave most entries "dead"
while( jd < maxDate ) {
DF <- updateWomen( DF
, jd
, maxAge
, timeStep
, pregProb
, gestation
, minBirthAge
, maxBirthAge
)
if ( nextSampleJd <= jd ) {
result$jd[ i ] <- jd
result$NAlive[ i ] <- sum( DF$isAlive )
result$NPreg[ i ] <- sum( DF$isPregnant )
result$NNotAlive <- sum( !DF$isAlive )
nextSampleJd <- nextSampleJd + dpy
i <- i + 1L
}
# Do various other things
jd <- jd + timeStep
}
result
}
nWomen <- 5
numberOfYears <- 30
maxDate <- 300 * 365
dpy <- 365
pregProb <- 0.01
maxAge <- 50 * 365
minBirthAge <- 18 * 365
maxBirthAge <- 45 * 365
timeStep <- 30
gestation <- 30 * 9
prealloc <- 10000
set.seed(42)
simresult <- calculatePopulation( nWomen
, maxDate
, dpy
, pregProb
, maxAge
, timeStep
, gestation
, minBirthAge
, maxBirthAge
, prealloc
)
plot( simresult$jd/365, simresult$NAlive )
plot( simresult$jd/365, simresult$NNotAlive )
plot( simresult$jd/365, simresult$NPreg )
#####################
On Sat, 2 Feb 2019, Alan Feuerbacher wrote:
> On 1/29/2019 11:50 PM, Jeff Newmiller wrote:
>> On Tue, 29 Jan 2019, Alan Feuerbacher wrote:
>>
>>> After my failed attempt at using Octave, I realized that most likely the
>>> main contributing factor was that I was not able to figure out an
>>> efficient data structure to model one person. But C lent itself perfectly
>>> to my idea of how to go about programming my simulation. So here's a
>>> simplified pseudocode sort of example of what I did:
>>
>> Don't model one person... model an array of people.
>>
>>> To model a single reproducing woman I used this C construct:
>>>
>>> typedef struct woman {
>>> int isAlive;
>>> int isPregnant;
>>> double age;
>>> . . .
>>> } WOMAN;
>>
>> # e.g.
>> Nwomen <- 100
>> women <- data.frame( isAlive = rep( TRUE, Nwomen )
>> , isPregnant = rep( FALSE, Nwomen )
>> , age = rep( 20, Nwomen )
>> )
>>
>>> Then I allocated memory for a big array of these things, using the C
>>> malloc() function, which gave me the equivalent of this statement:
>>>
>>> WOMAN women[NWOMEN]; /* An array of NWOMEN woman-structs */
>>>
>>> After some initialization I set up two loops:
>>>
>>> for( j=0; j<numberOfYears; j++) {
>>> for(i=1; i< numberOfWomen; i++) {
>>> updateWomen();
>>> }
>>> }
>>
>> for ( j in seq.int( numberOfYears ) {
>> # let vectorized data storage automatically handle the other for loop
>> women <- updateWomen( women )
>> }
>>
>>> The function updateWomen() figures out things like whether the woman
>>> becomes pregnant or gives birth on a given day, dies, etc.
>>
>> You can use your "fixed size" allocation strategy with flags indicating
>> whether specific rows are in use, or you can only work with valid rows and
>> add rows as needed for children... best to compute a logical vector that
>> identifies all of the birthing mothers as a subset of the data frame, and
>> build a set of children rows using the birthing mothers data frame as
>> input, and then rbind the new rows to the updated women dataframe as
>> appropriate. The most clear approach for individual decision calculations
>> is the use of the vectorized "ifelse" function, though under certain
>> circumstances putting an indexed subset on the left side of an assignment
>> can modify memory "in place" (the functional-programming restriction
>> against this is probably a foreign idea to a dyed-in-the-wool C programmer,
>> but R usually prevents you from modifying the variable that was input to a
>> function, automatically making a local copy of the input as needed in order
>> to prevent such backwash into the caller's context).
>
> Hi Jeff,
>
> I'm well along in implementing your suggestions, but I don't understand the
> last paragraph. Here is part of the experimenting I've done so far:
>
> *=======*=======*=======*=======*=======*=======*
> updatePerson <- function() {
> ifelse( women$isAlive,
> {
> # Check whether to kill off this person, if she's pregnant whether
> # to give birth, whether to make her pregnant again.
> women$age = women$age + timeStep
> # Check if the person has reached maxAge
> }
> )
> }
>
> calculatePopulation <- function() {
> lastDate = 0
> jd = 0
> while( jd < maxDate ) {
> for( i in seq_len( nWomen ) ) {
> updatePerson();
> }
> todaysDateInt = floor(jd/dpy)
> NAlive[todaysDateInt] = nWomen - nDead
> # Do various other things
> todaysDate = todaysDate + timeStep
> jd = jd + timeStep
> }
> }
>
> nWomen <- 5
> numberOfYears <- 30
> women <- data.frame( isAlive = rep_len( TRUE, nWomen )
> , isPregnant = rep_len( FALSE, nWomen )
> , nChildren = rep_len( 0L, nWomen )
> , ageInt = rep_len( 0L, nWomen )
> , age = rep_len( 0, nWomen )
> , dateOfPregnancy = rep_len( 0, nWomen )
> , endDateLastPregnancy = rep_len( 0.0, nWomen )
> , minBirthAge = rep_len( 0, nWomen )
> , maxBirthAge = rep_len( 0, nWomen )
> )
>
> # . . .
>
> calculatePopulation()
>
> *=======*=======*=======*=======*=======*=======*
>
> The above code (in its complete form) executes without errors. I don't
> understand at least two things:
>
> In the updatePerson function, in the ifelse statement, how do I change the
> appropriate values in the women dataframe?
>
> I don't understand most of your last paragraph at all.
>
> Thanks so much for your help in learning R!
>
> Alan
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil using dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
More information about the R-help
mailing list