[R] runtime on ising model

David Winsemius dwinsemius at comcast.net
Thu Oct 28 18:20:03 CEST 2010


On Oct 28, 2010, at 11:52 AM, Michael D wrote:

> Mike, I'm not sure what you mean about removing foo but I think the  
> method
> is sound in diagnosing a program issue and the results speak for  
> themselves.
>
> I did invert my if statement at the suggestion of a CS professor  
> (who also
> suggested recoding in C, but I'm in an applied math program and  
> haven't had
> the time to take programming courses, which i know would be helpful)
>
> Anyway, with the statement as:
>
> if( !(k %in% c(10^4,10^5,10^6,10^7)) ){
> #do nothing
> } else {
> q <- q+1
> Out[[q]] <- M
> }
>
> run times were back to around 20 minutes.

Have you tried replacing all of those 10^x operations with their  
integer equivalents, c(10000L, 100000L, 1000000L)? Each time through  
the loop you are unnecessarily calling the "^" function 4 times. You  
could also omit the last one. 10^7,  during testing since M at the  
last iteration (k=10^7) would be the final value and you could just  
assign the state of M at the end. So we have eliminated 4*10^7  
unnecessary "^" calls and 10^7 unnecessary comparisons. (The CS  
professor is perhaps used to having the C compiler do all thinking of  
this sort for him.)

-- 
David

> So as best I can tell something
> happens in the if statement causing the computer to work ahead, as the
> professor suggests. I'm no expert on R (and have no desire to try  
> looking at
> the R source code (it would only confuse me)) but if anyone can offer
> guidance on how the if statement works (Does R try to work ahead?  
> Under what
> conditions does it try to "work ahead" so I can try to exploit this
> behavior) I would greatly appreciate it.
> If it would require too much knowledge of the computer system to  
> understand
> I doubt I would be able to make use of it, but maybe someone else  
> could
> benefit.
>
> On Tue, Oct 26, 2010 at 3:24 PM, Mike Marchywka  
> <marchywka at hotmail.com>wrote:
>
>> ----------------------------------------
>>> Date: Tue, 26 Oct 2010 12:53:14 -0400
>>> From: mike409 at gmail.com
>>> To: jim at bitwrit.com.au
>>> CC: r-help at r-project.org
>>> Subject: Re: [R] runtime on ising model
>>>
>>> I have an update on where the issue is coming from.
>>>
>>> I commented out the code for "pos[k+1] <- M[i,j]" and the if  
>>> statement
>> for
>>> time = 10^4, 10^5, 10^6, 10^7 and the storage and everything ran
>> fast(er).
>>> Next I added back in the "pos" statements and still runtimes were  
>>> good
>>> (around 20 minutes).
>>>
>>> So I'm left with something is causing problems in:
>>
>> I haven't looked at this since some passing interest in magnetics
>> decades ago, something about 8-tracks and cassettes, but you have
>> to be careful with conclusions like " I removed foo and problem
>> went away therefore problem was foo." Performance issues are often
>> caused by memory, not CPU limitations. Removing anything with a big
>> memory footprint could speed things up. IO can be a real bottleneck.
>> If you are talking about things on minute timescales, look at task
>> manager and see if you are even CPU limited. Look for page faults
>> or IO etc. If you really need performance and have a task which
>> is relatively simple, don't ignore c++ as a way to generate data
>> points and then import these into R for analysis.
>>
>> In short, just because you are focusing on math it doesn't mean
>> the computer is limited by that.
>>
>>
>>>
>>> ## Store state at time 10^4, 10^5, 10^6, 10^7
>>> if( k %in% c(10^4,10^5,10^6,10^7) ){
>>> q <- q+1
>>> Out[[q]] <- M
>>> }
>>>
>>> Would there be any reason R is executing the statements inside the  
>>> "if"
>>> before getting to the logical check?
>>> Maybe R is written to hope for the best outcome (TRUE) and will just
>> throw
>>> out its work if the logic comes up FALSE?
>>> I guess I can always break the for loop up into four parts and  
>>> store the
>>> state at the end of each, but thats an unsatisfying solution to me.
>>>
>>>
>>> Jim, I like the suggestion of just pulling one big sample, but  
>>> since I
>> can
>>> get the runtimes under 30 minutes just by removing the storage  
>>> piece I
>> doubt
>>> I would see any noticeable changes by pulling large sample vectors.
>>>
>>> Thanks,
>>> Michael
>>>
>>> On Tue, Oct 26, 2010 at 6:22 AM, Jim Lemon  wrote:
>>>
>>>> On 10/26/2010 04:50 PM, Michael D wrote:
>>>>
>>>>> So I'm in a stochastic simulations class and I having issues  
>>>>> with the
>>>>> amount
>>>>> of time it takes to run the Ising model.
>>>>>
>>>>> I usually don't like to attach the code I'm running, since it will
>>>>> probably
>>>>> make me look like a fool, but I figure its the best way I can  
>>>>> find any
>>>>> bits
>>>>> I can speed up run time.
>>>>>
>>>>> As for the goals of the exercise:
>>>>> I need the state of the system at time=1, 10k, 100k, 1mill, and  
>>>>> 10mill
>>>>> and the percentage of vertices with positive spin at all t
>>>>>
>>>>> Just to be clear, i'm not expecting anyone to tell me how to  
>>>>> program
>> this
>>>>> model, cause I know what I have works for this exercise, but it  
>>>>> takes
>> far
>>>>> too long to run and I'd like to speed it up by replacing slow
>> operations
>>>>> wherever possible.
>>>>>
>>>>> Hi Michael,
>>>> One bottleneck is probably the sampling. If it doesn't grab too  
>>>> much
>>>> memory, setting up a vector of the samples (maybe a million at a  
>>>> time
>> if 10
>>>> million is too big - might be able to rewrite your sample vector  
>>>> when
>> you
>>>> store the state) and using k (and an offset if you don't have one  
>>>> big
>>>> vector) to index it will give you some speed.
>>>>
>>>> Jim
>>>>
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list