[Rd] iterative list rm [Was: R-devel Digest, Vol 83, Issue 2]

Sun Jan 3 08:36:52 CET 2010

On 01/03/2010 02:04 AM, Simon Urbanek wrote:
>
>
> On Jan 2, 2010, at 5:41 PM, Romain Francois wrote:
>
>> On 01/02/2010 11:12 PM, Duncan Murdoch wrote:
>>>
>>> On 02/01/2010 3:16 PM, Laurent Gautier wrote:
>>>> On 1/2/10 8:53 PM, Duncan Murdoch wrote:
>>>>> Simon Urbanek wrote:
>>>>>> On Jan 2, 2010, at 12:17 PM, Laurent Gautier wrote:
>>>>>>
>>>>>>> On 1/2/10 5:56 PM, Duncan Murdoch wrote:
>>>>>>>> On 02/01/2010 11:36 AM, Laurent Gautier wrote:
>>>>>>>>> [Disclaimer: what is below reflects my understanding from reading
>>>>>>>>> the
>>>>>>>>> R source, others will correct where deemed necessary]
>>>>>>>>>
>>>>>>>>> On 1/2/10 12:00 PM, r-devel-request at r-project.org wrote:
>>>>>>> (...)
>>>>>>>
>>>>>>>>>> I'd also be interested if there is some ideas on the relative
>>>>>>>>>> efficiency
>>>>>>>>>> of the preserve/release mechanism compared to PROTECT/UNPROTECT.
>>>>>>>>> PROTECT/UNPROTECT is trading granularity for speed. It is a stack
>>>>>>>>> with
>>>>>>>>> only tow operations possible:
>>>>>>>>> - push 1 object into the stack
>>>>>>>>> - pull (unprotect) N last objects from the stack
>>>>>>>> UNPROTECT_PTR is also possible, which does a linear search through
>>>>>>>> the
>>>>>>>> stack and unprotects something possibly deep within it. There is also
>>>>>>>> REPROTECT which allows you to replace an entry within the stack.
>>>>>>>>
>>>>>>>> I would guess that UNPROTECT_PTR is more efficient than
>>>>>>>> RecursiveRelease
>>>>>>>> because it doesn't use so much stack space when it needs to go deep
>>>>>>>> into
>>>>>>>> the stack to release, but it is possible the compiler recognizes the
>>>>>>>> tail recursion and RecursiveRelease is implemented efficiently. In
>>>>>>>> that
>>>>>>>> case it could be more efficient than UNPROTECT_PTR, which has to move
>>>>>>>> all the other entries down to fill the newly vacated space. Really
>>>>>>>> the
>>>>>>>> only reliable way to answer efficiency questions like this is to try
>>>>>>>> both ways and see which works better in your application.
>>>>>>> Thanks. I did not know about UNPROTECT_PTR.
>>>>>>> I had concerns over the stack usage, but so far it did not prove too
>>>>>>> much of a problem. Still, why isn't the tail recursion explicitly
>>>>>>> made an iteration then ? This would take the "may be the compiler
>>>>>>> figures it out, may be not" variable out of the equation.
>>>>>>>
>>>>>> Careful - the protection stack (bookkeeping by R) has nothing to do
>>>>>> with the C function call stack hence it has nothing to do with the
>>>>>> compiler. The protection stack is global so usually you don't run out
>>>>>> of it unless something goes horribly wrong (=infinite loop).
>>>>> I think Laurent was referring to RecursiveRelease, which could use a lot
>>>>> of C stack if you choose to release something that is deep in the list,
>>>>> since it checks the head, and if that doesn't match, calls itself again
>>>>> on the rest of the list. (I checked, and at least one version of gcc
>>>>> doesn't recognize the tail recursion: it really does generate a
>>>>> recursive call.)
>>>>>
>>>>> Laurent asked why it isn't optimized to avoid the recursion: I think the
>>>>> answer is simply because it is so rarely used that nobody has bothered.
>>>>
>>>> Yes, I was referring to RecursiveRelease. Sorry if this was not clear.
>>>>
>>>> What are the chances for a patch to be accepted ? At first sight(*),
>>>> making that tail recursion an iterative function is not a major
>>>> undertaking, and reviewing the patch be fairly straightforward... but
>>>> I can always use that time otherwise if the answer to the question is
>>>> "nil".
>>>
>>> I don't think I would want to review such a patch (I don't know the
>>> memory manager well, I don't know that there is really a case where it
>>> matters enough to be worth doing), so I'd say if you don't get a message
>>> from a core member volunteering to do so, you should assume it won't be
>>> accepted. But that doesn't mean you shouldn't write the code for your
>>> own internal use and edification, and if you can put together a demo
>>> that shows it really makes a big difference in a realistic situation,
>>> you might get a different response.
>>>
>>> Duncan Murdoch
>>
>>  From what I understand, this has little to do with the memory manager, and resumes to the simple problem "how to remove an object from a list".
>>
>> Something like this, using the amazing inline and inspect packages:
>>
>> require( inline )
>> require( inspect )
>>
>> remover<- cfunction(signature( list = "language", object = "environment" ), '
>> 	if( !isNull( list ) ){
>> 		SEXP x = list ;
>> 		SEXP y ;
>> 		while( CAR(x) != object&&  CADR(x) != R_NilValue ){
>> 			y = x ;
>> 			x = CDR(x) ;
>> 		}
>> 		if( CAR(x) == object ) SETCDR(y, CDR(x) ) ;
>> 	}
>> 	return list ;
>
> That blows up if the object is the first in the list BTW. You probably meant more something like
>
> static SEXP ReleaseObj(SEXP object, SEXP list) {
> 	if (!isNull(list)) {
> 		if (CAR(list) != object) {
> 			SEXP c = list;
> 			while (!isNull(CDR(c))) {
> 				if (CADR(c) == object) {
> 					CDR(c) = CDDR(c);
> 					break;
> 				}
> 				c = CDR(c);
> 			}
> 		} else return CDR(list);
> 	}
> 	return list;
> }
> 		
> (For non-core replace CDR(c) with SETCDR(c) although that goes through the write barrier).
>
> Cheers,
> Simon

Thanks. And sorry for not testing this case. (oops)

Romain

-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/IW9B : C++ exceptions at the R level
|- http://tr.im/IlMh : CPP package: exposing C++ objects
`- http://tr.im/HlX9 : new package : bibtex