[R] The end of Matlab
hadley wickham
h.wickham at gmail.com
Fri Dec 12 18:23:34 CET 2008
On Fri, Dec 12, 2008 at 11:18 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> On 12/12/2008 11:38 AM, hadley wickham wrote:
>>
>> On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch <murdoch at stats.uwo.ca>
>> wrote:
>>>
>>> On 12/12/2008 8:25 AM, hadley wickham wrote:
>>>>>
>>>>> From which you might conclude that I don't like the design of subset,
>>>>> and
>>>>> you'd be right. However, I don't think this is a counterexample to my
>>>>> general rule. In the subset function, the select argument is treated
>>>>> as
>>>>> an
>>>>> unevaluated expression, and then there are rules about what to do with
>>>>> it.
>>>>> (I.e. try to look up name `a` in the data frame, if that fails, ...)
>>>>>
>>>>> For the requested behaviour to similarly fall within the general rule,
>>>>> we'd
>>>>> have to treat all indices to all kinds of things (vectors, matrices,
>>>>> dataframes, etc.) as unevaluated expressions, with special handling for
>>>>> the
>>>>> particular symbol `end`.
>>>>
>>>> Except you wouldn't have to necessarily change indexing - you could
>>>> change seq instead. Then 5:end could produce some kind of special
>>>> data structure (maybe an iterator) that was recognised by the various
>>>> indexing functions.
>>>
>>> Ummm, doesn't that require changes to *both* indexing and seq?
>>
>> Ooops, yes. I meant it wouldn't require indexing to use unevaluated
>> expression.
>>
>>>> This would still be a lot of work for not a lot
>>>> of payoff, but it would be a logically consistent way of adding this
>>>> behaviour to indexing, and the basic work would make it possible to
>>>> develop other sorts of indexing, eg df[evens(), ], or df[last(5),
>>>> last(3)].
>>>
>>> I agree: it would be a nice addition, but a fair bit of work. I think
>>> it
>>> would be quite doable for the indexable things in the base packages, but
>>> there are a lot of contributed packages that define [ methods, and those
>>> methods would all need to be modified too.
>>
>> That's true, although I suspect many contributed [.methods eventually
>> delegate to base methods and might work without further modification.
>>
>>> (Just to be clear, when I say doable, I'm thinking that your iterators
>>> return functions that compute subsets of index ranges. For example,
>>> evens()
>>> might be implemented as
>>>
>>> evens <- function() {
>>> result <- function(indices) {
>>> indices[indices %% 2 == 0]
>>> }
>>> class(result) <- "iterator"
>>> return(result)
>>> }
>>>
>>> and then `[` in v[evens()] would recognize that it had been passed an
>>> iterator, and would pass 1:length(v) to the iterator to get the subset of
>>> even indices. Is that what you had in mind?)
>>
>> Yes, that's exactly what I was thinking, although you'd have to put
>> some thought into the conventions - would it be better to pass in the
>> length of the vector instead of a vector of indices? Should all
>> iterators return logical vectors? That way you could do x[evens() &
>> last(5)] to get the even indices out of the last 5, as opposed to
>> x[evens()][last(5)] which would return the last 5 even indices.
>
> Actually, I don't think so. "evens() & last(5)" would fail to evaluate,
> because you're trying to do a logical combination of two functions, not of
> two logical vectors. Or are we going to extend the logical operators to
> work on iterators/selectors too?
Oh yes, that's a good point. But wouldn't the following do the job?
"&.selector" <- function(a, b) {
function(n) a(n) & b(n)
}
or
"&.selector" <- function(a, b) {
function(n) intersect(a(n), b(n))
}
depending on whether selectors return logical or numeric vectors.
Writing functions for | and ! would be similarly easy. Or am I
missing something?
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list