[R] Variable Wildcard Value

Romain Francois romain.francois at dbmail.com
Wed Apr 1 11:05:25 CEST 2009


Francis Smart wrote:
>  Sure thing.  I realize that it is an unusual request and not the type
>  of thing that I have seen used in any other language that I know of.
>
> So right now I am using some of the statistical functions of R to get
> some summary statistics and visual output from this historic data set.
> I have a lot functions that look something like this:
>
>  summary(lm(SLV_DIE_PER[tontype==TONNAGE_TYPE]~SLV_PER_TON[tontype==TONNAGE_TYPE]))
>
>  SLV_DIE_PER - being the percent of slaves that died between purchase
>  and delivery
>  SLV_PER_TON - being the number of slaves per standardized ton (ship capacity)
>  tontype - being the type of ton that the ship capacity was recorded
>  as.  There are various factors mostly in the form of 1,2,3,4,5...
>  representing Spanish Ton, British Ton, etc.
>
>  Now I want to run a simple linear model and graphs and other things by
>  specifying TONNAGE_TYPE=1 or 2 etc.  Giving me a regression that is
>  only using looking at a particular type of tonnage over that of
>  another.
>
>  All of that works fine.  But, it gets a little ugly when I want to
>  generalize the linear model to include all values irrespective of
>  tontype.  Of course I could duplicate and trim the statement as such:
>
>  summary(lm(SLV_DIE_PER~SLV_PER_TON))
>
>  I am sure you can see how a wildcard could be more important given a series
>  of similar expressions.  Perhaps something that looks like this:
>
>  summary(SLV_DIE_PER[(SLV_DIE_PER>=SLV_Min_Value)&(Nationality==Select_Nationality)&((SLV_PER_TON<=SLV_Max_Value))&(tontype==TONNAGE_TYPE)]))
>
>  Well thanks for your interest.  Any suggestions that can help clean up
>  my code could be extremely helpful right now.
>
>  Btw thanks Dieter for that hint.  Not exactly what I was looking for
>  but I am sure to use it in the future.
>
> Romain.  Thanks for your code, though I don't see readily how to fit
> "w" into your functions.  Perhaps you could add an additional line
> between:
>
> "is.na.wildcard <- function( x ) FALSE"
>   

w <- wildcard()

Note that it answers your previous question, but not this one.

This is untested because we do not have your data, but you could go with 
something like that, assuming the variables "tontype", "SLV_DIE_PER", 
"SLV_PER_TON" are in the data frame "data". That way, you seperate your 
model to the data it is applied to:

with( subset( data, tontype == TONNAGE_TYPE ), lm( SLV_DIE_PER ~ 
SLV_PER_TON ) )

Not sure how the wildcard you asked for in your first email fits into this.

Romain

> and
> "> w == 1"
>
> Thanks!
> Francis
>
>
>   
>> On Wed, Apr 1, 2009 at 2:02 AM, Patrick Burns <pburns at pburns.seanet.com> wrote:
>>     
>>> I would be truly amazed if the answer were "yes".
>>>
>>> I find this the most fascinating question on R-help
>>> for a long time, maybe ever.  Can you tell us what
>>> you have in mind and what your ultimate purpose is?
>>>
>>> Patrick Burns
>>> patrick at burns-stat.com
>>> +44 (0)20 8525 0696
>>> http://www.burns-stat.com
>>> (home of "The R Inferno" and "A Guide for the Unwilling S User")
>>>
>>> Francis Smart wrote:
>>>       
>>>> Is there a wildcard value for vector values in r?
>>>>
>>>> For instance:
>>>>
>>>>
>>>>         
>>>>> M <- *wildcard
>>>>>
>>>>>           
>>>>         
>>>>> (M==1)
>>>>>
>>>>>           
>>>> TRUE
>>>>
>>>>
>>>>         
>>>>> (M=="peanut butter")
>>>>>
>>>>>           
>>>> TRUE
>>>>
>>>>
>>>>         
>>>>> is.na(M)
>>>>>
>>>>>           
>>>> FALSE
>>>>
>>>> thanks,
>>>> Francis
>>>>
>>>>
>>>>         
>>
>> --
>> Francis Smart
>> (406) 223-8108 cell
>>
>>     
>
>
>
>   


-- 
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr




More information about the R-help mailing list