[R] Writing a function to return column position XXXX

David Winsemius dwinsemius at comcast.net
Tue Jan 24 17:29:21 CET 2012


On Jan 24, 2012, at 11:07 AM, Dan Abner wrote:

> Hi everyone,
>
> I am using Michael's approach (grepl()) to identify which columns
> containing $ signs. I was hoping to incorporate this into a line of
> code that would automatically 1) find which columns contain $ signs,
> 2) strip the $ and commas, and 3) convert the result to a numeric
> vector.
>
> I have the following:
>
> col.id<-function(x) any(grepl("\\$",x))
>

No data to test so ... no testing: Perhaps ...

cand2[which(sapply(cand2,col.id))] <-
	sapply( cand2[which(sapply(cand2,col.id))],
                       function(cols) as.numeric(gsub("[$,]", "",  
cols)) )

### amen to Michael's admonition--------
>> PS -- Stop with HTML postings (seriously, it actually does mess up
>> what the rest of us see and I think it causes trouble for the  
>> archives
>> as well)
###-----------------
>
> However, I am doing something wrong: while the code correctly
> identifies the columns containing $ signs, it also returns ALL NA for
> those columns.
>
> See my initial message for this thread for example data.
>
> Any assistance is appreciated.
>
> Thanks!
>
> Dan
>
>
> On Tue, Jan 24, 2012 at 9:04 AM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>> Either
>>
>> any(grepl("$",x, fixed = TRUE)) # You probably want grepl not grep
>> any(grepl("\\$",x) )
>> ? regexpr # $ has a special value
>>
>> Michael
>>
>>
>>
>> On Tue, Jan 24, 2012 at 8:49 AM, Dan Abner <dan.abner99 at gmail.com>  
>> wrote:
>>> Hello everyone,
>>>
>>> I am writing my own function to return the column index of all  
>>> variables
>>> (these are currently character vectors) in a data frame that  
>>> contain a
>>> dollar sign($). A small piece of the data look like this:
>>>
>>>    can_sta can_zip ind_ite_con ind_uni_con AL 36106 $251,895.80  
>>> $22,874.43
>>> AL 35802 $141,373.60 $7,100.00  AL 35201 $273,208.50 $18,193.66  AR
>>> 72404 $186,918.00
>>> $25,391.00  AR 72217 $451,127.00 $27,255.23  AR 7.28E+08  
>>> $58,336.22 $5,293.82
>>>
>>>
>>> So far I have:
>>>
>>>
>>> col.id<-function(x) any(grep("$",x))
>>> sapply(cand2,col.id)
>>>
>>> However, this returns TRUE for all columns (even those that do not  
>>> contain
>>> the $).
>>>
>>> Any assistance is appreciated.
>>>
>>> Thank you,
>>>
>>> Dan
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list