[R] function using values separated by a comma
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Oct 8 16:18:33 CEST 2010
On Fri, Oct 8, 2010 at 1:19 AM, burgundy <sauburn at yahoo.com> wrote:
>
> Hello,
>
> I have a dataframe (tab separated file) which looks like the example below -
> two values separated by a comma, and tab separation between each of these.
>
> [,1] [,2] [,3] [ ,4]
> [1,] 0,1 1,3 40,10 0,0
> [2,] 20,5 4,2 10,40 10,0
> [3,] 0,11 1,2 120,10 0,0
>
> I would like to calculate the percentage of the smallest number separated by
> the comma by:
> 1) summing the values e.g. for [1,3] where 40,10, 40+10 = 50
> 2) taking the first value and dividing it by the total e.g. for [1,3], 40/50
> = 0.8
> 3) where the value generated by 2) is >0.5, print 1-value, otherwise, leave
> value e.g. for [1,3], where value is 0.8, print 1-0.8 = 0.2
>
> plan to generate file like:
>
> [,1] [,2] [,3] [,4]
> [1,] 1 0.25 0.2 0
> [2,] 0.2 0.33 0.2 1
> [3,] 1 0.33 0.08 0
Try using gsubfn in gsubfn (http://gsubfn.googlecode.com). Using that
match a regular expression consisting of digits, a comma and digits
capturing the two strings of digits and passing them to function f
replacing the expression with the output of f. Then read the
resulting text into a data frame.
library(gsubfn)
L <- c(" 0,1 1,3 40,10 0,0", " 20,5 4,2 10,40 10,0",
" 0,11 1,2 120,10 0,0")
f <- function(a, b) { x <- as.numeric(c(a, b)); min(x)/sum(x) }
L2 <- gsubfn("(\\d+),(\\d+)", f, L)
DF <- read.table(textConnection(L2))
which gives:
> DF
V1 V2 V3 V4
1 0.0 0.2500000 0.20000000 NaN
2 0.2 0.3333333 0.20000000 0
3 0.0 0.3333333 0.07692308 NaN
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list