[R] Sum of Numeric Values in a DF Column

Burhan ul haq ulhaqz at gmail.com
Tue Apr 19 06:37:16 CEST 2016


Dear Gunter /  Heiberger,

Thanks for the help. This is what I was looking for:

> ... and here is a non-dplyr rsolution:
>
>> z <-gsub("[^[:digit:]]"," ",dd$Lower)
>
>> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
> [1] 105  67  60 100  80

And that would explain, why one could not use "unlist" as a grand sum total
was not desired, but rather sum for each of the rows.


Br /

On Mon, Apr 18, 2016 at 10:57 PM, Bert Gunter <bgunter.4567 at gmail.com>
wrote:

> ... and a slightly more efficient non-dplyr 1-liner:
>
> > sapply(strsplit(dd$Lower,"[^[:digit:]]"),
> function(x)sum(as.numeric(x), na.rm=TRUE))
>
> [1] 105  67  60 100  80
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Apr 18, 2016 at 10:43 AM, Bert Gunter <bgunter.4567 at gmail.com>
> wrote:
> > ... and here is a non-dplyr rsolution:
> >
> >> z <-gsub("[^[:digit:]]"," ",dd$Lower)
> >
> >> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
> > [1] 105  67  60 100  80
> >
> >
> > Cheers,
> > Bert
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Apr 18, 2016 at 10:07 AM, Richard M. Heiberger <rmh at temple.edu>
> wrote:
> >> ## Continuing with your data
> >>
> >> AA <- stringr::str_extract_all(dd[[2]],"[[:digit:]]+")
> >> BB <- lapply(AA, as.numeric)
> >> ## I think you are looking for one of the following two expressions
> >> sum(unlist(BB))
> >> sapply(BB, sum)
> >>
> >>
> >> On Mon, Apr 18, 2016 at 12:48 PM, Burhan ul haq <ulhaqz at gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>> I request help with the following:
> >>>
> >>> INPUT: A data frame where column "Lower" is a character containing
> numeric
> >>> values (different count or occurrences of numeric values in each row,
> >>> mostly 2)
> >>>
> >>>> dput(dd)
> >>> structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
> >>> "California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1
> >>> Ind.",
> >>> "R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
> >>> "R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
> >>> )), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
> >>> 5L), class = "data.frame")
> >>>
> >>> PROBLEM: Need to extract all numeric values and sum them. There are few
> >>> exceptions like row2. But these can be ignored and will be fixed
> manually
> >>>
> >>> SOLUTION SO FAR:
> >>> str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as
> >>> character. I am unable to unlist it, because it mixes them all
> together, ...
> >>>
> >>> And if I may add, is there a "dplyr" way of doing it ...
> >>>
> >>>
> >>> Thanks
> >>>
> >>>         [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list