[R] sum some columns for each row
Dawn
dawn1313 at gmail.com
Wed Jul 15 01:05:37 CEST 2015
I used two rows to test the data frame, as follows.
> dat <- read.table("TOV_43_Protein_Clusters_abundance1.tab",
header=TRUE,sep = "\t")
> dat1 <- dat[1:2,]
> str(dat1)
'data.frame': 2 obs. of 44 variables:
$ X : Factor w/ 1075762 levels "","POV_Cluster_1000001",..: 305266
625028
$ X109DCM: Factor w/ 46 levels "","1","10","109DCM",..: 1 1
$ X109SUR: Factor w/ 41 levels "","1","10","109SUR",..: 1 1
$ X18DCM : Factor w/ 31 levels "","1","10","11",..: 1 1
$ X18SUR : Factor w/ 25 levels "","1","10","11",..: 1 1
$ X22SUR : Factor w/ 50 levels "","1","10","11",..: 1 2
$ X23DCM : Factor w/ 46 levels "","1","10","11",..: 1 1
$ X25DCM : Factor w/ 42 levels "","1","10","11",..: 1 1
$ X25SUR : Factor w/ 47 levels "","1","10","11",..: 1 1
$ X30DCM : Factor w/ 34 levels "","1","10","11",..: 1 1
$ X31SUR : Factor w/ 43 levels "","1","10","11",..: 1 1
$ X32DCM : Factor w/ 15 levels "","1","10","11",..: 1 1
$ X32SUR : Factor w/ 58 levels "","1","10","11",..: 1 1
$ X34DCM : Factor w/ 53 levels "","1","10","11",..: 1 35
$ X34SUR : Factor w/ 47 levels "","1","10","11",..: 10 14
$ X36DCM : Factor w/ 48 levels "","1","10","11",..: 2 43
$ X36SUR : Factor w/ 45 levels "","1","10","11",..: 23 38
$ X38DCM : Factor w/ 40 levels "","1","10","11",..: 3 23
$ X38SUR : Factor w/ 44 levels "","1","10","11",..: 7 41
$ X39DCM : Factor w/ 38 levels "","1","10","11",..: 34 38
$ X39SUR : Factor w/ 40 levels "","1","10","11",..: 13 40
$ X41DCM : Factor w/ 47 levels "","1","10","11",..: 13 40
$ X41SUR : Factor w/ 40 levels "","1","10","11",..: 1 1
$ X42DCM : Factor w/ 48 levels "","1","10","11",..: 2 3
$ X42SUR : Factor w/ 41 levels "","1","10","11",..: 2 1
$ X46SUR : Factor w/ 31 levels "","1","10","11",..: 2 2
$ X52DCM : Factor w/ 49 levels "","1","10","11",..: 13 23
$ X64DCM : Factor w/ 35 levels "","1","10","11",..: 1 2
$ X64SUR : Factor w/ 36 levels "","1","10","11",..: 1 1
$ X65DCM : Factor w/ 38 levels "","1","10","11",..: 1 1
$ X65SUR : Factor w/ 35 levels "","1","10","11",..: 1 1
$ X66DCM : Factor w/ 27 levels "","1","10","11",..: 1 1
$ X66SUR : Factor w/ 35 levels "","1","10","11",..: 1 1
$ X67SUR : Factor w/ 38 levels "","1","10","11",..: 1 1
$ X68DCM : Factor w/ 33 levels "","1","10","11",..: 1 1
$ X68SUR : Factor w/ 36 levels "","1","10","11",..: 1 1
$ X70MES : Factor w/ 23 levels "","1","10","11",..: 1 1
$ X70SUR : Factor w/ 37 levels "","1","10","11",..: 1 1
$ X72DCM : Factor w/ 40 levels "","1","10","11",..: 13 27
$ X72SUR : Factor w/ 38 levels "","1","10","11",..: 1 1
$ X76DCM : Factor w/ 44 levels "","1","10","11",..: 1 1
$ X76SUR : Factor w/ 34 levels "","1","10","11",..: 1 1
$ X82DCM : Factor w/ 29 levels "","1","10","11",..: 1 1
$ X85DCM : Factor w/ 30 levels "","1","10","11",..: 1 1
Thank you!!
Dawn
On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
wrote:
> I suspect your data frame "dat" has non-numeric data in some of the
> columns that have ABC in their names. Any column of a data frame can be
> numeric or not, but the data frame as a unit cannot be numeric. If your
> data file has odd characters in done of the otherwise-numeric columns, the
> whole column will be read in as a factor or character strings. Look at the
> output of str(dat) for columns that don't show "num'. If you can find the
> column, and then one of the bad rows, you can use a text editor to fix them
> manually, or show us examples of the bad data and we can suggest ways to
> fix it in R.
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
> Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
> >Hi,
> >
> >I used a small set of data (several columns and rows) and it works fine
> >using the following command:
> >abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
> >
> >But when I used the real big data table, "Error in rowSums(dat[,
> >grep("ABC", names(dat), fixed = T)], na.rm = T) :
> > 'x' must be numeric"
> >Then it didn't work either using as.numeric():
> >> as.numeric(dat)
> >Error: (list) object cannot be coerced to type 'double'
> >
> >Thanks!
> >Dawn
> >
> >
> >
> >
> >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com> wrote:
> >
> >> Thank you all and sorry for the data messing. It has worked!
> >>
> >> Best,
> >> Dawn
> >>
> >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at gmail.com>
> >wrote:
> >>
> >>> Hi Dawn,
> >>> Your data are a bit messed up, but try the following:
> >>>
> >>> colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
> >>> colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
> >>>
> >>> I'm assuming that you want to discard the NA values.
> >>>
> >>> Jim
> >>>
> >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas at sapo.pt>
> >>> wrote:
> >>> > Hello,
> >>> >
> >>> > Please use ?dput to give a data example, like this it's completely
> >>> > unreadable. If your data.frame is named 'dat' use
> >>> >
> >>> > dput(head(dat, 30)) # paste the outut of this in your mail
> >>> >
> >>> >
> >>> > And don't post in html, use plain text only, like the posting
> >guide
> >>> says.
> >>> >
> >>> > Rui Barradas
> >>> >
> >>> >
> >>> > Em 09-07-2015 18:12, Dawn escreveu:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> I have a big dataframe as follows
> >>> >>
> >>> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC
> >25ABC
> >>> >> 25XYZ
> >>> >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC
> >>> 36SUR
> >>> >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM
> >42SUR
> >>> >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC
> >66XYZ
> >>> >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ
> >76ABC
> >>> >> 76XYZ 82ABC 85ABC POV
> >>> >> Cluster_1
> >17
> >>> 1
> >>> >> 3 10 14 5 2 2 1 1 1 2
> >>> >> 2 TT:61
> >>> >> Cluster_2 1 4
> > 20
> >>> >> 6 5 3 6 9 9 6 10 1 3 1
> >>> >> 4 TT:88
> >>> >> Cluster_3 3 3 6 4
> > 17
> >>> >> 17 18 13 17 19 22 11 5 21 8 5 18
> > 4
> >>> >> 7 9
> >>> >> TT:227
> >>> >> ........
> >>> >>
> >>> >> I want to get two columns, i.e, one is to sum columns for all
> >>> including
> >>> >> ABC for each row and the other is to sum columns for all
> >including XYZ
> >>> >> for
> >>> >> each row.
> >>> >>
> >>> >> Is there some help? Thank you!
> >>> >> Dawn
> >>> >>
> >>> >> [[alternative HTML version deleted]]
> >>> >>
> >>> >> ______________________________________________
> >>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >> PLEASE do read the posting guide
> >>> >> http://www.R-project.org/posting-guide.html
> >>> >> and provide commented, minimal, self-contained, reproducible
> >code.
> >>> >>
> >>> >
> >>> > ______________________________________________
> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> > and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list