[R] sum some columns for each row
David Winsemius
dwinsemius at comcast.net
Wed Jul 15 03:28:26 CEST 2015
On Jul 14, 2015, at 4:49 PM, Dawn wrote:
> I attached the file
Well, you may have attached it, but you evidently did not read the posting guide about which filetypes are accepted by the mailserver.
> .... including the first two rows and please help to make it
> the numeric data frame. Hopefully the following command works:
>
> dcm <- rowSums(dat1[,grep("DCM",names(dat1),fixed=T)],na.rm=T)
How do you expect that to deliver anything meaningful if all of your columns are factor class?
That was the reason for this error in an earlier posting of yours:
But when I used the real big data table, "Error in rowSums(dat[,
grep("ABC", names(dat), fixed = T)], na.rm = T) :
'x' must be numeric"
You are not paying attention to the responses you have received so far.
I think Bert Gunter's suggestion that you need to work through more introductory tutorials is on point.
--
David.
>
> Thank you very much!
> Dawn
>
> On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
>
>> Well it is pretty obvious that all of your columns have non-numeric data
>> in them, but you are the only one who can tell which ones should have been
>> numeric, and you are also the one who can peruse your data file in a text
>> editor.
>> ---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go Live...
>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>> Go...
>> Live: OO#.. Dead: OO#.. Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>> ---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On July 14, 2015 4:05:37 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
>>> I used two rows to test the data frame, as follows.
>>>
>>>> dat <- read.table("TOV_43_Protein_Clusters_abundance1.tab",
>>> header=TRUE,sep = "\t")
>>>> dat1 <- dat[1:2,]
>>>> str(dat1)
>>> 'data.frame': 2 obs. of 44 variables:
>>> $ X : Factor w/ 1075762 levels "","POV_Cluster_1000001",..: 305266
>>> 625028
>>> $ X109DCM: Factor w/ 46 levels "","1","10","109DCM",..: 1 1
>>> $ X109SUR: Factor w/ 41 levels "","1","10","109SUR",..: 1 1
>>> $ X18DCM : Factor w/ 31 levels "","1","10","11",..: 1 1
>>> $ X18SUR : Factor w/ 25 levels "","1","10","11",..: 1 1
>>> $ X22SUR : Factor w/ 50 levels "","1","10","11",..: 1 2
>>> $ X23DCM : Factor w/ 46 levels "","1","10","11",..: 1 1
>>> $ X25DCM : Factor w/ 42 levels "","1","10","11",..: 1 1
>>> $ X25SUR : Factor w/ 47 levels "","1","10","11",..: 1 1
>>> $ X30DCM : Factor w/ 34 levels "","1","10","11",..: 1 1
>>> $ X31SUR : Factor w/ 43 levels "","1","10","11",..: 1 1
>>> $ X32DCM : Factor w/ 15 levels "","1","10","11",..: 1 1
>>> $ X32SUR : Factor w/ 58 levels "","1","10","11",..: 1 1
>>> $ X34DCM : Factor w/ 53 levels "","1","10","11",..: 1 35
>>> $ X34SUR : Factor w/ 47 levels "","1","10","11",..: 10 14
>>> $ X36DCM : Factor w/ 48 levels "","1","10","11",..: 2 43
>>> $ X36SUR : Factor w/ 45 levels "","1","10","11",..: 23 38
>>> $ X38DCM : Factor w/ 40 levels "","1","10","11",..: 3 23
>>> $ X38SUR : Factor w/ 44 levels "","1","10","11",..: 7 41
>>> $ X39DCM : Factor w/ 38 levels "","1","10","11",..: 34 38
>>> $ X39SUR : Factor w/ 40 levels "","1","10","11",..: 13 40
>>> $ X41DCM : Factor w/ 47 levels "","1","10","11",..: 13 40
>>> $ X41SUR : Factor w/ 40 levels "","1","10","11",..: 1 1
>>> $ X42DCM : Factor w/ 48 levels "","1","10","11",..: 2 3
>>> $ X42SUR : Factor w/ 41 levels "","1","10","11",..: 2 1
>>> $ X46SUR : Factor w/ 31 levels "","1","10","11",..: 2 2
>>> $ X52DCM : Factor w/ 49 levels "","1","10","11",..: 13 23
>>> $ X64DCM : Factor w/ 35 levels "","1","10","11",..: 1 2
>>> $ X64SUR : Factor w/ 36 levels "","1","10","11",..: 1 1
>>> $ X65DCM : Factor w/ 38 levels "","1","10","11",..: 1 1
>>> $ X65SUR : Factor w/ 35 levels "","1","10","11",..: 1 1
>>> $ X66DCM : Factor w/ 27 levels "","1","10","11",..: 1 1
>>> $ X66SUR : Factor w/ 35 levels "","1","10","11",..: 1 1
>>> $ X67SUR : Factor w/ 38 levels "","1","10","11",..: 1 1
>>> $ X68DCM : Factor w/ 33 levels "","1","10","11",..: 1 1
>>> $ X68SUR : Factor w/ 36 levels "","1","10","11",..: 1 1
>>> $ X70MES : Factor w/ 23 levels "","1","10","11",..: 1 1
>>> $ X70SUR : Factor w/ 37 levels "","1","10","11",..: 1 1
>>> $ X72DCM : Factor w/ 40 levels "","1","10","11",..: 13 27
>>> $ X72SUR : Factor w/ 38 levels "","1","10","11",..: 1 1
>>> $ X76DCM : Factor w/ 44 levels "","1","10","11",..: 1 1
>>> $ X76SUR : Factor w/ 34 levels "","1","10","11",..: 1 1
>>> $ X82DCM : Factor w/ 29 levels "","1","10","11",..: 1 1
>>> $ X85DCM : Factor w/ 30 levels "","1","10","11",..: 1 1
>>>
>>>
>>> Thank you!!
>>> Dawn
>>>
>>> On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller
>>> <jdnewmil at dcn.davis.ca.us>
>>> wrote:
>>>
>>>> I suspect your data frame "dat" has non-numeric data in some of the
>>>> columns that have ABC in their names. Any column of a data frame can
>>> be
>>>> numeric or not, but the data frame as a unit cannot be numeric. If
>>> your
>>>> data file has odd characters in done of the otherwise-numeric
>>> columns, the
>>>> whole column will be read in as a factor or character strings. Look
>>> at the
>>>> output of str(dat) for columns that don't show "num'. If you can find
>>> the
>>>> column, and then one of the bad rows, you can use a text editor to
>>> fix them
>>>> manually, or show us examples of the bad data and we can suggest ways
>>> to
>>>> fix it in R.
>>>>
>>
>>> ---------------------------------------------------------------------------
>>>> Jeff Newmiller The ..... ..... Go
>>> Live...
>>>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>>>> Go...
>>>> Live: OO#.. Dead: OO#..
>>> Playing
>>>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>>>> /Software/Embedded Controllers) .OO#. .OO#.
>>> rocks...1k
>>>>
>>
>>> ---------------------------------------------------------------------------
>>>> Sent from my phone. Please excuse my brevity.
>>>>
>>>> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I used a small set of data (several columns and rows) and it works
>>> fine
>>>>> using the following command:
>>>>> abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
>>>>>
>>>>> But when I used the real big data table, "Error in rowSums(dat[,
>>>>> grep("ABC", names(dat), fixed = T)], na.rm = T) :
>>>>> 'x' must be numeric"
>>>>> Then it didn't work either using as.numeric():
>>>>>> as.numeric(dat)
>>>>> Error: (list) object cannot be coerced to type 'double'
>>>>>
>>>>> Thanks!
>>>>> Dawn
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com> wrote:
>>>>>
>>>>>> Thank you all and sorry for the data messing. It has worked!
>>>>>>
>>>>>> Best,
>>>>>> Dawn
>>>>>>
>>>>>> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> Hi Dawn,
>>>>>>> Your data are a bit messed up, but try the following:
>>>>>>>
>>>>>>> colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
>>>>>>> colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
>>>>>>>
>>>>>>> I'm assuming that you want to discard the NA values.
>>>>>>>
>>>>>>> Jim
>>>>>>>
>>>>>>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas
>>> <ruipbarradas at sapo.pt>
>>>>>>> wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Please use ?dput to give a data example, like this it's
>>> completely
>>>>>>>> unreadable. If your data.frame is named 'dat' use
>>>>>>>>
>>>>>>>> dput(head(dat, 30)) # paste the outut of this in your mail
>>>>>>>>
>>>>>>>>
>>>>>>>> And don't post in html, use plain text only, like the posting
>>>>> guide
>>>>>>> says.
>>>>>>>>
>>>>>>>> Rui Barradas
>>>>>>>>
>>>>>>>>
>>>>>>>> Em 09-07-2015 18:12, Dawn escreveu:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have a big dataframe as follows
>>>>>>>>>
>>>>>>>>> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC
>>>>> 25ABC
>>>>>>>>> 25XYZ
>>>>>>>>> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ
>>> 36ABC
>>>>>>> 36SUR
>>>>>>>>> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM
>>>>> 42SUR
>>>>>>>>> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC
>>>>> 66XYZ
>>>>>>>>> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ
>>>>> 76ABC
>>>>>>>>> 76XYZ 82ABC 85ABC POV
>>>>>>>>> Cluster_1
>>>>> 17
>>>>>>> 1
>>>>>>>>> 3 10 14 5 2 2 1 1 1 2
>>>>>>>>> 2 TT:61
>>>>>>>>> Cluster_2 1
>>> 4
>>>>> 20
>>>>>>>>> 6 5 3 6 9 9 6 10 1 3 1
>>>>>>>>> 4
>>> TT:88
>>>>>>>>> Cluster_3 3 3 6 4
>>>>> 17
>>>>>>>>> 17 18 13 17 19 22 11 5 21 8 5
>>> 18
>>>>> 4
>>>>>>>>> 7 9
>>>>>>>>> TT:227
>>>>>>>>> ........
>>>>>>>>>
>>>>>>>>> I want to get two columns, i.e, one is to sum columns for all
>>>>>>> including
>>>>>>>>> ABC for each row and the other is to sum columns for all
>>>>> including XYZ
>>>>>>>>> for
>>>>>>>>> each row.
>>>>>>>>>
>>>>>>>>> Is there some help? Thank you!
>>>>>>>>> Dawn
>>>>>>>>>
>>>>>>>>> [[alternative HTML version deleted]]
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>>> see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> and provide commented, minimal, self-contained, reproducible
>>>>> code.
>>>>>>>>>
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>>> see
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained, reproducible
>>> code.
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list