[R] Sum of columns of a data frame equal to NA when all the elements are NA

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Thu Mar 22 17:00:34 CET 2018


I can see that one might regard having

sum( sum( 1 ), sum( NULL ) ) == sum( 1 )

be TRUE as a necessary consistency, but going down that road one might expect Bert's

v+NULL == v

for all numeric vectors also. I have always avoided that construction as poor computing practice, but if NULL is supposed to represent the empty set mathematically [1] then this would seem to follow. 

[1] https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf

-- 
Sent from my phone. Please excuse my brevity.

On March 21, 2018 1:06:46 PM PDT, Bert Gunter <bgunter.4567 at gmail.com> wrote:
>"I see: consistency with additive identity. "
>
>Ummm, well:
>
>> 1+NULL
>numeric(0)
>
>> sum(1,NULL)
>[1] 1
>
>Of course, there could well be something here I don't get, but that
>doesn't
>look very consistent to me. However, as I said privately, so long as
>the
>corner case behavior is documented, which it is, I don't care.
>
>Cheers,
>Bert
>
>
>
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and
>sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>On Wed, Mar 21, 2018 at 10:26 AM, Boris Steipe
><boris.steipe at utoronto.ca>
>wrote:
>
>> I see: consistency with additive identity. That makes sense. Thanks.
>>
>> B.
>>
>>
>> > On Mar 21, 2018, at 1:22 PM, peter dalgaard <pdalgd at gmail.com>
>wrote:
>> >
>> > No. The empty sum is zero. Adding it to another sum should not
>change
>> it. Nothing audacious about that. This is consistent; other
>definitions
>> just cause trouble.
>> >
>> > -pd
>> >
>> >> On 21 Mar 2018, at 18:05 , Boris Steipe <boris.steipe at utoronto.ca>
>> wrote:
>> >>
>> >> Surely the result of summation of non-existent values is not
>defined,
>> is it not? And since the NA values have been _removed_, there's
>nothing
>> left to sum over. In fact, pretending the the result in that case is
>zero
>> would appear audacious, no?
>> >>
>> >> Cheers,
>> >> Boris
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>> On Mar 21, 2018, at 12:58 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us>
>> wrote:
>> >>>
>> >>> What do you mean by "should not"?
>> >>>
>> >>> NULL means "missing object" in R. The result of the sum function
>is
>> always expected to be numeric... so NA_real or NA_integer could make
>sense
>> as possible return values. But you cannot compute on NULL so no, that
>> doesn't work.
>> >>>
>> >>> See the note under the "Value" section of ?sum as to why zero is
>> returned when all inputs are removed.
>> >>> --
>> >>> Sent from my phone. Please excuse my brevity.
>> >>>
>> >>> On March 21, 2018 9:03:29 AM PDT, Boris Steipe <
>> boris.steipe at utoronto.ca> wrote:
>> >>>> Should not the result be NULL if you have removed the NA with
>> >>>> na.rm=TRUE ?
>> >>>>
>> >>>> B.
>> >>>>
>> >>>>
>> >>>>
>> >>>>> On Mar 21, 2018, at 11:44 AM, Stefano Sofia
>> >>>> <stefano.sofia at regione.marche.it> wrote:
>> >>>>>
>> >>>>> Dear list users,
>> >>>>> let me ask you this trivial question. I worked on that for a
>long
>> >>>> time, by now.
>> >>>>> Suppose to have a data frame with NAs and to sum some columns
>with
>> >>>> rowSums:
>> >>>>>
>> >>>>> df <- data.frame(A = runif(10), B = runif(10), C = rnorm(10))
>> >>>>> df[1, ] <- NA
>> >>>>> rowSums(df[ , which(names(df) %in% c("A","B"))], na.rm=T)
>> >>>>>
>> >>>>> If all the elements of the selected columns are NA, rowSums
>returns 0
>> >>>> while I need NA.
>> >>>>> Is there an easy and efficient way to use rowSums within a
>function
>> >>>> like
>> >>>>>
>> >>>>> function(x) ifelse(all(is.na(x)), as.numeric(NA), rowSums...)?
>> >>>>>
>> >>>>> or an equivalent function?
>> >>>>>
>> >>>>> Thank you for your help
>> >>>>> Stefano
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>      (oo)
>> >>>>> --oOO--( )--OOo----------------
>> >>>>> Stefano Sofia PhD
>> >>>>> Area Meteorologica e  Area nivologica - Centro Funzionale
>> >>>>> Servizio Protezione Civile - Regione Marche
>> >>>>> Via del Colle Ameno 5
>> >>>>> 60126 Torrette di Ancona, Ancona
>> >>>>> Uff: 071 806 7743
>> >>>>> E-mail: stefano.sofia at regione.marche.it
>> >>>>> ---Oo---------oO----------------
>> >>>>>
>> >>>>> ________________________________
>> >>>>>
>> >>>>> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può
>> >>>> contenere informazioni confidenziali, pertanto è destinato solo
>a
>> >>>> persone autorizzate alla ricezione. I messaggi di posta
>elettronica
>> per
>> >>>> i client di Regione Marche possono contenere informazioni
>> confidenziali
>> >>>> e con privilegi legali. Se non si è il destinatario specificato,
>non
>> >>>> leggere, copiare, inoltrare o archiviare questo messaggio. Se si
>è
>> >>>> ricevuto questo messaggio per errore, inoltrarlo al mittente ed
>> >>>> eliminarlo completamente dal sistema del proprio computer. Ai
>sensi
>> >>>> dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di
>> necessità
>> >>>> ed urgenza, la risposta al presente messaggio di posta
>elettronica può
>> >>>> essere visionata da persone estranee al destinatario.
>> >>>>> IMPORTANT NOTICE: This e-mail message is intended to be
>received only
>> >>>> by persons entitled to receive the confidential information it
>may
>> >>>> contain. E-mail messages to clients of Regione Marche may
>contain
>> >>>> information that is confidential and legally privileged. Please
>do not
>> >>>> read, copy, forward, or store this message unless you are an
>intended
>> >>>> recipient of it. If you have received this message in error,
>please
>> >>>> forward it to the sender and delete it completely from your
>computer
>> >>>> system.
>> >>>>>
>> >>>>> --
>> >>>>> Questo messaggio  stato analizzato da Libra ESVA ed  risultato
>non
>> >>>> infetto.
>> >>>>> This message was scanned by Libra ESVA and is believed to be
>clean.
>> >>>>>
>> >>>>>
>> >>>>>   [[alternative HTML version deleted]]
>> >>>>>
>> >>>>> ______________________________________________
>> >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>>> PLEASE do read the posting guide
>> >>>> http://www.R-project.org/posting-guide.html
>> >>>>> and provide commented, minimal, self-contained, reproducible
>code.
>> >>>>
>> >>>> ______________________________________________
>> >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>> PLEASE do read the posting guide
>> >>>> http://www.R-project.org/posting-guide.html
>> >>>> and provide commented, minimal, self-contained, reproducible
>code.
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> > --
>> > Peter Dalgaard, Professor,
>> > Center for Statistics, Copenhagen Business School
>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> > Phone: (+45)38153501
>> > Office: A 4.23
>> > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>



More information about the R-help mailing list