[R] [FORGED] Histogram for Left Censored Data

Sat Jan 2 17:38:51 CET 2016

> On Jan 2, 2016, at 2:24 AM, Steven Stoline <sstoline at gmail.com> wrote:
> 
> Dear David:
> 
> Thank you very much for the code, it works very good for this data set.
> 
> I just have one more thing (if not bothered you).
> 
> how about if some of the non-censored (fully measured) data equal to the detection limit?
> 
> As an example, in the data set below, there are 16 censored observations with detection limit of 0.01, and there are some non-censored data observation equal to 0.01 (equal to the detection limit). I am wondering if we still can distinguish  between them in the histogram. I tried to modify your code, but I could not make it work for this situation.

I would probably construct an intermediate dataset copy where you "lowered" the items that were below the detection limit to a value .... below the detection limit, and then set the breaks parameter so that the real 0.01 items were included in the second bin.

(That actually mimics what I usually do with the actual values in regression situations. I consider the measurements "below the detection limit" to still be meaningful.)

-- 
David.
> 
> I crated a data frame, I want to create histogram for the variable "NH3Nconcentrations" (second column in the data frame).
> 
> 
> Once again, thank you very much for your helps.
> 
> 
> 
> 
> cen<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
> 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
> 
> censored<-c(TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,
> FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,
> FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,
> FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE,FALSE)
> 
> data.original<-c("<0.01","<0.01","<0.01","<0.01","<0.01","<0.01","<0.01","<0.01","<0.01","<0.01",
> "<0.01","<0.01","<0.01","<0.01","<0.01","<0.01",0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,
> 0.01,0.01,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.03,0.03,0.03,0.03,
> 0.03,0.03,0.03,0.04,0.04,0.04,0.04,0.04,0.04,0.05,0.05,0.05,0.06,0.47)
> 
> NH3Nconcentrations<-c(0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01
> ,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.02,0.02,0.02,0.02,0.02,0.02,
> 0.02,0.02,0.02,0.02,0.02,0.02,0.03,0.03,0.03,0.03,0.03,0.03,0.03,0.04,0.04,0.04,0.04,0.04,0.04,0.05,
> 0.05,0.05,0.06,0.47)
> 
> NH3N.concentrations<-data.frame(data.original,NH3Nconcentrations,cen,censored)
> 
> attach(NH3N.concentrations)
> 
> 
> NH3N.concentrations
> 
> 
> 
> with many thanks
> steve
> 
> On Fri, Jan 1, 2016 at 3:42 PM, David Winsemius <dwinsemius at comcast.net> wrote:
> 
>> On Jan 1, 2016, at 3:45 AM, Steven Stoline <sstoline at gmail.com> wrote:
>> 
>> Dear Rolf:
>> 
>> 
>> The histogram should contain a bar(s) for the censored data values replaced
>> by their detection limit(s) with different color than other bars for the
>> noncensored values . In this example there are only 3 censored values with
>> only one detection limit of DL = 1450.
>> 
>> 
>> with many thanks
>> steve
>> 
>> 
>> 
>> On Thu, Dec 31, 2015 at 4:16 PM, Rolf Turner <r.turner at auckland.ac.nz>
>> wrote:
>> 
>>> On 31/12/15 23:20, Steven Stoline wrote:
>>> 
>>>> Dear All:
>>>> 
>>>> I need helps with creating histograms for data that include left
>>>> censored observations.
>>>> 
>>>> Here is an example of left censored data
>>>> 
>>>> 
>>>> 
>>>> *Sulfate.Concentration*
>>>> <-matrix(c(1450,1800,1840,1820,1860,1780,1760,1800,1900,1770,1790,
>>>> 1780,1850,1760,1450,1710,1575,1475,1780,1790,1780,1450,1790,1800,
>>>> 1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0),24,2)
>>>> 
> 
>  myhist <- hist(sulfate[,1], breaks=c(1400,1451,1500,1600,1700,1800,1900), col=c(1,rep(2,5)), xaxt="n")
> #  plots with no x axis labeling
>  myhist
> #------------------
> $breaks
> [1] 1400 1451 1500 1600 1700 1800 1900
> 
> $counts
> [1]  3  1  1  0 14  5
> 
> $density
> [1] 0.0024509804 0.0008503401 0.0004166667 0.0000000000 0.0058333333 0.0020833333
> 
> $mids
> [1] 1425.5 1475.5 1550.0 1650.0 1750.0 1850.0
> 
> $xname
> [1] "sulfate[, 1]"
> 
> $equidist
> [1] FALSE
> 
> attr(,"class")
> [1] "histogram"
> #---rebuild the x-axis ----------------
>  axis(1, at=c(myhist$mids[1],myhist$breaks[-(1:2)]), labels=c("<1450", myhist$breaks[-(1:2)]))
> 
> <Rplot001.png>
> 
> -- 
> David.
> 
>>>> 
>>>> *Column 2* is an indicator for censoring "*1*" for left censored
>>>> observations and "*0*" for non-censored (fully measured)
>>>> observations.
>>>> 
>>> 
>>> And what, pray tell, do you want the resulting histogram to look like?
>>> See e.g. fortune("mind_read").
>>> 
>>> cheers,
>>> 
>>> Rolf Turner
>>> 
>>> --
>>> Technical Editor ANZJS
>>> Department of Statistics
>>> University of Auckland
>>> Phone: +64-9-373-7599 ext. 88276
>>> 
>> 
>> 
>> 
>> -- 
>> Steven M. Stoline
>> 1123 Forest Avenue
>> Portland, ME 04112
>> sstoline at gmail.com
>> 
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> 
> 
> 
> -- 
> Steven M. Stoline
> 1123 Forest Avenue
> Portland, ME 04112
> sstoline at gmail.com

David Winsemius
Alameda, CA, USA