[R] Finding Source of Error Message of 'Non-Unique Index Entries'

David Winsemius dwinsemius at comcast.net
Wed Jan 4 19:59:43 CET 2012


Nothing attached. I don't know what you entitled teh "compressed dput  
output" but it did not pass the filters of the mailserver and you did  
not copy me. If chemdata is available as a text file, hten make sure  
its extension is .txt and then attach it.

-- 
David.

On Jan 4, 2012, at 1:31 PM, Rich Shepard wrote:

> On Wed, 4 Jan 2012, David Winsemius wrote:
>
>> You didn't ask for what was duplicated, but rather what was NOT  
>> duplicated
>> with that code. In the case of a dataframe it is the entire row  
>> that is
>> tested.
>
>  My original question was what was duplicated, but ... I changed the
> function by dropping the 'not'. There's something seriously wrong  
> here and I
> need help from R gurus to tell me why.
>
>  Example:
>
> burns.tds[duplicated(burns.tds), ]
>  ...
> 25760 BC-1.5 1996-09-19      NA
> 25761 BC-1.5 1996-09-19   0.010
>  ...
>
>  But, when I query the database table I see this:
>
> select * from chemistry where site = 'BC-1.5' and sampdate =  
> '1996-09-19'
> and param = 'TDS';
>  site  |  sampdate  | param | quant | units | qual | easting |  
> northing |
> stream  | basin --------+------------+-------+-------+-------+------ 
> +---------+----------+-
> ---------+--------
> BC-1.5 | 1996-09-19 | TDS   |   935 | mg/L  |      |          
> |          | BurnsCrk | (1 row)
>
>  There is only a single row for that site, sampdate, and parameter  
> and the
> quantity is different from those in the R data frame.
>
>> I think you need to reduce this problem to a dataframe that you  
>> either
>> post an access method for or use dput() to include. Then you need  
>> to say
>> what you goals are and what code is not working on that example.
>
>  I'll gladly do this. Which data frame should I make available: the
> original chemdata or the subset burns.tds? I'll start with the latter.
> Compressed dput() output attached.
>
>  My goal is to produce time series plots of TDS, by site, on several
> streams over the period for which that component was measured.  
> Lattice lets
> me superpose multiple lines on the same axis set with different  
> color lines
> and a legend.
>
>  What's not working is something in the workflow of subsettiong  
> chemdata to
> extract all TDS data for a named stream (e.g., burns.tds and  
> winters.tds),
> then convert them to zoo objects using read.zoo(). Somewhere along  
> this
> process my data are being mangled. It's not in the source data frame,
> chemdata:
>
> chemdata[duplicated(chemdata), ]
> [1] site     sampdate param    quant    units    qual     easting   
> northing
> [9] stream   basin <0 rows> (or 0-length row.names)
>
>  The command I used to subset burns.tds from chemdata was:
>
> burns.tds <- subset(chemdata, stream == 'BurnsCrk', select = c(site,
> sampdate, param == 'TDS', quant), drop = T)
>
> Thanks, David,
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list