[R] Finding Source of Error Message of 'Non-Unique Index Entries'
David Winsemius
dwinsemius at comcast.net
Wed Jan 4 19:59:43 CET 2012
Nothing attached. I don't know what you entitled teh "compressed dput
output" but it did not pass the filters of the mailserver and you did
not copy me. If chemdata is available as a text file, hten make sure
its extension is .txt and then attach it.
--
David.
On Jan 4, 2012, at 1:31 PM, Rich Shepard wrote:
> On Wed, 4 Jan 2012, David Winsemius wrote:
>
>> You didn't ask for what was duplicated, but rather what was NOT
>> duplicated
>> with that code. In the case of a dataframe it is the entire row
>> that is
>> tested.
>
> My original question was what was duplicated, but ... I changed the
> function by dropping the 'not'. There's something seriously wrong
> here and I
> need help from R gurus to tell me why.
>
> Example:
>
> burns.tds[duplicated(burns.tds), ]
> ...
> 25760 BC-1.5 1996-09-19 NA
> 25761 BC-1.5 1996-09-19 0.010
> ...
>
> But, when I query the database table I see this:
>
> select * from chemistry where site = 'BC-1.5' and sampdate =
> '1996-09-19'
> and param = 'TDS';
> site | sampdate | param | quant | units | qual | easting |
> northing |
> stream | basin --------+------------+-------+-------+-------+------
> +---------+----------+-
> ---------+--------
> BC-1.5 | 1996-09-19 | TDS | 935 | mg/L | |
> | | BurnsCrk | (1 row)
>
> There is only a single row for that site, sampdate, and parameter
> and the
> quantity is different from those in the R data frame.
>
>> I think you need to reduce this problem to a dataframe that you
>> either
>> post an access method for or use dput() to include. Then you need
>> to say
>> what you goals are and what code is not working on that example.
>
> I'll gladly do this. Which data frame should I make available: the
> original chemdata or the subset burns.tds? I'll start with the latter.
> Compressed dput() output attached.
>
> My goal is to produce time series plots of TDS, by site, on several
> streams over the period for which that component was measured.
> Lattice lets
> me superpose multiple lines on the same axis set with different
> color lines
> and a legend.
>
> What's not working is something in the workflow of subsettiong
> chemdata to
> extract all TDS data for a named stream (e.g., burns.tds and
> winters.tds),
> then convert them to zoo objects using read.zoo(). Somewhere along
> this
> process my data are being mangled. It's not in the source data frame,
> chemdata:
>
> chemdata[duplicated(chemdata), ]
> [1] site sampdate param quant units qual easting
> northing
> [9] stream basin <0 rows> (or 0-length row.names)
>
> The command I used to subset burns.tds from chemdata was:
>
> burns.tds <- subset(chemdata, stream == 'BurnsCrk', select = c(site,
> sampdate, param == 'TDS', quant), drop = T)
>
> Thanks, David,
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list