[R] Excel Trend Function

Daniel Malter daniel at umd.edu
Sun Jul 13 21:19:00 CEST 2008


Disputed was probably not the correct wording for it. However, imputation
means that you make assumptions regarding the distribution of your missing
data dependent on the data that is available to you. Felipe had time series
data, and it is common to predict from the past to the future in such
models. However, I tried to outline why making these assumptions may be
critical in Felipe's case due to the "environment" in which his data was
missing (namely that the distribution of his variable around the missing
values seemed odd and raised questions whether or not they were measure
without error). 

The rule of thumb for imputation that I remember is: If you are not sure how
plausible the assumptions were that you would have to make - drop it. But
better, since newer, advice maybe available on that. However, I cannot
remember reading a single study in my field in which data was imputed (I
don't say there is none, but if they exist, they must be very rare), meaning
that observations with missing data are typically dropped. Therefore, how
you will be received for imputing data does not only depend on the
particular application, but also on the research tradition of your field.
 
General sources that review data imputation as well as associated problems
are:

Horton and Lipsitz, 2001, The American Statistician
Schafer and Graham, 2002, Psychological Methods

Best,
Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von paulandpen
Gesendet: Sunday, July 13, 2008 4:27 AM
An: mazatlanmexico at yahoo.com; r-help at stat.math.ethz.ch
Betreff: Re: [R] Excel Trend Function

Hi Felipe,

Daniel mentions imputation is a disputed practice.  There are
recommendations and rules of thumb for its use.  I am not sure that
imputation is disputed.  I would be interested to see some links to articles
recommending against its use.

Paul


----- Original Message -----
From: "Felipe Carrillo" <mazatlanmexico at yahoo.com>
To: <r-help at stat.math.ethz.ch>
Sent: Sunday, July 13, 2008 5:46 AM
Subject: [R] Excel Trend Function


> Hi:
> I have a dataset and need to interpolate for missing days. In Excel I 
> either average from sampled days from above and below the missing days or 
> use the TREND function to make up for the missing values. I have been 
> reading about na.approx, is this function similar to the TREND function? 
> Which is the best recommendable way to make up for missing data?
> Here's my dataset: weeks 17,18,26 and 46 have 0 daysSamp.
>
> Year Week daysSamp     Lower TotalPD Upper varTotalPD
> 2006 47 6 126988 188259 249530 1045878675
> 2006 48 7 189155 253350 317545 1148102355
> 2006 49 7 103300 132741 162182 241480186
> 2006 50 6 11801 252576 493352 16151006813
> 2006 51 7 2348 3671 4994 487926
> 2006 52 5 2606 29901 57196 215454181
> 2006 2 7 2968 4513 6058 664723
> 2006 3 7 1128 1889 2650 161231
> 2006 4 7 479 963 1447 65196
> 2006 5 7 2819 4413 6007 708094
> 2006 6 6 -1009 3128 7264 4766743
> 2006 7 7 -5239 10769 26777 71387835
> 2006 8 7 150 503 856 34685
> 2006 9 7 1858 2989 4120 356562
> 2006 10 7 193 494 795 25281
> 2006 11 7 125 346 567 13627
> 2006 12 7 432 767 1102 31189
> 2006 13 7 1229 1867 2505 113569
> 2006 14 7 813 1339 1865 77140
> 2006 15 4 -66 124 315 10105
> 2006 16 7 152 903 1654 157242
> 2006 17 0
> 2006 18 0
> 2006 19 5 0 0 0 0
> 2006 20 4 0 0 0 0
> 2006 21 5 0 0 0 0
> 2006 22 6 0 0 0 0
> 2006 23 7 -65 285 635 34112
> 2006 24 6 0 0 0 0
> 2006 25 7 0 0 0 0
> 2006 26 0
> 2006 27 4 228 931 1634 137726
> 2006 28 4 801 2231 3662 569977
> 2006 29 4 4544 9242 13939 6147522
> 2006 30 5 15798 28465 41131 44697915
> 2006 31 5 25398 41049 56701 68245523
> 2006 32 5 48197 82216 116235 322416917
> 2006 33 5 142980 230411 317841 2129630128
> 2006 34 5 227141 360468 493794 4952314336
> 2006 35 5 467244 756325 1045405 23281569629
> 2006 36 5 281049 463331 645614 9256900449
> 2006 37 2 227636 620330 1013023 42961663047
> 2006 38 3 478990 983472 1487954 70903343603
> 2006 39 7 539690 846522 1153354 26228718974
> 2006 40 7 320959 457866 594773 5221891252
> 2006 41 7 427561 582452 737343 6683813344
> 2006 42 7 271788 351103 430418 1752614293
> 2006 43 7 165019 208853 252687 535301133
> 2006 44 7 91514 117390 143266 186537178
> 2006 45 7 59061 79187 99313 112842787
> 2006 46 0
>
> Felipe D. Carrillo
> Supervisory Fishery Biologist
> Department of the Interior
> US Fish & Wildlife Service
>  California, USA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list