[R] na.approx and columns with NA's

Mon May 28 10:11:31 CEST 2007

Dear Gabor,

In order to perform your suggestion I needed to split my 'big' 720*5551 
matrix into small ones of the type: 720*400 due to memory constraints. 
But after performing the task I get less rows in the new matrix. For 
example:

zz1<-zz[,1:400]
dim(zz1)

[1] 720 400

zz1[,1]

1985-01-05 1985-01-13 1985-01-21 1985-01-29 1985-02-06 1985-02-14 1985-02-22
        NA   16.72500   16.50000   16.68750   15.90000         NA   16.20000
1985-03-02 1985-03-10 1985-03-18 1985-03-26 1985-04-03 1985-04-11 1985-04-19
  16.50000   16.20000   15.90000   16.35000   16.27500   16.87500   16.87500
........................................................................................................................................

idx <- colSums(!!zz1, na.rm = TRUE) > 1
zz1[,idx] <- na.approx(zz1[,idx])
dim(zz1)

[1] 718 400

I've done something similar to your example with random data, but with 
the same number of rows from my original data:

u <- zoo(matrix(rnorm(4320), 6))
u<-t(u)

dim(u)
[1] 720 6

u[,2:3] <- NA
u[1, 2] <- 3
u[2, 1] <- NA  
idx <- colSums(!!u, na.rm = TRUE) > 1
u[,idx] <- na.approx(u[,idx])

dim(u)
[1] 720 6

Don't know what could be happening to my original data.

Best regards

Antonio

Gabor Grothendieck escribió:
> na.approx uses approx and has the same behavior as it.  Try this:
>
>> library(zoo)
>>
>> # test data
>> z <- zoo(matrix(1:24, 6))
>> z[,2:3] <- NA
>> z[1, 2] <- 3
>> z[2, 1] <- NA
>> z
>
> 1  1  3 NA 19
> 2 NA NA NA 20
> 3  3 NA NA 21
> 4  4 NA NA 22
> 5  5 NA NA 23
> 6  6 NA NA 24
>>
>> # TRUE for each column that has more than 1 non-NA
>> idx <- colSums(!!z, na.rm = TRUE) > 1
>> idx
> [1]  TRUE FALSE FALSE  TRUE
>>
>> z[,idx] <- na.approx(z[,idx])
>> z
>
> 1 1  3 NA 19
> 2 2 NA NA 20
> 3 3 NA NA 21
> 4 4 NA NA 22
> 5 5 NA NA 23
> 6 6 NA NA 24
>
>
> On 5/27/07, antonio rodriguez <antonio.raju at gmail.com> wrote:
>> Hi,
>>
>> I have a object 'zoo':
>>
>> dim(zz)
>> [1]  720 5551
>>
>> where some columns only have NA's values (representing land data in a
>> sea surface temperature dataset) I find straightforward the use of
>> 'na.approx' for individual columns from the zz matrix, but when applied
>> to the whole matrix:
>>
>> zz.approx<-na.approx(zz)
>> Erro en approx(along[!na], y[!na], along[na], ...) :
>>        need at least two non-NA values to interpolate
>>
>> The message is clear, but how do I could skip those 'full-NA's' columns
>> from the interpolation in order to perform the analysis over the columns
>> which represent actual data with some NA's values
>>
>> Best regards,
>>
>> Antonio
>>
>> -- 
>> =====
>> Por favor, si me mandas correos con copia a varias personas,
>> pon mi dirección de correo en copia oculta (CCO), para evitar
>> que acabe en montones de sitios, eliminando mi privacidad,
>> favoreciendo la propagación de virus y la proliferación del SPAM. 
>> Gracias.
>> -----
>> If you send me e-mail which has also been sent to several other people,
>> kindly mark my address as blind-carbon-copy (or BCC), to avoid its
>> distribution, which affects my privacy, increases the likelihood of
>> spreading viruses, and leads to more SPAM. Thanks.
>> =====
>> Antes de imprimir este e-mail piense bien si es necesario hacerlo: El 
>> medioambiente es cosa de todos.
>> Before printing this email, assess if it is really needed.
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

-- 
=====
Por favor, si me mandas correos con copia a varias personas, 
pon mi dirección de correo en copia oculta (CCO), para evitar 
que acabe en montones de sitios, eliminando mi privacidad, 
favoreciendo la propagación de virus y la proliferación del SPAM. Gracias.
-----
If you send me e-mail which has also been sent to several other people,
kindly mark my address as blind-carbon-copy (or BCC), to avoid its
distribution, which affects my privacy, increases the likelihood of
spreading viruses, and leads to more SPAM. Thanks.
=====
Antes de imprimir este e-mail piense bien si es necesario hacerlo: El medioambiente es cosa de todos.
Before printing this email, assess if it is really needed.