[R] About clustering techniques

ctu at bigred.unl.edu ctu at bigred.unl.edu
Tue Jul 29 16:17:54 CEST 2008


Hi Paco,
I got the same problem with you before. Thus, I just impute the missing values
For example:

newdata<-as.matrix(impute(olddata, fun="random"))
then I believe that you could analyze your data.

Hopefully it helps.
Chunhao


Quoting pacomet <pacomet at gmail.com>:

> Hello R users
>
> It's some time I am playing with a dataset to do some cluster analysis. The
> data set consists of 14 columns being geographical coordinates and monthly
> temperatures in annual files
>
> latitutde - longitude - temperature 1 -..... - temperature 12
>
> I have some missing values in some cases, maybe there are 8 monthly valid
> values at some points with four non valid. I don't want to supress the whole
> row with 8 good/4 bad values as I wanna try annual and monthy analysis.
>
> I first tried kmeans but found a problem with missing values. When trying
> without omitting missing values kmeans gives an error and when excluding
> invalid data too many values are excluded in some years of the data series.
>
> Now I have been reading about pam, pamk and clara, I think they can handle
> missing values. But can't find out the way to perform the analysis with
> these functions. As I'm not an statistics nor an R expert the fpc or cluster
> package documentation is not enough for me. If you know about a website or a
> tutorial explaining the way to use that functions, with examples to check if
> possible, please post them.
>
> Any other help or suggestion is greatly appreciated.
>
> Thanks in advance
>
> Paco
>
> --
> _________________________
> El ponent la mou, el llevant la plou
> Usuari Linux registrat: 363952
> -------
> Fotos: http://picasaweb.google.es/pacomet
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list