[R] removing only rows/columns with "na" value from square ( symmetrical ) matrix.
Petr PIKAL
petr.pikal at precheza.cz
Mon May 21 15:03:36 CEST 2012
Hi
You can do it by hand and remove row/col with max number of NA values.
rem<-which.max(colSums(is.na(M)))
M1<-M[-rem, -rem]
rem<-which.max(colSums(is.na(M1)))
M2<-M1[-rem, -rem]
M2
1 2 3 4 5 7 8 10 11 12
1 0 143 92 134 42 123 40 107 49 93
2 143 0 77 6 99 46 47 114 138 82
3 92 77 0 2 89 24 62 59 97 52
4 134 6 2 0 71 23 43 80 35 86
5 42 99 89 71 0 68 95 27 55 14
7 123 46 24 23 68 0 124 18 53 101
8 40 47 62 43 95 124 0 126 11 129
10 107 114 59 80 27 18 126 0 31 13
11 49 138 97 35 55 53 11 31 0 75
12 93 82 52 86 14 101 129 13 75 0
I believe this can be transformed to cycle in which you need to test
whether there is any NA for ending a cycle or not starting it if there is
no NA values.
Regards
Petr
> Yes the matrix is symmetric
> Gabor provided a partial solution:
> Try this:
>
> ix <- na.action(na.omit(replace(M, upper.tri(M), 0)))
> M[-ix, -ix]
>
> However this removes all rows containing an NA in the lower half of the
> matrix - even if the corresponding column has also been removed
>
> I I have revised the example to show this.
>
> thanks all for you help
>
> in the below case I would like to retain row and column [c(1:5,7,8,10:
> 12),c(1:5,7,8,10:12)]
> M<-matrix(sample(144),12,12)
> M[10,9]<-NA
> M<-as.matrix(as.dist(M))
> N=M
> #the above rows are to create the symmetric matrix M and a copy N
> M[6,]<-NA
> M[,6]<-NA
> #above two rows - make corresponding row and column NA
> print (M)
> ix <- na.action(na.omit(replace(M, upper.tri(M), 0)))
> M<-M[-ix, -ix]
> print (M)
>
> print ("however what I would like to retain is the maximum amout of data
> while removing rows or columns containing NA ie:")
> print(N [c(1:5,7,8,10:12),c(1:5,7,8,10:12)])
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> thanks to all
> On 21/05/2012, at 1:10 AM, peter dalgaard wrote:
>
> >
> > On May 20, 2012, at 16:37 , Bert Gunter wrote:
> >
> >> Your problem is not well-defined. In your example below, why not
> >> remove rows 1,2,6, and 10, all of which contain NA's? Is the matrix
> >> supposed to be symmetric?
> YES
>
> >> Do NA's always occur symmetrically?
> YES
> >
> > ...and even if they do, how do you decide whether to remove row/col 9
or
> row/col 10 in the example? (Or, for that matter, between (1 and 2) and
6.
> In that case you might chose to remove the smallest no. of row/cols but
in
> "9 vs. 10", the situation is completely symmetric.)
> >
> >>
> >> You either need to rethink what you want to do or clarify your
statement of it.
> >>
> >> -- Bert
> >>
> >> On Sun, May 20, 2012 at 7:17 AM, Nevil Amos <nevil.amos at monash.edu>
wrote:
> >>> I have some square matrices with na values in corresponding rows and
> >>> columns.
> >>>
> >>> M<-matrix(1:2,10,10)
> >>> M[6,1:2]<-NA
> >>> M[10,9]<-NA
> >>> M<-as.matrix(as.dist(M))
> >>> print (M)
> >>>
> >>> 1 2 3 4 5 6 7 8 9 10
> >>> 1 0 2 1 2 1 NA 1 2 1 2
> >>> 2 2 0 1 2 1 NA 1 2 1 2
> >>> 3 1 1 0 2 1 2 1 2 1 2
> >>> 4 2 2 2 0 1 2 1 2 1 2
> >>> 5 1 1 1 1 0 2 1 2 1 2
> >>> 6 NA NA 2 2 2 0 1 2 1 2
> >>> 7 1 1 1 1 1 1 0 2 1 2
> >>> 8 2 2 2 2 2 2 2 0 1 2
> >>> 9 1 1 1 1 1 1 1 1 0 NA
> >>> 10 2 2 2 2 2 2 2 2 NA 0
> >>>
> >>>
> >>> How do I remove just the row/column pair( in this trivial example
row 6 and
> >>> 10 and column 6 and 10) containing the NA values?
> >>>
> >>> so that I end up with all rows/ columns that are not NA - e.g.
> >>>
> >>> 1 2 3 4 5 7 8 9
> >>> 1 0 2 1 2 1 1 2 1
> >>> 2 2 0 1 2 1 1 2 1
> >>> 3 1 1 0 2 1 1 2 1
> >>> 4 2 2 2 0 1 1 2 1
> >>> 5 1 1 1 1 0 1 2 1
> >>> 7 1 1 1 1 1 0 2 1
> >>> 8 2 2 2 2 2 2 0 1
> >>> 9 1 1 1 1 1 1 1 0
> >>>
> >>>
> >>> if i use na omit I lose rows 1,2,6, and 9
> >>> which is not what I want.
> >>>
> >>> thanks
> >>> --
> >>> Nevil Amos
> >>> Molecular Ecology Research Group
> >>> Australian Centre for Biodiversity
> >>> Monash University
> >>> CLAYTON VIC 3800
> >>> Australia
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >>
> >> --
> >>
> >> Bert Gunter
> >> Genentech Nonclinical Biostatistics
> >>
> >> Internal Contact Info:
> >> Phone: 467-7374
> >> Website:
> >>
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
> biostatistics/pdb-ncb-home.htm
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list