[R] Help with Merge - unexpected loss of factor level

Patrick Connolly p_connolly at slingshot.co.nz
Thu Dec 17 07:51:51 CET 2009


On Thu, 17-Dec-2009 at 03:17PM +1000, Zoe van Havre wrote:

[...]

|> The problem is that I have been tracking one factor in particular
|> ('branch', values 2 or 3) and once the final merge occurs, the
|> second level of this factor seems to disappear in the last dataset,
|> even though it was present before.  See code & output below:


|> 
|> >  dim(tma)

You didn't tell us that one.  What size is it?

|> >  names(tma)
|> [1] "Code"       "marker"     "cell"       "tumourA"    "tumourEXP"  "int"        "stain"      "tumourPERC" "branch"
|> > levels(tma$tumourA)
|> [1] "DCIS"                       "LN Metastasis"              "Normal"                     "Primary Invasive Carcinoma"
|> #split into cancer and normal tissue
|> >  tma1<-subset(tma, tumourA=="Primary Invasive Carcinoma")
|> >   tma2<-subset(tma, tumourA=="LN Metastasis")
|> >   tmaN<-subset(tma, tumourA=="Normal")
|> 

[...]

|>  2  3
|> 91 51
|> > table(tma1.1$branch.x)
|> 
|>    2    3
|> 1806  633
|> > table(tma2.1$branch.x)
|> 
|>   3
|> 625
|> 
|> 
|> Please, can someone tell me what's going on?


I suspect you'd have a lot of NAs in there.  Try this:
 sapply(tma, function(x)
    sum(is.na(x)))

If that doesn't tell you something interesting, try with the subsets.
Or maybe when you use table(), try the exclude=NULL argument.

HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___    Patrick Connolly   
 {~._.~}                   Great minds discuss ideas    
 _( Y )_  	         Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)  	                      ..... Eleanor Roosevelt
	  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.




More information about the R-help mailing list