[R] Help with Merge - unexpected loss of factor level
Patrick Connolly
p_connolly at slingshot.co.nz
Thu Dec 17 07:51:51 CET 2009
On Thu, 17-Dec-2009 at 03:17PM +1000, Zoe van Havre wrote:
[...]
|> The problem is that I have been tracking one factor in particular
|> ('branch', values 2 or 3) and once the final merge occurs, the
|> second level of this factor seems to disappear in the last dataset,
|> even though it was present before. See code & output below:
|>
|> > dim(tma)
You didn't tell us that one. What size is it?
|> > names(tma)
|> [1] "Code" "marker" "cell" "tumourA" "tumourEXP" "int" "stain" "tumourPERC" "branch"
|> > levels(tma$tumourA)
|> [1] "DCIS" "LN Metastasis" "Normal" "Primary Invasive Carcinoma"
|> #split into cancer and normal tissue
|> > tma1<-subset(tma, tumourA=="Primary Invasive Carcinoma")
|> > tma2<-subset(tma, tumourA=="LN Metastasis")
|> > tmaN<-subset(tma, tumourA=="Normal")
|>
[...]
|> 2 3
|> 91 51
|> > table(tma1.1$branch.x)
|>
|> 2 3
|> 1806 633
|> > table(tma2.1$branch.x)
|>
|> 3
|> 625
|>
|>
|> Please, can someone tell me what's going on?
I suspect you'd have a lot of NAs in there. Try this:
sapply(tma, function(x)
sum(is.na(x)))
If that doesn't tell you something interesting, try with the subsets.
Or maybe when you use table(), try the exclude=NULL argument.
HTH
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Average minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Eleanor Roosevelt
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
More information about the R-help
mailing list