[R] problems with data frames, factors and lists
Karin Lagesen
karinlag at studmed.uio.no
Wed May 21 17:33:07 CEST 2008
I have a function that creates a list based on some clustered data:
mix <- function(Y, pid) {
hc = gethc(Y,pid)
maxheight = max(hc$height)
noingrp = processhc(hc)
one = noingrp$one
two = noingrp$two
twoisone = "one"
if (two != 1)
twoisone = "more"
out = list(pid = pid,one = noingrp$one, two = noingrp$two, diff = maxheight, noseqs = length(hc$labels), twogrp = twoisone)
return(out)
}
example result:
> mix(tsus_same, 77)
$pid
[1] 77
$one
[1] 9
$two
[1] 2
$diff
[1] 8.577195
$noseqs
[1] 11
$twogrp
[1] "more"
>
I then use this function in another function that just runs this
function through a lot of data:
doset <- function(sameset) {
pids = unique(c(sameset$APID, sameset$BPID))
for (f in pids) {
oputframe = data.frame(rbind(oputframe, mix(sameset, f)))
}
return(oputframe)
}
All values except $twogrp are numbers. There are two possible values
for $twogrp, "one" and "more". the first one is more common and gets
added to the data frame first. The result is that I cannot add the
rows where this is "more" without getting
38: In `[<-.factor`(`*tmp*`, ri, value = "more") :
invalid factor level, NAs generated
Now, this is a pain in the neck. How can I merge these lists to the
data frame and still have the value $twogrp as a factor?
Thanks, and I hope my code makes some sense!
Karin
--
Karin Lagesen, PhD student
karin.lagesen at medisin.uio.no
http://folk.uio.no/karinlag
More information about the R-help
mailing list