[BioC] adding factors to a data frame from a dataframe

Tom Keller kellert at ohsu.edu
Wed Feb 29 23:29:51 CET 2012


Greetings,
I read a table as a dataframe that contains read metadata for DNA sequences. Each row contains the well.id and various parameters like signal2noise, etc..
e.g.
> welldfrm[1:3,]
  well.id signal.noise contiguous.read.length num.high.quality.bases sample.score comment container_name
1      A1      195.983                    976                    907       53.629  162194        111201a
3      C1      169.206                    990                    923       53.665  162196        111201a
4      D1      126.441                    923                    832       44.197  162197        111201a


What I don't have and would like to add is the capillary that each well was loaded into. So I created a dataframe with those groupings.

I would like to analyze the add the capillary, e.g. cap1, cap2, ... cap16 to each row based on whether the well.id was a member of the wells that capillary draws from. I can't quite figure out how to do that.
> capillaries$cap1
[1] A1  A3  A5  A7  A9  A11
Levels: A1 A11 A3 A5 A7 A9
> capillaries$cap5
[1] C1  C3  C5  C7  C9  C11
Levels: C1 C11 C3 C5 C7 C9

So for example, every row with a well.id in the cap1 list would have the factor "cap1":
E.G.
  well.id signal.noise crl num score ... capillary
1 A1 195.983 976 907 53.629 ... cap1
3 C1 169.206 990 923 53.665 ... cap5

I hope that makes sense. I think one of the 'apply' functions is the way to go, or perhaps rearrange capillaries with stack (?) but I'm stumbling with the syntax. (not to mention thinking in terms of complex data structures 8-)

thanks for any suggestions

Tom
MMI DNA Services Core Facility<http://www.ohsu.edu/xd/research/research-cores/dna-analysis/>
503-494-2442
kellert at ohsu.edu<http://ohsu.edu>
Office: 6588 RJH (CROET/BasicScience)

OHSU Shared Resources<http://www.ohsu.edu/xd/research/research-cores/index.cfm>








More information about the Bioconductor mailing list