[R] transform dataframe with look-up table
Brian Diggs
diggsb at ohsu.edu
Thu Jul 25 18:02:59 CEST 2013
On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote:
>
> Hello
> I hope that there is a simple solution to this apparently complex problem.
> Any help will be much appreciated:
> I have a dataframe with Left and Right readings (that is, elements in each
> row are paired). For instance,
> Left Right
> [1] 9 8
> [2] 4 3
> [3] 2 1
> [4] 6 5
> [5] 3 1
> [6] 4 1
> [7] 3 2
> [8] 4 2
> [9] 10 8
> [10] 9 10
> I need to produce a new data frame where the values are transformed
> according to a look-up table such as
> input output
> [1] 5 1
> [2] 10 1
> [3] 4 2
> [4] 8 3
> [5] 6 5
> [6] 5 6
> [7] 7 6
> [8] 2 7
> [9] 9 7
> [10] 10 7
> [11] 2 8
> So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what
> makes things complicated is the multiple outputs for a single input. In this
> example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must
> yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output
> for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are
> missing as inputs the output for [5, ] should be NA NA.
> Thank you very much for your time.
> Juan Antonio Balbuena
merge can handle both of these requirements.
First, making the two datasets reproducible:
Start <- data.frame(Left=c(9,4,2,6,3,4,3,4,10,9),
Right=c(8,3,1,5,1,1,2,2,8,10))
transformer <- data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2),
output=c(1,1,2,3,5,6,6,7,7,7,8))
Then add a marker of the original row numbers so that the work can be
checked more easily later (not really needed for the calculations):
Start$rownum <- seq_len(nrow(Start))
Two merge statements with the columns specified and all.x set to TRUE
(to keep cases even without a match):
End <- merge(merge(Start, transformer, by.x="Left", by.y="input",
all.x=TRUE),
transformer, by.x="Right", by.y="input", all.x=TRUE)
Then we can look at the output, resorted by the original row numbers:
End[order(End$rownum),]
which gives
Right Left rownum output.x output.y
12 8 9 1 7 3
9 3 4 2 2 NA
1 1 2 3 7 NA
2 1 2 3 8 NA
10 5 6 4 5 6
11 5 6 4 5 1
3 1 3 5 NA NA
4 1 4 6 2 NA
5 2 3 7 NA 7
6 2 3 7 NA 8
7 2 4 8 2 7
8 2 4 8 2 8
13 8 10 9 1 3
14 8 10 9 7 3
15 10 9 10 7 1
16 10 9 10 7 7
> --
>
> Dr. Juan A. Balbuena
> Marine Zoology Unit
> Cavanilles Institute of Biodiversity and Evolutionary Biology
> University of
> Valencia
> [1]http://www.uv.es/~balbuena
> P.O. Box 22085
> [2]http://www.uv.es/cavanilles/zoomarin/index.htm
> 46071 Valencia, Spain
> [3]http://cetus.uv.es/mullpardb/index.html
> e-mail: [4]j.a.balbuena at uv.es tel. +34 963 543 658 fax +34 963 543 733
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> NOTE! For shipments by EXPRESS COURIER use the following street address:
> C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> References
>
> 1. http://www.uv.es/%7Ebalbuena
> 2. http://www.uv.es/cavanilles/zoomarin/index.htm
> 3. http://cetus.uv.es/mullpardb/index.html
> 4. mailto:j.a.balbuena at uv.es
>
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More information about the R-help
mailing list