[R] combining data.frames with is.na & match (), two questions
PIKAL Petr
petr@p|k@| @end|ng |rom prechez@@cz
Tue Apr 23 08:59:29 CEST 2019
Hi
Keep posts also to r-help, others could give you different/better solutions.
Regarding ordering, see ?order or ?sort. However this is mainly necessary only for plotting or exporting data.
Cheers
Petr
From: Drake Gossi <drake.gossi using gmail.com>
Sent: Thursday, April 18, 2019 9:27 PM
To: PIKAL Petr <petr.pikal using precheza.cz>
Subject: Re: [R] combining data.frames with is.na & match (), two questions
Thanks Pikal,
Your answer was super helpful. I just learned a lot from you. The only thing I have to figure out now is how to rearrange the numbers, say, so that 200 is on top, and NA is on bottom, or so that the two 100 calories are together. Something like that. Perhaps I'll try an ascending/descending function.
Thank you again.
D
On Thu, Apr 18, 2019 at 1:31 AM PIKAL Petr <petr.pikal using precheza.cz<mailto:petr.pikal using precheza.cz>> wrote:
Hi
I wonder why such combination is so complicated in your text book.
Having data frames fr1 and fr2
> dput(fr1)
structure(list(Fruit = structure(c(1L, 3L, 2L), .Label = c("banana",
"mango", "pear"), class = "factor"), Calories = c(100L, 100L,
200L)), class = "data.frame", row.names = c("1", "2", "3"))
> dput(fr2)
structure(list(Fruit = structure(c(1L, 2L, 5L, 4L, 3L), .Label = c("apple",
"banana", "kiwi", "orange", "pear"), class = "factor"), Color = structure(c(3L,
4L, 1L, 2L, 1L), .Label = c("green", "orange", "red", "yellow"
), class = "factor"), Shape = structure(c(3L, 1L, 2L, 3L, 3L), .Label = c("oblong",
"pear", "round"), class = "factor"), Juice = c(1, 0, 0.5, 1,
0)), class = "data.frame", row.names = c("1", "2", "3", "4",
"5"))
>
> fr1
Fruit Calories
1 banana 100
2 pear 100
3 mango 200
>
you can use merge to combine those 2 data frames to get either all values from both
> merge(fr2, fr1, all=T)
Fruit Color Shape Juice Calories
1 apple red round 1.0 NA
2 banana yellow oblong 0.0 100
3 kiwi green round 0.0 NA
4 orange orange round 1.0 NA
5 pear green pear 0.5 100
6 mango <NA> <NA> NA 200
just values from data frame with calories
> merge(fr2, fr1, all.y=T)
Fruit Color Shape Juice Calories
1 banana yellow oblong 0.0 100
2 pear green pear 0.5 100
3 mango <NA> <NA> NA 200
or just values from data frame with colours
> merge(fr2, fr1, all.x=T)
Fruit Color Shape Juice Calories
1 apple red round 1.0 NA
2 banana yellow oblong 0.0 100
3 kiwi green round 0.0 NA
4 orange orange round 1.0 NA
5 pear green pear 0.5 100
Cheers
Petr
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org<mailto:r-help-bounces using r-project.org>> On Behalf Of Drake Gossi
> Sent: Thursday, April 18, 2019 1:24 AM
> To: r-help using r-project.org<mailto:r-help using r-project.org>
> Subject: [R] combining data.frames with is.na<http://is.na> & match (), two questions
>
> Hello everyone,
>
> I'm working through this book, *Humanities Data in R* (Arnold & Tilton), and
> I'm just having trouble understanding this maneuver.
>
> In sum, I'm trying to combine data in two different data.frames.
>
> This data.frame is called fruitNutr
>
> Fruit Calories
> 1 banana 100
> 2 pear 100
> 3 mango 200
>
> And this data.frame is called fruitData
>
> Fruit Color Shape Juice
> 1 apple red round 1
> 2 banana yellow oblong 0
> 3 pear green pear 0.5
> 4 orange orange round 1
> 5 kiwi green round 0
>
> So, as you can see, these two data.frames overlap insofar as they both have
> banana and pear. So, what happens next is the book suggests this:
>
> fruitData$calories <- NA
>
>
> As a result, I've created a new column for the fruitData data.frame:
>
> Fruit Color Shape Juice Calories
> 1 apple red round 1 N/A
> 2 banana yellow oblong 0 N/A
> 3 pear green pear 0.5 N/A
> 4 orange orange round 1 N/A
> 5 kiwi green round 0 N/A
>
> Then:
>
> > index <- match (x=fruitData$Fruit, table=fruitNutr$Fruit) index
> [1] NA 1 2 NA NA
> > is.na<http://is.na>(index)
> [1] TRUE FALSE FALSE TRUE TRUE
> > fruitData$Calories [!is.na<http://is.na>(index)] <- fruitNutr$Calories[index[!is.na<http://is.na>
> (index)]]
> > fruitData
>
> Fruit Color Shape Juice Calories
> 1 apple red round 1 N/A
> 2 banana yellow oblong 0 100
> 3 pear green pear 0.5 100
> 4 orange orange round 1 N/A
> 5 kiwi green round 0 N/A
>
> I get what the first part means, that first part being this:
> fruitData$Calories [!is.na<http://is.na>(index)]
> go into the fruitData data.frame, specifically into the calories column, and only
> for what's true according to is.na<http://is.na>(index). But I just literally can't understand
> this last part. fruitNutr$Calories[index[!is.na<http://is.na>(index)]]
>
> Two questions.
>
>
> 1. I just literally don't understand how this code works. It does work,
> of course, but I don't know what it's doing, specifically this [index[!
> is.na<http://is.na>(index)]] part. Could someone explain it to me like I'm five? I'm
> new at this...
> 2. And then: is there any other way to combine these two data.frames so
> that we get this same result? maybe an easier to understand method?
>
> That same result, again, is
>
> Fruit Color Shape Juice Calories
> 1 apple red round 1 N/A
> 2 banana yellow oblong 0 100
> 3 pear green pear 0.5 100
> 4 orange orange round 1 N/A
> 5 kiwi green round 0 N/A
>
>
> Drake
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner’s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
[[alternative HTML version deleted]]
More information about the R-help
mailing list