[R] Arranging column data to create plots
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sun Jul 16 23:07:14 CEST 2017
On Sat, 15 Jul 2017, Michael Reed via R-help wrote:
> Dear All,
>
> I need some help arranging data that was imported.
It would be helpful if you were to use dput to give us the sample data
since you say you have already imported it.
> The imported data frame looks something like this (the actual file is
> huge, so this is example data)
>
> DF:
> IDKey X1 Y1 X2 Y2 X3 Y3 X4 Y4
> Name1 21 15 25 10
> Name2 15 18 35 24 27 45
> Name3 17 21 30 22 15 40 32 55
That data is missing in X3 etc, but would be NA in an actual data frame,
so I don't know if my workaround was the same as your workaround. Dput
would have clarified the starting point.
> I would like to create a new data frame with the following
>
> NewDF:
> IDKey X Y
> Name1 21 15
> Name1 25 10
> Name2 15 18
> Name2 35 24
> Name2 27 45
> Name3 17 21
> Name3 30 22
> Name3 15 40
> Name3 32 55
>
> With the data like this I think I can do the following
>
> ggplot(NewDF, aes(x=X, y=Y, color=IDKey) + geom_line
You are missing parentheses. If you use the reprex library to test your
examples before posting them, you can be sure your simple errors don't
send us off on wild goose chases.
> and get 3 lines with the various number of points.
>
> The point is that each of the XY pairs is a data point tied to NameX.
> I would like to rearrange the data so I can plot the points/lines by the
> IDKey. There will be at least 2 points, but the number of points for
> each IDKey can be as many as 4.
>
> I have tried using the gather() function from the tidyverse package, but
The tidyverse package is a virtual package that pulls in many packages.
> I can't make it work. The issue is that I believe I need two separate
> gather statements (one for X, another for Y) to consolidate the data.
> This causes the pairs to not stay together and the data becomes jumbled.
No, what you need is a gather-spread.
######
library(dplyr)
library(tidyr)
DF <- read.table( text=
"IDKey X1 Y1 X2 Y2 X3 Y3 X4 Y4
Name1 21 15 25 10 NA NA NA NA
Name2 15 18 35 24 27 45 NA NA
Name3 17 21 30 22 15 40 32 55
", header=TRUE, as.is=TRUE )
NewDF <- ( dta
%>% gather( XY, value, -IDKey )
%>% separate( XY, c( "Coord", "Num" ), 1 )
%>% spread( Coord, value )
%>% filter( !is.na( X ) & !is.na( Y ) )
)
######
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list