[R] How to create a readable plot in R with 10000+ values in a dataframe

Carlos Ortega co|or|e @end|ng |rom gm@||@com
Wed Jul 29 21:04:43 CEST 2020


Hello Ritwik,

There is another possibility.

You can count (crosstab) the number of elements for each Region and Machine
(with table() function) and represent this table with geom_tile() function.
Wit this you will get an equivalent of a heatmap which will give you a good
sense of which combination of Region/Machine prevails.

Here you can get an example of how to use it:

   - https://www.r-graph-gallery.com/79-levelplot-with-ggplot2.html

And, just in in case you have to represent numeric values (numeric scatter
plot) there is an excellent way to graph that with this package, without
leaving ggplot ecosystem:

https://github.com/LKremer/ggpointdensity

Thanks,
Carlos Ortega.

On Wed, Jul 29, 2020 at 11:31 AM Jim Lemon <drjimlemon using gmail.com> wrote:

> Hi Ritwik,
> I haven't seen any further answers to your request, so I'll make a
> suggestion. I don't think there is any sensible way to illustrate that
> many data points on a single plot. I would try to segment the data by
> machine type or similar and plot a number of plots.
>
> Jim
>
> On Fri, Jul 24, 2020 at 11:34 PM Ritwik Mohapatra <ritm84 using gmail.com>
> wrote:
> >
> > Hi All,
> >
> > These are the two codes i have used so far:-
> > ggplot(df3_machine_region,aes(Region,Machine.Name)) +
> >   geom_count()
> > !![2nd Plot|690x375](upload://gTyYUXe6lPJXCdyvqRBtUZ8zsyL.png) [1st
> > Plot|690x375](upload://bb0ux9WheqM4ViyYf3Gki6TKtlG.png)
> > ggplot(df3_machine_region,aes(Region,Machine.Name)) +
> >   geom_jitter(aes(colour=Region))
> >
> > I have to present the plot to my stakeholders,so thats why its required
> in
> > a readable and legible way.
> >
> > There would be approximately 10k+ values(max) for machine and region
> > combination.
> >
> > I have attached the output plots for your reference.Please find below a
> > snapshot of data for your reference.
> >
> > |Machine.Name|Region|
> > |0460-EPBS1.sga-res.com|Europe|
> > |04821-EABS1.sga-res.com|Europe|
> > |10429-EDABS1.sga-res.com|Europe|
> > |1042619-ESWEBS1.sga-res.com|Europe|
> > |ABE-L-98769.europe.shell.com|Americas|
> > |AB-L-98769.europe.shell.com|APAC|
> > |AB-L-98769.europe.shell.com|Europe|
> > |ABE-L-98769.europe.shell.com (2)|Americas|
> > |ABE-L-98769.europe.shell.com (2)|Europe|
> > |ABE-L-98840.europe.shell.com|Americas|
> > |AB-L-98840.europe.shell.com|APAC|
> > |ABE-L-98840.europe.shell.com|Europe|
> > |AB-L-98854.europe.shell.com|Americas|
> > |ABE-L-98854.europe.shell.com|Europe|
> > |ABE-L-98862.europe.shell.com|Americas|
> >
> > Regards,
> > Ritwik
> >
> > On Fri, Jul 24, 2020 at 6:05 PM Martin Maechler <
> maechler using stat.math.ethz.ch>
> > wrote:
> >
> > > >>>>> Ritwik Mohapatra
> > > >>>>>     on Thu, 23 Jul 2020 23:41:57 +0530 writes:
> > >
> > >     > How to create a readable and legible plot in R with 10k+ values.I
> > > have a
> > >     > dataframe with 17298 records.There are two columns:Machine
> > > Name(Character)
> > >     > and Region(Character).So i want to create a readable plot with
> > > region in x
> > >     > axis and machine name in y axis.How do i do that using ggplot or
> any
> > > other
> > >     > way.Please help.
> > >
> > > Good answers to this question will depend very much on how many
> > > 'Machine' and 'Region' levels there are.
> > >
> > > (and this is a case where in my opinion it'd be *MUCH* more
> > >  useful to have 'factor' instead of 'character'.. if only just
> > >  so
> > >          str(<data>)
> > > or   summary(<data>)
> > >
> > > would give useful/relevant information.
> > >
> > > --
> > > One possibility for a somewhat cute plot is a  "good ole"
> > > sunflower plot (base graphics, but the idea must be easily
> > > transferable to grid-based graphics such as ggplot2):
> > >
> > >   help(sunflowerplot)
> > >
> > >
> > > Martin Maechler
> > > ETH Zurich
> > >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list