[R] Removing NAs from dataframe (for use in Vioplot)

David Winsemius dwinsemius at comcast.net
Sun May 1 17:43:12 CEST 2016


> On May 1, 2016, at 12:15 AM, Mike Smith <mike at hsm.org.uk> wrote:
> 
>>>> On Apr 30, 2016, at 12:58 PM, Mike Smith <mike at hsm.org.uk> wrote:
> 
>>>> Hi
> 
>>>> First post and a relative R newbie....
> 
>>>> I am using the vioplot library to produce some violin plots.
> 
> DW> It's a package,  .... not a library.
> 
>>>> I have an input CSV with columns off irregular length that contain NAs. I want to strip the NAs out and produce a multiple violin plot automatically labelled using the headers. At the moment I do this
> 
>>>> Code: 
>>>> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
>>>> library(vioplot)
>>>> y6<-na.omit(ds1$y6)
>>>> y5<-na.omit(ds1$y5)
>>>> y4<-na.omit(ds1$y4)
>>>> y3<-na.omit(ds1$y3)
>>>> y2<-na.omit(ds1$y2)
>>>> y1<-na.omit(ds1$y1)
>>>> vioplot(y6, y5, y4,y3,y2,y1,horizontal=TRUE, names=c("Y6", "Y5","Y4","Y3","Y2","Y1"), col = "lightblue")
> 
> 
>>>> Two queries:
> 
>>>> 1. Is there a more elegant way of automatically stripping the NAs, passing the columns to the function along with the header names??
> 
> 
>>> ds2 <- lapply( ds1, na.omit)
> 
> 
> Fantastic - that does the trick! Easy when you know how!! 
> 
> Follow-on: is there a way feed all the lists from ds2 to vioplot? It is now a series of lists (rather than a dataframe - is that right?). So this works, 
> 
> library(vioplot)
> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
> ds2 <- lapply( ds1, na.omit)
> vioplot(ds2$y1,ds2$y2)
> 
> but this doesnt
> 
> library(vioplot)
> ds1 = read.csv("http://www.lecturematerials.co.uk/data/spelling.csv")
> ds2 <- lapply( ds1, na.omit)
> vioplot(ds2)
> 
Error in min(data) : invalid 'type' (list) of argument


I had trouble, too. I thought, "Oh, this is easy, just use `do.call`", but I failed in getting any successful argument passing that way. 

> do.call('vioplot', list(x=ds2[[6]], ds2[-6]) )
Error in min(data) : invalid 'type' (list) of argument
> do.call('vioplot', c(x=ds2[[6]], ds2[-6]) )
Error in vioplot(x1 = 5L, x2 = 10L, x3 = 6L, x4 = 7L, x5 = 7L, x6 = 6L,  : 
  argument "x" is missing, with no default

Eventually I re-wrote the first line of vioplot's body to behave the way I thought made the most sense:

 vioplot <- 
function (x, ..., range = 1.5, h = NULL, ylim = NULL, names = NULL, 
    horizontal = FALSE, col = "magenta", border = "black", lty = 1, 
    lwd = 1, rectCol = "black", colMed = "white", pchMed = 19, 
    at, add = FALSE, wex = 1, drawRect = TRUE) 
{
    datas <- c(list(x), ...)
# .... but keep the rest the same.

# I then get success with:

vioplot(ds2[['y1']], ds2[-6])  # success

do.call('vioplot', list(x=ds2[[6]], ds2[-6]) ) # also successes
do.call('vioplot', list(x=ds2[['y1']], ds2[-6]) )

This is retracing a route explored 8 years ago:

http://markmail.org/search/?q=list%3Aorg.r-project.r-help+list+argument+to+vioplot#query:list%3Aorg.r-project.r-help%20list%20argument%20to%20vioplot+page:1+mid:j6lapgri46utcod7+state:results


It's probably easier to use that helper-function approach than my efforts at hacking.

Best of luck;

David


>>>> 2. Can I easily add the sample size to each violin plotted??
> 
>>>> ?violplot
>>> No documentation for ‘violplot’ in specified packages and libraries:
>>> you could try ‘??violplot’
> 
> DW> I see that I mispled that _package_ name. However, after loading
> DW> it I realized that I had no way of replicating what you are
> DW> seeing, because you didn't provide that file (or even something
> DW> that resembles it. It's rather unclear how you wanted this information presented.
> 
> The original code *should* have worked as the csv was online. There doesnt seem to be any option in vioplot to add the sample size (these are all small samples which I wanted to highlight) so I dont know if this is easily done elsewhere.
> 
> Thanks again!!
> ---
> Mike Smith
> 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list