[R] graphs, need urgent help (deadline :( )
Don McKenzie
dmck at u.washington.edu
Wed Jun 10 21:07:27 CEST 2015
Here is code that IS tested. I am sending Rosa the (ugly) output in a separate file. Crazy problems with argument order; I never figured out
exactly what was wrong.
# therapy plot
plot(therapy.df$Region[therapy.df$sample==50],therapy.df$factor.a[therapy.df$sample==50],xlab="Region",ylab="factor",type="l",col=4,ylim=c(0,1.5))
lines(therapy.df$Region[therapy.df$sample==50],therapy.df$factor.b[therapy.df$sample==50],col=2)
lines(therapy.df$Region[therapy.df$sample==50],therapy.df$factor.c[therapy.df$sample==50],col=3)
lines(therapy.df$Region[therapy.df$sample==250],therapy.df$factor.a[therapy.df$sample==250],col=4,lty=2)
lines(therapy.df$Region[therapy.df$sample==250],therapy.df$factor.b[therapy.df$sample==250],col=2,lty=2)
lines(therapy.df$Region[therapy.df$sample==250],therapy.df$factor.c[therapy.df$sample==250],col=3,lty=2)
lines(therapy.df$Region[therapy.df$sample==1000],therapy.df$factor.a[therapy.df$sample==1000],col=4,lty=3)
lines(therapy.df$Region[therapy.df$sample==1000],therapy.df$factor.b[therapy.df$sample==1000],col=2,lty=3)
lines(therapy.df$Region[therapy.df$sample==1000],therapy.df$factor.c[therapy.df$sample==1000],col=3,lty=3)
legend(7,1.4,c("factor.a","factor.b","factor.c"),col=c(4,2,3),lty=1)
> On Jun 10, 2015, at 11:03 AM, Rosa Oliveira <rosita21 at gmail.com> wrote:
>
> Sorry,
>
> I taught I attached the cvs file :)
>
> <therapy.csv>
>
>
> Don,
>
> I tried, but I got an error:
>
> > my.data$Region
> [1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
> > my.data$sample
> [1] 50 50 50 50 50 50 50 50 50 50 250 250 250 250 250 250 250 250 250 250 1000 1000 1000 1000 1000 1000 1000 1000
> [29] 1000 1000
> > my.data$factor.a
> [1] 0.895 0.811 0.685 0.777 0.600 0.466 0.446 0.392 0.256 0.198 0.136 0.121 0.875 0.777 0.685 0.626 0.550 0.466 0.384 0.330 0.060 0.138 0.065
> [24] 0.034 0.931 0.124 0.060 0.028 0.017 0.014
>
>
> > plot(my.data$Region[my.data$sample==50],my.data$factor.a[my.data$sample==50],col=4,type=“l”,xlab=“Region”,ylab=“factor")
> Error: unexpected input in "plot(my.data$Region[my.data$sample==50],my.data$factor.a[my.data$sample==50],col=4,type=�”
>
>
> I’m really naive, right?
>
>
> Best,
> RO
>
>
> Atenciosamente,
> Rosa Oliveira
>
> --
> ____________________________________________________________________________
>
> <smile.jpg>
>
> Rosa Celeste dos Santos Oliveira,
>
> E-mail: rosita21 at gmail.com
> Tlm: +351 939355143
> Linkedin: https://pt.linkedin.com/in/rosacsoliveira
> ____________________________________________________________________________
> "Many admire, few know"
> Hippocrates
>
>> On 10 Jun 2015, at 18:10, Don McKenzie <dmck at u.washington.edu> wrote:
>>
>> For a legend, try (untested)
>>
>> legend(0.15,0.9,c("factora","factorb","factorc"),col=c(4,2,3),lty=1)
>>
>> If it overlaps data points move the first two arguments (0.15 and 0.9) around, or change the “ylim” argument in the plot() to ~1.2.
>>
>> to avoid clutter, put the line-types information in the figure caption (IMO)
>>
>>
>>> On Jun 10, 2015, at 10:03 AM, Don McKenzie <dmck at u.washington.edu> wrote:
>>>
>>>
>>>> On Jun 10, 2015, at 9:08 AM, Rosa Oliveira <rosita21 at gmail.com> wrote:
>>>>
>>>> Dear All,
>>>>
>>>>
>>>> I attach my data.
>>>>
>>>> Dear Jim,
>>>>
>>>> when I run your code (even the one you send me, not in my data), I get:
>>>>
>>>> Don't know how to automatically pick scale for object of type function. Defaulting to continuous
>>>> Error in data.frame(x = c(0.1, 0.2, 0.1, 0.2, 0.1, 0.2, 0.1, 0.2, 0.1, :
>>>> arguments imply differing number of rows: 24, 0
>>>>
>>>>
>>>>
>>>> Dear Don,
>>>>
>>>> It’s meant that I will have 12 lines:
>>>> 3 factors - lines colors
>>>> with 3 different values of “sample” for each - line types
>>>>
>>>>
>>>> [Three colors, one for each factor,
>>>> and three line types (lty=1,2,3), one for eachvalue of “sample - preferable dash, thin and thick).
>>>>
>>>>
>>>> in the X - I should have region (because I have 10 regions)
>>>> for each region I have the outcome of 3 different treatments (factor)
>>>> for each region and each treatment I have 3 different sample size.
>>>
>>> But in your original post you had 4 sample sizes: 10,20,30,40.
>>>>
>>>> I need to “see” the the influence of the region in the treatment outcome for each sample size.
>>>>
>>>> So, at the end I should have 9 lines
>>>> 3 red (1 dash, 1 thin, 1 thick) - concerning factor a (dash for sample size 50, thin for sample size 250 and thick for sample size 1000)
>>>> 3 blue (1 dash, 1 thin, 1 thick) - concerning factor b (dash for sample size 50, thin for sample size 250 and thick for sample size 1000)
>>>> 3 green (1 dash, 1 thin, 1 thick) - concerning factor c (dash for sample size 50, thin for sample size 250 and thick for sample size 1000)
>>>>
>>>>
>>>>
>>>> Hope this time is clear.
>>>>
>>>>
>>>> I also though about doing 3 different graphs, each one for 1 different sample size, and in that case I should have 3 graphs each one with 3 lines
>>>> 1 red to factor a, 1 blue to factor b and 1 green to factor c.
>>>>
>>>> Do you all think is better?
>>>
>>> A matter of style perhaps but I would use dotplots because you have only two data points for each “line”. The lines will be misleading. You also could use
>>> panel plots, but given your skill set (unless someone wants to spend a fair bit of time with you), it’s probably best to stay as simple as possible.
>>>
>>> But given your original post (cleaned up) # untested: apologies for any typos
>>>
>>>> region sample factora factorb factorc
>>>> 0.1 10 0.895 0.903 0.378
>>>> 0.2 10 0.811 0.865 0.688
>>>> 0.1 20 0.735 0.966 0.611
>>>> 0.2 20 0.777 0.732 0.653
>>>> 0.1 30 0.600 0.778 0.694
>>>> 0.2 30 0.466 174.592 0.461
>>>> 0.1 40 0.446 0.432 0.693
>>>> 0.2 40 0.392 0.294 0.686
>>>
>>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4,type=“l”,ylim=c(0,1),xlab=“region”,ylab=“factor")
>>> lines(my.data$region[my.data$sample==10],my.data$factorb[my.data$sample==10],col=2)
>>> lines(my.data$region[my.data$sample==10],my.data$factorc[my.data$sample==10],col=3)
>>>
>>> lines(my.data$region[my.data$sample==20],my.data$factora[my.data$sample==20],col=4,lty=2)
>>> lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
>>> lines(my.data$region[my.data$sample==20],my.data$factorc[my.data$sample==20],col=3,lty=2)
>>>
>>> # Now do two more groups of 3, changing the parameter “lty” to 3 and then 4
>>>
>>> # Look at the syntax and note what changes and what stays constant. Do you see how this works?
>>> # there will be what looks like a vertical line where sample = 30 and factorb = 174.592. Do you see why?
>>>
>>> # then you will need a legend
>>>
>>>> Nonetheless I can’t do it :(
>>>>
>>>> best,
>>>> RO
>>>>
>>>>
>>>>
>>>> Atenciosamente,
>>>> Rosa Oliveira
>>>>
>>>> --
>>>> ____________________________________________________________________________
>>>>
>>>> <smile.jpg>
>>>> Rosa Celeste dos Santos Oliveira,
>>>>
>>>> E-mail: rosita21 at gmail.com
>>>> Tlm: +351 939355143
>>>> Linkedin: https://pt.linkedin.com/in/rosacsoliveira
>>>> ____________________________________________________________________________
>>>> "Many admire, few know"
>>>> Hippocrates
>>>>
>>>>> On 10 Jun 2015, at 14:13, John Kane <jrkrideau at inbox.com> wrote:
>>>>>
>>>>> Hi Jim,
>>>>>
>>>>> I was looking at that last night and had the same problem of visualizing what Rosa needed.
>>>>>
>>>>> Hi Rosa
>>>>> This is nothing like what you wanted and I really don't understand your data but would something like this work as a substitute or am I completely lost?
>>>>>
>>>>>
>>>>> dat1 <- structure(list(region = c(0.1, 0.2, 0.1, 0.2, 0.1, 0.2, 0.1,
>>>>> 0.2), sample = c(10L, 10L, 20L, 20L, 30L, 30L, 40L, 40L), factora = c(0.895,
>>>>> 0.811, 0.735, 0.777, 0.6, 0.466, 0.446, 0.392), factorb = c(0.903,
>>>>> 0.865, 0.966, 0.732, 0.778, 0.592, 0.432, 0.294), factorc = c(0.37,
>>>>> 0.688, 0.611, 0.653, 0.694, 0.461, 0.693, 0.686)), .Names = c("region",
>>>>> "sample", "factora", "factorb", "factorc"), class = "data.frame", row.names = c(NA,
>>>>> -8L))
>>>>>
>>>>>
>>>>> mdat1 <- melt(dat1, id.var = c("region", "sample"),
>>>>> variable.name = "factor",
>>>>> value.name = "value")
>>>>> str(mdat1)
>>>>>
>>>>> ggplot(mdat1, aes(region, value, colour = factor)) +
>>>>> geom_line() + facet_grid(sample ~ .)
>>>>>
>>>>> John Kane
>>>>> Kingston ON Canada
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: drjimlemon at gmail.com
>>>>>> Sent: Wed, 10 Jun 2015 20:51:52 +1000
>>>>>> To: rosita21 at gmail.com
>>>>>> Subject: Re: [R] graphs, need urgent help (deadline :( )
>>>>>>
>>>>>> Hi Rosa,
>>>>>> Like Don, I can't work out what you want and I don't even have the
>>>>>> picture. For example, your specification of color and line type leaves
>>>>>> only one point for each color and line type, and the line from one
>>>>>> point to the same point is not going to show up. Here is a possibility
>>>>>> that may lead (eventually) to a solution.
>>>>>>
>>>>>> library(plotrix)
>>>>>> par(tcl=-0.1)
>>>>>> gap.plot(x=rep(seq(10,45,by=5),3),
>>>>>> y=unlist(my.data[,c("factora","factorb","factorc")]),
>>>>>> main="A plot of factorial mystery",
>>>>>> gap=c(1.1,174),ylim=c(0,175),ylab="factor score",xlab="Group",
>>>>>> xticlab=c(" \n0.1\n10"," \n0.2\n10"," \n0.1\n20"," \n0.2\n20",
>>>>>> " \n0.1\n30"," \n0.2\n30"," \n0.1\n40"," \n0.2\n40"),
>>>>>> ytics=c(0,0.5,1,174.59),pch=rep(1:3,each=8),col=rep(c(4,2,3),each=8))
>>>>>> mtext(c("Region","Sample"),side=1,at=6,line=c(0,1))
>>>>>> lines(seq(10,45,by=5),my.data$factora,col=4)
>>>>>> lines(seq(10,45,by=5),my.data$factorb[c(1:5,NA,7,8)],col=2)
>>>>>> lines(seq(10,45,by=5),my.data$factorc,col=3)
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 10, 2015 at 10:53 AM, Rosa Oliveira <rosita21 at gmail.com>
>>>>>> wrote:
>>>>>>> Dear Don and all,
>>>>>>>
>>>>>>> I’ve read the tutorial and tried several codes before posting :)
>>>>>>> I’m really naive.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> what I was trying to : is something like the graph in the picture I
>>>>>>> drawee.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Is it more clear now?
>>>>>>>
>>>>>>> Atenciosamente,
>>>>>>> Rosa Oliveira
>>>>>>>
>>>>>>> --
>>>>>>> ____________________________________________________________________________
>>>>>>>
>>>>>>>
>>>>>>> Rosa Celeste dos Santos Oliveira,
>>>>>>>
>>>>>>> E-mail: rosita21 at gmail.com <mailto:rosita21 at gmail.com>
>>>>>>> Tlm: +351 939355143
>>>>>>> Linkedin: https://pt.linkedin.com/in/rosacsoliveira
>>>>>>> <https://pt.linkedin.com/in/rosacsoliveira>
>>>>>>> ____________________________________________________________________________
>>>>>>> "Many admire, few know"
>>>>>>> Hippocrates
>>>>>>>
>>>>>>>> On 09 Jun 2015, at 19:23, Don McKenzie <dmck at u.washington.edu
>>>>>>>> <mailto:dmck at u.washington.edu>> wrote:
>>>>>>>>
>>>>>>>> The answer lies in learning to use the help (and knowing where to
>>>>>>>> start). Did you look at the tutorial that comes with the R
>>>>>>>> installation?
>>>>>>>>
>>>>>>>> ?plot
>>>>>>>> ?lines
>>>>>>>>
>>>>>>>> ?par
>>>>>>>>
>>>>>>>> In the last, look for the descriptions of “col” and “lty”.
>>>>>>>>
>>>>>>>> Using plot() and lines(), and subsetting the four unique values of
>>>>>>>> “sample”, you can create your lines.
>>>>>>>>
>>>>>>>> Here is a crude start, assuming your columns are part of a data frame
>>>>>>>> called “my.data”. Untested...
>>>>>>>>
>>>>> plot(my.data$region[my.data$sample==10],my.data$factora[my.data$sample==10],col=4)
>>>>>>>> # blue line, not dashed
>>>>>>>> .
>>>>>>>> .
>>>>>>>> .
>>>>> lines(my.data$region[my.data$sample==20],my.data$factorb[my.data$sample==20],col=2,lty=2)
>>>>>>>> # red dashed line
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Jun 9, 2015, at 10:36 AM, Rosa Oliveira <rosita21 at gmail.com
>>>>>>>>> <mailto:rosita21 at gmail.com>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> another naive question (i’m pretty sure :( )
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I’m trying to plot a multiple line graph:
>>>>>>>>>
>>>>>>>>> region sample factora factorb
>>>>>>>>> factorc
>>>>>>>>> 0.1 10 0.895 0.903 0.378
>>>>>>>>> 0.2 10 0.811 0.865 0.688
>>>>>>>>> 0.1 20 0.735 0.966 0.611
>>>>>>>>> 0.2 20 0.777 0.732 0.653
>>>>>>>>> 0.1 30 0.600 0.778 0.694
>>>>>>>>> 0.2 30 0.466 174.592 0.461
>>>>>>>>> 0.1 40 0.446 0.432 0.693
>>>>>>>>> 0.2 40 0.392 0.294 0.686
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The first column should be the independent variable, the second should
>>>>>>>>> compute a bold line for sample(10) and dash line for sample 20.
>>>>>>>>
>>>>>>>> What about the other two values of “sample”?
>>>>>>>>
>>>>>>>>> The others variables are outcomes for each of the first scenarios, and
>>>>>>>>> so it should: the 3rd, 4th and 5th columns should be blue, red and
>>>>>>>>> green respectively.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Resume :)
>>>>>>>>>
>>>>>>>>> I should have a graph, in the x-axe should have the region and in the
>>>>>>>>> y axe, the factor.
>>>>>>>>> Lines:
>>>>>>>>> 1 - blue and bold for region 0.1, sample 10 and factor a
>>>>>>>>> 2 - blue and dash for region 0.2, sample 10 and factor a
>>>>>>>>> 3 - red and bold for region 0.1, sample 10 and factor b
>>>>>>>>> 4 - red and dash for region 0.2, sample 10 and factor b
>>>>>>>>> 5 - green and bold for region 0.1, sample 10 and factor c
>>>>>>>>> 6 - green and dash for region 0.2, sample 10 and factor c
>>>>>>>>
>>>>>>>> Not consistent with what you said above. These are no longer lines, but
>>>>>>>> points.
>>>>>>>>>
>>>>>>>>> nonetheless the independent variable is nominal, I should plot a line
>>>>>>>>> graph.
>>>>>>>>>
>>>>>>>>> Can anyone help me please?
>>>>>>>>> I have my file as a cvs file, so I first read that file (that I know
>>>>>>>>> how to do :)).
>>>>>>>>>
>>>>>>>>> But I have it in that format.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> RO
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Atenciosamente,
>>>>>>>>> Rosa Oliveira
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ____________________________________________________________________________
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Rosa Celeste dos Santos Oliveira,
>>>>>>>>>
>>>>>>>>> E-mail: rosita21 at gmail.com <mailto:rosita21 at gmail.com>
>>>>>>>>> Tlm: +351 939355143
>>>>>>>>> Linkedin: https://pt.linkedin.com/in/rosacsoliveira
>>>>>>>>> <https://pt.linkedin.com/in/rosacsoliveira>
>>>>>>>>> ____________________________________________________________________________
>>>>>>>>> "Many admire, few know"
>>>>>>>>> Hippocrates
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [[alternative HTML version deleted]]
>>>>>>>>>
>>>>>>>>> ______________________________________________
>>>>>>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To
>>>>>>>>> UNSUBSCRIBE and more, see
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>>> <https://stat.ethz.ch/mailman/listinfo/r-help>
>>>>>>>>> PLEASE do read the posting guide
>>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>>> <http://www.r-project.org/posting-guide.html>
>>>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>>
>>>>>>>> <PastedGraphic-1.tiff>
>>>>>>>>
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>> ____________________________________________________________
>>>>> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!
>>>>> Check it out at http://www.inbox.com/marineaquarium
>>>>>
>>>>>
>>>>
>>>
>>> <PastedGraphic-1.tiff>
>>>
>>
>> <PastedGraphic-1.tiff>
>>
>
More information about the R-help
mailing list