[R] Differentiate values in a plot by colour or symbol
Ismail SEZEN
sezenismail at gmail.com
Tue May 30 21:29:46 CEST 2017
> On 30 May 2017, at 21:44, Tobias Christoph <s3tochri at uni-bayreuth.de> wrote:
>
> Okay;)
>
> First of all many thanks to you, Ismail, that you really try to help me. I am really not an expert with R and try to learn.
>
You’re welcome.
> I just checked: All columns in my data frame are numeric. The range of years is from 2005 to 2016.
>
> Please find attached the result of "data.frame(data)". I also attached it as Excel-file
> > data.frame(data)
> town year revenue stations
> 1 Bremen 2005 39.91036 1
> 2 Bremen 2006 43.34265 1
> 3 Bremen 2007 44.03614 1
> 4 Bremen 2008 43.19945 1
> 5 Bremen 2009 39.05230 1
> 6 Bremen 2010 44.24626 1
> 7 Bremen 2011 46.19309 35
> 8 Bremen 2012 48.59513 101
> 9 Bremen 2013 48.15778 181
> 10 Bremen 2014 48.83199 323
> 11 Bremen 2015 48.68549 463
> 12 Bremen 2016 50.00000 614
> 13 Dresden 2005 42.27858 1
> 14 Dresden 2006 50.39606 1
> 15 Dresden 2007 48.73299 1
> 16 Dresden 2008 42.69010 1
> 17 Dresden 2009 40.81174 1
> 18 Dresden 2010 47.09675 2
> 19 Dresden 2011 49.16900 43
> 20 Dresden 2012 48.13645 151
> 21 Dresden 2013 48.13645 284
> 22 Dresden 2014 49.77309 511
> 23 Dresden 2015 51.51515 773
> 24 Dresden 2016 51.00000 1057
> 25 Düsseldorf 2005 44.23227 1
> 26 Düsseldorf 2006 46.70928 1
> 27 Düsseldorf 2007 51.24008 1
> 28 Düsseldorf 2008 61.59058 1
> 29 Düsseldorf 2009 45.70021 1
> 30 Düsseldorf 2010 57.17096 8
> 31 Düsseldorf 2011 60.60122 115
> 32 Düsseldorf 2012 65.50992 339
> 33 Düsseldorf 2013 64.06870 636
> 34 Düsseldorf 2014 71.56474 1117
> 35 Düsseldorf 2015 68.20119 1622
> 36 Düsseldorf 2016 80.00000 2117
> 37 Essen 2005 37.71029 1
> 38 Essen 2006 44.61127 1
> 39 Essen 2007 41.39926 1
> 40 Essen 2008 49.34792 1
> 41 Essen 2009 38.49137 1
> 42 Essen 2010 51.57844 1
> 43 Essen 2011 48.38058 29
> 44 Essen 2012 50.17066 42
> 45 Essen 2013 49.26759 90
> 46 Essen 2014 50.20367 162
> 47 Essen 2015 47.89430 258
> 48 Essen 2016 58.00000 370
> 49 Frankfurt 2005 47.97355 1
> 50 Frankfurt 2006 50.37223 1
> 51 Frankfurt 2007 49.11292 1
> 52 Frankfurt 2008 49.65316 1
> 53 Frankfurt 2009 44.53889 3
> 54 Frankfurt 2010 54.02567 15
> 55 Frankfurt 2011 56.29475 80
> 56 Frankfurt 2012 59.10949 223
> 57 Frankfurt 2013 62.30140 488
> 58 Frankfurt 2014 62.67521 836
> 59 Frankfurt 2015 66.93712 1319
> 60 Frankfurt 2016 66.00000 1744
> 61 Hannover 2005 39.82472 1
> 62 Hannover 2006 41.25841 1
> 63 Hannover 2007 40.80456 1
> 64 Hannover 2008 42.19192 1
> 65 Hannover 2009 36.96012 1
> 66 Hannover 2010 45.83055 5
> 67 Hannover 2011 49.86364 35
> 68 Hannover 2012 51.11023 167
> 69 Hannover 2013 52.69465 351
> 70 Hannover 2014 56.22519 983
> 71 Hannover 2015 56.95612 1413
> 72 Hannover 2016 61.00000 1864
> 73 Leipzig 2005 29.05982 1
> 74 Leipzig 2006 34.52306 1
> 75 Leipzig 2007 35.97303 1
> 76 Leipzig 2008 40.03798 1
> 77 Leipzig 2009 37.67574 2
> 78 Leipzig 2010 44.19365 3
> 79 Leipzig 2011 44.72397 53
> 80 Leipzig 2012 49.55416 223
> 81 Leipzig 2013 52.92384 488
> 82 Leipzig 2014 53.50600 918
> 83 Leipzig 2015 54.62963 1517
> 84 Leipzig 2016 59.00000 2037
> 85 Nürnberg 2005 43.51885 1
> 86 Nürnberg 2006 49.13278 1
> 87 Nürnberg 2007 46.92181 1
> 88 Nürnberg 2008 52.03628 1
> 89 Nürnberg 2009 43.45030 1
> 90 Nürnberg 2010 55.44258 5
> 91 Nürnberg 2011 57.21674 48
> 92 Nürnberg 2012 62.36625 145
> 93 Nürnberg 2013 61.49312 297
> 94 Nürnberg 2014 66.22809 505
> 95 Nürnberg 2015 63.38028 813
> 96 Nürnberg 2016 72.00000 1101
> 97 Rostock 2005 32.56640 1
> 98 Rostock 2006 30.71011 1
> 99 Rostock 2007 33.71970 1
> 100 Rostock 2008 34.25922 1
> 101 Rostock 2009 34.60181 1
> 102 Rostock 2010 40.17270 1
> 103 Rostock 2011 42.06082 3
> 104 Rostock 2012 42.43937 15
> 105 Rostock 2013 43.67011 43
> 106 Rostock 2014 43.93213 93
> 107 Rostock 2015 47.35883 174
> 108 Rostock 2016 52.00000 243
> 109 Stuttgart 2005 50.72972 1
> 110 Stuttgart 2006 58.74502 1
> 111 Stuttgart 2007 53.45797 1
> 112 Stuttgart 2008 56.18432 1
> 113 Stuttgart 2009 46.29588 1
> 114 Stuttgart 2010 56.38839 2
> 115 Stuttgart 2011 58.92586 33
> 116 Stuttgart 2012 61.16505 96
> 117 Stuttgart 2013 60.12524 200
> 118 Stuttgart 2014 65.89726 409
> 119 Stuttgart 2015 71.49853 661
> 120 Stuttgart 2016 73.00000 853
> 121 Wiebaden 2005 37.40724 1
> 122 Wiebaden 2006 38.94093 1
> 123 Wiebaden 2007 38.08423 1
> 124 Wiebaden 2008 38.23657 1
> 125 Wiebaden 2009 34.98646 1
> 126 Wiebaden 2010 40.72424 2
> 127 Wiebaden 2011 44.59304 8
> 128 Wiebaden 2012 47.58078 27
> 129 Wiebaden 2013 46.86706 59
> 130 Wiebaden 2014 46.58586 110
> 131 Wiebaden 2015 48.12320 163
> 132 Wiebaden 2016 50.00000 220
>
>
Insead of copy and paste data.frame or attach an excel file, learn how to create a minimal example as below:
set.seed(6)
data <- data.frame(
town = rep(LETTERS, each = 5, times = 5)[1:60],
year = rep(2005:2016, times = 5),
revenue = rnorm(60, 35),
stations = round(rnorm(60, 250, 100)))
plot(data$stations, data$revenue, xlab="stations", ylab="revenue", pch = 16, col = findInterval(data$year, c(2005, 2010, 2016)))
I created a fake data.frame similar your original one and used it to plot. See the result of plot. It works as intended.
let’s check result of findInterval function.
findInterval(data$year, c(2005, 2010, 2016))
[1] 1 1 1 1 1 2 2 2 2 2 2 3 1 1 1 1 1 2 2 2 2 2 2 3 1 1 1 1 1 2 2 2 2 2 2 3 1 1 1 1 1 2 2 2 2 2 2 3 1 1 1 1 1 2 2 2 2 2 2 3
and see the result of palette function.
palette()
"black" "red" "green3" "blue" "cyan" "magenta" "yellow" "gray"
as you noticed, black dots in the plot will be belong to years between 2005-2009, red dots will be belong to 2010-2015 and green dots will be belong to only 2016.
I hope your next questions follow this guide and make things easier for you and us :)
>
>
> Am 30.05.2017 um 20:30 schrieb Ismail SEZEN:
>>
>>> On 30 May 2017, at 21:23, Tobias Christoph <s3tochri at uni-bayreuth.de <mailto:s3tochri at uni-bayreuth.de>> wrote:
>>>
>>> Ahh, okay.
>>>
>>> I think now I understand what you exactly mean. But the plot is stil not working /differentiate the dots by color. I used the following formula.
>>> "plot(data$stations, data$revenue, xlab="stations", ylab="revenue", col = findInterval(data$year, c(2005, 2010, 2015))"
>>>
>>> I think the problem is stil related to the term "col = findInterval(data$year, c(2005, 2010, 2015))" and its notation.
>>>
>>> Just to make sure: "data" is the name of the data-table imported in R. "year" is the lable of the column where the years are listed in the data-table?
>>>
>> Exactly. Make sure all the columns in data.frame are numeric. Also I don’t know the range of years. You should arrange arguments to findInterval according your data. If you would send a minimal example as stated in posting guide [1], you will have your answer in second email :).
>>
>> 1- http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
>>
>>> Cheers
>>>
>>>
>>>
>>> Am 30.05.2017 um 19:57 schrieb Ismail SEZEN:
>>>>
>>>>> On 30 May 2017, at 20:48, Tobias Christoph <s3tochri at uni-bayreuth.de <mailto:s3tochri at uni-bayreuth.de>> wrote:
>>>>>
>>>>> Hi Ismael,
>>>>>
>>>>> thanks for your quick reply.
>>>>>
>>>>> I was now able to esmitate two intervals with the "findInterval"-Function.
>>>>>
>>>>> x
>>>>> [1,] 2005 1
>>>>> [2,] 2006 1
>>>>> [3,] 2007 1
>>>>> [4,] 2008 1
>>>>> [5,] 2009 1
>>>>> [6,] 2010 1
>>>>> [7,] 2011 2
>>>>> [8,] 2012 2
>>>>> [9,] 2013 2
>>>>> [10,] 2014 2
>>>>> [11,] 2015 2
>>>>> [12,] 2016 2
>>>>> But I was not able to connect the intervals with the plot-function. I used the following formular.
>>>>> "plot(data$stations, data$revenue, xlab="stations", ylab="revenue", col(findInterval())"
>>>>>
>>>> In fact I should say “feed _col_ or _pch_ argument with the result of findInterval” as below:
>>>>
>>>> plot(data$stations, data$revenue, xlab="stations", ylab="revenue", col = findInterval(x$year, c(2005, 2010, 2015))
>>>>
>>>> Please note that If you have many (20-30) intervals, colour handling will be more complex. But I assume you have maximum 5-10 intervals. So, the piece of code above will work for you.
>>>>
>>>>> How can I proceed and get the plot-funktion running?
>>>>> Maybe it is not running because the years as single numbers are already contained in my data-frame?
>>>>> Cheers,
>>>>>
>>>>> Toby
>>>>>
>>>>>
>>>>> Am 30.05.2017 um 18:26 schrieb Ismail SEZEN:
>>>>>>> On 30 May 2017, at 19:02, Tobias Christoph <s3tochri at uni-bayreuth.de> <mailto:s3tochri at uni-bayreuth.de> wrote:
>>>>>>>
>>>>>>> Hey Guys,
>>>>>>>
>>>>>>> I just try to differentiate certain values in my plot by colour or symbol.
>>>>>>>
>>>>>>> I have panel data with three dimensions (number of stations, revenue,
>>>>>>> years). To integrate the third dimension (years) in the plot, I want to
>>>>>>> differentiate the values(number of stations, revenue) by a certain range
>>>>>>> of years.
>>>>>>>
>>>>>>> e.g.: 2005-2010: red coloured dots, 2011-2016, blue coloured dots
>>>>>>>
>>>>>>> For the normal plot I used the following formula:
>>>>>>>
>>>>>>> *plot(data$stations, data$revenue, xlab="stations", ylab="revenue")*
>>>>>>>
>>>>>>> I only found a way to mark every single year. So hopefully you can help?
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Toby
>>>>>>>
>>>>>> See ?findInterval. Especially, first 3 lines in _Examples_ section. Use result of findInterval as argument to _col_ or _pch_ in plot function.
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> <data.xlsx>
[[alternative HTML version deleted]]
More information about the R-help
mailing list