[R] Compute the Gini coefficient
marine.regis at hotmail.fr
Fri Apr 1 02:11:38 CEST 2016
Thank you very much for your help.
How can I draw a Lorenz curve with several replications ?
Here is an example with 4 replications:
test <- cbind(parasites,hosts,replications)
Should I calculate the average frequency of hosts (replication mean values) and next calculate the cumulative percentage of hosts from the average frequency ?
Thank you very much for your time.
Have a nice day.
De : Achim Zeileis <Achim.Zeileis at uibk.ac.at>
Envoyé : mercredi 30 mars 2016 12:05
À : Erich Neuwirth
Cc : Marine Regis; r-help at r-project.org
Objet : Re: [R] Compute the Gini coefficient
On Wed, 30 Mar 2016, Erich Neuwirth wrote:
>> On 30 Mar 2016, at 02:53, Marine Regis <marine.regis at hotmail.fr> wrote:
>> I would like to build a Lorenz curve and calculate a Gini coefficient in order to find how much parasites does the top 20% most infected hosts support.
>> Here is my data set:
>> Number of parasites per host:
>> parasites = c(0,1,2,3,4,5,6,7,8,9,10)
>> Number of hosts associated with each number of parasites given above:
>> hosts = c(18,20,28,19,16,10,3,1,0,0,0)
>> To represent the Lorenz curve:
>> I manually calculated the cumulative percentage of parasites and hosts:
>> cumul_parasites <- cumsum(parasites)/max(cumsum(parasites))
>> cumul_hosts <- cumsum(hosts)/max(cumsum(hosts))
>> plot(cumul_hosts, cumul_parasites, type= "l?)
> Your values in hosts are frequencies. So you need to calculate
> cumul_hosts = cumsum(hosts)/sum(hosts)
> cumul_parasites = cumsum(hosts*parasites)/sum(parasites)
That's what I thought as well but Marine explicitly said that the 'host'
are _not_ weights. Hence I was confused what this would actually mean.
Using the "ineq" package you can also do
> The Lorenz curves starts at (0,0), so to draw it, you need to extend these vectors
> cumul_hosts = c(0,cumul_hosts)
> cumul_parasites = c(0,cumul_parasites)
> The Gini coefficient can be calculated as
> If you want to check, you can ?recreate? the original data (number of parasited for each host) with
> num_parasites = rep(parasites,hosts)
> will also give you the Gini coefficient you want.
>>> From this Lorenz curve, how can I calculate the Gini coefficient with the function "gini" in R (package reldist) given that the vector "hosts" is not a vector of weights ?
>> Thank you very much for your help.
>> Have a nice day
>> [[alternative HTML version deleted]]
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help