[R] ecdf

Dennis Murphy djmuser at gmail.com
Mon Oct 17 03:16:16 CEST 2011


Thanks for the clarification. I stand corrected.

Dennis

On Sun, Oct 16, 2011 at 5:48 PM, gj <gawesh at gmail.com> wrote:
> David is right. I am looking for the ecfd for fs$numstudents. The
> other column is just an id.
>
> I guess I don't know how to read the R documentation when it comes to functions.
>
> looking at the documentation, i now notice that it says "Compute an
> empirical cummulative distribution function and not a vector.
>
> But still I would had assumed that in ecdf(x) ... the x is the argument.
>
> So ecdf(fs$numstudents)(unique(fs$numstudents))
>     ===============  ==================
>          function                       arguments
>
> Yes? But I can't read that from the documentation? I suspect it has
> something to those dots .... in the arguments which I don't
> understand.
>
> Why it says usage ecdf(x) when it's clearly not the case?
>
> I don't get it.
>
> Gawesh
>
>
> On Sun, Oct 16, 2011 at 11:02 PM, David Winsemius
> <dwinsemius at comcast.net> wrote:
>>
>> On Oct 16, 2011, at 3:53 PM, Dennis Murphy wrote:
>>
>>> Hi:
>>>
>>> I don't understand what you're attempting to do. Wouldn't courseid be
>>> a categorical variable with a numeric label? If that is so, why are
>>> you trying to compute an EDF? An EDF computes cumulative relative
>>> frequency of a random variable, which by definition is numeric. If we
>>> were talking about EDFs for a distribution of student course grades on
>>> a numeric point system by course, that would make some sense, but I
>>> don't see how the course IDs themselves qualify as being on an
>>> interval scale of measurement. Could you clarify your intent?
>>
>> Huh? gawesh asked for ecdf on numstrudents (not courseid)  ... pretty
>> clearly a numeric value for which an ECDF should make sense.
>>
>> --
>> David.
>>
>> --
>>>
>>> Dennis
>>>
>>> On Sun, Oct 16, 2011 at 8:31 AM, gj <gawesh at gmail.com> wrote:
>>>>
>>>> Hi,
>>>> Newbie here. I read the R for Beginners but i still don't get this.
>>>>
>>>> I have the following data (this is just an example) in a CSV file:
>>>>
>>>>   courseid numstudents
>>>>       101         209
>>>>       141          13
>>>>       246         140
>>>>       263           8
>>>>       321          10
>>>>       361          10
>>>>       364          28
>>>>       365          25
>>>>       366          23
>>>>       367          34
>>>>
>>>> I load my data using:
>>>>
>>>> fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',')
>>>>
>>>> I want to get the ecdf. So, I looked at the ?ecdf which says
>>>> usage:ecdf(x)
>>>>
>>>> So I expected ecdf(fs$numstudents) to work
>>>>
>>>> Instead it just returned:
>>>> Call: ecdf(fs$numstudents)
>>>>  x[1:210] =      1,      2,      3,  ...,   3717,   4538
>>>>
>>>> After Googling, got this to work:
>>>> ecdf(fs$numstudents)(unique(fs$numstudents))
>>>>
>>>> But I don't understand why if the ?ecdf says usage is ecdf(x) ... I
>>>> need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this
>>>> to work?
>>>>
>>>> Can somebody explain this to me?
>>>>
>>>> Regards
>>>> Gawesh
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>>
>



More information about the R-help mailing list