Hi Sarah, Thanks for your very lucid explanations. Thanks also to David and Dennis.
I got it completely. I now have some nice ggplot of a couple ecdf in my paper :-) Now on to do some matrix plots of correlation matrices and some lm(). I'm like a child in a candy shop. :-) I'm learning something about R every day. Regards Gawesh On Mon, Oct 17, 2011 at 2:11 AM, Sarah Goslee <sarah.gos...@gmail.com> wrote: > Hi, > > On Sun, Oct 16, 2011 at 8:48 PM, gj <gaw...@gmail.com> wrote: >> David is right. I am looking for the ecfd for fs$numstudents. The >> other column is just an id. >> >> I guess I don't know how to read the R documentation when it comes to >> functions. >> >> looking at the documentation, i now notice that it says "Compute an >> empirical cumulative distribution function and not a vector. >> >> But still I would had assumed that in ecdf(x) ... the x is the argument. > > ecdf() is the function you're calling. > x is your vector, for which you want the ECDF. > > num.ecdf <- ecdf(fs$numstudents) > > There. That's the ECDF. > > But the ECDF is a *function* - that's what the F stands for, after all. > > If you're looking for the percentiles for your data, you might try: > > num.ecdf(fs$numstudents) > > You might also try working the examples given in ?ecdf yourself, so > that you can see exactly what's going on before you try it with your > own data. > > >> So ecdf(fs$numstudents)(unique(fs$numstudents)) >> =============== ================== >> function arguments >> >> Yes? But I can't read that from the documentation? I suspect it has >> something to those dots .... in the arguments which I don't >> understand. > > Yes. > > That's the condensed version of what I just proposed, done in > one step, instead of two. The two-step version is definitely in > the help. It doesn't have anything to do with the ..., which simply allow > for other arguments to be passed. > >> Why it says usage ecdf(x) when it's clearly not the case? >> >> I don't get it. > > Clearly that is the case. ecdf(x) returns the empirical cumulative > distribution *function* of the vector of data x. > > I'm not entirely sure what you think you should be getting. Perhaps > if you explained your expectations, the list would be able to help > you achieve them. > > Sarah > >> Gawesh >> >> >> On Sun, Oct 16, 2011 at 11:02 PM, David Winsemius >> <dwinsem...@comcast.net> wrote: >>> >>> On Oct 16, 2011, at 3:53 PM, Dennis Murphy wrote: >>> >>>> Hi: >>>> >>>> I don't understand what you're attempting to do. Wouldn't courseid be >>>> a categorical variable with a numeric label? If that is so, why are >>>> you trying to compute an EDF? An EDF computes cumulative relative >>>> frequency of a random variable, which by definition is numeric. If we >>>> were talking about EDFs for a distribution of student course grades on >>>> a numeric point system by course, that would make some sense, but I >>>> don't see how the course IDs themselves qualify as being on an >>>> interval scale of measurement. Could you clarify your intent? >>> >>> Huh? gawesh asked for ecdf on numstrudents (not courseid) ... pretty >>> clearly a numeric value for which an ECDF should make sense. >>> >>> -- >>> David. >>> >>> -- >>>> >>>> Dennis >>>> >>>> On Sun, Oct 16, 2011 at 8:31 AM, gj <gaw...@gmail.com> wrote: >>>>> >>>>> Hi, >>>>> Newbie here. I read the R for Beginners but i still don't get this. >>>>> >>>>> I have the following data (this is just an example) in a CSV file: >>>>> >>>>> courseid numstudents >>>>> 101 209 >>>>> 141 13 >>>>> 246 140 >>>>> 263 8 >>>>> 321 10 >>>>> 361 10 >>>>> 364 28 >>>>> 365 25 >>>>> 366 23 >>>>> 367 34 >>>>> >>>>> I load my data using: >>>>> >>>>> fs<-read.csv(file="C:\\num_students_inallmodules.csv",header=T, sep=',') >>>>> >>>>> I want to get the ecdf. So, I looked at the ?ecdf which says >>>>> usage:ecdf(x) >>>>> >>>>> So I expected ecdf(fs$numstudents) to work >>>>> >>>>> Instead it just returned: >>>>> Call: ecdf(fs$numstudents) >>>>> x[1:210] = 1, 2, 3, ..., 3717, 4538 >>>>> >>>>> After Googling, got this to work: >>>>> ecdf(fs$numstudents)(unique(fs$numstudents)) >>>>> >>>>> But I don't understand why if the ?ecdf says usage is ecdf(x) ... I >>>>> need to use ecdf(fs$numstudents)(unique(fs$numstudents)) to get this >>>>> to work? >>>>> >>>>> Can somebody explain this to me? >>>>> >>>>> Regards >>>>> Gawesh >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> David Winsemius, MD >>> Heritage Laboratories >>> West Hartford, CT >>> >>> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Sarah Goslee > http://www.stringpage.com > http://www.sarahgoslee.com > http://www.functionaldiversity.org > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.