#34325: PercentRank confusion
-----------------------------------------+------------------------
               Reporter:  dvg            |          Owner:  nobody
                   Type:  Uncategorized  |         Status:  new
              Component:  Documentation  |        Version:  4.1
               Severity:  Normal         |       Keywords:
           Triage Stage:  Unreviewed     |      Has patch:  0
    Needs documentation:  0              |    Needs tests:  0
Patch needs improvement:  0              |  Easy pickings:  0
                  UI/UX:  0              |
-----------------------------------------+------------------------
 The [https://docs.djangoproject.com/en/4.1/ref/models/database-
 functions/#percentrank documentation for the PercentRank window function]
 says:

   Computes the '''''percentile rank''''' of the rows in the frame clause.
 This computation is equivalent to evaluating:
   {{{
   (rank - 1) / (total rows - 1)
   }}}

 (my emphasis)

 However, I'm not so sure
 "[https://en.wikipedia.org/w/index.php?title=Percentile&oldid=1114275310
 percentile] rank" is the correct term.

 If you look up the (statistical) term "percentile rank" online, you'll
 find various definitions,
 [https://en.wikipedia.org/w/index.php?title=Percentile_rank&oldid=1136815121
 ranging from]

 {{{
 (CF - 0.5 * F) / N
 }}}

   where CF—the cumulative frequency—is the count of all scores less than
 or equal to the score of interest, F is the frequency for the score of
 interest, and N is the number of scores in the distribution.

 [https://www.geo.fu-berlin.de/en/v/soga/Basics-of-statistics/Descriptive-
 Statistics/Measures-of-Position/Percentiles-and-Percentile-Rank/index.html
 to something like]

 {{{
 <number of values less than the score of interest> / <total number of
 values in the data set>
 }}}

 However, none exactly matches the definition in the Django docs.

 Note also that the documentation for the `percent_rank` function in the
 [https://www.sqlite.org/windowfunctions.html#built_in_window_functions
 SQLite] and [https://www.postgresql.org/docs/15/functions-window.html
 PostgreSQL] database backends does '''not''' mention "percentile rank".
 Instead, they use the term "relative rank."

 To prevent confusion, wouldn't it be better to use the same terminology as
 the database backends?

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34325>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701863666ec17-5b3186ec-7ceb-4320-9697-51b513fc4f06-000000%40eu-central-1.amazonses.com.

Reply via email to