[Python-Dev] Re: What is a public API?

2019-07-20 Thread Serhiy Storchaka

14.07.19 05:09, Raymond Hettinger пише:

On Jul 13, 2019, at 1:56 PM, Serhiy Storchaka  wrote:

Could we strictly define what is considered a public module interface in Python?


The RealDefinition™ is that whatever we include in the docs is public, 
otherwise not.

Beyond that, there is a question of how users can deduce what is public when they run 
"import somemodule; print(dir(some module))".


Run "help(some module)" or read the module documentation. dir() is not 
proper tool for getting the public interface.


https://docs.python.org/3/library/functions.html#dir

* If the object is a module object, the list contains the names of the 
module’s attributes.


It does not say about publicly.


In some modules, we've been careful to use both __all__ and to use an 
underscore prefix to indicate private variables and helper functions 
(collections and random for example).  IMO, when a module has shown that care, 
future maintainers should stick with that practice.


Either we establish the rule that all non-public names must be 
underscored, and do mass renaming through the whole stdlib. Or allow to 
use non-underscored names for internal things and leave the sources in 
peace.


Note also that underscored names can be a part of the public interface 
(for example namedtuple._replace).



The calendar module is an example of where that care was taken for many years 
and then a recent patch went against that practice.  This came to my attention 
when an end-user questioned which functions were for internal use only and 
posted their question on Twitter.  On the tracker, I then made a simple request 
to restore the module's convention but you seem steadfastly resistant to the 
suggestion.


There was never such convention. Before that changes there were 
non-underscored non-public members in the module. In Python 3.6:


>>> sorted(set(dir(calendar)) - set(calendar.__all__))
['EPOCH', 'FRIDAY', 'February', 'January', 'MONDAY', 'SATURDAY', 
'SUNDAY', 'THURSDAY', 'TUESDAY', 'WEDNESDAY', '_EPOCH_ORD', '__all__', 
'__builtins__', '__cached__', '__doc__', '__file__', '__loader__', 
'__name__', '__package__', '__spec__', '_colwidth', '_locale', 
'_localized_day', '_localized_month', '_spacing', 'c', 'datetime', 
'different_locale', 'error', 'format', 'formatstring', 'main', 'mdays', 
'prweek', 'repeat', 'sys', 'week']



When we do have evidence of user confusion (as in the case with the calendar 
module), we should just fix it.


The main source of user confusion is not reading the documentation. 
Recent examples: https://bugs.python.org/issue37620, 
https://bugs.python.org/issue37623.



IMO, it would be an undue burden on the user to have to check every method in 
dir() against the contents of __all__ to determine what is public (see below).


Just do not use dir() for this. It returns the list of attributes of the 
object. Use __all__ or help().



Also, as a maintainer of the module, I would not have found it obvious whether 
the functions were public or not.  The non-public functions look just like the 
public ones.


As you said, public names are explicitly documented.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LN5RLZ2BE2ELV4D5HLQDME6C4XOXLQFN/


[Python-Dev] Re: What is a public API?

2019-07-20 Thread Serhiy Storchaka

17.07.19 03:26, Brett Cannon пише:

I agree with Raymond that if the calendar module was following the leading 
underscore practice (which we should probably encourage all new modules to 
follow for consistency going forward) then I think the module should be updated 
to keep the practice going.


But it was not.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AFFTBCUD3LHVN65YZXIPHOUIWUFWKVL4/


[Python-Dev] Re: What is a public API?

2019-07-20 Thread Kyle Stanley
Brett Cannon wrote:
>  I agree with Raymond that if the calendar module was following the leading
> underscore practice (which we should probably encourage all new modules to 
> follow for
> consistency going forward) then I think the module should be updated to keep 
> the practice
> going.
> -Brett

Rather than it being on a case-by-case basis, would it be reasonable to 
establish a universal standard across stdlib for defining modules as public to 
apply to older modules as well? I think that it would prove to be quite 
beneficial to create an explicit definition of what is considered public. If we 
don't, there is likely to be further confusion on this topic, particularly from 
users. 

There would be some overhead cost associated with ensuring that every 
non-public function is is proceeded by an underscore, but adding public 
functions to __all__ could safely be automated with something like this 
(https://bugs.python.org/issue29446#msg287049): 

__all__ = [name for name, obj in globals().items() if not name.startswith('_') 
and not isinstance(obj, types.ModuleType)]

or a bit more lazily:

__all__ = [name for name in globals() if not name.startswith('_')]

Personally, I think the benefit of avoiding confusion on this issue and 
providing consistency to users would far outweigh the cost of implementing it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QIBJF3EFHN2REFJEQPMPGL6JTAJT56GZ/


[Python-Dev] The order of operands in the comparison

2019-07-20 Thread Serhiy Storchaka
Usually the order of operands of the == operator does not matter. bool(a 
== b) should return the same as bool(b == a). Correct __eq__ should look 
like:


def __eq__(self, other):
if not know how to compare with other:
return NotImplemented
return the result of comparison

But we work with non-perfect code written by non-perfect people. 
__eq__() can return False instead of NotImplemented for comparison with 
different type (it is not the worst case, in worst case it raises 
AttributeError or TypeError). So the order of operands can matter.


See https://bugs.python.org/issue37555 as an example of a real world issue.

The typical implementation of the __contains__ method looks like:

def __contains__(self, needle):
for item in self:
if item == needle:  # or needle == item
return True
return False

The question is where the needle should be: at the right or at the left 
side of ==?


In __contains__ implementations in list, tuple and general iterators 
(see PySequence_Contains) the needle is at the right side. But in 
count(), index() and remove() it is at the left side. In array it is 
effectively always at the left side since its __eq__ is not invoked.


The question is whether we should unify implementations and always use 
the needle at some particular side and what this side should be.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VSV4K4AOKM4CBQMOELPFV5VMYALPH464/


[Python-Dev] Re: What is a public API?

2019-07-20 Thread Serhiy Storchaka

20.07.19 09:03, Kyle Stanley пише:

Rather than it being on a case-by-case basis, would it be reasonable to 
establish a universal standard across stdlib for defining modules as public to 
apply to older modules as well? I think that it would prove to be quite 
beneficial to create an explicit definition of what is considered public. If we 
don't, there is likely to be further confusion on this topic, particularly from 
users.

There would be some overhead cost associated with ensuring that every 
non-public function is is proceeded by an underscore, but adding public 
functions to __all__ could safely be automated with something like this 
(https://bugs.python.org/issue29446#msg287049):

__all__ = [name for name, obj in globals().items() if not name.startswith('_') 
and not isinstance(obj, types.ModuleType)]

or a bit more lazily:

__all__ = [name for name in globals() if not name.startswith('_')]

Personally, I think the benefit of avoiding confusion on this issue and 
providing consistency to users would far outweigh the cost of implementing it.


__all__ is not needed if we can make all public names non-undescored and 
all non-public names underscored. The problem in issue29446 is that we 
can't do this in case of tkinter. We can't add an underscore to 
"wantobjects", because this name is the part of the public interface, 
but we also do not want to make it imported by the star import. So we 
need __all__ which includes all "normal" public names except "wantobjects".

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UNPJYSIUTQPPM6CKCDX5AEKB7CPSLNHK/


[Python-Dev] Re: The order of operands in the comparison

2019-07-20 Thread Guido van Rossum
In an ideal world, needle is on the right. Let's replace needle with a
constant: which of the following looks more natural?

  for x in sequence:
  if x == 5: return True

or

  for x in sequence:
  if 5 == x: return True

For me, 'x == 5' wins with a huge margin. (There is a subculture of C
coders who have trained themselves to write '5 == x' because they're afraid
of accidentally typing 'x = 5', but that doesn't apply to Python.)

Should we unify the stdlib? I'm not sure -- it feels like a sufficiently
obscure area that we won't get much benefit out of it (people should fix
their __eq__ implementation to properly return NotImplemented) and changing
it would surely cause some mysterious breakage in some code we cannot
control.

--Guido

On Sat, Jul 20, 2019 at 7:31 AM Serhiy Storchaka 
wrote:

> Usually the order of operands of the == operator does not matter. bool(a
> == b) should return the same as bool(b == a). Correct __eq__ should look
> like:
>
>  def __eq__(self, other):
>  if not know how to compare with other:
>  return NotImplemented
>  return the result of comparison
>
> But we work with non-perfect code written by non-perfect people.
> __eq__() can return False instead of NotImplemented for comparison with
> different type (it is not the worst case, in worst case it raises
> AttributeError or TypeError). So the order of operands can matter.
>
> See https://bugs.python.org/issue37555 as an example of a real world
> issue.
>
> The typical implementation of the __contains__ method looks like:
>
>  def __contains__(self, needle):
>  for item in self:
>  if item == needle:  # or needle == item
>  return True
>  return False
>
> The question is where the needle should be: at the right or at the left
> side of ==?
>
> In __contains__ implementations in list, tuple and general iterators
> (see PySequence_Contains) the needle is at the right side. But in
> count(), index() and remove() it is at the left side. In array it is
> effectively always at the left side since its __eq__ is not invoked.
>
> The question is whether we should unify implementations and always use
> the needle at some particular side and what this side should be.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VSV4K4AOKM4CBQMOELPFV5VMYALPH464/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WRNIZS3E4NO74JDANPGPX2JZEDHOTPFC/


[Python-Dev] Re: What is a public API?

2019-07-20 Thread Kyle Stanley
Serhiy Storchaka wrote:
> Either we establish the rule that all non-public names must be 
> underscored, and do mass renaming through the whole stdlib. Or allow to 
> use non-underscored names for internal things and leave the sources in 

Personally, I would be the most in favor of doing a mass renaming through 
stdlib, at least for any public facing modules (if they don't start with an 
underscore, as that already implies the entire module is internal). Otherwise, 
I have a feeling similar issues will be brought up repeatedly by confused 
end-users. 

This change would also follow the guideline of "Explicit is better than 
implicit" by explicitly defining any function in a public-facing module as 
private or public through the existence or lack of an underscore. There would 
be some cost associated with implementing this change, but it would definitely 
be worthwhile if it settled the public vs private misunderstandings.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CUKFA46FAYGJAAKXZJN6COGOSOPHAGX2/


[Python-Dev] Re: What is a public API?

2019-07-20 Thread Steven D'Aprano
On Sat, Jul 20, 2019 at 06:03:39AM -, Kyle Stanley wrote:

> Rather than it being on a case-by-case basis, would it be reasonable 
> to establish a universal standard across stdlib for defining modules 
> as public to apply to older modules as well?

No, I don't think so. That would require code churn to "fix" modules 
which aren't currently broken, and may never be.

It also requires meeting a standard that doesn't universally apply:

1. __all__ is optional, not mandatory.

2. __all__ is a list of names to import during star imports, not a list
   of public names; while there is some overlap, we should not assume
   that the two will always match.

3. Imported modules are considered private, regardless of whether they 
   are named with a leading underscore or not;

4. Unless they are considered public, such as os.path.

5. While I don't know of any top-level examples of this, there are cases 
   in the std lib where single-underscore names are considered public,
   such as the namedtuple interface. So in principle at least, a module
   might include a single-underscore name in its __all__.

6. Dunder names are not private, and could appear in __all__.


> There would be some overhead cost associated with ensuring that every 
> non-public function is is proceeded by an underscore, but adding 
> public functions to __all__ could safely be automated with something 
> like this (https://bugs.python.org/issue29446#msg287049):
> 
> __all__ = [name for name, obj in globals().items() if not 
> name.startswith('_') and not isinstance(obj, types.ModuleType)]

And you've just broken about a million scripts and applications that use 
os.path. As well as any modules which export public dunder names, for 
example sys.__stdout__ and friends, since your test for a private name
may be overzealous.



-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/G6PPD736U5JAOC2WP4RXOYD4AHM4FBQM/