[Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Hongyi Zhao
Hi,

My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
try to run the script
<https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>,
but it will keep running and never end. When I use 'Ctrl + c' to
terminate it, it will give the following output:

$ python mu.py
[-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]

I have to terminate it and obtained the following information:

^CTraceback (most recent call last):
  File "mu.py", line 38, in 
integrand=DOS*fermi_array(energy,mu,kT)
  File 
"/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
line 2108, in __call__
return self._vectorize_call(func=func, args=vargs)
  File 
"/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
line 2192, in _vectorize_call
outputs = ufunc(*inputs)
  File "mu.py", line 8, in fermi
return 1./(exp((E-mu)/kT)+1)
KeyboardInterrupt


Any helps and hints for this problem will be highly appreciated?

Regards,
-- 
Hongyi Zhao 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  wrote:
>
> You don't need to use vectorize() on fermi(). fermi() will work just fine on 
> arrays and should be much faster.

Yes, it really does the trick. See the following for the benchmark
based on your suggestion:

$ time python mu.py
[-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]

real0m41.056s
user0m43.970s
sys0m3.813s


But are there any ways to further improve/increase efficiency?

Regards,
HY

>
> On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  wrote:
>>
>> Hi,
>>
>> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
>> try to run the script
>> <https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>,
>> but it will keep running and never end. When I use 'Ctrl + c' to
>> terminate it, it will give the following output:
>>
>> $ python mu.py
>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>>
>> I have to terminate it and obtained the following information:
>>
>> ^CTraceback (most recent call last):
>>   File "mu.py", line 38, in 
>> integrand=DOS*fermi_array(energy,mu,kT)
>>   File 
>> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> line 2108, in __call__
>> return self._vectorize_call(func=func, args=vargs)
>>   File 
>> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> line 2192, in _vectorize_call
>> outputs = ufunc(*inputs)
>>   File "mu.py", line 8, in fermi
>> return 1./(exp((E-mu)/kT)+1)
>> KeyboardInterrupt
>>
>>
>> Any helps and hints for this problem will be highly appreciated?
>>
>> Regards,
>> --
>> Hongyi Zhao 
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



-- 
Hongyi Zhao 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  wrote:
>
>
>
> On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  wrote:
>>
>> Hi,
>>
>> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  wrote:
>>>
>>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  wrote:
>>> >
>>> > You don't need to use vectorize() on fermi(). fermi() will work just fine 
>>> > on arrays and should be much faster.
>>>
>>> Yes, it really does the trick. See the following for the benchmark
>>> based on your suggestion:
>>>
>>> $ time python mu.py
>>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>>>
>>> real0m41.056s
>>> user0m43.970s
>>> sys0m3.813s
>>>
>>>
>>> But are there any ways to further improve/increase efficiency?
>>
>>
>>
>> I believe it will get a bit better if you don’t column_stack an array 6000 
>> times - maybe pre-allocate your output first?
>>
>> Andrea.
>
>
>
> I’m sorry, scratch that: I’ve seen a ghost white space in front of your 
> column_stack call and made me think you were stacking your results very many 
> times, which is not the case.

Still not so clear on your solutions for this problem. Could you
please post here the corresponding snippet of your enhancement?

Regards,
HY
>
>>
>>
>>>
>>>
>>> Regards,
>>> HY
>>>
>>> >
>>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  wrote:
>>> >>
>>> >> Hi,
>>> >>
>>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
>>> >> try to run the script
>>> >> <https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>,
>>> >> but it will keep running and never end. When I use 'Ctrl + c' to
>>> >> terminate it, it will give the following output:
>>> >>
>>> >> $ python mu.py
>>> >> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>>> >>
>>> >> I have to terminate it and obtained the following information:
>>> >>
>>> >> ^CTraceback (most recent call last):
>>> >>   File "mu.py", line 38, in 
>>> >> integrand=DOS*fermi_array(energy,mu,kT)
>>> >>   File 
>>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>>> >> line 2108, in __call__
>>> >> return self._vectorize_call(func=func, args=vargs)
>>> >>   File 
>>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>>> >> line 2192, in _vectorize_call
>>> >> outputs = ufunc(*inputs)
>>> >>   File "mu.py", line 8, in fermi
>>> >> return 1./(exp((E-mu)/kT)+1)
>>> >> KeyboardInterrupt
>>> >>
>>> >>
>>> >> Any helps and hints for this problem will be highly appreciated?
>>> >>
>>> >> Regards,
>>> >> --
>>> >> Hongyi Zhao 
>>> >> ___
>>> >> NumPy-Discussion mailing list
>>> >> NumPy-Discussion@python.org
>>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
>>> >
>>> > ___
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion@python.org
>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>>
>>> --
>>> Hongyi Zhao 
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



-- 
Hongyi Zhao 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-10 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  wrote:
>
>
>
> On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
>>
>> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
>> wrote:
>> >
>> >
>> >
>> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  wrote:
>> >>>
>> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
>> >>> wrote:
>> >>> >
>> >>> > You don't need to use vectorize() on fermi(). fermi() will work just 
>> >>> > fine on arrays and should be much faster.
>> >>>
>> >>> Yes, it really does the trick. See the following for the benchmark
>> >>> based on your suggestion:
>> >>>
>> >>> $ time python mu.py
>> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >>>
>> >>> real0m41.056s
>> >>> user0m43.970s
>> >>> sys0m3.813s
>> >>>
>> >>>
>> >>> But are there any ways to further improve/increase efficiency?
>> >>
>> >>
>> >>
>> >> I believe it will get a bit better if you don’t column_stack an array 
>> >> 6000 times - maybe pre-allocate your output first?
>> >>
>> >> Andrea.
>> >
>> >
>> >
>> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your 
>> > column_stack call and made me think you were stacking your results very 
>> > many times, which is not the case.
>>
>> Still not so clear on your solutions for this problem. Could you
>> please post here the corresponding snippet of your enhancement?
>
>
> I have no solution, I originally thought you were calling “column_stack” 6000 
> times in the loop, but that is not the case, I was mistaken. My apologies for 
> that.
>
> The timings of your approach is highly dependent on the size of your “energy” 
> and “DOS” array -

The size of the “energy” and “DOS” array is Problem-related and
shouldn't be reduced arbitrarily.

> not to mention calling trapz 6000 times in a loop.

I'm currently thinking on parallelization the execution of the for
loop, say, with joblib <https://github.com/joblib/joblib>, but I still
haven't figured out the corresponding codes. If you have some
experience on this type of solution, could you please give me some
more hints?

>  Maybe there’s a better way to do it with another approach, but at the moment 
> I can’t think of one...
>
>>
>>
>> Regards,
>> HY
>> >
>> >>
>> >>
>> >>>
>> >>>
>> >>> Regards,
>> >>> HY
>> >>>
>> >>> >
>> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  
>> >>> > wrote:
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by pyenv. I
>> >>> >> try to run the script
>> >>> >> <https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py>,
>> >>> >> but it will keep running and never end. When I use 'Ctrl + c' to
>> >>> >> terminate it, it will give the following output:
>> >>> >>
>> >>> >> $ python mu.py
>> >>> >> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> >>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >>> >>
>> >>> >> I have to terminate it and obtained the following information:
>> >>> >>
>> >>> >> ^CTraceback (most recent call last):
>> >>> >>   File "mu.py", line 38, in 
>> >>> >> integrand=DOS*fermi_array(energy,mu,kT)
>> >>> >>   File 
>> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
>> >>> >> line 2108, in __call__
>> >>> >> return self._vectorize_call(func=func, args=vargs)
>> >>> >>   File 
>> >>> >> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-11 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 2:56 PM Evgeni Burovski
 wrote:
>
> The script seems to be computing the particle numbers for an array of 
> chemical potentials.
>
> Two ways of speeding it up, both are likely simpler then using dask:

What do you mean by saying *dask*?

>
> First: use numpy
>
> 1. Move constructing mu_all out of the loop (np.linspace)
> 2. Arrange the integrands into a 2d array
> 3. np.trapz along an axis which corresponds to a single integrand array
> (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> manually)
>
> Second:
>
> Move the loop into cython.

Will this be more efficient than the schema like parallelization based
on python modules, say, joblib?

>
>
>
>
> вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
>>
>> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
>> wrote:
>> >
>> >
>> >
>> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
>> >>
>> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
>> >> >> wrote:
>> >> >>>
>> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
>> >> >>> > just fine on arrays and should be much faster.
>> >> >>>
>> >> >>> Yes, it really does the trick. See the following for the benchmark
>> >> >>> based on your suggestion:
>> >> >>>
>> >> >>> $ time python mu.py
>> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >> >>>
>> >> >>> real0m41.056s
>> >> >>> user0m43.970s
>> >> >>> sys0m3.813s
>> >> >>>
>> >> >>>
>> >> >>> But are there any ways to further improve/increase efficiency?
>> >> >>
>> >> >>
>> >> >>
>> >> >> I believe it will get a bit better if you don’t column_stack an array 
>> >> >> 6000 times - maybe pre-allocate your output first?
>> >> >>
>> >> >> Andrea.
>> >> >
>> >> >
>> >> >
>> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your 
>> >> > column_stack call and made me think you were stacking your results very 
>> >> > many times, which is not the case.
>> >>
>> >> Still not so clear on your solutions for this problem. Could you
>> >> please post here the corresponding snippet of your enhancement?
>> >
>> >
>> > I have no solution, I originally thought you were calling “column_stack” 
>> > 6000 times in the loop, but that is not the case, I was mistaken. My 
>> > apologies for that.
>> >
>> > The timings of your approach is highly dependent on the size of your 
>> > “energy” and “DOS” array -
>>
>> The size of the “energy” and “DOS” array is Problem-related and
>> shouldn't be reduced arbitrarily.
>>
>> > not to mention calling trapz 6000 times in a loop.
>>
>> I'm currently thinking on parallelization the execution of the for
>> loop, say, with joblib <https://github.com/joblib/joblib>, but I still
>> haven't figured out the corresponding codes. If you have some
>> experience on this type of solution, could you please give me some
>> more hints?
>>
>> >  Maybe there’s a better way to do it with another approach, but at the 
>> > moment I can’t think of one...
>> >
>> >>
>> >>
>> >> Regards,
>> >> HY
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> Regards,
>> >> >>> HY
>> >> >>>
>> >> >>> >
>> >> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  
>> >> >

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-11 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
 wrote:
>
> On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
>  wrote:
> >
> > The script seems to be computing the particle numbers for an array of 
> > chemical potentials.
> >
> > Two ways of speeding it up, both are likely simpler then using dask:
> >
> > First: use numpy
> >
> > 1. Move constructing mu_all out of the loop (np.linspace)
> > 2. Arrange the integrands into a 2d array
> > 3. np.trapz along an axis which corresponds to a single integrand array
> > (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> > manually)
>
>
> Roughly like this:
> https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99

I can't find the cython part suggested by you, i.e., move the loop
into cython. Furthermore, I also learned that the numpy array is
optimized and has the performance close to C/C++.

>
>
>
> > Second:
> >
> > Move the loop into cython.
> >
> >
> >
> >
> > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
> >>
> >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
> >> wrote:
> >> >
> >> >
> >> >
> >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
> >> >>
> >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
> >> >> wrote:
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
> >> >> >>> wrote:
> >> >> >>> >
> >> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
> >> >> >>> > just fine on arrays and should be much faster.
> >> >> >>>
> >> >> >>> Yes, it really does the trick. See the following for the benchmark
> >> >> >>> based on your suggestion:
> >> >> >>>
> >> >> >>> $ time python mu.py
> >> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >> >> >>>
> >> >> >>> real0m41.056s
> >> >> >>> user0m43.970s
> >> >> >>> sys0m3.813s
> >> >> >>>
> >> >> >>>
> >> >> >>> But are there any ways to further improve/increase efficiency?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I believe it will get a bit better if you don’t column_stack an 
> >> >> >> array 6000 times - maybe pre-allocate your output first?
> >> >> >>
> >> >> >> Andrea.
> >> >> >
> >> >> >
> >> >> >
> >> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of 
> >> >> > your column_stack call and made me think you were stacking your 
> >> >> > results very many times, which is not the case.
> >> >>
> >> >> Still not so clear on your solutions for this problem. Could you
> >> >> please post here the corresponding snippet of your enhancement?
> >> >
> >> >
> >> > I have no solution, I originally thought you were calling “column_stack” 
> >> > 6000 times in the loop, but that is not the case, I was mistaken. My 
> >> > apologies for that.
> >> >
> >> > The timings of your approach is highly dependent on the size of your 
> >> > “energy” and “DOS” array -
> >>
> >> The size of the “energy” and “DOS” array is Problem-related and
> >> shouldn't be reduced arbitrarily.
> >>
> >> > not to mention calling trapz 6000 times in a loop.
> >>
> >> I'm currently thinking on parallelization the execution of the for
> >> loop, say, with joblib <https://github.com/joblib/joblib>, but I still
> >> haven't figured out the corresponding codes. If you have some
> 

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-11 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
 wrote:
>
> On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
>  wrote:
> >
> > The script seems to be computing the particle numbers for an array of 
> > chemical potentials.
> >
> > Two ways of speeding it up, both are likely simpler then using dask:
> >
> > First: use numpy
> >
> > 1. Move constructing mu_all out of the loop (np.linspace)
> > 2. Arrange the integrands into a 2d array
> > 3. np.trapz along an axis which corresponds to a single integrand array
> > (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> > manually)
>
>
> Roughly like this:
> https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99

I try to run this notebook, but find that all of the
function/variable/method can't be found at all, if invoke them in
separate cells. See here for more details:

https://github.com/hongyi-zhao/test/blob/master/fermi_integrate_np.ipynb

Any hints for this problem?

Regards,
HY

>
>
>
> > Second:
> >
> > Move the loop into cython.
> >
> >
> >
> >
> > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
> >>
> >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
> >> wrote:
> >> >
> >> >
> >> >
> >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
> >> >>
> >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
> >> >> wrote:
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
> >> >> >>> wrote:
> >> >> >>> >
> >> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
> >> >> >>> > just fine on arrays and should be much faster.
> >> >> >>>
> >> >> >>> Yes, it really does the trick. See the following for the benchmark
> >> >> >>> based on your suggestion:
> >> >> >>>
> >> >> >>> $ time python mu.py
> >> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >> >> >>>
> >> >> >>> real0m41.056s
> >> >> >>> user0m43.970s
> >> >> >>> sys0m3.813s
> >> >> >>>
> >> >> >>>
> >> >> >>> But are there any ways to further improve/increase efficiency?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I believe it will get a bit better if you don’t column_stack an 
> >> >> >> array 6000 times - maybe pre-allocate your output first?
> >> >> >>
> >> >> >> Andrea.
> >> >> >
> >> >> >
> >> >> >
> >> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of 
> >> >> > your column_stack call and made me think you were stacking your 
> >> >> > results very many times, which is not the case.
> >> >>
> >> >> Still not so clear on your solutions for this problem. Could you
> >> >> please post here the corresponding snippet of your enhancement?
> >> >
> >> >
> >> > I have no solution, I originally thought you were calling “column_stack” 
> >> > 6000 times in the loop, but that is not the case, I was mistaken. My 
> >> > apologies for that.
> >> >
> >> > The timings of your approach is highly dependent on the size of your 
> >> > “energy” and “DOS” array -
> >>
> >> The size of the “energy” and “DOS” array is Problem-related and
> >> shouldn't be reduced arbitrarily.
> >>
> >> > not to mention calling trapz 6000 times in a loop.
> >>
> >> I'm currently thinking on parallelization the execution of the for
> >> loop, say, with joblib <https://github.com/joblib/joblib>, but I still

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-11 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 2:56 PM Evgeni Burovski
 wrote:
>
> The script seems to be computing the particle numbers for an array of 
> chemical potentials.
>
> Two ways of speeding it up, both are likely simpler then using dask:
>
> First: use numpy
>
> 1. Move constructing mu_all out of the loop (np.linspace)
> 2. Arrange the integrands into a 2d array
> 3. np.trapz along an axis which corresponds to a single integrand array
> (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> manually)

Could you please give me some more explanations on the reasons why
doing so can improve performance?

>
> Second:
>
> Move the loop into cython.
>
>
>
>
> вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
>>
>> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
>> wrote:
>> >
>> >
>> >
>> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
>> >>
>> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
>> >> >> wrote:
>> >> >>>
>> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
>> >> >>> > just fine on arrays and should be much faster.
>> >> >>>
>> >> >>> Yes, it really does the trick. See the following for the benchmark
>> >> >>> based on your suggestion:
>> >> >>>
>> >> >>> $ time python mu.py
>> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
>> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
>> >> >>>
>> >> >>> real0m41.056s
>> >> >>> user0m43.970s
>> >> >>> sys0m3.813s
>> >> >>>
>> >> >>>
>> >> >>> But are there any ways to further improve/increase efficiency?
>> >> >>
>> >> >>
>> >> >>
>> >> >> I believe it will get a bit better if you don’t column_stack an array 
>> >> >> 6000 times - maybe pre-allocate your output first?
>> >> >>
>> >> >> Andrea.
>> >> >
>> >> >
>> >> >
>> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of your 
>> >> > column_stack call and made me think you were stacking your results very 
>> >> > many times, which is not the case.
>> >>
>> >> Still not so clear on your solutions for this problem. Could you
>> >> please post here the corresponding snippet of your enhancement?
>> >
>> >
>> > I have no solution, I originally thought you were calling “column_stack” 
>> > 6000 times in the loop, but that is not the case, I was mistaken. My 
>> > apologies for that.
>> >
>> > The timings of your approach is highly dependent on the size of your 
>> > “energy” and “DOS” array -
>>
>> The size of the “energy” and “DOS” array is Problem-related and
>> shouldn't be reduced arbitrarily.
>>
>> > not to mention calling trapz 6000 times in a loop.
>>
>> I'm currently thinking on parallelization the execution of the for
>> loop, say, with joblib <https://github.com/joblib/joblib>, but I still
>> haven't figured out the corresponding codes. If you have some
>> experience on this type of solution, could you please give me some
>> more hints?
>>
>> >  Maybe there’s a better way to do it with another approach, but at the 
>> > moment I can’t think of one...
>> >
>> >>
>> >>
>> >> Regards,
>> >> HY
>> >> >
>> >> >>
>> >> >>
>> >> >>>
>> >> >>>
>> >> >>> Regards,
>> >> >>> HY
>> >> >>>
>> >> >>> >
>> >> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao  
>> >> >>> > wrote:
>> >>

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-12 Thread Hongyi Zhao
On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
 wrote:
>
> On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
>  wrote:
> >
> > The script seems to be computing the particle numbers for an array of 
> > chemical potentials.
> >
> > Two ways of speeding it up, both are likely simpler then using dask:
> >
> > First: use numpy
> >
> > 1. Move constructing mu_all out of the loop (np.linspace)
> > 2. Arrange the integrands into a 2d array
> > 3. np.trapz along an axis which corresponds to a single integrand array
> > (Or avoid the overhead of trapz by just implementing the trapezoid formula 
> > manually)
>
>
> Roughly like this:
> https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99

I've done the comparison of the real execution time for your version
I've compared the execution efficiency of your above method and the
original method of the python script by directly using fermi() without
executing vectorize() on it. Very surprisingly, the latter is more
efficient than the former, see following for more info:

$ time python fermi_integrate_np.py
[[1.0300e+01 4.55561775e+17]
 [1.03001000e+01 4.55561780e+17]
 [1.03002000e+01 4.55561786e+17]
 ...
 [1.08997000e+01 1.33654085e+21]
 [1.08998000e+01 1.33818034e+21]
 [1.08999000e+01 1.33982054e+21]]

real1m8.797s
user0m47.204s
sys0m27.105s
$ time python mu.py
[[1.0300e+01 4.55561775e+17]
 [1.03001000e+01 4.55561780e+17]
 [1.03002000e+01 4.55561786e+17]
 ...
 [1.08997000e+01 1.33654085e+21]
 [1.08998000e+01 1.33818034e+21]
 [1.08999000e+01 1.33982054e+21]]

real0m38.829s
user0m41.541s
sys0m3.399s

So, I think that the benchmark dataset used by you for testing code
efficiency is not so appropriate. What's your point of view on this
testing results?

Regards,
HY

>
>
>
> > Second:
> >
> > Move the loop into cython.
> >
> >
> >
> >
> > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
> >>
> >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
> >> wrote:
> >> >
> >> >
> >> >
> >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  wrote:
> >> >>
> >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana  
> >> >> wrote:
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana  
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern  
> >> >> >>> wrote:
> >> >> >>> >
> >> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work 
> >> >> >>> > just fine on arrays and should be much faster.
> >> >> >>>
> >> >> >>> Yes, it really does the trick. See the following for the benchmark
> >> >> >>> based on your suggestion:
> >> >> >>>
> >> >> >>> $ time python mu.py
> >> >> >>> [-10.999 -10.999 -10.999 ...  20. 20. 20.   ] [4.973e-84
> >> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >> >> >>>
> >> >> >>> real0m41.056s
> >> >> >>> user0m43.970s
> >> >> >>> sys0m3.813s
> >> >> >>>
> >> >> >>>
> >> >> >>> But are there any ways to further improve/increase efficiency?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> I believe it will get a bit better if you don’t column_stack an 
> >> >> >> array 6000 times - maybe pre-allocate your output first?
> >> >> >>
> >> >> >> Andrea.
> >> >> >
> >> >> >
> >> >> >
> >> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of 
> >> >> > your column_stack call and made me think you were stacking your 
> >> >> > results very many times, which is not the case.
> >> >>
> >> >> Still not so clear on your solutions for this problem. Could you
> >> >> please post here the corresponding snippet of your enhancement?
> >> >
> >> >

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-12 Thread Hongyi Zhao
On Mon, Oct 12, 2020 at 9:33 PM Andrea Gavana  wrote:
>
> Hi,
>
> On Mon, 12 Oct 2020 at 14:38, Hongyi Zhao  wrote:
>>
>> On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
>>  wrote:
>> >
>> > On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
>> >  wrote:
>> > >
>> > > The script seems to be computing the particle numbers for an array of 
>> > > chemical potentials.
>> > >
>> > > Two ways of speeding it up, both are likely simpler then using dask:
>> > >
>> > > First: use numpy
>> > >
>> > > 1. Move constructing mu_all out of the loop (np.linspace)
>> > > 2. Arrange the integrands into a 2d array
>> > > 3. np.trapz along an axis which corresponds to a single integrand array
>> > > (Or avoid the overhead of trapz by just implementing the trapezoid 
>> > > formula manually)
>> >
>> >
>> > Roughly like this:
>> > https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99
>>
>> I've done the comparison of the real execution time for your version
>> I've compared the execution efficiency of your above method and the
>> original method of the python script by directly using fermi() without
>> executing vectorize() on it. Very surprisingly, the latter is more
>> efficient than the former, see following for more info:
>>
>> $ time python fermi_integrate_np.py
>> [[1.0300e+01 4.55561775e+17]
>>  [1.03001000e+01 4.55561780e+17]
>>  [1.03002000e+01 4.55561786e+17]
>>  ...
>>  [1.08997000e+01 1.33654085e+21]
>>  [1.08998000e+01 1.33818034e+21]
>>  [1.08999000e+01 1.33982054e+21]]
>>
>> real1m8.797s
>> user0m47.204s
>> sys0m27.105s
>> $ time python mu.py
>> [[1.0300e+01 4.55561775e+17]
>>  [1.03001000e+01 4.55561780e+17]
>>  [1.03002000e+01 4.55561786e+17]
>>  ...
>>  [1.08997000e+01 1.33654085e+21]
>>  [1.08998000e+01 1.33818034e+21]
>>  [1.08999000e+01 1.33982054e+21]]
>>
>> real0m38.829s
>> user0m41.541s
>> sys0m3.399s
>>
>> So, I think that the benchmark dataset used by you for testing code
>> efficiency is not so appropriate. What's your point of view on this
>> testing results?
>
>
>
>   Evgeni has provided an interesting example on how to speed up your code - 
> granted, he used toy data but the improvement is real. As far as I can see, 
> you haven't specified how big are your DOS etc... vectors, so it's not that 
> obvious how to draw any conclusions. I find it highly puzzling that his 
> implementation appears to be slower than your original code.
>
> In any case, if performance is so paramount for you, then I would suggest you 
> to move in the direction Evgeni was proposing, i.e. shifting your 
> implementation to C/Cython or Fortran/f2py.

If so, I think that the C/Fortran based implementations should be more
efficient than the ones using Cython/f2py.


> I had much better results myself using Fortran/f2py than pure NumPy or 
> C/Cython, but this is mostly because my knowledge of Cython is quite limited. 
> That said, your problem should be fairly easy to implement in a compiled 
> language.
>
> Andrea.
>
>
>>
>>
>> Regards,
>> HY
>>
>> >
>> >
>> >
>> > > Second:
>> > >
>> > > Move the loop into cython.
>> > >
>> > >
>> > >
>> > >
>> > > вс, 11 окт. 2020 г., 9:32 Hongyi Zhao :
>> > >>
>> > >> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana  
>> > >> wrote:
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao  
>> > >> > wrote:
>> > >> >>
>> > >> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana 
>> > >> >>  wrote:
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana 
>> > >> >> >  wrote:
>> > >> >> >>
>> > >> >> >> Hi,
>> > >> >> >>
>> > >> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao  
>> > >> >> >> wrote:
>> > >> >> >>>
>> > >> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern 
>> > >> >> >

Re: [Numpy-discussion] The mu.py script will keep running and never end.

2020-10-12 Thread Hongyi Zhao
On Mon, Oct 12, 2020 at 10:41 PM Andrea Gavana  wrote:
>
> Hi,
>
> On Mon, 12 Oct 2020 at 16.22, Hongyi Zhao  wrote:
>>
>> On Mon, Oct 12, 2020 at 9:33 PM Andrea Gavana  
>> wrote:
>> >
>> > Hi,
>> >
>> > On Mon, 12 Oct 2020 at 14:38, Hongyi Zhao  wrote:
>> >>
>> >> On Sun, Oct 11, 2020 at 3:42 PM Evgeni Burovski
>> >>  wrote:
>> >> >
>> >> > On Sun, Oct 11, 2020 at 9:55 AM Evgeni Burovski
>> >> >  wrote:
>> >> > >
>> >> > > The script seems to be computing the particle numbers for an array of 
>> >> > > chemical potentials.
>> >> > >
>> >> > > Two ways of speeding it up, both are likely simpler then using dask:
>> >> > >
>> >> > > First: use numpy
>> >> > >
>> >> > > 1. Move constructing mu_all out of the loop (np.linspace)
>> >> > > 2. Arrange the integrands into a 2d array
>> >> > > 3. np.trapz along an axis which corresponds to a single integrand 
>> >> > > array
>> >> > > (Or avoid the overhead of trapz by just implementing the trapezoid 
>> >> > > formula manually)
>> >> >
>> >> >
>> >> > Roughly like this:
>> >> > https://gist.github.com/ev-br/0250e4eee461670cf489515ee427eb99
>> >>
>> >> I've done the comparison of the real execution time for your version
>> >> I've compared the execution efficiency of your above method and the
>> >> original method of the python script by directly using fermi() without
>> >> executing vectorize() on it. Very surprisingly, the latter is more
>> >> efficient than the former, see following for more info:
>> >>
>> >> $ time python fermi_integrate_np.py
>> >> [[1.0300e+01 4.55561775e+17]
>> >>  [1.03001000e+01 4.55561780e+17]
>> >>  [1.03002000e+01 4.55561786e+17]
>> >>  ...
>> >>  [1.08997000e+01 1.33654085e+21]
>> >>  [1.08998000e+01 1.33818034e+21]
>> >>  [1.08999000e+01 1.33982054e+21]]
>> >>
>> >> real1m8.797s
>> >> user0m47.204s
>> >> sys0m27.105s
>> >> $ time python mu.py
>> >> [[1.0300e+01 4.55561775e+17]
>> >>  [1.03001000e+01 4.55561780e+17]
>> >>  [1.03002000e+01 4.55561786e+17]
>> >>  ...
>> >>  [1.08997000e+01 1.33654085e+21]
>> >>  [1.08998000e+01 1.33818034e+21]
>> >>  [1.08999000e+01 1.33982054e+21]]
>> >>
>> >> real0m38.829s
>> >> user0m41.541s
>> >> sys0m3.399s
>> >>
>> >> So, I think that the benchmark dataset used by you for testing code
>> >> efficiency is not so appropriate. What's your point of view on this
>> >> testing results?
>> >
>> >
>> >
>> >   Evgeni has provided an interesting example on how to speed up your code 
>> > - granted, he used toy data but the improvement is real. As far as I can 
>> > see, you haven't specified how big are your DOS etc... vectors, so it's 
>> > not that obvious how to draw any conclusions. I find it highly puzzling 
>> > that his implementation appears to be slower than your original code.
>> >
>> > In any case, if performance is so paramount for you, then I would suggest 
>> > you to move in the direction Evgeni was proposing, i.e. shifting your 
>> > implementation to C/Cython or Fortran/f2py.
>>
>> If so, I think that the C/Fortran based implementations should be more
>> efficient than the ones using Cython/f2py.
>
>
> That is not what I meant: what I meant is: write the time consuming part of 
> your code in C or Fortran and then bridge it to Python using Cython or f2py.

I understand your meaning, but I think that for such small job, why
not do them with pure C/Fortran if we must bother them?

All the best,
-- 
Hongyi Zhao 
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] The source code corresponding to numpy.invert.

2021-10-03 Thread hongyi . zhao
I noticed the following documentation on `numpy.invert`: [1]


numpy.invert
[...]
Compute bit-wise inversion, or bit-wise NOT, element-wise.

Computes the bit-wise NOT of the underlying binary representation of the 
integers in the input arrays. This ufunc implements the C/Python operator ~.
[...]
The ~ operator can be used as a shorthand for np.invert on ndarrays.

x1 = np.array([True, False])

~x1
array([False, True]) 


So, C/Python operator `~` has been overridden by the corresponding user 
function in numpy, but where is the corresponding source code implementation? 

[1] 
https://numpy.org/doc/stable/reference/generated/numpy.invert.html#numpy-invert 

Regards,
HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: The source code corresponding to numpy.invert.

2021-10-03 Thread hongyi . zhao
> (the bool implementation uses the `logical_not` loop).

Do you the following code snippet:

https://github.com/numpy/numpy/blob/3c1e9b4717b2eb33a2bf2d495006bc300f5b8765/numpy/core/src/umath/loops.c.src#L1627-L1633
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: The source code corresponding to numpy.invert.

2021-10-03 Thread hongyi . zhao
Thank you for pointing this out. This is the code block which includes the 
first appearance of the keyword `logical_not`.

BTW, why can't the ~ operator be tested equal to 'np.invert', as shown below:

``` 
In [1]: import numpy as np
In [3]: np.invert is np.bitwise_not
Out[3]: True

In [4]: np.invert is ~
  File "", line 1
np.invert is ~
  ^
SyntaxError: invalid syntax
```
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: The source code corresponding to numpy.invert.

2021-10-04 Thread Hongyi Zhao
On Mon, Oct 4, 2021 at 4:41 PM Robert Kern  wrote:
>
> On Mon, Oct 4, 2021 at 1:09 AM  wrote:
>>
>> Thank you for pointing this out. This is the code block which includes the 
>> first appearance of the keyword `logical_not`.
>>
>> BTW, why can't the ~ operator be tested equal to 'np.invert', as shown below:
>>
>> ```
>> In [1]: import numpy as np
>> In [3]: np.invert is np.bitwise_not
>> Out[3]: True
>>
>> In [4]: np.invert is ~
>>   File "", line 1
>> np.invert is ~
>>   ^
>> SyntaxError: invalid syntax
>> ```
>
>
> That’s just the way Python’s syntax works. Operators are not names that can 
> be resolved to objects that can be compared with the `is` operator. Instead, 
> when that operator is evaluated in an expression, the Python interpreter will 
> look up a specially-named method on the operand object (in this case 
> `__invert__`). Numpy array objects implement this method using `np.invert`.

If so, which is symlink to which, I mean, which is the original name,
and which is an alias?

HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: The source code corresponding to numpy.invert.

2021-10-04 Thread Hongyi Zhao
On Mon, Oct 4, 2021 at 9:33 PM Robert Kern  wrote:
>
> On Mon, Oct 4, 2021 at 5:17 AM Hongyi Zhao  wrote:
>>
>>
>> > That’s just the way Python’s syntax works. Operators are not names that 
>> > can be resolved to objects that can be compared with the `is` operator. 
>> > Instead, when that operator is evaluated in an expression, the Python 
>> > interpreter will look up a specially-named method on the operand object 
>> > (in this case `__invert__`). Numpy array objects implement this method 
>> > using `np.invert`.
>>
>> If so, which is symlink to which, I mean, which is the original name,
>> and which is an alias?
>
>
> "symlink" and "alias" are probably not the best analogies. The implementation 
> of `np.ndarry.__invert__` simply calls `np.invert` to do the actual 
> computation.

It seems that the above calling/invoking logic/mechanism is not so
clear or easy to understand/figure out only by reading the document,
say, by the following commands in IPython:

import numpy as np
help(np.invert)
np.invert?
np.info(np.invert)

Regards,
HY
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: The source code corresponding to numpy.invert.

2021-10-04 Thread Hongyi Zhao
On Mon, Oct 4, 2021 at 11:54 PM Stephen Waterbury
 wrote:
>
> On 10/4/21 10:07 AM, Hongyi Zhao wrote:
>
> On Mon, Oct 4, 2021 at 9:33 PM Robert Kern  wrote:
>
> On Mon, Oct 4, 2021 at 5:17 AM Hongyi Zhao  wrote:
>
> That’s just the way Python’s syntax works. Operators are not names that can 
> be resolved to objects that can be compared with the `is` operator. Instead, 
> when that operator is evaluated in an expression, the Python interpreter will 
> look up a specially-named method on the operand object (in this case 
> `__invert__`). Numpy array objects implement this method using `np.invert`.
>
> If so, which is symlink to which, I mean, which is the original name,
> and which is an alias?
>
> "symlink" and "alias" are probably not the best analogies. The implementation 
> of `np.ndarry.__invert__` simply calls `np.invert` to do the actual 
> computation.
>
> It seems that the above calling/invoking logic/mechanism is not so
> clear or easy to understand/figure out only by reading the document,
> say, by the following commands in IPython:
>
> import numpy as np
> help(np.invert)
> np.invert?
> np.info(np.invert)
>
> You probably want to read the Python Language Reference regarding "Special 
> Methods":
>
> https://docs.python.org/3.9/reference/datamodel.html#special-method-names

Thank you for directing me to this helpful documentation.

HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Get a function definition/implementation hint similar to the one shown in pycharm.

2021-10-18 Thread hongyi . zhao
I've written the following python code snippet in pycharm:
```python
import numpy as np
from numpy import pi, sin

a = np.array([1], dtype=bool)
if np.in|vert(a) == ~a:
print('ok')
```
When putting the point/cursor in the above code snippet at the position denoted 
by `|`, I would like to see information similar to that provided by `pycharm`, 
as shown in the following screenshots:

https://user-images.githubusercontent.com/11155854/137619512-674e0eda-7564-4e76-af86-04a194ebeb8e.png
https://user-images.githubusercontent.com/11155854/137619524-a0b584a3-1627-4612-ab1f-05ec1af67d55.png

But I wonder if there are any other python packages/tools that can help me 
achieve this goal?

Regards,
HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] dtype=(bool) vs dtype=bool

2021-10-19 Thread hongyi . zhao
See the following testing in IPython shell:

In [6]: import numpy as np

In [7]: a = np.array([1], dtype=(bool))

In [8]: b = np.array([1], dtype=bool)

In [9]: a
Out[9]: array([ True])

In [10]: b
Out[10]: array([ True])

It seems that dtype=(bool) and dtype=bool are both correct usages. If so, which 
is preferable?

Regards,
HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: dtype=(bool) vs dtype=bool

2021-10-19 Thread hongyi . zhao
> You could use `dis.dis` to compare the two expressions and see that they 
> compile to the same bytecode.

Do you mean the following:

In [1]: import numpy as np
In [2]: from dis import dis
In [7]: dis('bool')
  1   0 LOAD_NAME0 (bool)
  2 RETURN_VALUE

In [8]: dis('(bool)')
  1   0 LOAD_NAME0 (bool)
  2 RETURN_VALUE
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: dtype=(bool) vs dtype=bool

2021-10-19 Thread hongyi . zhao
Do I have to use it this way?
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: dtype=(bool) vs dtype=bool

2021-10-19 Thread Hongyi Zhao
On Wed, Oct 20, 2021 at 8:29 AM Robert Kern  wrote:
>
> On Tue, Oct 19, 2021 at 8:22 PM  wrote:
>>
>> Do I have to use it this way?
>
>
> Nothing is forcing you to, but everyone else will write it as `dtype=bool`, 
> not `dtype=(bool)`. `dtype=(bool)` is perfectly syntactically-valid Python. 
> It's just not idiomatic, so readers of your code will wonder why you wrote it 
> that way and if you meant something else and will have trouble reading your 
> code.

I use Emacs, and find that python-lsp-server will give the dtype=() as
the completion result, see here [1] for more detailed discussion.

[1] https://github.com/python-lsp/python-lsp-server/issues/98

HZ
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com