[Inada Naoki]
>> So I tried to use LIKELY/UNLIKELY macro to teach compiler hot part.
>> But I need to use
>> "static inline" for pymalloc_alloc and pymalloc_free yet [1].
[Neil Schemenauer]
> I think LIKELY/UNLIKELY is not helpful if you compile with LTO/PGO
> enabled.
I like adding those regardl
> Mean +- std dev: [python-master] 199 ms +- 1 ms -> [python] 182 ms +-
> 4 ms: 1.10x faster (-9%)
...
> I will try to split pymalloc_alloc and pymalloc_free to smaller functions.
I did it and pymalloc is now as fast as mimalloc.
$ ./python bm_spectral_norm.py --compare-to=./python-master
python-
On Wed, Jul 10, 2019 at 5:18 PM Neil Schemenauer wrote:
>
> On 2019-07-09, Inada Naoki wrote:
> > PyObject_Malloc inlines pymalloc_alloc, and PyObject_Free inlines
> > pymalloc_free.
> > But compiler doesn't know which is the hot part in pymalloc_alloc and
> > pymalloc_free.
>
> Hello Inada,
>
>
On 2019-07-09, Inada Naoki wrote:
> PyObject_Malloc inlines pymalloc_alloc, and PyObject_Free inlines
> pymalloc_free.
> But compiler doesn't know which is the hot part in pymalloc_alloc and
> pymalloc_free.
Hello Inada,
I don't see this on my PC. I'm using GCC 8.3.0. I have configured
the bui
On 2019-07-09, Inada Naoki wrote:
> So I tried to use LIKELY/UNLIKELY macro to teach compiler hot part.
> But I need to use
> "static inline" for pymalloc_alloc and pymalloc_free yet [1].
I think LIKELY/UNLIKELY is not helpful if you compile with LTO/PGO
enabled. So, I would try that first. Also
[Inada Naoki ,
looking into why mimalloc did so much better on spectral_norm]
> I compared "perf" output of mimalloc and pymalloc, and I succeeded to
> optimize pymalloc!
>
> $ ./python bm_spectral_norm.py --compare-to ./python-master
> python-master: . 199 ms +- 1 ms
> python: