New submission from Raymond Hettinger:
A number of fine-grained methods in Objects/listobject.c use PyList_Check().
They include PyList_Size, PyList_GetItem, PyList_SetItem, PyList_Insert, and
PyList_Append.
The PyList_Check() works by making two sequentially dependent memory fetches:
movq 8(%rdi), %rax
testb $2, 171(%rax)
je L1645
This patch proposes a fast path for the common case of an exact match, using
PyList_CheckExact() as an early-out before the PyList_Check() test:
leaq _PyList_Type(%rip), %rdx # parallelizable
movq 8(%rdi), %rax # only 1 memory access
cmpq %rdx, %rax # macro-fusion
je L1604 # early-out
testb $2, 171(%rax) # fallback to 2nd memory access
je L1604
This technique won't help outside of Objects/listobject.c because the initial
LEA instruction becomes a MOV for the global offset table, nullifying the
advantage.
----------
assignee: serhiy.storchaka
components: Interpreter Core
files: list_check_fastpath.diff
keywords: patch
messages: 258918
nosy: rhettinger, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Faster type checking in listobject.c
type: performance
versions: Python 3.6
Added file: http://bugs.python.org/file41713/list_check_fastpath.diff
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue26201>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com