[issue40230] Itertools.product() Out of Memory Errors

2020-04-08 Thread Henry Carscadden


New submission from Henry Carscadden :

The product method in itertools provides an implementation of the Cartesian 
product that when run on with many arguments quickly gives out of memory 
errors. The current implementation creates a lot of unnecessary lists in this 
situation. A more appropriate implementation uses dynamic programming to avoid 
these out of memory issues.

--
components: Distutils
messages: 366005
nosy: Henry Carscadden, dstufft, eric.araujo
priority: normal
severity: normal
status: open
title: Itertools.product() Out of Memory Errors
type: resource usage
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue40230>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40230] Itertools.product() Out of Memory Errors

2020-04-09 Thread Henry Carscadden


Henry Carscadden  added the comment:

Hey, Tim, I just wanted a note of clarification. I was working on an 
approximation algorithm for a network science tool that is being released soon. 
Initially, I relied on itertools.product(), but when I moved to testing on 
non-trivial graphs, I immediately had Out of Memory Errors. This was even on 
the nodes with large memories on the computing cluster that I was using. The 
issue is a lot of extra lists are added as the number of iterables passed 
product grows although the actual number of elements in the iterables is 
relatively small. 
Here is the workaround that I used to handle many arguments:

def product(*args) -> Iterable:
'''
Takes in several iterables and returns a generator
that yields the cartesian product of the elements in the iterables
:param args: sets which you would like the cartesian product of
:return: generator
'''
if len(args) == 1 and type(args) == list:
return args
numArgs = len(args)
indices = np.zeros(numArgs, dtype=np.uint)
checkLen = len(args[0])
while indices[0] < checkLen:
curr_term = [args[0][indices[0]]]
for i in range(1, numArgs):
curr_term.append(args[i][indices[i]])
indices[-1] += np.uint(1)
for i in range(1, numArgs):
updated = False
if indices[-i] >= len(args[-i]):
indices[-i] = np.uint(0)
indices[-i - 1] += np.uint(1)
updated = True
if not updated:
break
yield curr_term

--

___
Python tracker 
<https://bugs.python.org/issue40230>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40230] Itertools.product() Out of Memory Errors

2020-04-10 Thread Henry Carscadden


Henry Carscadden  added the comment:

@Tim Peters
For example,
the following should reproduce the error:
many_arguments = [[1,2] for i in range(50)]
for term in product(*many_arguments):
print(term)
In my application, I was taking the Cartesian product of the sets of all simple 
path to several nodes. This regularly produced a lot of arguments and crashed 
the compute nodes on which my jobs were running. Perhaps, this is not a salient 
concern for most uses, but I just wanted to make you aware of the issue.

--

___
Python tracker 
<https://bugs.python.org/issue40230>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40230] Itertools.product() Out of Memory Errors

2020-04-10 Thread Henry Carscadden


Henry Carscadden  added the comment:

Tim, I'm guessing it was a misdiagnosis then. In any case, something changed 
about my codebase that alleviated the problem. Thanks for looking into it.

--

___
Python tracker 
<https://bugs.python.org/issue40230>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com