[issue40230] Itertools.product() Out of Memory Errors
New submission from Henry Carscadden : The product method in itertools provides an implementation of the Cartesian product that when run on with many arguments quickly gives out of memory errors. The current implementation creates a lot of unnecessary lists in this situation. A more appropriate implementation uses dynamic programming to avoid these out of memory issues. -- components: Distutils messages: 366005 nosy: Henry Carscadden, dstufft, eric.araujo priority: normal severity: normal status: open title: Itertools.product() Out of Memory Errors type: resource usage versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue40230> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40230] Itertools.product() Out of Memory Errors
Henry Carscadden added the comment: Hey, Tim, I just wanted a note of clarification. I was working on an approximation algorithm for a network science tool that is being released soon. Initially, I relied on itertools.product(), but when I moved to testing on non-trivial graphs, I immediately had Out of Memory Errors. This was even on the nodes with large memories on the computing cluster that I was using. The issue is a lot of extra lists are added as the number of iterables passed product grows although the actual number of elements in the iterables is relatively small. Here is the workaround that I used to handle many arguments: def product(*args) -> Iterable: ''' Takes in several iterables and returns a generator that yields the cartesian product of the elements in the iterables :param args: sets which you would like the cartesian product of :return: generator ''' if len(args) == 1 and type(args) == list: return args numArgs = len(args) indices = np.zeros(numArgs, dtype=np.uint) checkLen = len(args[0]) while indices[0] < checkLen: curr_term = [args[0][indices[0]]] for i in range(1, numArgs): curr_term.append(args[i][indices[i]]) indices[-1] += np.uint(1) for i in range(1, numArgs): updated = False if indices[-i] >= len(args[-i]): indices[-i] = np.uint(0) indices[-i - 1] += np.uint(1) updated = True if not updated: break yield curr_term -- ___ Python tracker <https://bugs.python.org/issue40230> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40230] Itertools.product() Out of Memory Errors
Henry Carscadden added the comment: @Tim Peters For example, the following should reproduce the error: many_arguments = [[1,2] for i in range(50)] for term in product(*many_arguments): print(term) In my application, I was taking the Cartesian product of the sets of all simple path to several nodes. This regularly produced a lot of arguments and crashed the compute nodes on which my jobs were running. Perhaps, this is not a salient concern for most uses, but I just wanted to make you aware of the issue. -- ___ Python tracker <https://bugs.python.org/issue40230> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40230] Itertools.product() Out of Memory Errors
Henry Carscadden added the comment: Tim, I'm guessing it was a misdiagnosis then. In any case, something changed about my codebase that alleviated the problem. Thanks for looking into it. -- ___ Python tracker <https://bugs.python.org/issue40230> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com