[Numpy-discussion] Re: Improving the Thread Safety of NumPy's Test Suite

Marten van Kerkwijk via NumPy-Discussion Thu, 18 Sep 2025 19:00:46 -0700

Hi Britney,

Thanks for the write-up, and great that you are working on this!


Just wanted to support your general preference for setup within test
functions over using pytest fixtures.  Those are nice too, but I agree
with what you write, that "dependency injection via a fixture is
fundamentally *hard to debug and understand at a glance*."

One question: if there is no teardown, do things like `setup_method`
still work fine?  (For perhaps strange reasons, I find those much easier
to follow that fixtures, especially if used in not too large a test
class.)

Anyway, overall, this seems like a relatively modest amount of
relearning pain for future contributions in exchange for clear gains (of
course, quite a lot more pain for you for changing the past...).

All the best,

Marten

"Britney Whittington" <[email protected]> writes:

> Hello! My name is Britney Whittington, and I'm an intern at Quansight working 
> with Nathan Goldbaum. This is my first time working with an open-source 
> project like NumPy, and I'm excited to be here!
>
> For my project, I am working on making the NumPy test suite more thread safe, 
> to improve its support of Python's free-threaded build of Python 3.14. To do 
> this, I have been using `pytest-run-parallel` [1]. You can see the README of 
> the plugin for more details, but briefly it runs each test in a test suite 
> many times in a thread pool. This exercise results in failures in the test 
> suite, often due to thread-safety issues in pytest itself, or thread-safety 
> issues with NumPy due to use of global state in the NumPy implementation. As 
> we work on making the test suite more thread safe, we have exposed thread 
> safety issues in NumPy.
>
> Party of the difficulty of using `pytest-run-parallel` is that Pytest itself 
> and the constructs provided by it are not thread safe [2], so it takes some 
> effort to make a test suite as big as NumPy's run under `pytest-run-parallel` 
> without any failures.
>
> We've already merged some work along these lines. You may have seen some PRs 
> I've submitted so far related to this project [3]. We are getting close to 
> turning on `pytest-run-parallel` in CI. Before we do that, we want to make 
> sure there aren't any objections.
>
> So far, I've made two major changes to the test suite:
>
> ----------------------------------------------------
> 1. Refactoring xunit Setup and Teardown Methods
> ----------------------------------------------------
> #### Problem
> - NumPy makes heavy use of pytest's xunit setup and teardown methods [4], 
> which are mentioned in the testing guidelines [5].
> - When using `pytest-run-parallel`, pytest will try to run the teardown 
> before all threads complete running a test.
>
> #### Solutions
> - Replace setup/teardown with fixtures. While fixtures play more nicely with 
> threads, they are still shared between threads. If one thread mutates a 
> fixture, this mutation will carry over to all the other threads. 
> Additionally, dependency injection via a fixture is fundamentally *hard to 
> debug and understand at a glance*.
> - Use **explicit setup**. Instead of pytest calling setup methods, we modify 
> the setup methods to be manually called in each test. This fixes the teardown 
> issue, and allows us to declare variables locally and not worry about 
> mutations between threads.
>
> Currently we are favoring the usage of explicit setup. Of course, this may 
> not work for every xunit setup. For more complex cases, fixtures may be more 
> useful, or context managers.
>
> -------------------------------------------
> 2. Refactoring Global np.random Calls
> -------------------------------------------
> #### Problem
> - Calls to `np.random` use the same global instance. This results in errors 
> with tests that heavily rely on seeded results, due to threads sharing the 
> same global RNG state. While this may not necessarily cause failures, we also 
> feel that it's a fundamentally bad practice for tests to rely on global state 
> like this.
>
> #### Solution
> -  Instead of `np.random`, each test should use a local instance of 
> `np.random.RandomState`, so that threads can increment through their own 
> local RNG stream.
>
> I have made this change to the tests that fail under `pytest-run-parallel` 
> [7], and am working on making all test calls to `np.random` local.
> Note that `RandomState` uses the same MT RNG that the global RNG uses, so the 
> RNG streams are the same as before.
>
> ===============================
>
> -----------------------------------------
>> What Does This Mean Going Forward?
> -----------------------------------------
> By refactoring the test suite to be more thread safe, if we'd like to add 
> `pytest-run-parallel` CI, contributors may need to write tests in a somewhat 
> different style.
>
> It is possible for `pytest-run-parallel` to fix some of these issues on its 
> side, such as making it so xunit setup runs properly. However, with the 
> current state of pytest and the plugin, this will require a lot of work, 
> time, and maintenance. It also may make it more difficult to improve the 
> thread safety of pytest itself in the future. See this issue [8] for further 
> discussion. We think that a refactor of the NumPy test suite is more 
> straightforward for now, and can always be reverted once 
> `pytest-run-parallel` develops a way to handle thread-unsafe setup fixtures.
>
> --------------------
>> Testing Guidelines
> --------------------
> In addition to make the tests thread safe, I'd like to update the testing 
> guidelines [9]. Some things can be clarified (such as NumPy's opinion on 
> fixture usage) and it would be good to update it with current best practices, 
> and, if folks are open to it, guidelines on writing thread-safe tests.
>
> [1] https://github.com/Quansight-Labs/pytest-run-parallel
> [2] https://docs.pytest.org/en/stable/explanation/flaky.html#thread-safety
> [3] https://github.com/numpy/numpy/issues/29552 (mentioned by all PRs related 
> to this project)
> [4] https://docs.pytest.org/en/stable/how-to/xunit_setup.html
> [5] 
> https://numpy.org/doc/stable/reference/testing.html#easier-setup-and-teardown-functions-methods
> [6] https://docs.pytest.org/en/stable/explanation/fixtures.html
> [7] https://github.com/numpy/numpy/pull/29729
> [8] https://github.com/Quansight-Labs/pytest-run-parallel/issues/14
> [9] https://numpy.org/doc/stable/reference/testing.html
> _______________________________________________
> NumPy-Discussion mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3//lists/numpy-discussion.python.org
> Member address: [email protected]
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

[Numpy-discussion] Re: Improving the Thread Safety of NumPy's Test Suite

Reply via email to