Re: [Avocado-devel] Running tests in parallel

Zubair Lutfullah Kakakhel Wed, 23 Nov 2016 06:47:10 -0800

Hi,

On 11/23/2016 02:28 PM, Cleber Rosa wrote:


On 11/23/2016 07:07 AM, Zubair Lutfullah Kakakhel wrote:

Hi,

Thank you for your comprehensive reply!

Comments inline.

On 11/22/2016 02:11 PM, Cleber Rosa wrote:

On 11/22/2016 07:53 AM, Zubair Lutfullah Kakakhel wrote:

Hi,


Hi Zubair,

There are quite a few threads about this and a trello card
https://trello.com/c/xNeR2slj/255-support-running-tests-in-parallel

And the discussion leads to a complex multi-host RFC.
https://www.redhat.com/archives/avocado-devel/2016-March/msg00025.html

Our requirement is simpler.
All we wanted to do is run disjoint simple (c executables) tests in
parallel.


Sounds fair enough.

I was wondering if somebody has a WIP branch that has some level of
implementation for this?


I'm not familiar with a WiP or PoC on this (yet).  If anyone has
experimented with it, I'd happy to hear about it.

Or If somebody is familiar with the code base, I'd appreciate some
direction on how to implement this.


Avocado already runs every single test in a fresh new process.  This is,
at least theoretically,  a good start.  Also, the test process is
handled based on the standard Python multiprocessing module:

https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L363


The first experimentation I'd do would be to attempt using the also
Python standard multiprocessing.Pool:

https://docs.python.org/2.7/library/multiprocessing.html#using-a-pool-of-workers


In this case, there would be a separate python thread for each test
being run in parallel.
Each python thread would actually call the test executable using a
sub-process?


Ideally, the Avocado test runner would remain a single process, that is,
without one additional thread (or process) to manage each *test* process.

That can be OK for Desktops but won't scale well for using avocado in
memory
constrained Embedded devices.


I must admit I haven't attempted to run Avocado in resource constrained
environments.  Can you explain what is your bigger concern?


In our case, primarily memory. Even for dormant processes. Although cpu usage 
is also
a concern.

Imagine running Avocado on a slightly beefy WiFi router with 128 Mbytes of RAM.
One python process is slow/difficult. Run a few python processes in parallel.
And the Kernel Out of Memory killer starts killing processes.


Do you feel that Avocado (as a single process test *runner*) plus one
process for each *test* is not suitable to those environments?


Avocado should only be running one process ideally.
And each test should be running 'only' its process.

I think we've confused the dialogue with terminology.
Threads/processes/subprocesses/multiprocessing
I'll attempt to clarify.

My current understanding of Avocado

Avocado-runner parent process
runs - > Avocado test thread using multiprocessing.Process here [1]
         run - > Actual test executable using subprocess here [2]

Is this correct?
Is there a particular purpose the runner starts a separate thread which
actually calls the test executable?

Now coming back to running tests in parallel.

You mentioned using multiprocessing.Pool. In that case, there could be
a potential issue for constrained devices.
e.g. Running 4 tests in parallel.

Avocado-runner parent process
runs - > Avocado test thread using multiprocessing.Process here [1]
         run - > Actual test process using subprocess here [2]
runs - > Avocado test thread using multiprocessing.Process here [1]
         run - > Actual test process using subprocess here [2]
runs - > Avocado test thread using multiprocessing.Process here [1]
         run - > Actual test process using subprocess here [2]
runs - > Avocado test thread using multiprocessing.Process here [1]
         run - > Actual test process using subprocess here [2]

So 4 tests would actually result in 9 processes being created.
1 runner in Python
4 of them mostly dormant Python multiprocessing. (their purpose is a bit 
unclear)
4 actual executables.

Ideally, that number should be 5 for running 4 tests.
Avocado-runner parent process
run - > Actual test process using subprocess here [2]
run - > Actual test process using subprocess here [2]
run - > Actual test process using subprocess here [2]
run - > Actual test process using subprocess here [2]

I hope this doesn't look even more confusing :)

Regards,
ZubairLK

[1] 
https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L363
[2] 
https://github.com/avocado-framework/avocado/blob/master/avocado/utils/process.py#L273


- Cleber.

Please correct me if I am reading this incorrectly.

Regards,
ZubairLK


This would most certainly lead to changes in how Avocado currently
serially waits for the test status:

https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L403


Which ultimately is added to the (Job wide) results:

https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L455


Since the results for many tests will now be acquired in unpredictable
order, this will require changes to the ResultEvent based plugins (such
as the UI).

Thanks

Regards,
ZubairLK


I hope this is a good initial set of pointers.  If you feel adventurous
and wants to start hacking on this, you're more then welcome.

BTW: we've had quite a number of features that started as
experiments/ideas/not-really-perfect-pull-requests from the community
that Avocado "core team" members embraced and pushed all the way to
completeness.

Re: [Avocado-devel] Running tests in parallel

Reply via email to