Re: [Interest] [External]Re: How to get QtConcurrent to do what I want?

Murphy, Sean Sun, 30 Jan 2022 18:09:44 -0800

Right now, I'm not even actually extracting anything from the original image, 
so far I'm just kind of setting up my proof-of-concept that I'm going down the 
right path.


In my current code, the tile constructor is literally empty (other than the 
automatic assignments to mPos & mSize) and my tile::process() is just a dummy 
function that randomly selects a certain number of milliseconds to sleep to 
simulate that it's doing some work:

void tile::process()

{

    // mRNG below is a QRandomNumberGenerator

    int delay = mRNG.bounded(5, 21);

    QThread::msleep(delay);

}

My current testing is involving around 60,000 tiles (which is quite realistic 
given our actual image dimensions), and on my fairly old & slow laptop, that 
tile creation loop in tileManager::setup() takes around 5 seconds to create all 
60,000 tiles. So that's 5 seconds since the very first tile was created before 
that first tile even begins to process its subset region from the original 
image. I'd like to see what I can do to change that, so that as soon as the 
first tile is created, it starts processing its subset, and each subsequent 
tile begins processing its own subset region as soon as it is created, thereby 
making better use of the 5 seconds it takes to create 60,000 tiles... But since 
as far as I can tell (and let me know if any of these is false):

  1.  QtConcurrent::map(tiles, processTile) needs a fully populated list to 
begin
  2.  Once QtConcurrent::map() begins executing processTile() in parallel on 
each tile, there's no way to tell which tile in the list you're working on 
other than whatever information you already populated in the tile itself before 
you called QtConcurrent::map()
  3.  There's no way to make processTile() contain any automatically 
incrementing arguments that would allow you to detect which iteration you're 
currently on

Assuming I have all those correct, I'm not seeing how I can spread out the tile 
CREATION over multiple threads in a way that correctly assigns the tile's 
subset region.

Since this is just a proof-of-concept, I'm happy to share the whole project 
(once I clean it up a little...). I'm also willing to refactor the design if 
needed. I'm not sure what other options there are. The list that I pass to 
QtConcurrent::map() is always going to be 60,000 items in my example project, 
so even if I'm creating QPoints instead of tiles, I still need to create 60,000 
of them. And the number of tiles in the real world it will vary of course, but 
I'm betting that 60,000 tiles isn't our maximum count.

The other idea I'm considering is creating a tileManagerManager class that 
breaks up the main image into smaller (but still quite large regions), each 
region managed by a single tileManager as we've implemented above. For example 
instead of creating one tileManager object that processes 60,000 tiles as I'm 
doing now, I could create 60 tileManagers that are each processing 1000 tiles. 
They'd still have the same for loop, but now the first tile created would only 
need to wait for 999 more tiles to be created before it starts processing, 
instead of 59,999.

Sean
________________________________
From: Interest <interest-boun...@qt-project.org> on behalf of Tony Rietwyk 
<t...@rightsoft.com.au>
Sent: Sunday, January 30, 2022 8:26 PM
To: interest@qt-project.org <interest@qt-project.org>
Subject: Re: [Interest] [External]Re: How to get QtConcurrent to do what I want?


That's looks OK.  Why does the tile object creation take so long?  Is all of 
the image handling in tile::process, or does tile constructor extract from the 
original?


Regards, Tony


On 31/01/2022 11:59 am, Murphy, Sean wrote:
Thanks for the response, but I'm not following your suggestion - or at least 
I'm not seeing how it's different than what I'm doing? Maybe a little 
pseudocode will help. Here's what I'm currently doing:

Tile class:
private:
  QPoint mPos;

  int mSize;

tile::tile(QPoint pos, int size) :

    mPos(pos),


    mSize(size)

{

  // assigns this tile an mSize x mSize square

  // from the original image starting at mPos

  // pixel location in the original image

}

void tile::process()

{

  // does the work on the assigned subset

}


TileManager:
private:
  QVector<QSharedPointer<tile>> mTiles;

processTile(QSharedPointer<tile>& t)
{
  t->process();
}

tileManager::setup(QSize tileGrid, int tileSize)
{
  // generate each tile with its assignment

  for(int i=0; i < tileGrid.height(); ++i)

  {

    for(int j=0; j < tileGrid.width(); ++j)

    {

      // create the new tile while assign its

      // region of the original image

      QSharedPointer<tile> t(new tile(


                   QPoint(j * tileSize, i * tileSize),

                   tileSize));

      mTiles.append(t);

    }

  }

  QtConcurrent::map(mTiles, processTile);

}

So I think I'm already doing what you're saying? Where I'm paying the penalty 
is that the allocation of each tile is happening in one thread and I'd like to 
see if I can thread out the object creation. But I don't see how to 
simultaneously thread out the tile objection creation AND correctly assign the 
tile its location since as far as I can tell, when QtConcurrent executes 
tileManager's processTile function in parallel there's nothing I can poll 
inside tileManager::processTile() that allows me to know WHICH step I'm at.

Or am I misunderstanding what you're saying?

The best thing I can come up with is that maybe I could change the type of my 
mTiles vector to be a QVector<QPoint>> but then I'd still need to loop through 
nested for-loop to populate all the QPoint items in the vector I want to pass 
to QtConcurrent::map(). I have tried that yet to see if generating thousands of 
QPoint objects is faster than generating the same number of tiles, but I can 
test that out.

Sean
________________________________
From: Interest 
<interest-boun...@qt-project.org><mailto:interest-boun...@qt-project.org> on 
behalf of Tony Rietwyk <t...@rightsoft.com.au><mailto:t...@rightsoft.com.au>
Sent: Sunday, January 30, 2022 7:19 PM
To: interest@qt-project.org<mailto:interest@qt-project.org> 
<interest@qt-project.org><mailto:interest@qt-project.org>
Subject: [External]Re: [Interest] How to get QtConcurrent to do what I want?

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize the sender and know the content 
is safe.


Hi Sean,


Can you use the position of the tile as a unique key?  Then the manager only 
needs to calculate each tile's position in the original image.  Each tile 
extracts the bits, processes and notifies the result with its position.


Regards, Tony



On 31/01/2022 10:06 am, Murphy, Sean wrote:
I'm hitting a design issue with the way I'm using the QtConcurrent module to do 
some image processing, and I'm wondering if someone can give some pointers?

At a high level, the software needs to do some processing on every pixel of an 
image. The processing can mostly be done in parallel, so I've created the 
following:

  1.  Tile class - responsible for doing the processing on a small subset of 
the original image
     *   Has a constructor that takes a Position and Size. From those 
parameters, the Tile knows what subset of the original image it is going to 
process
     *   Has a process() function which will do the work on those assigned 
pixels
  2.  TileManager class - responsible for managing the Tile objects
     *   Contains a for-loop that creates each Tile object, assigns it a unique 
Position, and adds it to the QVector<Tile> vector
     *   Has a processTile(Tile& t) function which calls t.process() to tell a 
given Tile to begin its work
     *   Calls QtConcurrent::map(tiles, processTile) to process each tile

So far this works well, but as I was timing different parts of the codebase, I 
discovered that a large portion of the time is spent allocating the 
QVector<Tile> vector (step 2a above) before I get to the concurrent processing 
call. The reason why is obvious to me - I need to ensure that each tile is 
created with a unique assignment and as far as I can see, that need to happen 
in a single thread? If I could instead pass off the Tile creation to the 
parallel processing step, I might be able to improve the overall performance, 
but I don't see a way around it within the QtConcurrent framework.

How can I go about creating Tile objects in parallel AND ensure that each of 
them gets a unique Position assignment? I could easily move the Tile allocation 
into processTile(), but if I do that, I don't see a way make the unique 
position assignment since I don't see how a given call to processTile() would 
know where it is in the overall parallelization sequence to determine what 
Position to assign to the Tile it creates. If I were using something like CUDA, 
I could use things like blockIdx and threadIdx to do that, but as far as I can 
see, those concepts don't exist (or at least aren't exposed) in QtConcurrent.

Any thoughts?



_______________________________________________
Interest mailing list
Interest@qt-project.org<mailto:Interest@qt-project.org>
https://lists.qt-project.org/listinfo/interest

_______________________________________________
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest

Re: [Interest] [External]Re: How to get QtConcurrent to do what I want?

Reply via email to