(Accidentally sent previous email.) To conclude my code snippet:
// ... auto step1Future = QtConcurrent::map(tiles, processTile): step1Watcher.setFuture(step1Future); // the charade with "internal" future is needed to get the progress from the second stage. // I think there's no simpler way currently - except for writing own map function? QFuture<void> step2FutureInternal; /* auto step2Future = */ step1Future.then( QtFuture::Launch::Inherit, [&] () mutable { step2FutureInternal = QtConcurrent::map(tiles, convertToGrayScale); } ); step2Watcher.setFuture(step2FutureInternal); // note that we use _internal_ version here If you want to convert your image to grayscale in place that should in principle work. If you want the result in a separate image, you'd probably need some more boilerplate though. -- Best Regards, Andrei ________________________________ From: Andrei Golubev <andrei.golu...@qt.io> Sent: Tuesday, February 1, 2022 8:48 AM To: Scott Bloom <sc...@towel42.com>; Murphy, Sean <sean.mur...@centauricorp.com>; interest@qt-project.org <interest@qt-project.org> Subject: Re: [Interest] [External]Re: How to get QtConcurrent to do what I want? Hi, I am not 100% certain what reduction does in QtConcurrent, but guessing this is generally not what you need (there's no "convert N to 1", but "convert N to N"). Also, is QtConcurrent's reduction concurrent? The docs [1] state that it is not, so you are likely to get a sequential second pass over the image. +1 to Scott's "remove QSharedPointer and QObject", IMO Tile class should be POD type (plain-old-data) needed as an abstraction of an image's region of interest. (I hope you don't copy the image data around by the way). QFutureWatcher is also a good thing, in principle it should also give you the progress of the QtConcurrent::map() and friends - you could use that for visualization purposes. I would've experimented with QFuture continuations. // ... auto step1Future = QtConcurrent::map(tiles, processTile): step1Watcher.setFuture(step1Future); auto step2Future = step1Future.then(QtFuture::Launch::Inherit, [&] () { QtConcurrent::map [1]: https://doc.qt.io/qt-6/qtconcurrent.html#mappedReduced QtConcurrent Namespace | Qt Concurrent 6.2.2<https://doc.qt.io/qt-6/qtconcurrent.html#mappedReduced> QtConcurrent Namespace. The QtConcurrent namespace provides high-level APIs that make it possible to write multi-threaded programs without using low-level threading primitives. doc.qt.io -- Best Regards, Andrei ________________________________ From: Interest <interest-boun...@qt-project.org> on behalf of Scott Bloom <sc...@towel42.com> Sent: Monday, January 31, 2022 9:42 PM To: Murphy, Sean <sean.mur...@centauricorp.com>; interest@qt-project.org <interest@qt-project.org> Subject: Re: [Interest] [External]Re: How to get QtConcurrent to do what I want? Not knowing if a partial value makes any sense to your system. Qt::Concurrent::mappedReduced might make more sense, if its purely a speedup you are looking for, and not a "keep the GUI alive during it" possibly blockingMappedReduced. If you need the gui, setting up a qfuturewatcher on the results of the mapped call, would be my approach QFutureWatcher< XXX > watcher; connect( &watcher, &finished, manager, &handleFinished); auto future = QtConcurrent::mapReduced(tiles, processTile, mergeFunction) watcher.setFuture( future ); -----Original Message----- From: Scott Bloom Sent: Monday, January 31, 2022 12:35 PM To: Murphy, Sean <sean.mur...@centauricorp.com>; interest@qt-project.org Subject: RE: [Interest] [External]Re: How to get QtConcurrent to do what I want? Ahh that does fill in a bunch of holes 😊 I would definitely scrap using QSharedPointer as well as QObject, and instead monitoring of the QObject, montor the QFuture that is returned from your map call. Scott -----Original Message----- From: Interest <interest-boun...@qt-project.org> On Behalf Of Murphy, Sean Sent: Monday, January 31, 2022 9:13 AM To: interest@qt-project.org Subject: Re: [Interest] [External]Re: How to get QtConcurrent to do what I want? Thanks for the reply, Scott. My responses below: > Couple things I would try. > > First, preallocate the size of the vector, or use a list if you don't > need random access into it. I had attempted this before my original post. My sequence of attempts before I posted went like this: 1. Started with QList<QSharedPointer<tile>>, and calling list.append() to populate. Noticed this was somewhat slow 2. Switched to QVector<QSharedPointer<tile>>, still calling list.append() to see if that made any difference. It didn't. 3. Realized that calling append() on a QVector could be slow as the vector keeps growing and is forced to reallocate, so I added a QVector::reserve() call beforehand to prevent that. Still made no appreciable difference I neglected to add the reserve() call to the code I posted here, so that's on me for not painting the full picture - more on that below. From the attempts above it didn't seem like the type of container, nor how I build it up were the problem. > Second, just send, pos, size into the tile. Only save the values in > the constrctor. When the worker thread kicks off on a tile, then > initialize it and do any computation. This includes the allocation of the > submatrix. I think this is what I'm already doing, correct? My tile constructor only assigns those values and does nothing else (ACTUAL code below, not edited pseudocode like the previous email): tile::tile(int id, QPoint pos, int size, QObject *parent) : QObject(parent), mID(id), mPos(pos), mSize(size) { } One thing to note here that I didn't include in my previous email, and as I keep investigating is starting to look like the real cause of slow allocation/assignment loop, my tile class inherits from QObject. My original thought was that I wanted to be able to use signals to monitor what's going on inside of the tile class, but I'm beginning to rethink whether I really need that at all, especially for the hassles that it causes: 1. Creating 60,000 QObjects in a single thread appears to be slow a. I've since tested creating a 60,000 item QVector<int> AND looping through it assigning each item the value 0-59999, and using QElapsedTimer to measure that takes 0.3 ms vs. 5 seconds or so to execute my current loop 2. Since QObject doesn't have a copy constructor, but QtConcurrent expects a container of type T items not T* items, I chose to wrap my tile* within a QSharedPointer to get everything to work out. As you point out below, I may have chosen the wrong smart pointer type, since I'm not really sharing the pointer around, but my goal there was to satisfy what QtConcurrent expects of the container you pass in to the map() function > > Also, does it need to be a shared pointer? Its clear who owns the > pointers, they aren't being "shared" as much as use. For me, I only > use shared pointers, when multiple objects can "own" the pointer. In this > case, the tile manager owns the tiles. From this comment, I'm realizing that posting pseudocode/stripped down code previously was a mistake as it was only painting a partial picture. The actual loop is this: // generate each tile with its assignment for(int i=0; i < nTiles.height(); ++i) { for(int j=0; j < nTiles.width(); ++j) { int id = i * nTiles.width() + j; QSharedPointer<tile> t(new tile(id, QPoint(j * tileSize, i * tileSize), tileSize, this)); connect(t.get(), &tile::loadingDone, this, &tileManager::tileLoaded); connect(t.get(), &tile::remappingDone, this, &tileManager::tileRemapped); mTiles.append(t); } } So there you can see that the tileManager class owns the shared pointer to each tile, and I also connect up a couple signals which are used to convey status back to the tileManager which in turn emits status signals back up to my UI class so it can show progress bars to the user that something is actually happening since this is a lengthy operation. > > Mainly, do as much as possible to reduce the time in the nested loop. I'm not sure there's anything else I can do to reduce the time in the loop, beyond changing the tile class to no longer inherit from QObject, and just have it be a pure data class, but I'm open to any suggestions. I probably should have mentioned in an earlier email that this is my first attempt at using QtConcurrent (if that wasn't obvious already), so I'm extremely open to any suggestions of refactoring that helps out this process. Disregarding anything I'm currently attempting, at a high level, this is the problem I'm trying to solve: 1. We have a large file on disk that contains data that we want to convert to a grayscale image to display to the user. This data is organized similar to an image in the sense that it can be thought of as a rectangular grid of "pixels", but the dynamic range of data pixel values is much larger than 0-255. 2. Converting the raw data to grayscale is a two step process: a. Loop through every data pixel to find the minimum and maximum pixel values in the entire image b. Loop through every pixel a second time, mapping the real dynamic range found in step a) to the grayscale range of 0-255 So 2a and 2b both lend themselves to parallelization with a necessary synchronization step in between them - I can't start the grayscale remapping conversion until I've determined the true dynamic range of the original data. Thanks to everyone that has responded so far... _______________________________________________ Interest mailing list Interest@qt-project.org https://lists.qt-project.org/listinfo/interest _______________________________________________ Interest mailing list Interest@qt-project.org https://lists.qt-project.org/listinfo/interest
_______________________________________________ Interest mailing list Interest@qt-project.org https://lists.qt-project.org/listinfo/interest