Hi, Are these servers running multiple builds at a time, or is a Windows build given the full host resources, i.e. 8c/16t and 64GB of RAM? If you can monitor the resources in real time, it would be interesting to confirm if indeed the CPU utilization is significantly lower than 100%, meaning something else is bottlenecking it.
I'm not sure what's the expected build time of a native (i.e. no docker/VM/etc.) build on these servers, but for reference, I'm seeing a bit less than 13 minutes on a stock Ryzen 9 9950X (a developer build, including test apps). I can see this number go way up with less cores, but still, over 100 minutes is very long. Thanks, Maciej. On Mon, Apr 21, 2025 at 9:15 PM Ben Cooksley <bcooks...@kde.org> wrote: > On Tue, Apr 22, 2025 at 5:57 AM Dmitry Kazakov <dimul...@gmail.com> wrote: > >> Hi, Ben! >> > > Hey Dmitry, > > >> >> As for Krita, most of CI time is spent on the Windows pipeline, which >> build extremely slowly due to done obscure filesystem issues (searching >> includes is extremely slow). I personally don't know how to fix it. I >> tried: 1) PCH builds, 2) relative includes, 3) split debug info (dwo). The >> only solution left is to rewrite a huge portion of Krita to reduce amount >> of includes. Which is, obviously, not an option atm. >> > > This is probably at least in part due to Windows on Docker having > extremely poor file system performance even vs. straight NTFS (which isn't > great to begin with). > That will be fixed by VM based CI (progress update - I have most of the > tool that will manage the underlying base images written now, just need to > finish the VM provisioning part and give it some serious testing) > > >> >> Another point that requires extra build time for Krita is an >> inappropriate timeout on 100 minutes. A lot of our windows builds are >> terminated at around 95% completion because of this timeout, so we have to >> rerun them and, effectively, consume more and more CI time. >> > > Have you got a list of these so I can have a look to see if the timeout is > set too low? > Increasing the timeout is only a temporary fix though - we will need to > find a solution to why the build time is taking so long. > > >> >> --- >> Dmitry Kazakov >> > > Cheers, > Ben > > >> >> пт, 18 апр. 2025 г., 21:27 Ben Cooksley <bcooks...@kde.org>: >> >>> Hi all, >>> >>> Over the past week or two there have been a number of complaints >>> regarding CI builder availability which i've done some investigating into >>> this morning. >>> >>> Part of this is related to the Windows CI builders falling offline due >>> to OOM events, however the rest is simply due to a lack of builder time >>> availability (which is what this email is focused on). >>> >>> Given we have 6 Hetzner AX51 servers connected to Gitlab (each equipped >>> with a Ryzen 7 7700 CPU, 64GB RAM and NVMe storage) the issue is not >>> available build power - it is the number of builds and the length of those >>> builds that is at issue. >>> >>> This morning I ran a basic query to ascertain the top 20 projects for CI >>> time utilisation on invent.kde.org which revealed the following: >>> >>> full_path | time_used | job_count >>> ------------------------------+------------------+----------- >>> plasma/kwin | 320:47:04.966412 | 2387 >>> graphics/krita | 178:03:19.080763 | 423 >>> multimedia/kdenlive | 174:08:09.876842 | 697 >>> network/ruqola | 173:17:47.311305 | 555 >>> plasma/plasma-workspace | 155:10:03.618929 | 660 >>> network/neochat | 138:03:23.926652 | 1546 >>> education/kstars | 129:49:17.74229 | 329 >>> sysadmin/ci-management | 111:21:09.739792 | 154 >>> plasma/plasma-desktop | 108:56:52.849433 | 776 >>> kde-linux/kde-linux-packages | 81:00:10.001937 | 33 >>> kdevelop/kdevelop | 59:40:51.54474 | 217 >>> office/kmymoney | 54:32:00.24623 | 271 >>> frameworks/kio | 53:54:19.046685 | 690 >>> education/labplot | 52:36:30.343671 | 245 >>> murveit/kstars | 52:32:56.882728 | 128 >>> frameworks/kirigami | 47:07:19.172935 | 1627 >>> system/dolphin | 46:09:58.02836 | 705 >>> kde-linux/kde-linux | 39:25:54.052469 | 46 >>> utilities/kate | 36:09:22.18958 | 356 >>> wreissenberger/kstars | 35:58:14.120515 | 105 >>> >>> If we look closely, KStars has three spots on this list (totalling 216 >>> hours of time used, making it the biggest app user of CI time). >>> >>> Projects on the above list are asked to please review their jobs and how >>> they are conducting development to ensure CI time is used efficiently and >>> appropriately. >>> >>> Other projects should also please review their usage and optimise >>> accordingly even if they're not on this list as there is efficiencies to be >>> found in all projects. >>> >>> When reviewing the list of CI builds projects have enabled, it is >>> important to consider to what degree your project benefits from having >>> various builds enabled. One common pattern i've seen is having Alpine, SUSE >>> Qt 6.9 and SUSE Qt 6.10 all enabled. >>> >>> If you need to verify building on Alpine / MUSL type systems and wish to >>> monitor for Qt Next regressions then you probably shouldn't have a >>> conventional Linux Qt stable build as those two jobs between them already >>> cover that list of permutations. >>> >>> I've taken a quick look at some of these and can suggest the following: >>> >>> KWin: it has two conventional Linux jobs (suse_qt69 and suse_qt610) plus >>> a custom reduced feature set job. It seems like one of these conventional >>> Linux jobs should be dropped. >>> >>> KStars: Appears to have a custom Linux job in addition to a conventional >>> Linux job. Choose one please. >>> >>> Ruqola: Appears to be conducting a development process whereby changes >>> are made in stable then immediately merged to master in a ever continuing >>> loop. Please discontinue this behaviour and only periodically merge stable >>> to master. >>> >>> Also needs to drop one of it's Linux jobs as they're duplicating >>> functionality as noted above. >>> >>> Plasma Workspace/Desktop: At least in part this seems to be driven by >>> Appium tests. Please reduce the number of these and/or streamline the >>> process for running an Appium test. Consideration should be given to >>> enabling the CI option use-ccache as well. >>> >>> KDevelop: Please enable the CI option use-ccache. >>> >>> Labplot: Appears to have a strange customisation in place to the >>> standard jobs which shouldn't be necessary as flags in .kde-ci.yml should >>> permit that to be done. >>> >>> Thanks, >>> Ben >>> >>