On Tue, Apr 22, 2025 at 9:53 AM Maciej Jesionowski <yav...@gmail.com> wrote:

> Hi,
>

Hi Maciej,


> Are these servers running multiple builds at a time, or is a Windows build
> given the full host resources, i.e. 8c/16t and 64GB of RAM? If you can
> monitor the resources in real time, it would be interesting to confirm if
> indeed the CPU utilization is significantly lower than 100%, meaning
> something else is bottlenecking it.
>

There is a reasonable probability that the systems are processing multiple
builds at any given moment.
During the normal working hours in Europe that is almost guaranteed to be
the case.

I am almost 100% certain that the Krita build times on Windows are being
adversely affected by Docker on Windows inefficiencies, with speedups
likely to be significant for Krita if what I saw with Craft builds was
anything to go by.
(Craft builds would essentially be unworkable without the punch through we
currently provide)


>
> I'm not sure what's the expected build time of a native (i.e. no
> docker/VM/etc.) build on these servers, but for reference, I'm seeing a bit
> less than 13 minutes on a stock Ryzen 9 9950X (a developer build, including
> test apps). I can see this number go way up with less cores, but still,
> over 100 minutes is very long.
>

Is that a clean build or an incremental build?
These servers are https://www.hetzner.com/dedicated-rootserver/ax52/ for
the record.


> Thanks,
> Maciej.
>

Cheers,
Ben


>
> On Mon, Apr 21, 2025 at 9:15 PM Ben Cooksley <bcooks...@kde.org> wrote:
>
>> On Tue, Apr 22, 2025 at 5:57 AM Dmitry Kazakov <dimul...@gmail.com>
>> wrote:
>>
>>> Hi, Ben!
>>>
>>
>> Hey Dmitry,
>>
>>
>>>
>>> As for Krita, most of CI time is spent on the Windows pipeline, which
>>> build extremely slowly due to done obscure filesystem issues (searching
>>> includes is extremely slow). I personally don't know how to fix it. I
>>> tried: 1) PCH builds, 2) relative includes, 3) split debug info (dwo). The
>>> only solution left is to rewrite a huge portion of Krita to reduce amount
>>> of includes. Which is, obviously, not an option atm.
>>>
>>
>> This is probably at least in part due to Windows on Docker having
>> extremely poor file system performance even vs. straight NTFS (which isn't
>> great to begin with).
>> That will be fixed by VM based CI (progress update - I have most of the
>> tool that will manage the underlying base images written now, just need to
>> finish the VM provisioning part and give it some serious testing)
>>
>>
>>>
>>> Another point that requires extra build time for Krita is an
>>> inappropriate timeout on 100 minutes. A lot of our windows builds are
>>> terminated at around 95% completion because of this timeout, so we have to
>>> rerun them and, effectively, consume more and more CI time.
>>>
>>
>> Have you got a list of these so I can have a look to see if the timeout
>> is set too low?
>> Increasing the timeout is only a temporary fix though - we will need to
>> find a solution to why the build time is taking so long.
>>
>>
>>>
>>> ---
>>> Dmitry Kazakov
>>>
>>
>> Cheers,
>> Ben
>>
>>
>>>
>>> пт, 18 апр. 2025 г., 21:27 Ben Cooksley <bcooks...@kde.org>:
>>>
>>>> Hi all,
>>>>
>>>> Over the past week or two there have been a number of complaints
>>>> regarding CI builder availability which i've done some investigating into
>>>> this morning.
>>>>
>>>> Part of this is related to the Windows CI builders falling offline due
>>>> to OOM events, however the rest is simply due to a lack of builder time
>>>> availability (which is what this email is focused on).
>>>>
>>>> Given we have 6 Hetzner AX51 servers connected to Gitlab (each equipped
>>>> with a Ryzen 7 7700 CPU, 64GB RAM and NVMe storage) the issue is not
>>>> available build power - it is the number of builds and the length of those
>>>> builds that is at issue.
>>>>
>>>> This morning I ran a basic query to ascertain the top 20 projects for
>>>> CI time utilisation on invent.kde.org which revealed the following:
>>>>
>>>>           full_path           |    time_used     | job_count
>>>> ------------------------------+------------------+-----------
>>>> plasma/kwin                  | 320:47:04.966412 |      2387
>>>> graphics/krita               | 178:03:19.080763 |       423
>>>> multimedia/kdenlive          | 174:08:09.876842 |       697
>>>> network/ruqola               | 173:17:47.311305 |       555
>>>> plasma/plasma-workspace      | 155:10:03.618929 |       660
>>>> network/neochat              | 138:03:23.926652 |      1546
>>>> education/kstars             | 129:49:17.74229  |       329
>>>> sysadmin/ci-management       | 111:21:09.739792 |       154
>>>> plasma/plasma-desktop        | 108:56:52.849433 |       776
>>>> kde-linux/kde-linux-packages | 81:00:10.001937  |        33
>>>> kdevelop/kdevelop            | 59:40:51.54474   |       217
>>>> office/kmymoney              | 54:32:00.24623   |       271
>>>> frameworks/kio               | 53:54:19.046685  |       690
>>>> education/labplot            | 52:36:30.343671  |       245
>>>> murveit/kstars               | 52:32:56.882728  |       128
>>>> frameworks/kirigami          | 47:07:19.172935  |      1627
>>>> system/dolphin               | 46:09:58.02836   |       705
>>>> kde-linux/kde-linux          | 39:25:54.052469  |        46
>>>> utilities/kate               | 36:09:22.18958   |       356
>>>> wreissenberger/kstars        | 35:58:14.120515  |       105
>>>>
>>>> If we look closely, KStars has three spots on this list (totalling 216
>>>> hours of time used, making it the biggest app user of CI time).
>>>>
>>>> Projects on the above list are asked to please review their jobs and
>>>> how they are conducting development to ensure CI time is used efficiently
>>>> and appropriately.
>>>>
>>>> Other projects should also please review their usage and optimise
>>>> accordingly even if they're not on this list as there is efficiencies to be
>>>> found in all projects.
>>>>
>>>> When reviewing the list of CI builds projects have enabled, it is
>>>> important to consider to what degree your project benefits from having
>>>> various builds enabled. One common pattern i've seen is having Alpine, SUSE
>>>> Qt 6.9 and SUSE Qt 6.10 all enabled.
>>>>
>>>> If you need to verify building on Alpine / MUSL type systems and wish
>>>> to monitor for Qt Next regressions then you probably shouldn't have a
>>>> conventional Linux Qt stable build as those two jobs between them already
>>>> cover that list of permutations.
>>>>
>>>> I've taken a quick look at some of these and can suggest the following:
>>>>
>>>> KWin: it has two conventional Linux jobs (suse_qt69 and suse_qt610)
>>>> plus a custom reduced feature set job. It seems like one of these
>>>> conventional Linux jobs should be dropped.
>>>>
>>>> KStars: Appears to have a custom Linux job in addition to a
>>>> conventional Linux job. Choose one please.
>>>>
>>>> Ruqola: Appears to be conducting a development process whereby changes
>>>> are made in stable then immediately merged to master in a ever continuing
>>>> loop. Please discontinue this behaviour and only periodically merge stable
>>>> to master.
>>>>
>>>> Also needs to drop one of it's Linux jobs as they're duplicating
>>>> functionality as noted above.
>>>>
>>>> Plasma Workspace/Desktop: At least in part this seems to be driven by
>>>> Appium tests. Please reduce the number of these and/or streamline the
>>>> process for running an Appium test. Consideration should be given to
>>>> enabling the CI option use-ccache as well.
>>>>
>>>> KDevelop: Please enable the CI option use-ccache.
>>>>
>>>> Labplot: Appears to have a strange customisation in place to the
>>>> standard jobs which shouldn't be necessary as flags in .kde-ci.yml should
>>>> permit that to be done.
>>>>
>>>> Thanks,
>>>> Ben
>>>>
>>>

Reply via email to