Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
The problem of data transfer costs is not new in Cloud environments. At work we usually just opt for paying for it since dev time is scarser. For private projects though, I opt for aggressive (remote) caching. So you can setup a global cache in Google Cloud Storage and more local caches wherever your executors are (reduces egress as much as possible). This setup works great with Bazel and Pants among others. Note that these systems are pretty hermetic in contrast to Meson. IIRC Eric by now works at Google. They internally use Blaze which AFAIK does aggressive caching, too. So maybe using any of these systems would be a way of not having to sacrifice any of the current functionality. Downside is that you have lower a bit of dev productivity since you cannot eyeball your build definitions anymore. ym2c On Fri, 28 Feb 2020 at 20:34, Eric Anholt wrote: > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie wrote: > > > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone wrote: > > > > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie wrote: > > > > b) we probably need to take a large step back here. > > > > > > > > Look at this from a sponsor POV, why would I give X.org/fd.o > > > > sponsorship money that they are just giving straight to google to pay > > > > for hosting credits? Google are profiting in some minor way from > these > > > > hosting credits being bought by us, and I assume we aren't getting > any > > > > sort of discounts here. Having google sponsor the credits costs > google > > > > substantially less than having any other company give us money to do > > > > it. > > > > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty > > > comparable in terms of what you get and what you pay for them. > > > Obviously providers like Packet and Digital Ocean who offer bare-metal > > > services are cheaper, but then you need to find someone who is going > > > to properly administer the various machines, install decent > > > monitoring, make sure that more storage is provisioned when we need > > > more storage (which is basically all the time), make sure that the > > > hardware is maintained in decent shape (pretty sure one of the fd.o > > > machines has had a drive in imminent-failure state for the last few > > > months), etc. > > > > > > Given the size of our service, that's a much better plan (IMO) than > > > relying on someone who a) isn't an admin by trade, b) has a million > > > other things to do, and c) hasn't wanted to do it for the past several > > > years. But as long as that's the resources we have, then we're paying > > > the cloud tradeoff, where we pay more money in exchange for fewer > > > problems. > > > > Admin for gitlab and CI is a full time role anyways. The system is > > definitely not self sustaining without time being put in by you and > > anholt still. If we have $75k to burn on credits, and it was diverted > > to just pay an admin to admin the real hw + gitlab/CI would that not > > be a better use of the money? I didn't know if we can afford $75k for > > an admin, but suddenly we can afford it for gitlab credits? > > As I think about the time that I've spent at google in less than a > year on trying to keep the lights on for CI and optimize our > infrastructure in the current cloud environment, that's more than the > entire yearly budget you're talking about here. Saying "let's just > pay for people to do more work instead of paying for full-service > cloud" is not a cost optimization. > > > > > Yes, we could federate everything back out so everyone runs their own > > > builds and executes those. Tinderbox did something really similar to > > > that IIRC; not sure if Buildbot does as well. Probably rules out > > > pre-merge testing, mind. > > > > Why? does gitlab not support the model? having builds done in parallel > > on runners closer to the test runners seems like it should be a thing. > > I guess artifact transfer would cost less then as a result. > > Let's do some napkin math. The biggest artifacts cost we have in Mesa > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64, > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day, > makes ~1.8TB/month ($180 or so). We could build a local storage next > to the lava dispatcher so that the artifacts didn't have to contain > the rootfs that came from the container (~2/3 of the insides of the > zip file), but that's another service to build and maintain. Building > the drivers once locally and storing it would save downloading the > other ~1/3 of the inside of the zip file, but that requires a big > enough system to do builds in time. > > I'm planning on doing a local filestore for google's lava lab, since I > need to be able to move our xml files off of the lava DUTs to get the > xml results we've become accustomed to, but this would not bubble up > to being a priority for my time if I wasn't doing it anyway. If it > takes me a single day to set all this up (I estimate a couple of > weeks), that costs
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer wrote: > > On 2020-03-01 6:46 a.m., Marek Olšák wrote: > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly > > pre-merge CI. > > Thanks for the suggestion! I implemented something like this for Mesa: > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432 > I wouldn't mind manually triggering pipelines, but unless there is some trick I'm not realizing, it is super cumbersome. Ie. you have to click first the container jobs.. then wait.. then the build jobs.. then wait some more.. and then finally the actual runners. That would be a real step back in terms of usefulness of CI.. one might call it a regression :-( Is there a possible middle ground where pre-marge pipelines that touch a particular driver trigger that driver's CI jobs, but MRs that don't touch that driver but do touch shared code don't until triggered by marge? Ie. if I have a MR that only touches nir, it's probably ok to not run freedreno jobs until marge triggers it. But if I have a MR that is touching freedreno, I'd really rather not have to wait until marge triggers the freedreno CI jobs. Btw, I was under the impression (from periodically skimming the logs in #freedesktop, so I could well be missing or misunderstanding something) that caching/etc had been improved and mesa's part of the egress wasn't the bigger issue at this point? BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit : > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer wrote: > > On 2020-03-01 6:46 a.m., Marek Olšák wrote: > > > For Mesa, we could run CI only when Marge pushes, so that it's a strictly > > > pre-merge CI. > > > > Thanks for the suggestion! I implemented something like this for Mesa: > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432 > > > > I wouldn't mind manually triggering pipelines, but unless there is > some trick I'm not realizing, it is super cumbersome. Ie. you have to > click first the container jobs.. then wait.. then the build jobs.. > then wait some more.. and then finally the actual runners. That would > be a real step back in terms of usefulness of CI.. one might call it a > regression :-( On GStreamer side we have moved some existing pipeline to manual mode. As we use needs: between jobs, we could simply set the first job to manual (in our case it's a single job called manifest in your case it would be the N container jobs). This way you can have a manual pipeline that is triggered in single (or fewer) clicks. Here's an example: https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292 That our post-merge pipelines, we only trigger then if we suspect a problem. > > Is there a possible middle ground where pre-marge pipelines that touch > a particular driver trigger that driver's CI jobs, but MRs that don't > touch that driver but do touch shared code don't until triggered by > marge? Ie. if I have a MR that only touches nir, it's probably ok to > not run freedreno jobs until marge triggers it. But if I have a MR > that is touching freedreno, I'd really rather not have to wait until > marge triggers the freedreno CI jobs. > > Btw, I was under the impression (from periodically skimming the logs > in #freedesktop, so I could well be missing or misunderstanding > something) that caching/etc had been improved and mesa's part of the > egress wasn't the bigger issue at this point? > > BR, > -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne wrote: > > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit : > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer wrote: > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote: > > > > For Mesa, we could run CI only when Marge pushes, so that it's a > > > > strictly > > > > pre-merge CI. > > > > > > Thanks for the suggestion! I implemented something like this for Mesa: > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432 > > > > > > > I wouldn't mind manually triggering pipelines, but unless there is > > some trick I'm not realizing, it is super cumbersome. Ie. you have to > > click first the container jobs.. then wait.. then the build jobs.. > > then wait some more.. and then finally the actual runners. That would > > be a real step back in terms of usefulness of CI.. one might call it a > > regression :-( > > On GStreamer side we have moved some existing pipeline to manual mode. > As we use needs: between jobs, we could simply set the first job to > manual (in our case it's a single job called manifest in your case it > would be the N container jobs). This way you can have a manual pipeline > that is triggered in single (or fewer) clicks. Here's an example: > > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292 > > That our post-merge pipelines, we only trigger then if we suspect a > problem. > I'm not sure that would work for mesa since the hierarchy of jobs branches out pretty far.. ie. if I just clicked the arm64 build + test container jobs, and everything else ran automatically after that, it would end up running all the CI jobs for all the arm devices (or at least all the 64b ones) I'm not sure why gitlab works this way, a more sensible approach would be to click on the last jobs you want to run and for that to automatically propagate up to run the jobs needed to run clicked job. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Sat, Apr 4, 2020 at 11:16 AM Rob Clark wrote: > > On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne wrote: > > > > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit : > > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer wrote: > > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote: > > > > > For Mesa, we could run CI only when Marge pushes, so that it's a > > > > > strictly > > > > > pre-merge CI. > > > > > > > > Thanks for the suggestion! I implemented something like this for Mesa: > > > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432 > > > > > > > > > > I wouldn't mind manually triggering pipelines, but unless there is > > > some trick I'm not realizing, it is super cumbersome. Ie. you have to > > > click first the container jobs.. then wait.. then the build jobs.. > > > then wait some more.. and then finally the actual runners. That would > > > be a real step back in terms of usefulness of CI.. one might call it a > > > regression :-( > > > > On GStreamer side we have moved some existing pipeline to manual mode. > > As we use needs: between jobs, we could simply set the first job to > > manual (in our case it's a single job called manifest in your case it > > would be the N container jobs). This way you can have a manual pipeline > > that is triggered in single (or fewer) clicks. Here's an example: > > > > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292 > > > > That our post-merge pipelines, we only trigger then if we suspect a > > problem. > > > > I'm not sure that would work for mesa since the hierarchy of jobs > branches out pretty far.. ie. if I just clicked the arm64 build + test > container jobs, and everything else ran automatically after that, it > would end up running all the CI jobs for all the arm devices (or at > least all the 64b ones) update: pepp pointed out on #dri-devel that the path-based rules should still apply to prune out hw CI jobs for hw not affected by the MR. If that is the case, and we only need to click the container jobs (without then doing the wait&click dance), then this doesn't sound as bad as I feared. BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Sat, Apr 4, 2020 at 11:41 AM Rob Clark wrote: > > On Sat, Apr 4, 2020 at 11:16 AM Rob Clark wrote: > > > > On Sat, Apr 4, 2020 at 10:47 AM Nicolas Dufresne > > wrote: > > > > > > Le samedi 04 avril 2020 à 08:11 -0700, Rob Clark a écrit : > > > > On Fri, Apr 3, 2020 at 7:12 AM Michel Dänzer wrote: > > > > > On 2020-03-01 6:46 a.m., Marek Olšák wrote: > > > > > > For Mesa, we could run CI only when Marge pushes, so that it's a > > > > > > strictly > > > > > > pre-merge CI. > > > > > > > > > > Thanks for the suggestion! I implemented something like this for Mesa: > > > > > > > > > > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4432 > > > > > > > > > > > > > I wouldn't mind manually triggering pipelines, but unless there is > > > > some trick I'm not realizing, it is super cumbersome. Ie. you have to > > > > click first the container jobs.. then wait.. then the build jobs.. > > > > then wait some more.. and then finally the actual runners. That would > > > > be a real step back in terms of usefulness of CI.. one might call it a > > > > regression :-( > > > > > > On GStreamer side we have moved some existing pipeline to manual mode. > > > As we use needs: between jobs, we could simply set the first job to > > > manual (in our case it's a single job called manifest in your case it > > > would be the N container jobs). This way you can have a manual pipeline > > > that is triggered in single (or fewer) clicks. Here's an example: > > > > > > https://gitlab.freedesktop.org/gstreamer/gstreamer/pipelines/128292 > > > > > > That our post-merge pipelines, we only trigger then if we suspect a > > > problem. > > > > > > > I'm not sure that would work for mesa since the hierarchy of jobs > > branches out pretty far.. ie. if I just clicked the arm64 build + test > > container jobs, and everything else ran automatically after that, it > > would end up running all the CI jobs for all the arm devices (or at > > least all the 64b ones) > > update: pepp pointed out on #dri-devel that the path-based rules > should still apply to prune out hw CI jobs for hw not affected by the > MR. If that is the case, and we only need to click the container jobs > (without then doing the wait&click dance), then this doesn't sound as > bad as I feared. PS. I should add, that in these wfh days, I'm relying on CI to be able to test changes on some generations of hw that I don't physically have with me. It's easy to take for granted, I did until I thought about what I'd do without CI. So big thanks to all the people who are working on CI, it's more important these days than you might realize :-) BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev