[ https://issues.apache.org/jira/browse/LUCENE-10195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434267#comment-17434267 ]
Dawid Weiss edited comment on LUCENE-10195 at 10/26/21, 11:02 AM: ------------------------------------------------------------------ I did take a look at the patch. Thank you - some of the things look like nice improvements. This command is not what should be the benchmark though: {code} ./gradlew clean build -Ptests.seed=deadbeef -Ptests.nightly=false -Ptests.neverUpToDate=false --scan {code} instead, the second invocation of this (full incremental build) should be measured instead (tests are a separate story): {code} ./gradlew check -x test {code} I understand you're advocating for gradle cache and I think it's great but I don't think it should be the default setting - sorry, this is my honest opinion. Unless you have a bunch of corporate CI servers it'll only pollute your home with a gazillion of megabytes of data that will simply not be reused much. And we do want folks to run stuff in their own environments because this is a good regression test (different VMs, operating systems). If somebody wants a local cache, they can enable it but it shouldn't be forced down their throats. As for the recomendations, here are my thoughts. bq. Fixing tests.seed obviously help to benefit from up-to-date check, I get the point about randomization, but this is a trade-off with expensive cost of resources Yes, it is a tradeoff we're willing to take. Again - if somebody wants a locally fixed seed, they can do it. You'd be surprised how frequently those tests fail on boundary conditions in only certain environment combinations. bq. Use the standard Gradle wrapper There is a reason why non-standard wrapper is used - please look up the relevant issue in Jira (source release shouldn't ship a binary artifact). bq. Setup local gradle.properties in Gradle home folder rather than having an automatic generation from gradle/generation/local-settings.gradle There is a reason why gradle.properties is generated (it adjusts the defaults to the local machine). I wish gradle had a mechanism of tuning, say, max-workers dynamically, but I don't think it does. bq. Do not override Gradle daemon TTL to 15mn, this is way to short If you run the build regularily switching VMs then background gradle daemons eat up all your memory. So no, I don't think it's too short. bq. Do not create / commit generated test files to src directory (frenchArticles.txt, Top50KWiki.utf8, bq. CambridgeMA.utf8, Latin-dont-break-on-hyphens.rbbi) You don't understand why they're there. One of these generated files requires 16GB memory and over 15 minutes on a decent server to generate. Even if you use the gradle cache, the first run on your old-ish laptop will kill your build with an OOM. Some of these resources require specific environments (like Linux toolchain). I don't think there is a mechanism in gradle which would allow only regenerating these resources if their source input triggers actually change. I'll go through the changes you suggested and will cherry-pick some of the improvements, thank you. was (Author: dweiss): I did take a look at the patch. Thank you - some of the things look like nice improvements. This command is not what should be the benchmark though: {code} ./gradlew clean build -Ptests.seed=deadbeef -Ptests.nightly=false -Ptests.neverUpToDate=false --scan {code} instead, the second invocation of this (full incremental build) should be measured instead (tests are a separate story): {code} ./gradlew check -x test {code} I understand you're advocating for gradle cache and I think it's great but I don't think it should be the default setting - sorry, this is my honest opinion. Unless you have a corporate CI servers it'll only pollute your home with a gazillion of megabytes of data that will simply not be reused much. And we do want folks to run stuff in their own environments because this is a good regression test (different VMs, operating systems). If somebody wants a local cache, they can enable it but it shouldn't be forced down their throats. As for the recomendations, here are my thoughts. > Fixing tests.seed obviously help to benefit from up-to-date check, I get the > point about randomization, but this is a trade-off with expensive cost of > resources Yes, it is a tradeoff we're willing to take. Again - if somebody wants a locally fixed seed, they can do it. You'd be surprised how frequently those tests fail on boundary conditions in only certain environment combinations. > Use the standard Gradle wrapper There is a reason why non-standard wrapper is used - please look up the relevant issue in Jira (source release shouldn't ship a binary artifact). > Setup local gradle.properties in Gradle home folder rather than having an > automatic generation from gradle/generation/local-settings.gradle There is a reason why gradle.properties is generated (it adjusts the defaults to the local machine). I wish gradle had a mechanism of tuning, say, max-workers dynamically, but I don't think it does. > Do not override Gradle daemon TTL to 15mn, this is way to short If you run the build regularily switching VMs then background gradle daemons eat up all your memory. So no, I don't think it's too short. > Do not create / commit generated test files to src directory > (frenchArticles.txt, Top50KWiki.utf8, > CambridgeMA.utf8, Latin-dont-break-on-hyphens.rbbi) You don't understand why they're there. One of these generated files requires 16GB memory and over 15 minutes on a decent server to generate. Even if you use the gradle cache, the first run on your old-ish laptop will kill your build with an OOM. Some of these resources require specific environments (like Linux toolchain). I don't think there is a mechanism in gradle which would allow only regenerating these resources if their source input triggers actually change. I'll go through the changes you suggested and will cherry-pick some of the improvements, thank you. > Gradle build speed improvement > ------------------------------ > > Key: LUCENE-10195 > URL: https://issues.apache.org/jira/browse/LUCENE-10195 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Jerome Prinet > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Increase Gradle build speed with help of Gradle built-in features, mostly > cache and up-to-date checks > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org