AlexanderAshitkin commented on code in PR #394:
URL:
https://github.com/apache/maven-build-cache-extension/pull/394#discussion_r2423051347
##########
src/main/java/org/apache/maven/buildcache/CacheControllerImpl.java:
##########
@@ -498,25 +498,23 @@ public void save(
final MavenProject project = context.getProject();
final MavenSession session = context.getSession();
try {
+ attachedResourcesPathsById.clear();
+ attachedResourceCounter = 0;
+
final HashFactory hashFactory = cacheConfig.getHashFactory();
+ final HashAlgorithm algorithm = hashFactory.createAlgorithm();
final org.apache.maven.artifact.Artifact projectArtifact =
project.getArtifact();
- final List<org.apache.maven.artifact.Artifact> attachedArtifacts;
- final List<Artifact> attachedArtifactDtos;
- final Artifact projectArtifactDto;
- if (project.hasLifecyclePhase("package")) {
- final HashAlgorithm algorithm = hashFactory.createAlgorithm();
- attachGeneratedSources(project);
- attachOutputs(project);
- attachedArtifacts = project.getAttachedArtifacts() != null
- ? project.getAttachedArtifacts()
- : Collections.emptyList();
- attachedArtifactDtos = artifactDtos(attachedArtifacts,
algorithm, project);
- projectArtifactDto = artifactDto(project.getArtifact(),
algorithm, project);
- } else {
- attachedArtifacts = Collections.emptyList();
- attachedArtifactDtos = new ArrayList<>();
- projectArtifactDto = null;
- }
+ final boolean hasPackagePhase =
project.hasLifecyclePhase("package");
+
+ attachGeneratedSources(project);
+ attachOutputs(project);
Review Comment:
Conditioning this logic on `package` served two main purposes:
1. It prevented premature caching of outputs if the corresponding lifecycles
have not run. Doing otherwise could lead to costly and incorrect caching or
restoration of the previous branch's compilation output. Here’s a problematic
sequence:
- Compile Branch A and then checkout Branch B.
- Run `process-resources` in Branch B. Result: the compiled sources of
Branch A are cached, under checksum for Branch B.
- Run the compilation in Branch B. Result: the compiled classes from
Branch A are restored in Branch B, potentially interfering with the compilation
of Branch B.
Although such cached entries can be replaced later, it is still not
desirable.
2. Conditioning on `package` also reduces the number of caching and
unpacking operations. Specifically, it avoids the need to zip, download, or
upload files during every compilation, which helps maintain performance. When
an engineer is actively working on the code, is it really beneficial to cache
intermediate results? It is challenging to estimate how useful this would be in
practice, and I assume it won’t always be desirable.
Just thinking aloud, from the user's perspective, intentionally invoking
restoration seems to offer better usability. In that sense, Gradle cache model
is also flawed - saving a zip of classes with every compile task seems
excessive and results in the io/cpu spent on creation of unnecessary cache
entries that contain intermediate compilation results. In that sense `package`
is not that bad - it allows to throttle costly operation and run it together
with other costly IO work.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]