AlexanderAshitkin commented on code in PR #394:
URL: 
https://github.com/apache/maven-build-cache-extension/pull/394#discussion_r2423051347


##########
src/main/java/org/apache/maven/buildcache/CacheControllerImpl.java:
##########
@@ -498,25 +498,23 @@ public void save(
         final MavenProject project = context.getProject();
         final MavenSession session = context.getSession();
         try {
+            attachedResourcesPathsById.clear();
+            attachedResourceCounter = 0;
+
             final HashFactory hashFactory = cacheConfig.getHashFactory();
+            final HashAlgorithm algorithm = hashFactory.createAlgorithm();
             final org.apache.maven.artifact.Artifact projectArtifact = 
project.getArtifact();
-            final List<org.apache.maven.artifact.Artifact> attachedArtifacts;
-            final List<Artifact> attachedArtifactDtos;
-            final Artifact projectArtifactDto;
-            if (project.hasLifecyclePhase("package")) {
-                final HashAlgorithm algorithm = hashFactory.createAlgorithm();
-                attachGeneratedSources(project);
-                attachOutputs(project);
-                attachedArtifacts = project.getAttachedArtifacts() != null
-                        ? project.getAttachedArtifacts()
-                        : Collections.emptyList();
-                attachedArtifactDtos = artifactDtos(attachedArtifacts, 
algorithm, project);
-                projectArtifactDto = artifactDto(project.getArtifact(), 
algorithm, project);
-            } else {
-                attachedArtifacts = Collections.emptyList();
-                attachedArtifactDtos = new ArrayList<>();
-                projectArtifactDto = null;
-            }
+            final boolean hasPackagePhase = 
project.hasLifecyclePhase("package");
+
+            attachGeneratedSources(project);
+            attachOutputs(project);

Review Comment:
   Conditioning this logic on `package` served two main purposes:
   
   1. It prevented premature caching of outputs if the corresponding lifecycles 
have not run. Doing otherwise could lead to costly and incorrect caching or 
restoration of the previous branch's compilation output. Here’s a problematic 
sequence:
      - Compile Branch A and then checkout Branch B.
      - Run `process-resources` in Branch B. Result: the compiled sources of 
Branch A are cached, under checksum for Branch B.
      - Run the compilation in Branch B. Result: the compiled classes from 
Branch A are restored in Branch B, potentially interfering with the compilation 
of Branch B.
      Although such cached entries can be replaced later, it is still not 
desirable.
   
   2. Conditioning on `package` also reduces the number of caching and 
unpacking operations. Specifically, it avoids the need to zip, download, or 
upload files during every compilation, which helps maintain performance. When 
an engineer is actively working on the code, is it really beneficial to cache 
intermediate results? It is challenging to estimate how useful this would be in 
practice, and I assume it won’t always be desirable.
   
   Just thinking aloud, from the user's perspective, intentionally invoking 
restoration seems to offer better usability. In that sense, Gradle cache model 
is also flawed - saving a zip of classes with every compile task seems 
excessive and results in the io/cpu spent on creation of unnecessary cache 
entries that contain intermediate compilation results. In that sense `package` 
is not that bad - it allows to throttle costly operation and run it together 
with other costly IO work.
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to