Copilot commented on code in PR #2728:
URL: https://github.com/apache/tika/pull/2728#discussion_r3017899281


##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -152,7 +152,7 @@ protected static String 
getStringFromInputStream(InputStream in) throws IOExcept
     public static InputStream gzip(InputStream is) throws IOException {
         ByteArrayOutputStream bos = new ByteArrayOutputStream();
         OutputStream gz = new GzipCompressorOutputStream(bos);
-        IOUtils.copy(is, gz);
+        is.transferTo(gz);
         gz.flush();
         gz.close();
         return new ByteArrayInputStream(bos.toByteArray());

Review Comment:
   `gzip(InputStream)` creates a `GzipCompressorOutputStream` and closes it 
only on the happy path. If `transferTo()` throws, the gzip stream won’t be 
closed, which can leave the compressor in an inconsistent state and leak 
resources. Use try-with-resources (or a finally) to guarantee the gzip stream 
is closed even on exceptions.



##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -201,7 +202,7 @@ public void setUp() throws Exception {
 
             // Initialize PipesParsingHelper for pipes-based parsing
             // Merge the fetcher config with basePath pointing to the temp 
directory
-            this.pipesConfigPath = mergeFetcherConfig(this.pipesConfigPath, 
inputTempDirectory);
+            this.pipesConfigPath = mergeFetcherConfig(tmpPipesConfigPath, 
inputTempDirectory);
             TikaJsonConfig tikaJsonConfig = 
TikaJsonConfig.load(this.pipesConfigPath);

Review Comment:
   In `setUp()`, `tmpPipesConfigPath` is always a temp file created by 
`mergePassbackAllStrategy()`/`createDefaultTestConfig()`, but it is never 
deleted after `mergeFetcherConfig(...)` writes a new temp config. This leaves 
an intermediate temp file behind for each test run. Please delete 
`tmpPipesConfigPath` once the merged config has been written (ideally in a 
finally that won’t delete the final `pipesConfigPath`).



##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -268,6 +271,8 @@ private Path mergePassbackAllStrategy(InputStream 
pipesConfigInputStream) throws
     /**
      * Merges the tika-server-fetcher configuration into the pipes config.
      * The fetcher is configured with basePath pointing to the input temp 
directory.
+     * 
+     * @return new (different) JSON configPath in the temp director

Review Comment:
   Javadoc typo: "temp director" should be "temp directory", and the `@return` 
sentence should end with a period.
   ```suggestion
        * @return new (different) JSON configPath in the temp directory.
   ```



##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -186,12 +186,13 @@ public void setUp() throws Exception {
             Files.copy(getTikaConfigInputStream(), tmp, 
StandardCopyOption.REPLACE_EXISTING);
 
             InputStream pipesConfigInputStream = getPipesConfigInputStream();
+            Path tmpPipesConfigPath;
             if (pipesConfigInputStream != null) {
                 // Test provided its own pipes config - merge in PASSBACK_ALL 
emit strategy
-                this.pipesConfigPath = 
mergePassbackAllStrategy(pipesConfigInputStream);
+                tmpPipesConfigPath = 
mergePassbackAllStrategy(pipesConfigInputStream);
             } else {
                 // Create a default pipes config, merging metadata-filters 
from tika config
-                this.pipesConfigPath = createDefaultTestConfig(tmp);
+                tmpPipesConfigPath = createDefaultTestConfig(tmp);
             }

Review Comment:
   `pipesConfigInputStream` is obtained from `getPipesConfigInputStream()` and 
passed into `mergePassbackAllStrategy(...)` without being closed. Even though 
the current base implementation returns a `ByteArrayInputStream`, subclasses 
can override this to return a stream that requires closing. Wrap 
`pipesConfigInputStream` in try-with-resources when non-null.



##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -452,9 +459,8 @@ protected Map<String, byte[]> 
readZipArchiveBytes(InputStream inputStream) throw
                 Enumeration<ZipArchiveEntry> entries = zip.getEntries();
                 while (entries.hasMoreElements()) {
                     ZipArchiveEntry entry = entries.nextElement();
-                    ByteArrayOutputStream bos = new ByteArrayOutputStream();
-                    IOUtils.copy(zip.getInputStream(entry), bos);
-                    data.put(entry.getName(), bos.toByteArray());
+                    byte[] bytes = zip.getInputStream(entry).readAllBytes();
+                    data.put(entry.getName(), bytes);
                 }

Review Comment:
   In `readZipArchiveBytes(...)`, the per-entry stream returned by 
`zip.getInputStream(entry)` is not closed. Wrap that stream in 
try-with-resources before calling `readAllBytes()` to avoid leaking file 
descriptors.



##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -429,9 +437,8 @@ protected Map<String, String> readZipArchive(InputStream 
inputStream) throws IOE
                 Enumeration<ZipArchiveEntry> entries = zip.getEntries();
                 while (entries.hasMoreElements()) {
                     ZipArchiveEntry entry = entries.nextElement();
-                    ByteArrayOutputStream bos = new ByteArrayOutputStream();
-                    IOUtils.copy(zip.getInputStream(entry), bos);
-                    data.put(entry.getName(), 
DigestUtils.md5Hex(bos.toByteArray()));
+                    byte[] bytes = zip.getInputStream(entry).readAllBytes();
+                    data.put(entry.getName(), DigestUtils.md5Hex(bytes));
                 }

Review Comment:
   In `readZipArchive(...)`, the per-entry stream returned by 
`zip.getInputStream(entry)` is not closed. This can leak file handles across 
many entries. Wrap the entry stream in try-with-resources before calling 
`readAllBytes()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to