lhotari opened a new pull request, #25568:
URL: https://github.com/apache/pulsar/pull/25568

   ### Motivation
   
   While investigating a 
`PulsarFunctionsJavaProcessTest.testAutoSchemaFunctionTest` flake on master 
(run 
[24775322750/job/72493182352](https://github.com/apache/pulsar/actions/runs/24775322750/job/72493182352),
 attempt 1), the `functions_worker.log` showed a five-minute silent gap between 
the function being scheduled and the test timing out:
   
   ```
   11:31:05  FunctionActioner — Starting function 
{functionName=test-autoschema-fn-…}
   11:31:05  FunctionActioner — Function package file will be downloaded {…}
             (nothing)
   11:35:59  test tears down, TestNG fires ThreadTimeoutException
   ```
   
   No follow-up was logged between `will be downloaded` and the expected 
`ProcessBuilder starting the process`, and no function consumer ever subscribed 
to the input topic. With the existing logging we can't tell whether:
   
   - the BookKeeper download hung,
   - the hard-link step failed silently,
   - `ProcessRuntime.startProcess()` was never reached, or
   - `processBuilder.start()` returned a process that died before 
`process.exitValue()` was consulted.
   
   This PR adds two cheap, targeted log lines so the next occurrence of this 
class of flakes can be diagnosed from logs alone, without changing any behavior.
   
   ### Modifications
   
   - **`FunctionActioner.downloadFile`** — after the HTTP / package-management 
/ BookKeeper download branch returns, emit a new INFO log `Function package 
file downloaded` with `sizeBytes` and `durationMs`. The pre-existing `Function 
package file will be downloaded` (before) and `Function package file is linked` 
/ `has been downloaded` (after hard-link) remain, so a stuck download now shows 
up as a missing `downloaded` line, and a slow download shows a large 
`durationMs`.
   - **`ProcessRuntime.startProcess`** — immediately after 
`processBuilder.start()` returns, emit `Process started {pid}`. Also add the 
`pid` attribute to the existing `Started process successfully` and `Instance 
Process quit unexpectedly` log entries. This lets container `docker.log` lines 
be correlated with the worker log by PID, and distinguishes "never forked" from 
"forked but exited silently".
   
   No production behavior change; no public API, schema, config, metrics, 
threading, protocol, REST, CLI, or deployment surface is touched.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   This change is a trivial rework / code cleanup without any test coverage. 
The added log lines carry values already present in existing scopes (PID from 
the spawned `Process`, byte count from the downloaded temp file, duration from 
a local timer), so they cannot fail at runtime unless the existing logging path 
does.
   
   ### Does this pull request potentially affect one of the following parts:
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to