Re: [External] : Re: Should mapConcurrent() respect time order instead of input order?

Viktor Klang Tue, 03 Jun 2025 02:14:00 -0700

The general feedback received thus far has been primarily positive. There have 
been a few behavior-related enhancements over the previews to better handle 
interruption (there's still room to improve there, as per our concurrent 
conversation) as well as some improvements to work-in-progress tracking.

It will be interesting to see which Gatherer-based operations will be devised 
by Java developers in the future.

Cheers,
√

Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: Jige Yu <yuj...@gmail.com>
Sent: Monday, 2 June 2025 18:54
To: Viktor Klang <viktor.kl...@oracle.com>
Cc: core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>
Subject: Re: [External] : Re: Should mapConcurrent() respect time order instead 
of input order?

Hi Viktor,

Thanks for your reply and for sharing your experience regarding user 
preferences. I appreciate that perspective.

You're right, if an unordered version of mapConcurrent proves to be widely 
beneficial and is implemented and adopted by the community, it could certainly 
make a strong case for future inclusion in the JDK.

I wanted to clarify a nuance regarding user preference that I might not have 
articulated clearly before. If the question is simply "ordered or unordered?", 
in isolation, I can see why many, myself included, might lean towards "ordered" 
as a general preference.

However, the decision becomes more complex when the associated trade-offs are 
considered. If the question were phrased more like, "Do you prefer an ordered 
mapConcurrent by default, even if it entails potential performance overhead and 
limitations for certain use cases like race() operations, versus an unordered 
version that offers higher throughput and broader applicability in such 
scenarios?" my (and perhaps others') answer might differ. The perceived cost 
versus benefit of ordering changes significantly when these factors are 
explicit.

My initial suggestion stemmed from the belief that the performance and 
flexibility gains of an unordered approach for I/O-bound tasks would, in many 
practical situations, outweigh the convenience of default ordering, especially 
since ordering can be reintroduced relatively easily, and explicitly, when 
needed.

Thanks again for the discussion.

Best regards,

On Mon, Jun 2, 2025 at 8:51 AM Viktor Klang 
<viktor.kl...@oracle.com<mailto:viktor.kl...@oracle.com>> wrote:
>My perspective is that strict adherence to input order for mapConcurrent() 
>might not be the most common or beneficial default behavior for users.

If there is indeed a majority who would benefit from an unordered version of 
mapConcurrent (my experience is that the majority prefer ordered) then, since 
it is possible to implement such a Gatherer outside of the JDK, this is 
something which will be constructed, widely used, and someone will then propose 
to add something similar to the JDK.

>While re-implementing the gatherer is a possibility, the existing 
>implementation is non-trivial, and creating a custom, robust alternative 
>represents a significant undertaking.

The existing version needs to maintain order, which adds to the complexity of 
the implementation. Implementing an unordered version would likely look 
different.
I'd definitely encourage taking the opportunity to attempt to implement it.

Cheers,
√

Viktor Klang
Software Architect, Java Platform Group
Oracle

________________________________
From: Jige Yu <yuj...@gmail.com<mailto:yuj...@gmail.com>>
Sent: Monday, 2 June 2025 17:05
To: Viktor Klang <viktor.kl...@oracle.com<mailto:viktor.kl...@oracle.com>>
Cc: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org> 
<core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>>
Subject: Re: [External] : Re: Should mapConcurrent() respect time order instead 
of input order?

Thank you for your response and for considering my feedback on the 
mapConcurrent() gatherer. I understand and respect that the final decision 
rests with the JDK maintainers.

I would like to offer a couple of further points for consideration. My 
perspective is that strict adherence to input order for mapConcurrent() might 
not be the most common or beneficial default behavior for users. I'd be very 
interested to see any research or data that suggests otherwise, as that would 
certainly inform my understanding.

From my experience, a more common need is for higher throughput in 
I/O-intensive operations. The ability to support use cases like race()—where 
the first successfully completed operation determines the outcome—also seems 
like a valuable capability that is currently infeasible due to the ordering 
constraint.

As I see it, if a developer specifically requires the input order to be 
preserved, this can be achieved with relative ease by applying a subsequent 
sorting operation. For instance:

.gather(mapConcurrent(...))
.sorted(Comparator.comparing(Result::getInputSequenceId))

The primary challenge in these scenarios is typically the efficient fan-out and 
execution of concurrent tasks, not the subsequent sorting of results.

Conversely, as you've noted, there isn't a straightforward way to modify the 
current default ordered behavior to achieve the higher throughput or race() 
semantics that an unordered approach would naturally provide.

While re-implementing the gatherer is a possibility, the existing 
implementation is non-trivial, and creating a custom, robust alternative 
represents a significant undertaking. My hope was that an unordered option 
could be a valuable addition to the standard library, benefiting a wider range 
of developers.

Thank you again for your time and consideration.

On Mon, Jun 2, 2025 at 7:48 AM Viktor Klang 
<viktor.kl...@oracle.com<mailto:viktor.kl...@oracle.com>> wrote:
>Even if it by default preserves input order, when I explicitly called 
>stream.unordered(), could mapConcurrent() respect that and in return achieve 
>higher throughput with support for race?

The Gatherer doesn't know whether the Stream is unordered or ordered. The 
operation should be semantically equivalent anyway.

Cheers,
√

Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: Jige Yu <yuj...@gmail.com<mailto:yuj...@gmail.com>>
Sent: Monday, 2 June 2025 16:29
To: Viktor Klang <viktor.kl...@oracle.com<mailto:viktor.kl...@oracle.com>>; 
core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org> 
<core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>>
Subject: [External] : Re: Should mapConcurrent() respect time order instead of 
input order?

Sorry. Forgot to copy to the mailing list.

On Mon, Jun 2, 2025 at 7:27 AM Jige Yu 
<yuj...@gmail.com<mailto:yuj...@gmail.com>> wrote:
Thanks Viktor!

I was thinking from my own experience that I wouldn't have automatically 
assumed that a concurrent fanout library would by default preserve input order.

And I think wanting high throughput with real-life utilities like race would be 
more commonly useful.

But I could be wrong.

Regardless, mapConcurrent() can do both, no?

Even if it by default preserves input order, when I explicitly called 
stream.unordered(), could mapConcurrent() respect that and in return achieve 
higher throughput with support for race?

On Mon, Jun 2, 2025 at 2:33 AM Viktor Klang 
<viktor.kl...@oracle.com<mailto:viktor.kl...@oracle.com>> wrote:
Hi!

In a similar vein to the built-in Collectors,
the built-in Gatherers provide solutions to common stream-related problems, but 
also, they also serve as "inspiration" for developers for what is possible to 
implement using Gatherers.

If someone, for performance reasons, and with a use-case which does not require 
encounter-order, want to take advantage of that combination of circumstances, 
it is definitely possible to implement your own Gatherer which has that 
behavior.

Cheers,
√

Viktor Klang
Software Architect, Java Platform Group
Oracle
________________________________
From: core-libs-dev 
<core-libs-dev-r...@openjdk.org<mailto:core-libs-dev-r...@openjdk.org>> on 
behalf of Jige Yu <yuj...@gmail.com<mailto:yuj...@gmail.com>>
Sent: Sunday, 1 June 2025 21:08
To: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org> 
<core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>>
Subject: Should mapConcurrent() respect time order instead of input order?

It seems like for most people, input order isn't that important for concurrent 
work, and concurrent results being in non-deterministic order is often expected.

If mapConcurrent() just respect output encounter order:

It'll be able to achieve higher throughput if an early task is slow, For 
example, with concurrency=2, and if the first task takes 10 minutes to run, 
mapConcurrent() would only be able to process 2 tasks within the first 10 
minutes; whereas with encounter order, the first task being slow doesn't block 
the 3rd - 100th elements from being processed and output.

mapConcurrent() can be used to implement useful concurrent semantics, for 
example to support race semantics. Imagine if I need to send request to 10 
candidate backends and take whichever that succeeds first, I'd be able to do:

backends.stream()
    .gather(mapConcurrent(
        backend -> {
          try {
            return backend.fetchOrder();
           } catch (RpcException e) {
             return null; // failed to fetch but not fatal
           }
        })
        .filter(Objects::notNull)
        .findFirst(); // first success then cancel the rest

Cheers,

Re: [External] : Re: Should mapConcurrent() respect time order instead of input order?

Reply via email to