Re: Fwd: [DISCUSS] Improvements on client function execution API

2019-08-23 Thread Alberto Gomez
Hi Dan,

Please, see my answers below.

On 22/8/19 20:29, Dan Smith wrote:
> For the two new execute() methods you are proposing, are you also
> suggesting that they be implemented on the server?
I was only thinking about the client.
>
> I took a look at the C++ API. The C++ API also has a
> ResultCollector.getResult(timeout) method. Does that timeout do anything,
> given that execute(timeout) already waited for the given timeout?
The DefaultResultCollector in the C++ API handles the timeout with a 
condition variable in the getResult method (unlike the Java API) but as 
Jacob Barrett pointed out on another e-mail, given that the execute 
method is blocking, the timeout is pointless here too.
>
> Long term, are you thinking that we should remove execute() with no
> parameters and getResult(timeout) and just have these two blocking
> execute(timeout) methods on the server and clients? I think you are right
> that we shouldn't change the behavior of the existing execute() to avoid
> breaking our users. But I also want to understand what our long term plan
> is for a consistent and sensible API.

It depends on the option that gets selected.

If we go for option a), the idea would be to keep in clients execute() 
with no parameters and have also execute() with timeout, and remove the 
getResult(timeout) method keeping just the getResult() one (which would 
be non-blocking as it is today).

On the servers, I would leave things as they are, execute() is not 
blocking. Not sure if it makes sense to keep these differences between 
clients and servers and if it does, how we could maintain them in the 
API. The fact is that currently execute is very different in clients and 
in servers.

If we go for option b), the idea would be to keep in clients execute() 
with no parameters and have also execute() with timeout in order not to 
lose the feature of closing connections if read timeout expires. The 
socket timeout exception would have to be thrown by the getResult() 
method instead of the execute method. The execute() methods would be 
non-blocking and it would be getResult() the one which is blocking, both 
on servers and clients.

For option b) in order to deprecate the blocking behavior in execute() 
we could keep both implementations (blocking and non-blocking) and have 
a system property to select the behavior, which could be the blocking 
one by default in the release this is introduced. But long term, the aim 
would be to have just the non-blocking implementation of execute().

>
> -Dan
- Alberto


Re: [DISCUSS] Improvements on client function execution API

2019-08-23 Thread Alberto Gomez
Hi Jake,

Please, see my answers below.

On 22/8/19 21:16, Jacob Barrett wrote:
>
>> On Aug 21, 2019, at 8:49 AM, Alberto Gomez  wrote:
>> 2. Timeout in ResultCollector::getResult() and Execution::execute() blocking
>>
>> Regarding the timeout in the ResultCollector::getResult() method problem and 
>> the blocking/non-blocking confusion for Execution::execute() two 
>> alternatives are considered:
>>
>> a) Remove the possibility of setting a timeout on the 
>> ResultCollector::getResult() method on the client side as with the current 
>> client implementation it is useless. This could be done by removing the 
>> method with the timeout parameter from the public API.
>>
>> It would be advisable to make explicit in the documentation that the 
>> getResult() method does not wait for results to arrive as that should have 
>> already been done in the Execution::execute() invocation.
>>
>> This alternative is very simple and would keep things pretty much as they 
>> are today.
> To be honest I think approach would go against what a user “thinks” is going 
> on. Given that there hasn’t been a timeout on execute I assumed it was 
> asynchronous and that the getResult blocked until timeout or results arrived. 
> Typically these two calls were done back to back.

You are right if you look at the Java client. But if you look at the 
native clients, the timeout is there, both in the C++ and the C# cases 
which would indicate that it is a blocking call.

See 
https://geode.apache.org/releases/latest/cppdocs/a00725.html#aa918a5e193745950e12ca4feb9c5d776
 
and 
https://geode.apache.org/releases/latest/dotnetdocs/a00882.html#ae0a814049482ca424f89c13ab1099c3d
>
>> b) Transform the Execution::execute() method on the client side into a 
>> non-blocking method.
>>
>> This alternative is more complex and requires changes in all the clients. 
>> Apart from that it has implications on the public client API it requires 
>> moving the exceptions offered currently by the Execution::execute() method 
>> to the ResultCollector::getResult() and new threads will have to be managed.
> I think this is more in line with what users expect is going on based on the 
> current API, I know I have. If were are going to make any change I think this 
> is the one. I don’t think the behavior change is a problem since it's what is 
> expected to be happening anyway.
>
>> An outline of a possible implementation for option b) would be:
>>
>>   *   Instead of invoking the ServerRegionProxy::executeFunction() directly 
>> as it is done today, create a Future that invokes this method and returns 
>> the resultCollector passed as parameter.
> Do you really think we need to introduce Futures here into the API? I feel 
> like the ResultCollector acts as the Future. I don’t think any change needs 
> to be made to the API in this regard. The ResultCollector implementation 
> would just simply block as implied by the api for the timeout period. I would 
> change the method to have units though and deprecate the method without units.

I did not mean to introduce Futures in the API. My idea was to use Java 
Futures internally so that the ResultCollector returned by the 
getResult() method would wrap the Java Future with the ResultCollector 
that would actually hold the result.

An alternative would be to leave the logic of blocking to each 
implementation ResultCollector. In the case of the 
DefaultResultCollector, we could use a CountDownLatch that would be 
decremented when endResults() is called and that would make getResult() 
block by using CountDown.await(...).

The advantage of using Futures internally is that the blocking logic 
would not have to be implemented on every ResultCollector implementation.

>
> -Jake
>
>
- Alberto


Re: Failed: apache/geode-native#2059 (rel/v1.9.1.RC1 - add53da)

2019-08-23 Thread Blake Bender
This looks spurious - apparently a download issue with Apache Rat jar
file.  I just ran Rat locally at this Git tag and it passed.

Thanks,

Blake


On Thu, Aug 22, 2019 at 7:33 PM Travis CI  wrote:

> Build Update for apache/geode-native
> -
>
> Build: #2059
> Status: Failed
>
> Duration: 1 hr, 39 mins, and 48 secs
> Commit: add53da (rel/v1.9.1.RC1)
> Author: Dave Barnes
> Message: User Guide: Add a link to the official XSD declarative cache
> schema
>
> View the changeset:
> https://github.com/apache/geode-native/compare/rel/v1.9.1.RC1
>
> View the full build log and details:
> https://travis-ci.org/apache/geode-native/builds/575617339?utm_medium=notification&utm_source=email
>
> --
>
> You can unsubscribe from build emails from the apache/geode-native
> repository going to
> https://travis-ci.org/account/preferences/unsubscribe?repository=11948127&utm_medium=notification&utm_source=email
> .
> Or unsubscribe from *all* email updating your settings at
> https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email
> .
> Or configure specific recipients for build notifications in your
> .travis.yml file. See https://docs.travis-ci.com/user/notifications.
>
>


[DISCUSS] Version 1.11 serialization ordinal is wrong

2019-08-23 Thread Bruce Schuchardt
We've been incrementing the serialization version by 5 for each x.x.0 
release but I see that we went from 105 in 1.10.0 to 107 in develop for 
1.11.0.  That gives us only one opportunity to make a serialization 
change if we need to release patches for 1.10.  I don't think that's 
safe & we need to bump the ordinal for 1.11.0 to 110.




Re: [DISCUSS] Version 1.11 serialization ordinal is wrong

2019-08-23 Thread Kirk Lund
I think they did that because the datatypes in the Version class are too
small. I'm not sure why bytes and shorts were chosen.

The version constants are defined as bytes:

private static final *byte* GEODE_1_11_0_ORDINAL = 107;

While the ordinal is defined as a short:

private final *short* ordinal;

I'm assuming this was done because Version ends up on the wire. Does it go
into every message or just handshake/join messages? I suppose we can't
change these datatypes to integers until Geode 2.0 because they would end
up being a breaking change.

Maybe we should scan through all serialized datatypes for bytes/shorts that
are going to run out...

On Fri, Aug 23, 2019 at 9:02 AM Bruce Schuchardt 
wrote:

> We've been incrementing the serialization version by 5 for each x.x.0
> release but I see that we went from 105 in 1.10.0 to 107 in develop for
> 1.11.0.  That gives us only one opportunity to make a serialization
> change if we need to release patches for 1.10.  I don't think that's
> safe & we need to bump the ordinal for 1.11.0 to 110.
>
>


Re: [DISCUSS] Version 1.11 serialization ordinal is wrong

2019-08-23 Thread Bruce Schuchardt
Yeah, that code is confusing.  I think the ordinal constants should all 
be changed to shorts and the constructor for Version should take a short 
as well.  I can see why someone creating a new ordinal would be hesitant 
to do that - it will need to be tested.


The on-wire representation of a Version isn't limited to a byte. We use 
writeOrdinal() to serialize version ordinals.


I'll open a JIRA to address this.


On 8/23/19 9:16 AM, Kirk Lund wrote:

I think they did that because the datatypes in the Version class are too
small. I'm not sure why bytes and shorts were chosen.

The version constants are defined as bytes:

private static final *byte* GEODE_1_11_0_ORDINAL = 107;

While the ordinal is defined as a short:

private final *short* ordinal;

I'm assuming this was done because Version ends up on the wire. Does it go
into every message or just handshake/join messages? I suppose we can't
change these datatypes to integers until Geode 2.0 because they would end
up being a breaking change.

Maybe we should scan through all serialized datatypes for bytes/shorts that
are going to run out...

On Fri, Aug 23, 2019 at 9:02 AM Bruce Schuchardt 
wrote:


We've been incrementing the serialization version by 5 for each x.x.0
release but I see that we went from 105 in 1.10.0 to 107 in develop for
1.11.0.  That gives us only one opportunity to make a serialization
change if we need to release patches for 1.10.  I don't think that's
safe & we need to bump the ordinal for 1.11.0 to 110.




Re: Need help: Jetty dunit tests blocking creation of geode-log4j

2019-08-23 Thread Robert Houghton
If the distributedTest jobs are swapping then we have a bigger problem than
just this test. We can crank down on the number of parallel jobs and  see
if that relieves the memory pressure, or we can check for machine types
with more memory.

I can look at this on Monday, or maybe ping @Sean Goller


On Thu, Aug 22, 2019 at 9:35 AM Kirk Lund  wrote:

> So far Jens and I are unable to reproduce these test failures [1]. This
> does not appear to be a test or java code problem. Based on google searches
> [2] [3], the cause might be either:
>
> 1) the distributed test job is out of swap space when the Tomcat/Jetty
> tests run
>
> 2) the /proc/sys/vm/overcommit_memory and /proc/sys/vm/overcommit_ratio may
> be misconfigured
>
> I don't have sufficient concourse knowledge to attempt to fix or change
> either of the above two settings in our precheckin environment. Can someone
> please help me with this?
>
> The failure:
>
>
> *OpenJDK 64-Bit Server VM warning: INFO:
> os::commit_memory(0x7f74a4ba4000, 65536, 1) failed; error='Not enough
> space' (errno=12) [thread 26510 also had an error] *
>
> [1] https://concourse.apachegeode-ci.info/builds/88240
> [2] https://bugs.openjdk.java.net/browse/JDK-8216619
> [3]
>
> https://stackoverflow.com/questions/46801741/jvm-crashes-with-error-cannot-allocate-memory-errno-12
>
>
> On Wed, Aug 21, 2019 at 8:36 AM Jens Deppe  wrote:
>
> > I can work with you on this if you're still blocked.
> >
> > --Jens
> >
> > On Tue, Aug 20, 2019 at 4:12 PM Kirk Lund  wrote:
> >
> > > Does anyone know how to debug geode-assembly Jetty dunit tests that
> fail
> > to
> > > launch modify_war?
> > >
> > > It passes 100% locally in intellij and with gradle cli. It only fails
> in
> > > concourse PR precheckin.
> > >
> > > Right now, this is the last thing blocking me from submitting a
> non-draft
> > > PR to move all log4j-core code from geode-core to geode-log4j. This is
> > > blocking the creation of geode-log4j.
> > >
> > > The only changes in my branch are moving all log4j-core code from
> > > geode-core to geode-log4j.
> > >
> > > If anyone else wants to see this change make it to develop, then I need
> > > help!
> > >
> > > When the test tries to execute modify_war, it fails with the following
> > > output and stack trace. No other info is available for debugging as
> > > apparently this kills the gradle daemon.
> > >
> > > Failed PR precheckin dunit job:
> > > https://concourse.apachegeode-ci.info/builds/88240
> > > PR: https://github.com/apache/geode/pull/3914
> > >
> > > > Task :geode-assembly:distributedTest
> > >
> > > org.apache.geode.session.tests.Jetty9CachingClientServerTest >
> > > containersShouldHavePersistentSessionData FAILED
> > > java.lang.RuntimeException: Something very bad happened when trying
> > to
> > > start container
> > >
> > >
> >
> JETTY9_client-server_containersShouldHavePersistentSessionData_0_a6ebd229-072b-47db-a9bf-ca3713175f05_
> > >
> > > Caused by:
> > > java.lang.RuntimeException: Something very bad happened to this
> > > container when starting. Check the cargo_logs folder for container
> logs.
> > >
> > > Caused by:
> > > java.io.IOException: Unable to run modify_war script,
> > command:
> > >
> > >
> >
> [/tmp/geode_container_install17845041006471328987/cargo_modules/Apache_Geode_Modules-1.11.0-SNAPSHOT-AppServer/bin/modify_war,
> > > -J, -Xmx2096m, -w,
> > >
> > >
> >
> /home/geode/geode/geode-assembly/build/distributedTest254/../../../extensions/session-testing-war/build/libs/session-testing-war.war,
> > > -t, client-server, -o,
> > >
> > >
> >
> /tmp/geode_container_install17845041006471328987/cargo_wars/JETTY9_client-server_containersShouldHavePersistentSessionData_0_a6ebd229-072b-47db-a9bf-ca3713175f053692095078744488223.war,
> > > -p, gemfire.cache.enable_local_cache=true, -p,
> > >
> > >
> >
> gemfire.property.log-file=/home/geode/geode/geode-assembly/build/distributedTest254/cargo_logs/JETTY9_client-server_containersShouldHavePersistentSessionData_0_a6ebd229-072b-47db-a9bf-ca3713175f05/gemfire.log,
> > > -p,
> > >
> > >
> >
> gemfire.property.cache-xml-file=/home/geode/geode/geode-assembly/build/distributedTest254/cargo_logs/JETTY9_client-server_containersShouldHavePersistentSessionData_0_a6ebd229-072b-47db-a9bf-ca3713175f05/cache-client.xml]
> > > log file:
> > > ERROR: Error updating web.xml
> > > ng: INFO: os::commit_memory(0x00077d00, 2147483648,
> > 0)
> > > failed; error='Not enough space' (errno=12)
> > >
> >
>


Re: [DISCUSS] Add a public API to add endpoints to Geode's HTTP server

2019-08-23 Thread Aaron Lindsey
Would it be practical to remove the dependency on Jetty from the
HttpService interface? I admit that I don't know a lot about this area of
the code, but I noticed that the current HttpService public interface
doesn't have any dependencies on Jetty (except the getHttpService() method,
which would be changed by this proposal). It seems desirable to decouple
the interface from the underlying web server in case we need to change it
later.

I like this proposal's intent and think it's an important area of work,
because any Geode user who wants to add a custom HTTP endpoint currently
has to rely on unstable internal APIs.

- Aaron


On Tue, Aug 20, 2019 at 1:34 PM Dale Emery  wrote:

> I’ve drafted a proposal to add a public API to add endpoints to Geode’s
> HTTP server.
>
>
> https://cwiki.apache.org/confluence/display/GEODE/%5BDraft%5D+Public+API+to+Add+Endpoints+to+Geode+HTTP+Server
> <
> https://cwiki.apache.org/confluence/display/GEODE/[Draft]+Public+API+to+Add+Endpoints+to+Geode+HTTP+Server
> 
> >
>
> Currently it is possible to serve additional HTTP content only by creating
> a separate server or by using internal Geode APIs to add endpoints to the
> existing server.
>
> We would like to give users a public way to serve additional HTTP content.
> For example, a user may wish to use a PrometheusMeterRegistry to publish
> Geode’s Micrometer-based metrics via HTTP. Geode’s existing HTTP server
> would be a convenient way to publish that information.
>
> I invite your feedback and ideas.
>
> Dale
>
> —
> Dale Emery
> dem...@pivotal.io
>
>


Re: [DISCUSS] Add a public API to add endpoints to Geode's HTTP server

2019-08-23 Thread Dale Emery
In my proposal as drafted, I assumed Jetty.

If we want to support some additional set of server implementations, we will 
have to define an abstract version of Handler, and perhaps other types, that 
would be adaptable to all of those implementations.

To do that, we would have to have some set of server implementations in mind.

I don’t have any insight into that, so I don’t know how to assess the cost or 
value of making this independent of Jetty.

—
Dale Emery
dem...@pivotal.io



> On Aug 23, 2019, at 9:59 AM, Aaron Lindsey  wrote:
> 
> Would it be practical to remove the dependency on Jetty from the
> HttpService interface? I admit that I don't know a lot about this area of
> the code, but I noticed that the current HttpService public interface
> doesn't have any dependencies on Jetty (except the getHttpService() method,
> which would be changed by this proposal). It seems desirable to decouple
> the interface from the underlying web server in case we need to change it
> later.
> 
> I like this proposal's intent and think it's an important area of work,
> because any Geode user who wants to add a custom HTTP endpoint currently
> has to rely on unstable internal APIs.
> 
> - Aaron
> 
> 
> On Tue, Aug 20, 2019 at 1:34 PM Dale Emery  wrote:
> 
>> I’ve drafted a proposal to add a public API to add endpoints to Geode’s
>> HTTP server.
>> 
>> 
>> https://cwiki.apache.org/confluence/display/GEODE/%5BDraft%5D+Public+API+to+Add+Endpoints+to+Geode+HTTP+Server
>> <
>> https://cwiki.apache.org/confluence/display/GEODE/[Draft]+Public+API+to+Add+Endpoints+to+Geode+HTTP+Server
>> 
>>> 
>> 
>> Currently it is possible to serve additional HTTP content only by creating
>> a separate server or by using internal Geode APIs to add endpoints to the
>> existing server.
>> 
>> We would like to give users a public way to serve additional HTTP content.
>> For example, a user may wish to use a PrometheusMeterRegistry to publish
>> Geode’s Micrometer-based metrics via HTTP. Geode’s existing HTTP server
>> would be a convenient way to publish that information.
>> 
>> I invite your feedback and ideas.
>> 
>> Dale
>> 
>> —
>> Dale Emery
>> dem...@pivotal.io
>> 
>> 



Re: [DISCUSS] Version 1.11 serialization ordinal is wrong

2019-08-23 Thread Jacob Barrett
I commented in the jira about using int.

> On Aug 23, 2019, at 9:54 AM, Bruce Schuchardt  wrote:
> 
> Yeah, that code is confusing.  I think the ordinal constants should all be 
> changed to shorts and the constructor for Version should take a short as 
> well.  I can see why someone creating a new ordinal would be hesitant to do 
> that - it will need to be tested.
> 
> The on-wire representation of a Version isn't limited to a byte. We use 
> writeOrdinal() to serialize version ordinals.
> 
> I'll open a JIRA to address this.
> 
> 
>> On 8/23/19 9:16 AM, Kirk Lund wrote:
>> I think they did that because the datatypes in the Version class are too
>> small. I'm not sure why bytes and shorts were chosen.
>> 
>> The version constants are defined as bytes:
>> 
>> private static final *byte* GEODE_1_11_0_ORDINAL = 107;
>> 
>> While the ordinal is defined as a short:
>> 
>> private final *short* ordinal;
>> 
>> I'm assuming this was done because Version ends up on the wire. Does it go
>> into every message or just handshake/join messages? I suppose we can't
>> change these datatypes to integers until Geode 2.0 because they would end
>> up being a breaking change.
>> 
>> Maybe we should scan through all serialized datatypes for bytes/shorts that
>> are going to run out...
>> 
>> On Fri, Aug 23, 2019 at 9:02 AM Bruce Schuchardt 
>> wrote:
>> 
>>> We've been incrementing the serialization version by 5 for each x.x.0
>>> release but I see that we went from 105 in 1.10.0 to 107 in develop for
>>> 1.11.0.  That gives us only one opportunity to make a serialization
>>> change if we need to release patches for 1.10.  I don't think that's
>>> safe & we need to bump the ordinal for 1.11.0 to 110.
>>> 
>>> 


Passed: apache/geode-examples#342 (release/1.9.1 - fc97f58)

2019-08-23 Thread Travis CI
Build Update for apache/geode-examples
-

Build: #342
Status: Passed

Duration: 20 mins and 50 secs
Commit: fc97f58 (release/1.9.1)
Author: Owen Nichols
Message: temporary commit to allow CI access to the Release Candidate staging 
maven

View the changeset: 
https://github.com/apache/geode-examples/compare/3d887324bc51...fc97f585ab17

View the full build log and details: 
https://travis-ci.org/apache/geode-examples/builds/575956161?utm_medium=notification&utm_source=email

--

You can unsubscribe from build emails from the apache/geode-examples repository 
going to 
https://travis-ci.org/account/preferences/unsubscribe?repository=11483039&utm_medium=notification&utm_source=email.
Or unsubscribe from *all* email updating your settings at 
https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email.
Or configure specific recipients for build notifications in your .travis.yml 
file. See https://docs.travis-ci.com/user/notifications.



Failed: apache/geode-examples#344 (tag-test-deleteme - e50c908)

2019-08-23 Thread Travis CI
Build Update for apache/geode-examples
-

Build: #344
Status: Failed

Duration: 2 mins and 14 secs
Commit: e50c908 (tag-test-deleteme)
Author: Owen Nichols
Message: this commit should follow the tag

View the changeset: 
https://github.com/apache/geode-examples/compare/51abdfae1724^...e50c908c48f2

View the full build log and details: 
https://travis-ci.org/apache/geode-examples/builds/575974603?utm_medium=notification&utm_source=email

--

You can unsubscribe from build emails from the apache/geode-examples repository 
going to 
https://travis-ci.org/account/preferences/unsubscribe?repository=11483039&utm_medium=notification&utm_source=email.
Or unsubscribe from *all* email updating your settings at 
https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email.
Or configure specific recipients for build notifications in your .travis.yml 
file. See https://docs.travis-ci.com/user/notifications.



Failed: apache/geode-examples#343 (rel/vTEST - 51abdfa)

2019-08-23 Thread Travis CI
Build Update for apache/geode-examples
-

Build: #343
Status: Failed

Duration: 2 mins and 8 secs
Commit: 51abdfa (rel/vTEST)
Author: Owen Nichols
Message: this is the commit that should be tagged

View the changeset: https://github.com/apache/geode-examples/compare/rel/vTEST

View the full build log and details: 
https://travis-ci.org/apache/geode-examples/builds/575974599?utm_medium=notification&utm_source=email

--

You can unsubscribe from build emails from the apache/geode-examples repository 
going to 
https://travis-ci.org/account/preferences/unsubscribe?repository=11483039&utm_medium=notification&utm_source=email.
Or unsubscribe from *all* email updating your settings at 
https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email.
Or configure specific recipients for build notifications in your .travis.yml 
file. See https://docs.travis-ci.com/user/notifications.