Hello Geode devs,

Currently GEODE is "swallowing" all output sent to stdout and stderr by
default and there's no way of changing this behavior when starting members
through *gfsh*.
This, between other things, prevents users, between other things, from
playing around with *System.out.println()* during development phases and
getting thread dumps by executing a plain *kill -3* or *kill -QUIT* using
the processId, which is critical in troubleshooting.
Currently there are two internal flags that can be used to prevent this
default behavior, both have to be used at the same time and both are very
counterintuitive: *gemfire.OSProcess.ENABLE_OUTPUT_REDIRECTION=true* and
*gemfire.OSProcess.DISABLE_OUTPUT_REDIRECTION=false*. These flags, however,
don't work when starting members through *gfsh*, and that's because the
relevant commands wrongly assume that the flags are already part of the
system properties too early in the lifecycle execution of the command:


*StartXXXXXCommand.java*

@CliCommand(value = CliStrings.START_XXXXX, help = CliStrings.START_XXXXX__HELP)

@CliMetaData(shellOnly = true, relatedTopic =
{CliStrings.TOPIC_GEODE_XXXXX, CliStrings.TOPIC_GEODE_LIFECYCLE})
public Result startXXXXX(...) throws Exception {
        (...)
        final boolean redirectOutput =
Boolean.getBoolean(OSProcess.ENABLE_OUTPUT_REDIRECTION_PROPERTY);
    XXXXXLauncher.Builder serverXXXXXBuilder =
        new XXXXXLauncher.Builder()
        .setRedirectOutput(redirectOutput)
        (...)

}

At this stage during the execution, the system properties used when
starting the members haven't been fully parsed yet and the flags are only
present within the sun.java.command system property, so
*Boolean.getBoolean(OSProcess.ENABLE_OUTPUT_REDIRECTION_PROPERTY)* will
always return *false*. There's a JIRA created with this same description,
and I've started to work on a fix for it: GEODE-4101
<https://issues.apache.org/jira/browse/GEODE-4101>.

The proposal would be to add a new flag, *--redirect-ouput*, to the start
commands in GFSH and deprecate the properties
*OSProcess.DISABLE_OUTPUT_REDIRECTION* and
*OSProcess.ENABLE_OUTPUT_REDIRECTION*. To avoid major code changes the
start commands will have this new flag as a parameter and will also set as
*true* a new internal system property
*OSProcess.DISABLE_REDIRECTION_CONFIGURATION* which, as it names implies,
will disable the other two properties when set. In the next major release,
the three properties should be deleted without major changes. Do you see
any flaws here?.

I've tested these changes and the output from *System.out.println()* (from
a function or listener, as an example) goes to the member's log file as
expected. However, no matter what I do, I can't get the output from *kill
-3 / kill -QUIT*, nor can I find a place within the source code where this
signal is caught to explain why the thread dump is not printed in the
member's log file. Am I missing something?.

Last, but not least, when redirecting *stdout/stderr* within a locator with
pulse embedded, all of the deploy steps get logged using a different format
(*this was being swallowed before*):

...
[info 2017/12/19 11:12:12.123 ART locator1 <main> tid=0x1]
Initializing Spring root WebApplicationContext
Dec 19, 2017 11:12:12 AM org.springframework.web.context.ContextLoader
initWebApplicationContext
INFO: Root WebApplicationContext: initialization started
Dec 19, 2017 11:12:12 AM
org.springframework.web.context.support.XmlWebApplicationContext
prepareRefresh
INFO: Refreshing Root WebApplicationContext: startup date [Tue Dec 19
11:12:12 ART 2017]; root of context hierarchy
Dec 19, 2017 11:12:12 AM
org.springframework.beans.factory.xml.XmlBeanDefinitionReader
loadBeanDefinitions
INFO: Loading XML bean definitions from ServletContext resource
[/WEB-INF/mvc-dispatcher-servlet.xml]
Dec 19, 2017 11:12:12 AM
org.springframework.beans.factory.xml.XmlBeanDefinitionReader
loadBeanDefinitions
INFO: Loading XML bean definitions from ServletContext resource
[/WEB-INF/spring-security.xml]
...

This probably happens because jetty uses *StdErrLog* by default and
*log4j2* gets
reconfigured using the *log4j2.xml* file from pulse (ignoring the format
and options defined by *org.apache.geode.internal.logging.LogService*).
What would be the recommended approach here?, add *geode-core* as a compile
dependency of *geode-pulse *and directly use *LogService* instead of the
default *LogManager*?, define a custom *LogService* in *geode-pulse *to
check (JMX maybe?) whether there's a parent context defined already and use
it instead of *LogManager*?, tweak *JettyHelper* to, somehow, threat
*pulse* differently
and disable this deploy logging as it happens today?, leave it as it is and
create a new JIRA to address this separately (maybe moving the internal
logging to a separate module)?.
Best regards

Reply via email to