cshannon opened a new issue, #5000: URL: https://github.com/apache/accumulo/issues/5000
### Background and Motivation The Accumulo RPC layer uses Apache Thrift as both the transport and protocol but I think it would be worthwhile to consider switching to gRPC in the future for the RPC layer. I have done a lot of investigation and prototyping into alternatives and below summarizes things. There are a few reasons to switch (which are highlighted below in the gRPC advantages section) but the primary motivation that started the investigation into gRPC is to be able to support async RPC calls on the server side. Async RPC calls would enable the server to handle many more connections and requests at a time and give us the ability to do things like long polling without blocking the IO threads. The initial use case that started this is described in #4664, but there are other use cases as well. Thrift is synchronous and their Async api, which in theory could be used to accomplish this, is unfortunately quite limited (as described) below. This makes it difficult to handle a lot of connections concurrently with Thrift, especially if those requests are long lived. **Note:** There is still one more Jetty prototype to work on and I will update the issue with those results when done. ### Prototypes I have investigated and prototyped both gRPC and also Thrift using Async processors as an alternative to compare both to the current sync Thrift api as well as to each other. Below are the the advantages/disadvantages I have found for both. I also am planning to test out a 3rd alternative, using an async REST API with Jetty because Jetty should support the different authentication mechanisms not supported by gRPC/async thrift. I will report back and link that prototype and findings when that is done as well. #### Grpc Results PR: https://github.com/apache/accumulo/pull/4715 **Advantages:** 1. gRPC is Netty and Http/2 based so high performance and a non-blocking architecture. Netty is well tested and the most well known NIO framework. 2. gRPC is async by default out of the box but is still easy to write RPC services that are sync if desired. 3. gRPC supports streaming which would be quite useful for the client for scans. 4. The gRPC serialization format is flexible. By default it supports protobuf but the format is pluggable so we could potentially use Thrift as the binary format and keep the existing Thrift objects. 5. There is good SSL support out of the box. 6. gRPC supports OAUTH2 and has an API for plugging in authentication. 7. gRPC supports async on the client side as well if a client wants to make a non-blocking RPC call. 8. The documentation is pretty good and gRPC has a wide adoption. **Disadvantages:** 1. The code changes are larger to replace an entire framework. However, we'd likely just start with one service at a time and we do not have to switch everything all at once. 2. **SASL is not supported (potential blocker)** #### Async Thrift PR: https://github.com/apache/accumulo/pull/4931 **Advantages:** 1. We already use Thrift so the changes are much smaller than switching out an entire RPC layer. 2. We can continue to use all the existing RPC services and objects and only implement Async APIs for the services we want. **Disadvantages:** 1. Thrift does not support multiplexing async processors which means we'd have to add support or end up opening up a different server for every service. There is an open [issue](https://issues.apache.org/jira/browse/THRIFT-2427) for this and an old [PR](https://github.com/apache/thrift/pull/747) that was closed that could be reopened to support this. 2. **SASL is not supported**. While Thrift did add a non-blocking service implementation for SASL, it does not support the Async API and only sync, so we would have to implement this ourselves and contribute that back to support it. (potential blocker) 3. **SSL is NOT supported** at all for non blocking servers. Non blocking server implementations are a requirement for the Async API and there is no support for that. Non-blocking SSL in Java requires using the SSLEngine and is extremely difficult to implement correctly. This is best left for other frameworks like Netty and it seems unlikely Thrit will add support for this. This is a big blocker 4. The documentation for Async thrift is basically non existent. There is almost zero information and doesn't seem to be well supported or tested. I had to essentially reverse engineer the source code and trace things with break points to figure out how the async processors worked. #### Async Jetty (wip prototype) PR: Todo **This section will be updated after the prototype is done.** **Potential advantages:** 1. Jetty should support all the different authentication mechanisms including SSL and SASL. 2. Jetty is already used in the monitor and REST is a well known/established pattern. 3. Jetty supports Async servlet API so we can accomplish long polling. 4. REST apis usually support Json and Thrift already has support for Json serialization but we could explore just using binary as well. **Porential Disadvantages:** 1. Using REST and Json is probably not ideal for all RPC calls. 2. This would be a one-off server that would only be used for the CompactionCoodinator service. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
