amunra opened a new issue, #172:
URL: https://github.com/apache/arrow-java/issues/172

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We're experimenting with Arrow Flight SQL and wrote an initial basic 
prototype server.
   
   I've tried hitting it with the Java library, but it's a little buggy so far.
   
   The data set I'm querying is a few GBs of data across 10'000'000 rows and 21 
columns.
   There are 10 double columns and 10 symbol (`Dictionary<Int32, Utf8>`) 
columns.
   
   See: https://github.com/timescale/tsbs to get an idea of the type of data 
being queried.
   
   I think the Java Flight SQL client is struggling with the response sent back.
   
   ```
   java.lang.StackOverflowError
        at 
java.base/java.util.Spliterator.getExactSizeIfKnown(Spliterator.java:414)
        at 
java.base/java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:526)
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513)
        at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at 
java.base/java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:150)
        at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at 
java.base/java.util.stream.IntPipeline.findFirst(IntPipeline.java:552)
        at 
java.base/java.text.DecimalFormatSymbols.findNonFormatChar(DecimalFormatSymbols.java:844)
        at 
java.base/java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols.java:815)
        at 
java.base/java.text.DecimalFormatSymbols.<init>(DecimalFormatSymbols.java:115)
        at 
java.base/sun.util.locale.provider.DecimalFormatSymbolsProviderImpl.getInstance(DecimalFormatSymbolsProviderImpl.java:85)
        at 
java.base/java.text.DecimalFormatSymbols.getInstance(DecimalFormatSymbols.java:182)
        at java.base/java.util.Formatter.zero(Formatter.java:2450)
        at 
java.base/java.util.Formatter$FormatSpecifier.getZero(Formatter.java:4450)
        at 
java.base/java.util.Formatter$FormatSpecifier.localizedMagnitude(Formatter.java:4466)
        at 
java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:3276)
        at 
java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:3261)
        at 
java.base/java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2957)
        at 
java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:2918)
        at java.base/java.util.Formatter.format(Formatter.java:2689)
        at java.base/java.util.Formatter.format(Formatter.java:2625)
        at java.base/java.lang.String.format(String.java:4141)
        at 
org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:82)
        at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:182)
        at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:169)
        at 
org.apache.arrow.vector.ipc.message.ArrowRecordBatch.<init>(ArrowRecordBatch.java:92)
        at 
org.apache.arrow.vector.ipc.message.ArrowRecordBatch.<init>(ArrowRecordBatch.java:69)
        at 
org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeRecordBatch(MessageSerializer.java:438)
        at 
org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeDictionaryBatch(MessageSerializer.java:514)
        at 
org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeDictionaryBatch(MessageSerializer.java:529)
        at 
org.apache.arrow.flight.ArrowMessage.asDictionaryBatch(ArrowMessage.java:273)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:264)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
           ....
   ```
   
   Looking at `FlightStream.java:280`, there's a recursive call to `next()`.
   
   ```java
   public boolean next() {
       ...
             } else if (msg.getMessageType() == HeaderType.DICTIONARY_BATCH) {
               ...
               return next();      //  <------------- culprit
       ...
     }
   ```
   
   This should be fixed to allow querying large datasets.
   
   Before I forget, here's the version I'm using:
   
   ```
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>flight-sql</artifactId>
               <version>11.0.0</version>
           </dependency>
   ```
   
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to