aleksdikanski commented on issue #7270:
URL: https://github.com/apache/pinot/issues/7270#issuecomment-995777801


   Hi, I had a look at this issue, as I'm currently investigating pulsar and 
pinot integration for my current company.
   I first encountered the first error with the 
`InvalidProtocolBufferException` (kubernetes deployment) and later with a 
different setup (local docker) the `IndexOutOfBoundsException`.
   
   So far, it does not look like an issue with false shaded libraries. As one 
can see from the stacktraces, the errors occur in libraries that are already 
shaded by the pulsar client lib.
   
   I looked into the second `IndexOutOfBoundsException` first and think I found 
the issue. When constructing a `MessageIdStreamOffset` from a string the 
current implementation tries to parse a pulsar MessageId from a string 
provided. 
   ```
     /**
      * returns the class object from string message id in the format 
ledgerId:entryId:partitionId
      * throws {@link IOException} if message if format is invalid.
      * @param messageId
      */
     public MessageIdStreamOffset(String messageId) {
       try {
         _messageId = 
MessageId.fromByteArray(messageId.getBytes(StandardCharsets.UTF_8));
       } catch (IOException e) {
         LOGGER.warn("Cannot parse message id " + messageId, e);
       }
     }
   ```
   
   As you can see from the comment the assumption about the structure of a 
MessageId is made. Unfortunately the passed String is only the 
`MessageId.toString()` representation and not the actual wire transfer 
representation as a byte array. Hence the parsing fails.
   I have currently fixed this by splitting the incoming String and using the 
constructors of `MessageIdImpl` and `BatchMessageIdImpl` to create a 
`MessageId` from the string. With that I had no execptions any more, not even 
the second one, i.e., I could the airline stats example.
   I can create a MR for this, but I don't know if splitting and constructing 
the MessageId using impl classes is such a nice solution and would rather have 
the wire format of the MessageId passed as an input, but I have not found, were 
this is happening. Would be nice to get some pointers here. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to