mneedham opened a new pull request, #9134: URL: https://github.com/apache/pinot/pull/9134
This is a bug fix for an issue I found when using the timestamp index with streaming data. The problem is that the schema passed into the `LLRealTimeDataManager` (and then into `MutableSegmentImpl`) doesn't know about the extra timestamp fields. This means that when any rows are indexed they ignore the new fields and when Pinot tries to commit the segment we get this type of exception: ``` java.lang.NullPointerException: null at org.apache.pinot.segment.spi.creator.ColumnIndexCreationInfo.getDistinctValueCount(ColumnIndexCreationInfo.java:67) ~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator.init(SegmentColumnarIndexCreator.java:201) ~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:216) ~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.segment.local.realtime.converter.RealtimeSegmentConverter.build(RealtimeSegmentConverter.java:123) ~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentInternal(LLRealtimeSegmentDataManager.java:851) [pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentForCommit(LLRealtimeSegmentDataManager.java:778) [pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:677) [pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2] at java.lang.Thread.run(Thread.java:829) [?:?] ``` I have updated the streaming QuickStart to add the timestamp index. While doing that I had to change the value for `mtime` because there is another bug where Pinot runs the following expression when it tries to add the extra date columns: ``` dateTrunc('DAY', '2022-07-29 11:18:23') ``` Which doesn't work because the second parameter of this function needs to be a `LONG` value, which isn't yet the case as the `DataTypeTransformer` hasn't coerced the type. I'm not sure what the proper fix for that issue should be, so I'm working around it for the sake of this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org