GXM2333 opened a new issue #8294: URL: https://github.com/apache/pinot/issues/8294
I use kafka streaming to build my pinot realtime partition table.The partition key is `studentID`,which I defined it as `Integer` both in pinot table schema and kafka producer.Then I use Integer's `HashCode` function to make partitions, and I custom a partitioner class for kafka producer, the partition function is like this: ``` public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) { List<PartitionInfo> partitions = cluster.partitionsForTopic(topic); int numPartitions = partitions.size(); // I didn't convert `key` to String return Utils.toPositive(key.hashCode() % numPartitions); } ``` I found I can't get result if I use `where studentID = 111` as predicate, then I found the reason is the broker calculate partition number for partition column using this way: https://github.com/apache/pinot/blob/46ed731c4e60c308c9559e46349a984b0ce05ce6/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java#L216 which will convert field value to String value before doing the `hashCode` function. In my opinion, both kafka and piont have type systems, and different types have their own `hashCode` function.So is it better using their own `hashCode` function to calculate partition number for partition columns, not have to convert to string value before? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org