satishd commented on PR #12538: URL: https://github.com/apache/pinot/pull/12538#issuecomment-2032592795
>@satishd I guess you mean collision between two primary keys where one is UUID conformant and the other is not? Right. >Unless we prepend the length of the string in the UUID conformant serialization scheme, there will always be theoretical chances of collision. I think prepending the length of the string removes chances of such collision beyond any reasonable doubt. I got rid of the null terminated strings because those felt too easy to collide (e.g. (foo, bar) and (fo, obar)). @ankitsultana Other approach like falling back to Murmur3 may create collisions but reduce size. I understand that you are trying to make a tradeoff here for consistency and storage by trying to generate the unique keys even when the values are not conforming to the respective UUID format. But it can cause collisions in very rare scenarios. This is because it uses `value.toString().getBytes(StandardCharsets.UTF_8)`. The null values and strings as "null" result the same based on the concat method implementation. These are rare scenarios and the proposed implementation seems reasonable to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org