jhyao commented on issue #11948: URL: https://github.com/apache/pinot/issues/11948#issuecomment-1797769758
After publishing 1M ids, producer continued to send 2M upsert data with same ids as first 1M ids. Producer code like this: ```python def generate_record(id): record = { 'UID': id, 'UpdatedTime': get_time(), 'Content': generate_random_string(CONTENT_LENGTH) } for i in range(1, 11): record[f'JTD{i}'] = generate_random_number() return json.dumps(record) for i in range(3): for id in range(1_000_000): record = generate_record(id) producer.send(TOPIC, key=str(id).encode(), value=record.encode()) if id % 10000 == 0: print(f'Published {i} round, {id} messages') producer.flush() producer.flush() ``` I tested again without upsert, no this issue. So the issue is only on upsert table.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org