cypherean opened a new pull request, #17249:
URL: https://github.com/apache/pinot/pull/17249
Fix for #17233
This PR adds skipCrcCheckOnLoad as table config as an alternate to server
level config to allow for fine grained control on disabling CRC check for
tables.
New config added: `skipCrcCheckOnLoad` under segmentsConfig in table config
## Test Plan
Tested with Pinot batch quickstart:
1. With skipCrcCheckOnLoad config enabled in airlineStats table:
```
Handling message: ZnRecord=cb7bee95-dde7-41c9-a0ad-53e4a26e6381,
{CREATE_TIMESTAMP=1763551164828, EXECUTE_START_TIMESTAMP=1763551164866,
MSG_ID=cb7bee95-dde7-41c9-a0ad-53e4a26e6381, MSG_STATE=new,
MSG_SUBTYPE=REFRESH_SEGMENT, MSG_TYPE=USER_DEFINE_MSG,
PARTITION_NAME=airlineStats_OFFLINE_16074_16074_0,
RESOURCE_NAME=airlineStats_OFFLINE, RETRY_COUNT=0,
SRC_CLUSTER=QuickStartCluster, SRC_INSTANCE_TYPE=PARTICIPANT,
SRC_NAME=Controller_127.0.0.1_9002, TGT_NAME=Server_127.0.0.1_7050,
TGT_SESSION_ID=1005c2cb4d30018, TIMEOUT=-1,
segmentName=airlineStats_OFFLINE_16074_16074_0,
tableName=airlineStats_OFFLINE}{}{}, Stat=Stat {_version=0,
_creationTime=1763551164837, _modifiedTime=1763551164837, _ephemeralOwner=0}
Replacing segment: airlineStats_OFFLINE_16074_16074_0 in table:
airlineStats_OFFLINE
Replacing segment: airlineStats_OFFLINE_16074_16074_0
END: GenericClusterController.onMessage() for cluster QuickStartCluster
144 END:INVOKE CallbackHandler 26,
/QuickStartCluster/INSTANCES/Broker_127.0.0.1_8000/MESSAGES listener:
org.apache.helix.controller.GenericHelixController@10bdfbcc type: CALLBACK
Took: 1ms
The latency of message e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 is 38 ms
END: InstanceMessagesCache.refresh(), 2 of Messages read from ZooKeeper.
took 1 ms.
Start to refresh stale message cache
END: InstanceMessagesCache.refresh(), 2 of Messages read from ZooKeeper.
took 1 ms.
Start to refresh stale message cache
Waiting for lock to reload/refresh: airlineStats_OFFLINE_16074_16074_0,
queue-length: 0
Acquired lock to reload/refresh segment: airlineStats_OFFLINE_16074_16074_0
(lock-time=0ms, queue-length=0)
Skipping replacing segment: airlineStats_OFFLINE_16074_16074_0 even though
its CRC has changed from: 2686808396 to: 2686808519 because skipCrcCheckOnLoad
is enabled
Message cb7bee95-dde7-41c9-a0ad-53e4a26e6381 completed.
Scheduling message e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7:
brokerResource:airlineStats_OFFLINE, null->null
Submit task: e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 to pool:
java.util.concurrent.ThreadPoolExecutor@6350aec6[Running, pool size = 0, active
threads = 0, queued tasks = 0, completed tasks = 0]
Message: e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 handling task scheduled
377 END:INVOKE CallbackHandler 28,
/QuickStartCluster/INSTANCES/Broker_127.0.0.1_8000/MESSAGES listener:
org.apache.helix.messaging.handling.HelixTaskExecutor@702c26cd type: CALLBACK
Took: 7ms
handling task: e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 begin, at: 1763551164872
Refreshing segment: airlineStats_OFFLINE_16074_16074_0 for table:
airlineStats_OFFLINE
Refreshed segment: airlineStats_OFFLINE_16074_16074_0 for table:
airlineStats_OFFLINE
Message e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 completed.
Handled request from 127.0.0.1 POST
http://localhost:9002/v2/segments?tableName=airlineStats&tableType=OFFLINE,
content-type multipart/form-data; charset=ISO-8859-1;
boundary=httpclient_boundary_8c757373-6c11-4c17-a0c3-2d3fad879992 status code
200 OK
Delete message cb7bee95-dde7-41c9-a0ad-53e4a26e6381 from zk!
message finished: cb7bee95-dde7-41c9-a0ad-53e4a26e6381, took 11
Message: cb7bee95-dde7-41c9-a0ad-53e4a26e6381 (parent: null) handling task
for airlineStats_OFFLINE:airlineStats_OFFLINE_16074_16074_0 completed at:
1763551164877, results: true. FrameworkTime: 11 ms; HandlerTime: 1 ms.
Delete message e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7 from zk!
message finished: e3a32ea1-8025-40c7-ac3f-7bfb9f2bf8a7, took 10
```
2. With skipCrcCheckOnLoad disabled on billing table by default:
```
Handling message: ZnRecord=c5bb3f12-3937-4761-99ed-b7833f02789b,
{CREATE_TIMESTAMP=1763552941563, EXECUTE_START_TIMESTAMP=1763552941592,
MSG_ID=c5bb3f12-3937-4761-99ed-b7833f02789b, MSG_STATE=new,
MSG_SUBTYPE=REFRESH_SEGMENT, MSG_TYPE=USER_DEFINE_MSG,
PARTITION_NAME=billing_OFFLINE_0, RESOURCE_NAME=billing_OFFLINE, RETRY_COUNT=0,
SRC_CLUSTER=QuickStartCluster, SRC_INSTANCE_TYPE=PARTICIPANT,
SRC_NAME=Controller_127.0.0.1_9002, TGT_NAME=Server_127.0.0.1_7050,
TGT_SESSION_ID=1005c43051f0018, TIMEOUT=-1, segmentName=billing_OFFLINE_0,
tableName=billing_OFFLINE}{}{}, Stat=Stat {_version=0,
_creationTime=1763552941572, _modifiedTime=1763552941572, _ephemeralOwner=0}
CallbackHandler26, Subscribing to path:
/QuickStartCluster/INSTANCES/Broker_127.0.0.1_8000/MESSAGES took: 0
START: GenericClusterController.onMessage() for cluster QuickStartCluster
Replacing segment: billing_OFFLINE_0 in table: billing_OFFLINE
Replacing segment: billing_OFFLINE_0
CallbackHandler28, Subscribing to path:
/QuickStartCluster/INSTANCES/Broker_127.0.0.1_8000/MESSAGES took: 0
END: GenericClusterController.onMessage() for cluster QuickStartCluster
144 END:INVOKE CallbackHandler 26,
/QuickStartCluster/INSTANCES/Broker_127.0.0.1_8000/MESSAGES listener:
org.apache.helix.controller.GenericHelixController@382d549a type: CALLBACK
Took: 0ms
START: InstanceMessagesCache.refresh()
The latency of message dd9dfcb1-3f9c-4c3e-b009-9ba5f24ef8bb is 30 ms
END: InstanceMessagesCache.refresh(), 2 of Messages read from ZooKeeper.
took 1 ms.
Start to refresh stale message cache
Waiting for lock to reload/refresh: billing_OFFLINE_0, queue-length: 0
Acquired lock to reload/refresh segment: billing_OFFLINE_0 (lock-time=0ms,
queue-length=0)
Replacing segment: billing_OFFLINE_0 because its CRC has changed from:
2262953635 to: 2262953636
Downloading and loading segment: billing_OFFLINE_0
Downloading segment: billing_OFFLINE_0 from:
http://127.0.0.1:9002/segments/billing/billing_OFFLINE_0
Acquiring instance level segment download semaphore for segment:
billing_OFFLINE_0, queue-length: 0
Acquired instance level segment download semaphore for segment:
billing_OFFLINE_0 (lock-time=0ms, queue-length=0).
```
Also tested with realtime tables with lucene index and upsert compaction on
our test clusters.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]