fx19880617 opened a new pull request #5864:
URL: https://github.com/apache/incubator-pinot/pull/5864


   ## Description
   
   Current Pinot server start ServerInstance before ControllerLeaderLocator 
creation.
   This is typically ok but in one extreme case that during realtime server 
bootstrap, one segment is consumed to threshold and trying to persist, it will 
use `ControllerLeaderLocator` in 
`ServerSegmentCompletionProtocolHandler.createSegmentCompletionUrl(...)` which 
may not be created yet.
   
   Below is a sample logs:
   ```
   2020/08/14 00:20:05.855 INFO [PluginManager] [main] Plugins root dir is 
[/opt/pinot/plugins]
   2020/08/14 00:20:05.869 INFO [PluginManager] [main] Loading all plugins. 
Please use env variable 'plugins.include' to customize.
   2020/08/14 00:20:05.869 INFO [PluginManager] [main] Trying to load plugin 
[pinot-batch-ingestion-hadoop] from location 
[/opt/pinot/plugins/pinot-batch-ingestion/pinot-batch-ingestion-hadoop]
   2020/08/14 00:20:05.870 INFO [PluginManager] [main] Successfully loaded 
plugin [pinot-batch-ingestion-hadoop] from jar file 
[/opt/pinot/plugins/pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pinot-batch-ingestion-hadoop-0.5.0-SNAPSHOT-shaded.jar]
   ...
   ...
   2020/08/14 00:20:06.199 INFO [ServerInstance] [main] Initializing server 
instance
   2020/08/14 00:20:06.199 INFO [ServerInstance] [main] Initializing server 
metrics
   2020/08/14 00:20:06.220 INFO [ServerInstance] [main] Initializing instance 
data manager of class: 
org.apache.pinot.server.starter.helix.HelixInstanceDataManager
   2020/08/14 00:20:06.223 INFO [HelixInstanceDataManager] [main] Initializing 
Helix instance data manager
   2020/08/14 00:20:06.224 INFO [HelixInstanceDataManagerConfig] [main] 
InstanceDataManagerConfig, key: datadir , value: /var/pinot/server/data/index
   2020/08/14 00:20:06.225 INFO [HelixInstanceDataManagerConfig] [main] 
InstanceDataManagerConfig, key: segmenttardir , value: 
/var/pinot/server/data/segment
   2020/08/14 00:20:06.225 INFO [HelixInstanceDataManagerConfig] [main] 
InstanceDataManagerConfig, key: data.manager.class , value: 
org.apache.pinot.server.starter.helix.HelixInstanceDataManager
   ...
   2020/08/14 00:20:08.114 INFO [HelixTaskExecutor] [main] Scheduling message 
553a8091-14b7-48a9-91b6-61c9c722112d: 
testTable_REALTIME:testTable__0__4__20200813T2127Z, OFFLINE->CONSUMING
   2020/08/14 00:20:08.115 INFO [HelixTaskExecutor] [main] Submit task: 
553a8091-14b7-48a9-91b6-61c9c722112d to pool: 
java.util.concurrent.ThreadPoolExecutor@48b0e701[Running, pool size = 1, active 
threads = 1, queued tasks = 0, completed tasks = 0]
   2020/08/14 00:20:08.203 INFO [HelixStateTransitionHandler] 
[HelixTaskExecutor-message_handle_thread] Instance 
Server_my-host-server-0.my-host-server-headless.my-host.svc.cluster.local_8098, 
partition testTable__0__4__20200813T2127Z received state transition from 
OFFLINE to CONSUMING on session 100e1f0f4d00022, message id: 
553a8091-14b7-48a9-91b6-61c9c722112d
   2020/08/14 00:20:08.203 INFO 
[SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] 
[HelixTaskExecutor-message_handle_thread] 
SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline() : 
ZnRecord=553a8091-14b7-48a9-91b6-61c9c722112d, {CREATE_TIMESTAMP=1597364407619, 
ClusterEventName=CurrentStateChange, EXECUTE_START_TIMESTAMP=1597364408128, 
EXE_SESSION_ID=100e1f0f4d00022, FROM_STATE=OFFLINE, 
MSG_ID=553a8091-14b7-48a9-91b6-61c9c722112d, MSG_STATE=read, 
MSG_TYPE=STATE_TRANSITION, PARTITION_NAME=testTable__0__4__20200813T2127Z, 
READ_TIMESTAMP=1597364407995, RESOURCE_NAME=testTable_REALTIME, 
RESOURCE_TAG=testTable_REALTIME, 
SRC_NAME=my-host-controller-0.my-host-controller-headless.my-host.svc.cluster.local_9000,
 SRC_SESSION_ID=100e1f0f4d0001e, 
STATE_MODEL_DEF=SegmentOnlineOfflineStateModel, 
STATE_MODEL_FACTORY_NAME=DEFAULT, 
TGT_NAME=Server_my-host-server-0.my-host-server-headless.my-host.svc.cluster.local_8098,
 TGT_SESSION_ID=100e1f0f4d00022, TO_STATE=CONSUMING}{}{}
 , Stat=Stat {_version=0, _creationTime=1597364407632, 
_modifiedTime=1597364407632, _ephemeralOwner=0}
   2020/08/14 00:20:08.203 INFO 
[SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] 
[HelixTaskExecutor-message_handle_thread] 
SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline() : 
ZnRecord=553a8091-14b7-48a9-91b6-61c9c722112d, {CREATE_TIMESTAMP=1597364407619, 
ClusterEventName=CurrentStateChange, EXECUTE_START_TIMESTAMP=1597364408128, 
EXE_SESSION_ID=100e1f0f4d00022, FROM_STATE=OFFLINE, 
MSG_ID=553a8091-14b7-48a9-91b6-61c9c722112d, MSG_STATE=read, 
MSG_TYPE=STATE_TRANSITION, PARTITION_NAME=testTable__0__4__20200813T2127Z, 
READ_TIMESTAMP=1597364407995, RESOURCE_NAME=testTable_REALTIME, 
RESOURCE_TAG=testTable_REALTIME, 
SRC_NAME=my-host-controller-0.my-host-controller-headless.my-host.svc.cluster.local_9000,
 SRC_SESSION_ID=100e1f0f4d0001e, 
STATE_MODEL_DEF=SegmentOnlineOfflineStateModel, 
STATE_MODEL_FACTORY_NAME=DEFAULT, 
TGT_NAME=Server_my-host-server-0.my-host-server-headless.my-host.svc.cluster.local_8098,
 TGT_SESSION_ID=100e1f0f4d00022, TO_STATE=CONSUMING}{}{}, S
 tat=Stat {_version=0, _creationTime=1597364407632, 
_modifiedTime=1597364407632, _ephemeralOwner=0}
   2020/08/14 00:20:08.203 INFO [HelixInstanceDataManager] 
[HelixTaskExecutor-message_handle_thread] Adding segment: 
testTable__0__4__20200813T2127Z to table: testTable_REALTIME
   2020/08/14 00:20:09.093 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[HelixTaskExecutor-message_handle_thread] RealtimeDataResourceZKMetadata 
contains no information about sorted column for segment 
testTable__0__4__20200813T2127Z
   2020/08/14 00:20:09.099 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[HelixTaskExecutor-message_handle_thread] Creating new stream consumer, reason: 
Starting
   2020/08/14 00:20:09.129 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[HelixTaskExecutor-message_handle_thread] Creating new stream metadata 
provider, reason: Starting
   ...
   2020/08/14 00:20:09.173 INFO [FixedByteSVMutableForwardIndex] 
[HelixTaskExecutor-message_handle_thread] Allocating 200000 bytes for: 
testTable__0__4__20200813T2127Z:ItemStatus.sv.unsorted.fwd
   ...
   2020/08/14 00:20:09.228 INFO 
[MutableSegmentImpl_testTable__0__4__20200813T2127Z_testTables] 
[HelixTaskExecutor-message_handle_thread] Metrics aggregation is disabled.
   2020/08/14 00:20:09.272 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[HelixTaskExecutor-message_handle_thread] Starting consumption on realtime 
consuming segment testTable__0__4__20200813T2127Z maxRowCount 50000 maxEndTime 
2020-08-14T00:30:09.229Z
   2020/08/14 00:20:09.292 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[HelixTaskExecutor-message_handle_thread] Created new consumer thread 
Thread[testTable__0__4__20200813T2127Z,5,main] for 
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager@a12cbb0
   2020/08/14 00:20:09.299 INFO [testTable_REALTIME-RealtimeTableDataManager] 
[HelixTaskExecutor-message_handle_thread] Initialize RealtimeSegmentDataManager 
- testTable__0__4__20200813T2127Z
   2020/08/14 00:20:09.299 INFO [HelixInstanceDataManager] 
[HelixTaskExecutor-message_handle_thread] Added segment: 
testTable__0__4__20200813T2127Z to table: testTable_REALTIME
   2020/08/14 00:20:09.300 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[testTable__0__4__20200813T2127Z] Starting consumption loop start offset 
200000, finalOffset null
   2020/08/14 00:20:09.383 INFO [HelixTask] 
[HelixTaskExecutor-message_handle_thread] Message: 
553a8091-14b7-48a9-91b6-61c9c722112d (parent: null) handling task for 
testTable_REALTIME:testTable__0__4__20200813T2127Z completed at: 1597364409383, 
results: true. FrameworkTime: 53 ms; HandlerTime: 1211 ms.
   2020/08/14 00:20:09.717 INFO [Metadata] [testTable__0__4__20200813T2127Z] 
Cluster ID: 21TWeeruS9eTKd-gAXF1qQ
   2020/08/14 00:20:14.127 INFO [AbstractCoordinator] 
[testTable__0__4__20200813T2127Z] [Consumer clientId=consumer-2, groupId=] 
Discovered group coordinator 
b-3.gc-pinot-kafka.hgsrac.c3.kafka.us-west-2.amazonaws.com:9092 (id: 2147483644 
rack: null)
   2020/08/14 00:20:18.489 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[testTable__0__4__20200813T2127Z] Stopping consumption due to row limit 
nRows=50000 numRowsIndexed=50000, numRowsConsumed=50000
   2020/08/14 00:20:18.489 INFO 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[testTable__0__4__20200813T2127Z] Stopping consumption due to row limit 
nRows=50000 numRowsIndexed=50000, numRowsConsumed=50000
   2020/08/14 00:20:18.489 ERROR 
[LLRealtimeSegmentDataManager_testTable__0__4__20200813T2127Z] 
[testTable__0__4__20200813T2127Z] Exception while in work
   java.lang.RuntimeException: Not yet created
        at 
org.apache.pinot.server.realtime.ControllerLeaderLocator.getInstance(ControllerLeaderLocator.java:79)
 
~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-8ab032fa05b514ab6614396d632b9c9e52b9100d]
        at 
org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.createSegmentCompletionUrl(ServerSegmentCompletionProtocolHandler.java:188)
 
~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-8ab032fa05b514ab6614396d632b9c9e52b9100d]
        at 
org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.segmentConsumed(ServerSegmentCompletionProtocolHandler.java:170)
 
~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-8ab032fa05b514ab6614396d632b9c9e52b9100d]
        at 
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.postSegmentConsumedMsg(LLRealtimeSegmentDataManager.java:926)
 
~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-8ab032fa05b514ab6614396d632b9c9e52b9100d]
        at 
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:553)
 
[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-8ab032fa05b514ab6614396d632b9c9e52b9100d]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to