lksvenoy-r7 opened a new issue #8014: URL: https://github.com/apache/pinot/issues/8014
We are seeing an issue in production where the RealtimeToOfflineSegments task is taking a very long time to complete, and timing out in the end. There are two areas where it is slow, during segment mapping: `2022-01-12T12:40:28Z Initialized mapper with 43 record readers` And during row sorting `2022-01-13T10:10:09Z Start sorting on numRows: 42757049, numSortFields: 9` We are using Pinot 0.9.3 with 8 minions, with 8 GB heap, 16GB memory and 4 cores of CPU. Task config: ``` "taskTypeConfigsMap": { "RealtimeToOfflineSegmentsTask": { "bucketTimePeriod": "3h", "bufferTimePeriod": "2d", "mergeType": "dedup", "maxNumRecordsPerSegment": 10000000, "roundBucketTimePeriod": "1h" } } ``` See log files and profiling screenshots **Full run log** [log1.txt](https://github.com/apache/pinot/files/7862331/log1.txt) Sometimes segment mapping happens with up to 90 record readers. This stage then takes 35 minutes up to 1 hour. Once that’s done, it starts the usual reduction and sorting phase. (Even though the task should have been cancelled by now due to hitting timeout). It then starts destroying segments, and then immediately finishes with a timeout after destroying segments. I’m not sure exactly what happens here, but I’m wondering if this is causing an issue where the task is not properly cleaning up, causing loss of data. The logs above show the final moments where the task times out after issuing delete segments. I have been talking with @richardstartin on slack, and he pointed me towards this issue: https://github.com/apache/pinot/issues/7929 To circumvent this problem, I disabled all raw indexes. After doing that, I ran another task, here is the log output: [log2.txt](https://github.com/apache/pinot/files/7862358/log2.txt) Looking at this log file, we can see that sorting rows took 1.42 hours for 75 million rows ``` 2022-01-12T14:19:37Z Start sorting on numRows: 75094464, numSortFields: 9 2022-01-12T15:45:24Z Finish sorting in 5147498ms ``` We decided to profile the application to try figure out what is going on using: `jcmd <pid> JFR.start duration=300s filename=minion.jfr settings=profile` **Profiling during segment mapping:** `2022-01-13T09:53:44Z Initialized mapper with 45 record readers, output dir: /var/pinot/minion/data/RealtimeToOfflineSegmentsTask/tmp-a4085745-be54-4411-abfa-c0dfb8fa05a1/workingDir/mapper_output, timeHandler: class org.apache.pinot.core.segment.processing.timehandler.EpochTimeHandler, partitioners: class org.apache.pinot.core.segment.processing.partitioner.TableConfigPartitioner` **Top 5 methods during segment mapping:**      **Top 5 TLAB during segment mapping:**      **Profiling during segment sorting** `2022-01-13T10:10:09Z Start sorting on numRows: 42757049, numSortFields: 9` **Top 5 methods during segment sorting:**      **Top 5 TLAB during segment sorting:**      Looking at the EBS volume attached to the minion, which has 500 GB and 3000 IOPs, it's maxing out the iOPS on reads.  Here is the resource usage of the minion:    @richardstartin indicated that this is due a 9 dimensional sort taking place (on 42 million rows) due to dedup being enabled. I am currently trying to work-around the issue by disabling dedup to see how that goes. This still does not explain why the segment mapping is slow. Here is the code Richard Startin linked which is causing the excessive disk usage ``` _sortedRowIds = new int[numRows]; for (int i = 0; i < numRows; i++) { _sortedRowIds[i] = i; } Arrays .quickSort(0, _endRowId, (i1, i2) -> _fileReader.compare(_sortedRowIds[i1], _sortedRowIds[i2]), (i1, i2) -> { int temp = _sortedRowIds[i1]; _sortedRowIds[i1] = _sortedRowIds[i2]; _sortedRowIds[i2] = temp; }); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org