itschrispeck opened a new issue, #15444: URL: https://github.com/apache/pinot/issues/15444
We observe that server reads way more data than required on startup. From the below screenshot, we see ~6TB of data read from disk upon server restart. The servers have <400GB memory allocated, so it doesn't make sense to prefetch this much data. No segment format changes occur during this server restart example that would cause this. This behavior significantly impacts server startup time, and also impacts colocated services in a cloud environment. In this case, one pinot server restarting saturated disk io and impacted a pinot server in another cluster. <img width="1200" alt="Image" src="https://github.com/user-attachments/assets/3f6c4d77-d443-4b78-8c70-6270e6baa7e1" /> Some possible improvements: 1. Make the existing data prefetch and header prefetch thresholds configurable. For the cluster where this is observed, we have hundreds of tables with hundreds-thousands of columns each. Prefetching the index header on each column, combined with our typical read-ahead, could take a significant % of overall data size. 2. Limit on overall data prefetched. iiuc it never makes sense to prefetch more header data than we have memory 3. Use different madvise setting - I'm curious if others using MADV_RANDOM have noticed any improvement in this area -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org